2010 05 hands_on
TRANSCRIPT
![Page 1: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/1.jpg)
HandsOn
![Page 2: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/2.jpg)
QUARRY / VAMPIRSERVERUsing putty for portforwarding
![Page 3: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/3.jpg)
Start VM
• If not done already – start the virtual machine:
1. Start All Programs MS Virtual PC
2. Next Add an existing VM Next
3. Browse Select Windows XP VM
4. Next Finish
5. Start Windows XP VM
• Full-screen-mode is found under the Action menu entry
• The VM should resize the resolution automatically
Hands-On is completely done in the VM
![Page 4: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/4.jpg)
Start PuTTY
Select Quarry, click Load and then Open
![Page 5: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/5.jpg)
Login to compute-node
• There is a script in your home-folder that should connect you to the correct node:
% ./logon_to_compute_node
![Page 6: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/6.jpg)
Start VampirServer
% vampirserver
Once connected, type in “vampirserver “
![Page 7: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/7.jpg)
Open Second PuTTY
Select Quarry, click Load but !NOT! Open
![Page 8: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/8.jpg)
Portforwarding
On the left, select SSH and then Tunnels
![Page 9: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/9.jpg)
Portforwarding
Source port: 30000Destination: see vampirserverClick Add
![Page 10: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/10.jpg)
Portforwarding
Click Open and then:
% ./logon_to_compute_node
This terminal can be used normally for compile&run commands
![Page 11: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/11.jpg)
Vampir Remote Open
Open GUI, click on File, then click on Remote Open
![Page 12: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/12.jpg)
Vampir Remote Open
Server: 127.0.0.1Port: 30000Click on Connect
![Page 13: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/13.jpg)
Vampir Remote Open
To avoid having to wait for all user-home folders to be loaded – add path manually.Path: /N/u/hpstrn####: 01-15 (see vampirserver terminal for your specific username)
![Page 14: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/14.jpg)
Vampir Remote Open
In your Home under Quarry/traces/p_8, click on Semtex_original_8cpu and then Open
![Page 15: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/15.jpg)
Vampir 7 GUI
Take time to get acquainted with the different displays and the options each display offers
![Page 16: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/16.jpg)
Use of VampirTrace• Instrument your application with VampirTrace
– Edit your Makefile and change the underlying compiler
– Tell VampirTrace the parallelization type of your application
– Optional: Choose instrumentation type for your application
CC = icc
CXX = icpc
F77 = ifort
F90 = ifort
MPICC = icc
MPIF90 = ifort
CC = vtcc
CXX = vtcxx
F77 = vtf77
F90 = vtf90
MPICC = vtcc
MPIF90 = vtf90
-vt:<seq|mpi|mt|hyb>
# seq = sequential
# mpi = parallel (uses MPI)
# mt = parallel (uses OpenMP/POSIX threads)
# hyb = hybrid parallel (MPI+Threads)
-vt:inst <gnu|pgi|sun|xl|ftrace|openuh|manual|
dyninst>
# DEFAULT: automatic instrumentation by compiler
# manual: manual by using VT’s API (see manual)
# dyninst: binary instrumentation using Dyninst
![Page 17: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/17.jpg)
HANDS-ON EXERCISEGetting to know the GUI
![Page 18: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/18.jpg)
Hands-on: The Ping-Pong Example
• Hands-on: The Ping-Pong example with VampirTrace and Vampir– Go to the ping_pong.c example program
– Compile and run with pristine version• Always check that target application compiles and runs
without errors
– Compile with VampirTrace compiler wrapper
– Run normally
%> mpicc -g –O3 ping_pong.c –o ping_pong
%> mpirun –np 2 ./ping_pong
%> vtcc –vt:cc mpicc -g –O3 ping_pong.c –o
ping_pong
%> mpirun –np 2 ./ping_pong
%> cd ./examples/ping_pong
![Page 19: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/19.jpg)
Hands-on: The Ping-Pong Example– After trace run, there are additional output files in the
working directory:
– The event trace in Open Trace Format (OTF)• Anchor file *.otf• Definitions in *.def.z• Events in *.events.z, one per process/rank/thread by default• Markers in *.markers.z for advanced usage
– Open *.otf with Vampir– Command line tools to access or modify OTF traces
2.2K ping_pong.0.def.z
29 ping_pong.0.marker.z
954 ping_pong.1.events.z
935 ping_pong.2.events.z
12 ping_pong.otf
![Page 20: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/20.jpg)
Hands-on: The Ping-Pong Example
Timeline and Profile: time mostly spend in VT init and MPI finish
Time interval indicator: entire time shown
![Page 21: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/21.jpg)
Hands-on: The Ping-Pong Examplezoomed to the actual activity
![Page 22: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/22.jpg)
Hands-on: The Ping-Pong Example
further zoomed, ping-pong messages become visible
MPI time still dominating!
average message bandwidth
![Page 23: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/23.jpg)
Hands-on: The Ping-Pong Example
zoomed to single message pair
different behavior on both ranks details for selected
second message
![Page 24: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/24.jpg)
VAMPIR / VAMPIRTRACE HANDS-ON EXERCISE
Guided Exercise with NPB 3.3 BT-MPI
Center for Information Services and High Performance Computing (ZIH)
![Page 25: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/25.jpg)
Hands-on: NPB 3.3 BT-MPI
– Move into tutorial directory in your home directory
– Select the VampirTrace compiler wrappers% vim config/make.def
-> comment out line 32, resulting in:...
32: #MPIF77 = mpif77...
-> remove the comment from line 38, resulting in:...
38: MPIF77 = vtf77 –vt:f77 mpif77
...-> comment out line 88, resulting in:
...88: #MPICC = mpicc
...
-> remove the comment from line 94, resulting in:...
94: MPICC = vtcc -vt:cc mpicc...
% cd NPB3.3-MPI
![Page 26: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/26.jpg)
Hands-on: NPB 3.3 BT-MPI
• Build benchmark
• Launch as MPI application% cd bin.vampir; export VT_FILE_PREFIX=bt_1_initial
% mpiexec –np 16 bt_W.16
NAS Parallel Benchmarks 3.3 -- BT Benchmark
Size: 24x 24x 24Iterations: 200 dt: 0.0008000
Number of active processes: 16
Time step 1
...Time step 180
[0]VampirTrace: Maximum number of buffer flushes reached \
(VT_MAX_FLUSHES=1)[0]VampirTrace: Tracing switched off permanently
Time step 200...
% make clean; make suite
![Page 27: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/27.jpg)
Hands-on: NPB 3.3 BT-MPI• Resulting trace files
• Visualization with Vampir7
% ls -alh
4,1M bt_1_initial.164,9K bt_1_initial.16.0.def.z
29 bt_1_initial.16.0.marker.z12M bt_1_initial.16.10.events.z
12M bt_1_initial.16.1.events.z11M bt_1_initial.16.2.events.z
12M bt_1_initial.16.3.events.z
...11M bt_1_initial.16.c.events.z
12M bt_1_initial.16.d.events.z12M bt_1_initial.16.e.events.z
12M bt_1_initial.16.f.events.z
66 bt_1_initial.16.otf
![Page 28: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/28.jpg)
28
Hands-on: NPB 3.3 BT-MPI
![Page 29: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/29.jpg)
29
Hands-on: NPB 3.3 BT-MPI
![Page 30: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/30.jpg)
Hands-on: NPB 3.3 BT-MPI
• Decrease number of buffer flushes by increasing the buffer size
• Set a new file prefix
• Launch as MPI application
% export VT_FILE_PREFIX=bt_2_buffer_120M
% export VT_MAX_FLUSHES=1 VT_BUFFER_SIZE=120M
% mpiexec -np 16 bt_W.16
![Page 31: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/31.jpg)
31
Hands-on: NPB 3.3 BT-MPI
On an SGI Altix4700
![Page 32: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/32.jpg)
32
Hands-on: NPB 3.3 BT-MPI
On an SGI Altix4700
![Page 33: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/33.jpg)
Hands-on: NPB 3.3 BT-MPI
• Generate filter specification file
• Set a new file prefix
• Launch as MPI application
• For reference a manually written filterfile:
% export VT_FILE_PREFIX=bt_3_filter
% vtfilter -gen -fo filter.txt -r 10 -stats \
-p bt_2_buffer_120M.otf% export VT_FILTER_SPEC=/path/to/filter.txt
% mpiexec -np 16 bt_W.16
matmul_sub*; matvec_sub*;binvcrhs* --0
![Page 34: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/34.jpg)
34
Hands-on: NPB 3.3 BT-MPI
On an SGI Altix4700
![Page 35: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/35.jpg)
35
Hands-on: NPB 3.3 BT-MPI
On an SGI Altix4700
![Page 36: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/36.jpg)
PAPI
• PAPI counters can be included in traces
– If VampirTrace was build with PAPI support
– If PAPI is available on the platform
• VT_METRICS specifies a list of PAPI counters
• see also the PAPI commands papi_avail and papi_command_line
• PAPI is not available on quarry
– View traces Large/Small on your windows-machine
% export VT_METRICS = PAPI_FP_OPS:PAPI_L2_TCM
![Page 37: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/37.jpg)
37
Hands-on: NPB 3.3 BT-MPI
• Record I/O and Memory counters
• Set a new file prefix
• Launch as MPI application
% export VT_FILE_PREFIX=bt_4_papi
% export VT_MEMTRACE = yes
% export VT_IOTRACE = yes
% mpiexec -np 16 bt_W.16
![Page 38: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/38.jpg)
Hands-on: NPB 3.3 BT-MPI
On an SGI Altix4700
![Page 39: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/39.jpg)
FREE TRAINING
Examples:Filtering: filter_mpi_ompInstrumenting: instrument_ringProfiling: profile_heatMixed: Cannon
![Page 40: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/40.jpg)
examples/filter_mpi_omp
• Look into the source-code
– Artificial example made of three parts
• Matrix multiply MPI-parallelized
• Matrix multiply OpenMP-parallelized
• Dummy functions
• Use automatic instrumentation and visualize
• Filter out the dummy functions, run&visualize
• Create a group-filter for dummy functions and matrix multiply functions
– Do not forget to switch off the function filter
![Page 41: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/41.jpg)
examples/instrument_ring
• Look at source-code and makefiles
• Run and visualize both versions
• Add additional instrumentation for while loop
• Run and visualize again
![Page 42: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/42.jpg)
examples/profile_heat
• Compile via “make all”
• export GMON_OUT_PREFIX=name
• Run the binaries (change prefix in between)
• Use gprof to combine the profiles: gprof –s
• Watch the output: gprof [-b] sum.txt | less
![Page 43: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/43.jpg)
WRAP-UPHow to solve issues when using VampirTrace
![Page 44: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/44.jpg)
HOW TO SOLVE ISSUES WHEN USING VAMPIRTRACE
For more details on VampirTrace and its features see also the manual.
![Page 45: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/45.jpg)
Incomplete Traces
• Issue: Tracing was switched off because the internal trace buffer was too small
• Result:
• Asynchronous behavior of the application due tobuffer flush of the measurement system
• No tracing information available after flush operation
• Huge overhead due to flush operation
[0]VampirTrace: Maximum number of buffer flushes reached \
(VT_MAX_FLUSHES=1)
[0]VampirTrace: Tracing switched off permanently
![Page 46: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/46.jpg)
Incomplete Traces - Solutions
• Increase trace buffer size
• Increase number of allowed buffer flushes (not recommended)
• Use filter mechanisms to reduce the number of recorded events
• Switch tracing on/off if your application in an iterative manner to reduce the number of recorded events
%> export VT_BUFFER_SIZE = 150M
%> export VT_MAX_FLUSHES = 2
%> export VT_FILTER_SPEC = $HOME/filter.spec
![Page 47: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/47.jpg)
Way too large Traces
• Issue:
– Each function entry/exit, MPI event was recorded
• Result:
– Trace files become large even for short application runs
• Solutions:
– Use filter mechanisms to reduce the number of recorded events
– Use selective instrumentation of your application
– Switch tracing on/off if your application works in an iterative manner to reduce the number of recorded events
![Page 48: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/48.jpg)
Overhead
• Issue:– Runtime filtering will be called for each event
• Result:– Runtime filtering increases the runtime overhead
• Solutions:– Use selective instrumentation of your application
– Use manual source instrumentation (high effort, error prone)
– Only instrument interesting source files with VampirTrace
– Switch tracing on/off if your application works in an iterative manner to reduce the number of recorded events
![Page 49: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/49.jpg)
Additional Information needed
• Issue:– I’m interested in more events and hardware counters. What do
I have to do?
• Solutions:– Use the enviroment option VT_METRICS to enable recording of
additional hardware counters like PAPI, CPC or NEC if available.
– Use the environment option VT_RUSAGE to record the Unixresource usage counters.
– Use the environment option VT_MEMTRACE, if available on your system, to intercept the libc allocation functions add to record memory allocation information.
– For more additional events and recording hardware information see chapter 4 in the VampirTrace manual.
![Page 50: 2010 05 hands_on](https://reader033.vdocuments.site/reader033/viewer/2022042602/558cb5ddd8b42ae7408b459e/html5/thumbnails/50.jpg)
50
Thanks for your attention.