survey of mpi call usage
DESCRIPTION
Survey of MPI Call Usage. Daniel Han, USC. Terry Jones, LLNL. August 12, 2004. UCRL-PRES-206265. Outline. Motivation About the Applications Statistics Gathered Inferences Future Work. Motivation. Info for App developers Information on the expense of basic MPI functions (recode?) - PowerPoint PPT PresentationTRANSCRIPT
August 12, 2004
UCRL-PRES-206265
2Aug-12-2004
Outline
Motivation
About the Applications
Statistics Gathered
Inferences
Future Work
3Aug-12-2004
Motivation
Info for App developers– Information on the expense of basic MPI functions (recode?)– Set expectations
Many tradeoffs available in MPI design– Memory allocation decisions– Protocol cutoff point decisions– Where is additional code complexity worth it?
Information on MPI Usage is scarce New tools (e.g. mpiP) make profiling reasonable
– Easy to incorporate (no source code changes)– Easy to interpret– Unobtrusive observation (little performance impact)
4Aug-12-2004
About the applications…
AmtranAmtran: discrete coordinate neutron transport AresAres: instability 3-D simulation in massive star supernova envelopesArdraArdra: neutron transport/radiation diffusion code exploring new numerical algorithms and methods for the solution of the Boltzmann Transport Equation (e.g. nuclear imaging).GeodyneGeodyne: eulerian adaptive mesh refinement (e.g. comet-earth impacts) IRSIRS:: solves the radiation transport equation by the flux-limiting diffusion approximation using an implicit matrix solutionMdcask:Mdcask: molecular dynamics codes for study in radiation damage in metalsLinpack/HPL:Linpack/HPL: solves a random dense linear system.Miranda: Miranda: hydrodynamics code simulating instability growth SmgSmg:: a parallel semicoarsening multigrid solver for the linear systems arising from finite difference, volume, or finite element discretizationsSpheralSpheral: provides a steerable parallel environment for performing coupled hydrodynamical & gravitational numerical simulations http://sourceforge.net/projects/spheral Sweep3dSweep3d: solves a 1-group neuron transport problemUmt2kUmt2k:: photon transport code for unstructured meshes
5Aug-12-2004
Percent of time to MPI
Overall for sampled:60% MPI40% remaining app
6Aug-12-2004
Top MPI Point-to-Point Calls
7Aug-12-2004
Top MPI Collective Calls
8Aug-12-2004
Comparing Collective and Point-to-Point
9Aug-12-2004
Average Number of Calls for Most Common MPI Functions
Application Average
AllGather 68.5Allreduce 10,616.4Alltoall 1,057.0Barrier 56.9Bcast 2,067.3Gather 134.0Gatherv 284.0
Irecv 246,531.0Isend 222,527.9Recv 53,648.3
Reduce 250.9Send 80,337.4Wait 65,881.4
Waitall 31,983.5Waitany 562,436.5
“Large” Runs
10Aug-12-2004
Communication Patterns
most dominant msgsize
Smaller Runs Larger RunsApplication
Kbytes % of mpi Kbytes % of mpi
Amtran 23.6328125 94.79 784.1796875 99.24%Ardra 996.09375 95.17% 146.484375 81.35%Ares 9.16 99.17% 17.87 97.33%
Geodyne 550.7 99.06% 639.64 97.22%IRS 2.92 99.76% 2.23 98.49%
Mdcask 4.619 99.76% 2.68 99.62%Linpack 1.5 91.05% Less 0.5 91.45%Miranda Less 2 90.30% Less 0.5 95.84%Smg2000 Less 0.1 99.65% 1 99.85%Spheral Less 3.6 100.00% 0.1 100.00%Sppm 719 43.90% 719 40.02%
Sweep3d 45 100.00% 0.003 100.00%
11Aug-12-2004
Communication Patterns (continued)
PrimarySmaller Runs Larger Runs
ApplicationKbytes % of mpi Kbytes % of mpi
Ardra 996.09375 95.17% 146.484375 81.35%Sppm 719 43.90% 719 40.02%
SecondarySmaller Runs Larger Runs
ApplicationKbytes % of mpi Kbytes % of mpi
Ardra 5.53710938 17.31%Sppm 1796 35.90% 1796 35.00%
TertiarySmaller Runs Larger Runs
ApplicationKbytes % of mpi Kbytes % of mpi
Ardra 996.09375 95.17%Sppm 1171 11.32% 1123 11.14%
12Aug-12-2004
Frequency of callsites by MPI functions
13Aug-12-2004
Scalability
14Aug-12-2004
Observations Summary
General– People seem to scale code to ~60% MPI/communication– Isend/Irecv/Wait many times more prevalent than Sendrecv and blocking
send/recv– Time spent in collectives predominantly divided among barrier, allreduce,
broadcast, gather, and alltoall– Most common msgsize is typically between 1K and 1MB
Surprises– Waitany most prevalent call– Almost all pt2pt messages are the same size within a run– Often, message size decreases with large runs– Some codes driven by alltoall performance
15Aug-12-2004
Future Work & Concluding Remarks
Further understanding of apps needed– Results for other test configurations
– When can apps make better use of collectives
– Mpi-io usage info needed
– Classified applications
Acknowledgements mpiP is due to Jeffrey Vetter and Chris Chambreau http://www.llnl.gov/CASC/mpip
This work was performed under the auspices of the U.S. Department of Energy by University of California Lawrence Livermore National Laboratory under contract No. W-7405-Eng-48.