a case study of hpc metrics collection and analysis philip johnson and michael paulding, university...
Post on 21-Dec-2015
223 views
TRANSCRIPT
A Case Study of HPC Metrics Collection and AnalysisPhilip Johnson and Michael Paulding, University of Hawaii, Honolulu,
Hawaii. Goals of the case study
• Provide a complete implementation of one Purpose Based Benchmark problem definition, called Optimal Truss Design
• Implement the Optimal Truss Design system in C++ using MPI on a 240 node Linux cluster at the University of Hawaii
• Develop and evaluate automated support for HPC process and product measurement using Hackystat
• Assess the utility of the metrics for understanding HPC development
Metrics Collected
• Size (Number of files, Total SLOC, “Parallel” SLOC containing an MPI directive, “Serial” SLOC not containing an MPI directive, Test code)
• Active Time (amount of time spent editing Optimal Truss Design files)
• Performance (wall clock time on 1, 2, 4, 8, 16, and 32 processors)
• Milestone Tests (indicates functional completeness)
• Command Line Invocations
Results: Basic Process and Product Measures
Total Source Lines of Code 3320 LOC
Total Test Lines of Code 901 LOC
Total MPI Lines of Code 1032 LOC
Total Days (Calendar time) 1 year
Total Days (with Development Activity)
88 days
Total Active Time 152 Hours
Total Distinct MPI Directives 60 Directives
Total Files 56 Files
Total Sequential Files (no MPI) 51 Files
Total Parallel Files (containing MPI)
5 Files
Execution Time 126 sec. (1 processor)66 sec. (2 processors)33 sec. (4 processors)27 sec. (8 processors)39 sec. (16 processors)43 sec. (32 processors)
Results: Derived Process and Product Measures
Derived Metric Definition Value
Productivity Proxy (LOC / Active Time) 22 LOC/hour
Average Daily Active Time (Total Active Time / Total Days) 1.73 hours/day
Test Code Density Percentage (Total Test LOC / Total LOC) 27%
MPI Code Density Percentage (Total MPI LOC / Total LOC) 31%
MPI File Density Percentage (Total MPI Files / Total Files) 9%
MPI Directive Frequency Ratio (Total MPI Directives : Total MPI LOC)
1 Directive : 17 LOC
Speedup (Execution Time (1 Proc.) / Exec. Time (n processors)
1.0 (1 processor)1.9 (2 processors)3.7 (4 processors)4.5 (8 processors)3.2 (16 processors)2.9 (32 processors)
Results: Process and Product Telemetry Charts
For More Information
•Understanding HPCS development through automated process and product measurement with Hackystat, Philip M. Johnson and Michael G. Paulding, Proceedings of the Second Workshop on Productivity and Performance in High-End Computing.
Results: Daily Diary with CLI and Most Active File
Insights and Lessons Learned
•Productivity (22 LOC/hour) and test code density (27%) seem in line with traditional software engineering metrics.
•Speedup data indicates almost linear speedup to 4 processors, then falls off sharply, indicating that current solution is not scalable.
•Parallel and serial LOC were equal at start of project, then most effort was devoted to serial code, with some final enhancements to parallel code at end of project.
•Performance data was not comparable over course of project (only final numbers available; no telemetry)
• Hackystat provides effective infrastructure for collection of process and product metrics.
• This case study provides useful baseline data to compare with future studies.
• Future research:• Compare to OpenMP or JavaParty implementation.• Gather metrics while improving scalability of system• Compare metrics against other application types. • Analyze CLI data for patterns, bottlenecks
Thanks to our sponsors