statistical inference for efficient microarchitectural...
TRANSCRIPT
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Statistical Inference for EfficientMicroarchitectural Analysis
Benjamin C. Lee, David M. Brooks
www.deas.harvard.edu/∼bcleeDivision of Engineering and Applied Sciences
Harvard University
Boston Area Architecture Workshop26 January 2007
Benjamin C. Lee, David M. Brooks 1 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Outline
Motivation & BackgroundSimulation ChallengesSimulation ParadigmRegression Theory
Sampling and ModelingExperimental MethodologyModel Evaluation
Design OptimizationPareto FrontierMultiprocessor Heterogeneity
Conclusion
Benjamin C. Lee, David M. Brooks 2 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Simulation ChallengesSimulation ParadigmRegression Theory
Outline
Motivation & BackgroundSimulation ChallengesSimulation ParadigmRegression Theory
Sampling and ModelingExperimental MethodologyModel Evaluation
Design OptimizationPareto FrontierMultiprocessor Heterogeneity
Conclusion
Benjamin C. Lee, David M. Brooks 3 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Simulation ChallengesSimulation ParadigmRegression Theory
Microarchitectural Design Space
Increasing diversity of interesting, viable designsExamples :: Power 4, Pentium 4, UltraSPARC T1Tractably quantify trends across comprehensive design space
Benjamin C. Lee, David M. Brooks 4 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Simulation ChallengesSimulation ParadigmRegression Theory
Microarchitectural Simulation Challenges
Cycle-Accurate SimulationAccurately identifies trends in design spaceTracks instructions’ progress through microprocessorEstimates performance, power, temperature, . . .
Simulation CostsLong simulation times (minutes, hours per design)Number of potential simulations scale exponentially (mp)
p :: parameter countm :: parameter resolution
Benjamin C. Lee, David M. Brooks 5 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Simulation ChallengesSimulation ParadigmRegression Theory
Microarchitectural Sampling
Temporal SamplingSample from instruction traces in time domainReduce simulation costs via size of inputsSynthetic traces from profiled workloads 1
Sampled traces from phase analysis 2
Spatial SamplingSample from design spaceReduce simulation costs via number of simulations
1Eeckhout+[ISPASS’00]2Sherwood+[ASPLOS’02], Wunderlich+[ISCA’03]
Benjamin C. Lee, David M. Brooks 6 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Simulation ChallengesSimulation ParadigmRegression Theory
Simulation Paradigm
Comprehensively understand design spaceSpecify large, high-resolution design spaceConsider all design parameters simultaneously
Selectively simulate modest number of designsSample points randomly from design space for simulationDecouple resolution of design space and simulation
Efficiently leverage simulation data with inferenceReveal trends, trade-offs from sparse samplingEnable predictions for metrics of interest
Benjamin C. Lee, David M. Brooks 7 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Simulation ChallengesSimulation ParadigmRegression Theory
Regression Theory
Statistical InferenceModels approximate solutions to intractable problemsRequires initial data to train, formulate modelLeverages correlations from initial data for prediction
Regression ModelsLow formulation costs (1K samples from 1B designs)Accurate inference (5 − 7% median error)Efficient computation (100’s of predictions per second)
Benjamin C. Lee, David M. Brooks 8 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Experimental MethodologyModel Evaluation
Outline
Motivation & BackgroundSimulation ChallengesSimulation ParadigmRegression Theory
Sampling and ModelingExperimental MethodologyModel Evaluation
Design OptimizationPareto FrontierMultiprocessor Heterogeneity
Conclusion
Benjamin C. Lee, David M. Brooks 9 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Experimental MethodologyModel Evaluation
Tools and Benchmarks
Simulation FrameworkTurandot :: a cycle-accurate trace driven simulatorPowerTimer :: power models derived from circuit analysesBaseline simulator models POWER4/POWER5 architecture
BenchmarksSPEC2kCPU :: compute-intensive benchmarksSPECjbb :: Java server benchmark
Statistical FrameworkR :: software environment for statistical computingHmisc and Design packages3
3Harrell [Springer,’01]Benjamin C. Lee, David M. Brooks 10 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Experimental MethodologyModel Evaluation
Predictors :: MicroarchitectureSet Parameters Measure Range |S|
S1 Depth depth FO4 9::3::36 10S2 Width width insn b/w 4,8,16 3
L/S reorder queue entries 15::15::45store queue entries 14::14::42functional units count 1,2,4
S3 Physical general purpose (GP) count 40::10::130 10Registers floating-point (FP) count 40::8::112
special purpose (SP) count 42::6::96S4 Reservation branch entries 6::1::15 10
Stations fixed-point/memory entries 10::2::28floating-point entries 5::1::14
S5 I-L1 Cache i-L1 cache size log2(entries) 7::1::11 5S6 D-L1 Cache d-L1 cache size log2(entries) 6::1::10 5S7 L2 Cache L2 cache size log2(entries) 11::1::15 5
Benjamin C. Lee, David M. Brooks 11 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Experimental MethodologyModel Evaluation
Model Evaluation I
FrameworkFormulate models with n = 1, 000 samplesObtain 100 additional random samples for validationQuantify percentage error, 100 ∗ |yi − yi|/yi
ComparisonSimulator-reported performance, powerRegression-predicted performance, power
Benjamin C. Lee, David M. Brooks 12 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Experimental MethodologyModel Evaluation
Model Evaluation II
Benjamin C. Lee, David M. Brooks 13 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Pareto FrontierMultiprocessor Heterogeneity
Outline
Motivation & BackgroundSimulation ChallengesSimulation ParadigmRegression Theory
Sampling and ModelingExperimental MethodologyModel Evaluation
Design OptimizationPareto FrontierMultiprocessor Heterogeneity
Conclusion
Benjamin C. Lee, David M. Brooks 14 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Pareto FrontierMultiprocessor Heterogeneity
Pareto Frontier Analysis
BackgroundPareto optimization improves at least one metric withoutnegatively impacting any other metricPareto frontier is set of pareto optima
ObjectiveConstruct pareto frontiers for the power-delay space
ApproachSimulate 1K samples from design spaceFormulate regression models for performance, powerExhaustively evaluate models to identify frontier
Benjamin C. Lee, David M. Brooks 15 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Pareto FrontierMultiprocessor Heterogeneity
Design Space Characterization
Benjamin C. Lee, David M. Brooks 16 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Pareto FrontierMultiprocessor Heterogeneity
Pareto Frontier
Benjamin C. Lee, David M. Brooks 17 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Pareto FrontierMultiprocessor Heterogeneity
Multiprocessor Heterogeneity
BackgroundPrior heterogeneity studies constrained design options4
ObjectiveIdentify efficient heterogeneous design compromisesMitigate penalties from single design compromise
ApproachSimulate 1K samples from design spaceFormulate regression models for performance, powerIdentify per benchmark optima (bips3/w) via regressionIdentify compromises via K-means clustering
4Kumar+[ISCA’04], Kumar+[PACT’06]Benjamin C. Lee, David M. Brooks 18 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Pareto FrontierMultiprocessor Heterogeneity
Multiprocessor Heterogeneity II
Benjamin C. Lee, David M. Brooks 19 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Outline
Motivation & BackgroundSimulation ChallengesSimulation ParadigmRegression Theory
Sampling and ModelingExperimental MethodologyModel Evaluation
Design OptimizationPareto FrontierMultiprocessor Heterogeneity
Conclusion
Benjamin C. Lee, David M. Brooks 20 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Continuing Work I
Topology Visualization
Benjamin C. Lee, David M. Brooks 21 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Continuing Work II
Topology Roughness Metrics5
R1 =∫
xf ′′(x)2dx
R2 =∫
x2
∫x1
(„δ2fδx2
1
«2
+
„δ2f
δx1δx2
«2
+
„δ2fδx2
2
«2)
dx1dx2
OptimizationHeuristic search (e.g., gradient descent)Symbolic optimization
Chip Multiprocessor DesignDecoupled models (e.g., core and interconnect)Larger parameter space (e.g., in-order execution)
5Green+[Monographs Stat & App Prob]Benjamin C. Lee, David M. Brooks 22 :: BARC :: 26 Jan 07
Motivation & BackgroundSampling and Modeling
Design OptimizationConclusion
Conclusion
Exploration ParadigmComprehensively understand design spaceSelectively simulate modest number of designsEfficiently leverage simulation data with inference
Regression ModelingStatistical analysis for robust, efficient modelsLow formulation costs with accurate inferenceComputationally efficient
Model Evaluation7.2%, 5.4% median errors for performance, powerApplicable to practical design optimization
Benjamin C. Lee, David M. Brooks 23 :: BARC :: 26 Jan 07
Appendix Further Reading
Further Readingwww.deas.harvard.edu/∼bclee
B.C. Lee and D.M. Brooks and B.R. de Supinski and M. Schulz and K. Singh and S.A. Mckee.Methods of inference and learning for performance modeling of parallel applicationsPPoPP’07: Symposium on Principles and Practice of Parallel Programming, March 2007.
B.C. Lee and D.M. Brooks.Illustrative design space studies with microarchitectural regression modelsHPCA-13: International Symposium on High Performance Computer Architecture, Feb 2007.
B.C. Lee and D.M. Brooks.Accurate, efficient regression modeling for microarchitectural performance, power prediction.ASPLOS-XII: International Conference on Architectural Support for Programming Languagesand Operating Systems, Oct 2006.
B.C. Lee and D.M. Brooks.Statistically rigorous regression modeling for the microprocessor design space.MoBS-2: Workshop on Modeling, Benchmarking, and Simulation, June 2006.
B.C. Lee and D.M. Brooks.Regression modeling strategies for microarchitectural performance and power prediction.Harvard University Technical Report TR-08-06, March 2006.
Benjamin C. Lee, David M. Brooks 24 :: BARC :: 26 Jan 07