Interpreting Performance Test Results

Download Interpreting Performance Test Results

Post on 09-Jan-2017




0 download


<p>PowerPoint Presentation</p> <p>Interpreting and reporting performance Test results</p> <p>Eric Proegler</p> <p>1</p> <p>About Me20 Years in Software, 14 in Performance, Context-Driven for 12</p> <p>Performance Engineer/Teacher/Consultant</p> <p>Product Manager </p> <p>Board of Directors</p> <p>Lead Organizer </p> <p>Mentor </p> <p>#</p> <p>1. DiscoverID processes &amp; SLAsDefine use case workflowsModel production workload</p> <p>3. AnalyzeRun tests Monitor system resourcesAnalyze results5. ReportInterpret resultsMake recommendationsPresent to stakeholders2. DevelopDevelop test scriptsConfigure environment monitorsRun shakedown test</p> <p>34. FixDiagnose FixRe-test</p> <p>Dan Downings 5 Steps of Load Testing</p> <p>#</p> <p>About this sessionParticipatory!Graphs from actual projects have any?Learn from each other</p> <p>Not About Tools</p> <p>First HalfMaking observations and forming hypotheses</p> <p>Break (~10:00)</p> <p>Second HalfInterpreting and reporting actionable results</p> <p>#</p> <p>What can we observe about this app?</p> <p>#</p> <p>Whats this suggesting?</p> <p>#</p> <p>Whats this suggesting?</p> <p>#</p> <p>And this?</p> <p>#</p> <p>Performance KPIs*</p> <p>ScalabilityThroughputSystem Capacity*Key Performance IndicatorsWorkload Achievement</p> <p>#</p> <p>How could we annotate this graph?</p> <p>Note scale of each metricMixed units (sec., count, %)</p> <p>#</p> <p>What does this SAY about capacity?</p> <p>#</p> <p>What observation can we make here?</p> <p>#</p> <p>And here?</p> <p>#</p> <p>Hmmmyikes!</p> <p>#</p> <p>What can we say here?</p> <p>#</p> <p>What can we say here?</p> <p>#</p> <p>describe what happened here?</p> <p>#</p> <p>Tell a plausible story about this</p> <p>#</p> <p>Whats the lesson from this graph?</p> <p>Hurricane Center Average of US hurricane Forecast modelsAverages Lie!</p> <p>#</p> <p>Measure what where?</p> <p>412356</p> <p>#</p> <p>Measure what where?</p> <p>Proper load balancing (really at Web/App servers)HW resourcesWeb Server connections, queuing, errorsHW resourcesJVM heap memoryDB Connection poolsHW resourcesLock waits / DeadlocksSQL recompilesFull table scansSlowest queriesSAN IOPSBandwidth throughputLoad injector capacityLoadResponse timeHW resources</p> <p>#</p> <p>Measure with what? </p> <p>#</p> <p>Anything concerning here?</p> <p>Before:Slowest transactions show spikes of 5 - 8 seconds, every 10 minutesAfter:Spikes substantially reduced after VM memory increased to 12 GB</p> <p>#</p> <p>What are we looking at here?</p> <p>When does this become a problem?</p> <p>When heap space utilization keeps growing, despite garbage collection, and reaches its max allocation</p> <p>#</p> <p>Any hypotheses about this?</p> <p>Before:Abnormally high TCP retransmits between Web and App serverAfter:Network issues resolved</p> <p>#</p> <p>Tell a data-supported story about this; annotate the graph</p> <p>#</p> <p>26</p> <p>How much loaD?</p> <p>#</p> <p>27</p> <p>How much loaD?</p> <p>#</p> <p>28</p> <p>How much loaD?</p> <p>#</p> <p>29</p> <p>How much loaD?</p> <p>#</p> <p>30</p> <p>Think CAVIARC ollectingA ggregating V isualizing I nterpretingA ssessingR eportingFor actionable performance results</p> <p>#</p> <p>Mark:Paraphrase agenda31</p> <p>CollectingObjective: Gather all results from test thathelp gain confidence in results validityPortray system scalability, throughput &amp; capacityprovide bottleneck / resource limit diagnosticshelp formulate hypotheses</p> <p>#</p> <p>32</p> <p>AggregatingObjective: Summarize measurements usingVarious sized time-buckets to provide tree &amp; forest viewsConsistent time-buckets across types to enable accurate correlationMeaningful statistics: scatter, min-max range, variance, percentilesMultiple metrics to triangulate, confirm (or invalidate) hypotheses</p> <p>#</p> <p>33</p> <p>Visualizing</p> <p>Objective: Gain forest view of metrics relative to loadTurn barrels of numbers into a few picturesVary graph scale &amp; summarization granularity to expose hidden factsID load point where degradation beginsID system tier(s) where bottlenecks appear, limiting resources</p> <p>#</p> <p>34</p> <p>Visualizing</p> <p>My key graphs, in order of importanceErrors over load (results valid?)Bandwidth throughput over load (system bottleneck?)Response time over load (how does system scale?)Business process end-to-endPage level (min-avg-max-SD-90th percentile)System resources (hows the infrastructure capacity?)Server cpu over loadJVM heap memory/GCDB lock contention, I/O Latency</p> <p>#</p> <p>35</p> <p>Interpreting</p> <p>Objective: Draw conclusions from observations, hypothesesMake objective, quantitative observations from graphs / data Correlate / triangulate graphs / dataDevelop hypotheses from correlated observations Test hypotheses and achieve consensus among tech teamsTurn validated hypotheses into conclusions</p> <p>#</p> <p>36</p> <p>Interpreting</p> <p>Observations:I observe that; no evaluation at this point!Correlations:Comparing graph A to graph B relate observations to each otherHypotheses:It appears as though test these with extended team; corroborate with other information (anecdotal observations, manual tests)Conclusions:From observations a, b, c, corroborated by d, I conclude that</p> <p>#</p> <p>37</p> <p>Scalability: Response Time over Load</p> <p>Is 2.5 sec / page acceptable? Need to drill down to page level to ID key contributors, look at 90th or 95th percentiles (averages are misleading)Two styles for system scalability; top graph shows load explicitly on its own y-axisNote consistent 0.5 sec / page up to ~20 usersAbove that, degrades steeply to 5x at max load</p> <p>#</p> <p>Throughput plateau with load rising= Bottleneck somewhere!</p> <p>Note throughput tracking load through ~45 users, then leveling offCulprit was an Intrusion Detection appliance limiting bandwidth to 60 MbpsIn a healthy system throughput should closely track load</p> <p>#</p> <p>Bandwidth tracking with load = healthy</p> <p>All 3 web servers show network interface throughput tracking with load throughout the testA healthy bandwidth graph looks like Mt. Fuji</p> <p>#</p> <p>Errors over Load Must explain!Note relatively few errorsLargely http 404s on missing resources</p> <p>Error rate of 1% should be analyzed and fully explainedSporatic bursts of http 500 errors near end of the test while customer was tuning web servers</p> <p>#</p> <p>end user experience SLA violations</p> <p>Outlier, not on VPN</p> <p>#</p> <p>SLA violations drill downFelipe B. (Brazil, Feb 28th, 7:19AM-1:00PM CST, &gt; 20 second response on page Media Object Viewer. </p> <p>#</p> <p>Network Throughput raw graph</p> <p>#</p> <p>Network Throughput - Interpreted</p> <p>#</p> <p>Capacity: System Resources - raw</p> <p>#</p> <p>Capacity: System Resources - interpreted</p> <p>Monitor resources liberally, provide (and annotate!) graphs selectively: which resources tell the main story?</p> <p>#</p> <p>AssessingObjective: Turn conclusions into recommendationsTie conclusions back to test objectives were objectives met?Determine remediation options at appropriate level business, middleware, application, infrastructure, networkPerform agreed-to remediationRe-testRecommendations:Should be specific and actionable at a business or technical levelShould be reviewed (and if possible, supported) by the teams that need to perform the actions (nobody likes surprises!)Should quantify the benefit, if possible the cost, and the risk of not doing itFinal outcome is managements judgment, not yours</p> <p>#</p> <p>48</p> <p>ReportingObjective: Convey recommendations in stakeholders termsIdentify the audience(s) for the report; write / talk in their languageExecutive Summary 3 pages maxSummarize objectives, approach, target load, acceptance criteriaCite factual Observations Draw Conclusions based on ObservationsMake actionable RecommendationsSupporting DetailTest parameters (date/time executed, business processes, load ramp, think-times, system tested (hw config, sw versions/builds)Sections for Errors, Throughput, Scalability, CapacityIn each section: annotated graphs, observations, conclusionsAssociated Docs (If Appropriate) Full set of graphs, workflow detail, scripts, test assets</p> <p>#</p> <p>49</p> <p>ReportingStep 1: *DO NOT* press Print of tools default Report Who is your audience? Why do they want to see 50 graphs and 20 tables? What will they be able to see?Data + Analysis = INFORMATION</p> <p>#</p> <p>ReportingStep 2: Understand What is ImportantWhat did you learn? Study your results, look for correlations.What are the 3 things you need to convey?What information is needed to support these 3 things?Discuss findings with technical team members: What does this look like to you?</p> <p>#</p> <p>ReportingStep 3: So, What is Important?Prepare a three paragraph summary for emailPrepare a 30 second Elevator Summary for when someone asks you about the testingMore will consume these than any test reportGet feedback</p> <p>#</p> <p>ReportingStep 4: Preparing Your Final Report: AudienceYour primary audience is usually executive sponsors and the business. Write the Summary at the front of the report for them.Language, Acronyms, and Jargon Level of DetailCorrelation to business objectives</p> <p>#</p> <p>ReportingStep 5: Audience (cont.)Rich Technical Detail within:Observations, including selected graphsInclude Feedback from Technical TeamConclusions Recommendations</p> <p>#</p> <p>ReportingStep 6: Present!Remember, no one is going to read the report.Gather your audience: executive, business, and technical. Present your resultsHelp shape the narrative. Explain the risks. Earn your keep.Call to action! Recommend solutions</p> <p>#</p> <p>Remember: CAVIAR!</p> <p>C ollectingA ggregating V isualizing I nterpretingA ssessingR eporting</p> <p>#</p> <p>Mark:Paraphrase agenda56</p> <p>A Few ResourcesWOPR (Workshop On Performance and Reliability)http://www.performance-workshop.orgExperience reports on performance testingSpring &amp; Fall facilitated, theme-based peer conferencesSOASTA Communityhttp://cloudlink.soasta.comPapers, articles, presentations on performance testingPerfBytes PodcastMark Tomlinsons blog Richard Leekes blog ( Leeke Data visualizationScott Barbers resource page STP Resources, blogs, papers on wide range of testing topics</p> <p>#</p> <p>Thanks for AttendingPlease fill out an evaluation form</p> <p> </p> <p>#</p> <p>Resource GraphsExamples of Resource Monitoring GraphsDan Downing - www.mentora.comMonitoring your Load Injectors -- localhost saturated at max loadLoad Average of all servers - Oracle Apps on Linux/RACcpu run queue size of all servers - Oracle Apps on Linux/RACPoor (no) web server load balancing - cpuPoor (no) web server load balancing - web server NIC bandwidthWeb server request waitsDB cpu vs. Web servers -- cause was missing indexes, no DB statisticsExample of a healthy bandwidth throughput and web load balancingUnhealthy bandwidth throughput indicating a system bottleneckJVM Heaps and garbage collection severely impacting performanceIO-bound server?IO-constrained throughput on a SANCorroborating disk queue length on SANhttp 5xx errors</p> <p>Resource Graphs40.07750.04250.0650.20.1050.075120.12250.03750.060.20.1350.075280.13250.03750.05250.21750.12750.095410.1450.060.0650.19750.12250.18510.16250.08750.0550.2150.150.185630.19750.0850.0550.24750.18750.205770.20750.11250.15750.210.1750.235870.2250.120.230.26250.1450.25751010.210.1550.1850.25750.140.29251110.24250.2450.150.26750.15750.4451210.24250.2550.120.26250.17250.5351330.25250.290.0950.23250.1750.5051430.29250.30.0750.23750.19750.48751530.30250.30250.05750.2950.21250.42751660.4250.2850.09750.37250.2050.431710.44250.2350.0850.37250.170.4151830.43250.240.08250.3350.17250.371940.51250.26750.0650.30.24750.3152040.56750.3050.05250.30250.25750.29252130.530.34250.040.370.30250.2652240.530.34750.0350.38250.33750.262310.54250.4150.08750.38250.390.2852380.5850.4650.07750.4650.36750.51252450.59250.450.060.4650.38250.5252510.64750.47250.0550.4550.4150.44252500.61250.50250.06750.420.45250.4152540.6650.540.1050.40750.45250.41252590.6850.6450.11750.40250.45250.352670.7550.76750.09750.450.4750.72752720.8350.7250.1250.50750.48750.77752780.830.75250.12250.50.520.72752840.80250.74250.110.530.52750.72252910.77750.680.120.5750.511.11252940.74750.6550.11750.5450.4651.123040.7450.66750.120.4650.490.97753090.74250.62250.11250.43250.42250.85253130.7950.60750.10.47250.41750.94253080.780.590.08750.48750.46751.17253150.83250.6450.08750.54750.46251.11253160.86250.65750.16250.5650.51251.023200.93250.6450.160.6050.511.2153210.91750.69250.12750.59750.54251.32253180.860.760.11750.55250.4851.12753180.9650.79750.17250.510.4650.97753091.1350.8150.23250.54250.5651.25753031.140.7850.1950.53250.70751.29252981.16250.86750.15750.4950.67251.08752901.19250.83250.190.530.67250.9452881.10750.7950.1750.54250.63750.9252861.00250.8450.14750.50250.58751.1252841.0050.8750.11750.56250.5451.0252820.9150.91250.110.610.510.95752800.87250.87750.11250.61250.50250.8852780.8450.8350.0950.53250.5350.95752760.920.80.1050.51750.53251.0452750.90250.890.09750.49250.56750.97252710.9250.930.1150.5250.53750.90252660.93250.8950.110.480.5650.85252580.91250.99750.090.4950.60751.11752520.9450.8850.070.47250.58251.052470.9150.82750.0550.50250.5150.9452360.860.770.04750.4450.5050.922210.84750.71750.15250.42250.51250.83252071.13750.91250.37750.5350.540.92751971.00250.920.36750.650.61751.281750.950.8050.30250.640.60251.231640.83750.690.2450.60.5551.051480.80250.63750.1950.570.4850.911310.77750.670.15750.58250.4750.79251170.7550.65250.140.5450.48250.76960.75250.55750.1150.4850.43250.695820.7050.50.090.450.4250.6125650.680.47250.07250.40250.380.5425540.5750.49750.0550.34250.320.49450.480.440.0450.32250.3150.4325360.4250.370.03250.28750.320.38220.39250.32250.0250.26250.270.31180.4050.270.020.2150.230.2775100.3550.230.01250.1850.22560.290.1850.010.150.18530.2350.147500.1250.145</p> <p>Concurrent UsersLD-AVG5 -- APP20LD-AVG5 -- APP21LD-AVG5 -- APP22LD-AVG5 -- DB10LD-AVG5 -- DB11LD-AVG5 -- DB12Test Session Time (min)Load (concurrent users)5-minute load averageServer Resources: Server 5-Minute Load Average</p> <p>Monitoring Tools40.50.25121.252120.50.2511.7521.75280.7511111.254100.25111.251.25510.250.7512.251.751.25631.250.7511.501.757720.251.250.75118720.2511.25011012111.50.7511112110.7520.751211.</p>