© 1998, geoff kuenning vague idea “groping around” experiences hypothesis model initial...
TRANSCRIPT
© 1998, Geoff Kuenning
Vague idea
“groping around” experiences
Hypothesis
Model
Initialobservations
Experiment
Data, analysis, interpretation
Results & finalPresentation
Experimental Lifecycle
© 1998, Geoff Kuenning
Common Mistakes in Graphics
• Excess information• Multiple scales• Using symbols in place of text• Poor scales• Using lines incorrectly
© 1998, Geoff Kuenning
Multiple Scales
• Another way to meet length limits• Basically, two graphs overlaid on each other• Confuses reader (which line goes with which
scale?)• Misstates relationships
– Implies equality of magnitude that doesn’t exist
Start here
© 1998, Geoff Kuenning
Some Especially Bad Multiple Scales
0
5
10
15
20
25
30
35
40
45
1 2 3 4
10
100
1000
Throughput
Response Time
© 1998, Geoff Kuenning
Using Symbolsin Place of Text
• Graphics should be self-explanatory– Remember that the graphs often draw
the reader in• So use explanatory text, not symbols• This means no Greek letters!
– Unless your conference is in Athens...
© 1998, Geoff Kuenning
It’s All Greek To Me...
0
2
4
6
8
10
12
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
w
© 1998, Geoff Kuenning
Explanation is Easy
Waiting Time asa Function of Offered Load
0
2
4
6
8
10
12
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9
Offered Load
Waiting Time
© 1998, Geoff Kuenning
Poor Scales
• Plotting programs love non-zero origins– But people are used to zero
• Fiddle with axis ranges (and logarithms) to get your message across– But don’t lie or cheat
• Sometimes trimming off high ends makes things clearer– Brings out low-end detail
© 1998, Geoff Kuenning
Nonzero Origins(Chosen by Microsoft)
65
70
75
80
85
90
95
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
© 1998, Geoff Kuenning
Proper Origins
0102030405060708090
100
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
© 1998, Geoff Kuenning
A Poor Axis Range
0
2000
4000
6000
8000
10000
12000
1 2 3 4
© 1998, Geoff Kuenning
A Logarithmic Range
1
10
100
1000
10000
1 2 3 4
© 1998, Geoff Kuenning
A Truncated Range
0
10
20
30
40
50
60
1 2 3 4
10000
© 1998, Geoff Kuenning
Using Lines Incorrectly
• Don’t connect points unless interpolation is meaningful
• Don’t smooth lines that are based on samples– Exception: fitted non-linear curves
© 1998, Geoff Kuenning
Incorrect Line Usage
0
100
200
300
400
500
0 1 2 3 4 5 6 7 8 9Replicas
Timecp
mab
rm
© 1998, Geoff Kuenning
Pictorial Games
• Non-zero origins and broken scales• Double-whammy graphs• Omitting confidence intervals• Scaling by height, not area• Poor histogram cell size
© 1998, Geoff Kuenning
Non-Zero Originsand Broken Scales
• People expect (0,0) origins– Subconsciously
• So non-zero origins are a great way to lie• More common than not in popular press• Also very common to cheat by omitting
part of scale– “Really, Your Honor, I included (0,0)”
© 1998, Geoff Kuenning
Non-Zero Origins
20
21
22
23
24
25
26
27
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
UsThem
0
20
40
60
80
100
1st Qtr 2ndQtr
3rdQtr
4th Qtr
ThemUs
© 1998, Geoff Kuenning
The Three-Quarters Rule
• Highest point should be 3/4 of scale or more
0
5
10
15
20
25
30
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
ThemUs
© 1998, Geoff Kuenning
Double-Whammy Graphs
• Put two related measures on same graph– One is (almost) function of other
• Hits reader twice with same information– And thus overstates impact
0
20
40
60
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Sales ($)
Units Shipped
© 1998, Geoff Kuenning
OmittingConfidence Intervals
• Statistical data is inherently fuzzy• But means appear precise• Giving confidence intervals can make it
clear there’s no real difference– So liars and fools leave them out
© 1998, Geoff Kuenning
Graph WithoutConfidence Intervals
0
10
20
30
40
50
60
70
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
© 1998, Geoff Kuenning
Graph WithConfidence Intervals
0
10
20
30
40
50
60
70
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Confidence Intervals
• Sample mean value is only an estimate of the true population mean
• Bounds c1 and c2 such that there is a high probability, 1-, that the population mean is in the interval (c1,c2):
Prob{ c1 < < c2} =1- where is the significance level and100(1-) is the confidence level
• Overlapping confidence intervals is interpreted as “not statistically different”
© 1998, Geoff Kuenning
Graph WithConfidence Intervals
0
10
20
30
40
50
60
70
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
Reporting Only One Run(tell-tale sign)
0
20
40
60
80
100
100 200 300 400 500
Probably a fluke(It’s likely that withmultiple trials this would go away)
© 1998, Geoff Kuenning
Scaling by HeightInstead of Area
• Clip art is popular with illustrators:
Women in the Workforce
1960 1980
Any quesses?w1980/w1960 = ?
© 1998, Geoff Kuenning
The Troublewith Height Scaling
• Previous graph had heights of 2:1• But people perceive areas, not heights
– So areas should be what’s proportional to data• Tufte defines a lie factor: size of effect in graphic
divided by size of effect in data– Not limited to area scaling– But especially insidious there (quadratic effect)
© 1998, Geoff Kuenning
Scaling by Area
• Here’s the same graph with 2:1 area:
Women in the Workforce
1960 1980
© 1998, Geoff Kuenning
Histogram Cell Size
• Picking bucket size is always a problem• Prefer 5 or more observations per bucket• Choice of bucket size can affect results:
02468
1012
5 10 15 20 25 30
Histogram Cell Size
• Picking bucket size is always a problem• Prefer 5 or more observations per bucket• Choice of bucket size can affect results:
02468
1012
10 20 30
Histogram Cell Size
• Picking bucket size is always a problem• Prefer 5 or more observations per bucket• Choice of bucket size can affect results:
0
5
10
15
15 30
© 1998, Geoff Kuenning
Don’t Quote DataOut of Context
Traffic Deaths andEnforcement of Speed Limits
250
275
300
325
350
1954 1955 1956 1957
Before stricterenforcement
After stricterenforcement
© 1998, Geoff Kuenning
The Same Data in Context
Connecticut Traffic Deaths, 1951-1959
0
50
100
150
200
250
300
350
1950 1951 1952 1953 1954 1955 1956 1957 1958 1959 1960
Tell the Whole Truth
0
20
40
60
80
100
100 200 300 400 500
Tell the Whole Truth
0
20
40
60
80
100
120
100 200 300 400 500
© 1998, Geoff Kuenning
Special-Purpose Charts
• Histograms• Scatter plots• Gantt charts• Kiviat graphs
© 1998, Geoff Kuenning
Tukey’s Box Plot
• Shows range, median, quartiles all in one:
• Variations:
minimum maximumquartile quartilemedian
© 1998, Geoff Kuenning
Histograms
0
20
40
60
80
1st Qtr 2nd Qtr 3rd Qtr 4th Qtr
© 1998, Geoff Kuenning
Scatter Plots
• Useful in statistical analysis• Also excellent for huge quantities of data
– Can show patterns otherwise invisible
0
5
10
15
20
25
0 5 10 15
© 1998, Geoff Kuenning
Gantt Charts
• Shows relative duration of Boolean conditions• Arranged to make lines continuous
– Each level after first follows FTTF pattern
0 20 40 60 80 100
CPU
I/O
Network
© 1998, Geoff Kuenning0 20 40 60 80 100
CPU
I/O
Network
Gantt Charts
• Shows relative duration of Boolean conditions• Arranged to make lines continuous
– Each level after first follows FTTF pattern
T
T T
T T TT
F
FF
F F F F
© 1998, Geoff Kuenning
Kiviat Graphs
• Also called “star charts” or “radar plots”• Useful for looking at balance between HB and
LB metricsHB
LB
© 1998, Geoff Kuenning
Useful Reference Works
• Edward R. Tufte, The Visual Display of Quantitative Information, Graphics Press, Cheshire, Connecticut, 1983.
• Edward R. Tufte, Envisioning Information, Graphics Press, Cheshire, Connecticut, 1990.
• Edward R. Tufte, Visual Explanations, Graphics Press, Cheshire, Connecticut, 1997.
• Darrell Huff, How to Lie With Statistics, W.W. Norton & Co., New York, 1954
© 1998, Geoff Kuenning
Ratio Games
• Choosing a Base System• Using Ratio Metrics• Relative Performance Enhancement• Ratio Games with Percentages• Strategies for Winning a Ratio Game• Correct Analysis of Ratios
© 1998, Geoff Kuenning
Choosing a Base System
• Run workloads on two systems• Normalize performance to chosen
system• Take average of ratios• Presto: you control what’s best
Code Size Example
Program RISC-1 Z8002 R/R Z/R
F-bit 120 180 1.0 1.5
Acker 144 302 1.0 2.1
Towers 96 240 1.0 2.5
Puzzle 2796 1398 1.0 0.5
Sum 3156 2120 4.0 6.6
Average 789 530 1.0 1.6 or .67?
Simple Example
Program 1 2 1/2 2/1
A 50 100 0.5 2.0
B 1000 500 2.0 0.5
Sum 1050 600 1.75 0.57
© 1998, Geoff Kuenning
Using Ratio Metrics
• Pick a metric that is itself a ratio– power = throughput response time– cost / performance– improvement ratio
• Handy because division is “hidden”
© 1998, Geoff Kuenning
Relative Performance Enhancement
• Compare systems with incomparable bases• Turn into ratios• Example: compare Ficus 1 vs. 2 replicas with UFS
vs. NFS (1 run on chosen day):
• “Proves” adding Ficus replica costs less than going from UFS to NFS
"cp" Time RatioFicus 1 vs. 2 197.4 246.6 1.25UFS vs. NFS 178.7 238.3 1.33
© 1998, Geoff Kuenning
Ratio Games with Percentages
• Percentages are inherently ratios– But disguised– So great for ratio games
• Example: Passing tests
• A is worse, but looks better in total line!
Test A Runs A Passes A % B Runs B Passes B %1 300 60 20 32 8 252 50 2 4 500 40 8
Total 350 62 18 532 48 9
© 1998, Geoff Kuenning
More on Percentages
• Psychological impact– 1000% sounds bigger than 10-fold (or
11-fold)– Great when both original and final
performance are lousy• E.g., salary went from $40 to $80 per week
• Small sample sizes generate big lies• Base should be initial, not final value
– E.g., price can’t drop 400%
Sequential page placement normalized to random placement for static policies -- SPEC
True Confessions
Power state policies with random placement normalized toall active memory -- SPEC
True Confessions
© 1998, Geoff Kuenning
Strategies for Winninga Ratio Game
• Can you win?• How to win
© 1998, Geoff Kuenning
Can You Winthe Ratio Game?
• If one system is better by all measures, a ratio game won’t work– But recall percent-passes example– And selecting the base lets you change the
magnitude of the difference• If each system wins on some measures, ratio
games might be possible (but no promises)– May have to try all bases
© 1998, Geoff Kuenning
How to WinYour Ratio Game
• For LB metrics, use your system as the base• For HB metrics, use the other as a base• If possible, adjust lengths of benchmarks
– Elongate when your system performs best– Short when your system is worst– This gives greater weight to your strengths