copyright 2002 david m. hassenzahl using r and 2 statistics for risk analysis
TRANSCRIPT
![Page 1: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/1.jpg)
Copyright 2002
David M. Hassenzahl
Using r and 2
Statistics
for
Risk Analysis
![Page 2: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/2.jpg)
Copyright 2002
David M. Hassenzahl
Objectives
• Purpose: to compare model to data– “validate model” (or not)
• Two techniques– r (correlation coefficient) 2 (Chi-squared)
• Apply to a familiar problem (barium decay)
![Page 3: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/3.jpg)
Copyright 2002
David M. Hassenzahl
Statistics
• Descriptive
• Comparison– Z-scores, hypotheses– Confidence levels– Evaluating models– Correlation and Chi-squared
![Page 4: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/4.jpg)
Copyright 2002
David M. Hassenzahl
Confidence Levels
• Given 100 flips of a coin. Would you bet $1000 that the next flip will yield heads if– 50 heads?– 90 heads?– 99 heads?– 999 heads out of the last 1000 flips?
• How about for $5? For 50% of your current net worth?
![Page 5: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/5.jpg)
Copyright 2002
David M. Hassenzahl
Statistical Significance
• Z = – (sample occurrence – number in sample
times expected probability – Divide by square root of (np(1-p)
• Student’s t• One-sided versus two sided tests!• “p values”• Confidence intervals
![Page 6: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/6.jpg)
Copyright 2002
David M. Hassenzahl
Type I and II errors
• Type I: reject the truth! (accuracy)
• Type II: accept an untruth! (precision)
• This is important… there’s often a tradeoff here!
![Page 7: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/7.jpg)
Copyright 2002
David M. Hassenzahl
Z-scores Intuition
• Z score will be big if– Numerator: if xbar >> OR >> xbar – s is very small– n is very big
• Bigger Z-score: confidence that xbar • Small Z-score: confidence that xbar
ns
μxZ 0
![Page 8: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/8.jpg)
Copyright 2002
David M. Hassenzahl
From Z’s to r’s and 2
• r and 2 compare more than one estimate
• Compare – Set of model predictions to– Set of data or observations
• If r is SMALL (little correlation) the model doesn’t fit
• If 2 is SMALL then the model does fit
![Page 9: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/9.jpg)
Copyright 2002
David M. Hassenzahl
“Goodness of fit”
• We say that r and 2 evaluate “goodness of fit”
• Note that a good fit does not mean that the model is right!
![Page 10: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/10.jpg)
Copyright 2002
David M. Hassenzahl
Barium Decay
• Theory: barium is removed as a constant function of concentration
• “Exponential decay”
• C(T) = C(0)ekT
– k = -0.007/min– C(0) = 0.16 mgBa / liter blood
• (From SWRI page 56 – 63; hypothetical)
![Page 11: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/11.jpg)
Copyright 2002
David M. Hassenzahl
Exponential Decay Model
Figure 2-9 from Should We Risk It?
Bloo
d ba
rium
conc
entra
tion
(mg/
l)
time
0
0.16
0.12
0.08
0.04
0 12060 180 300240 420360
![Page 12: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/12.jpg)
Copyright 2002
David M. Hassenzahl
Sample blood at 1 hour intervals
Time (hours) Measured Concentration
0 0.16
1 0.13
2 0.087
3 0.055
4 0.040
5 0.022
6 0.009
7 0.002
8 0.001
![Page 13: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/13.jpg)
Copyright 2002
David M. Hassenzahl
Measured and ExpectedTime (hours) Measured
ConcentrationPredicted
Concentration
0 0.16 0.16
1 0.13 0.11
2 0.087 0.070
3 0.055 0.045
4 0.040 0.030
5 0.022 0.020
6 0.009 0.013
7 0.002 0.0095
8 0.001 0.0056
![Page 14: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/14.jpg)
Copyright 2002
David M. Hassenzahl
Graphical Comparison
After Figure 2-9 from Should We Risk It?
Bloo
d ba
rium
conc
entra
tion
(mg/
l)
time
0
0.16
0.12
0.08
0.04
0 12060 180 300240 420360
![Page 15: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/15.jpg)
Copyright 2002
David M. Hassenzahl
How well does the model fit?
• Why do we care?– Future predictions– Is there a better model?
• Looks OK. Is that good enough?
• Try our two tools: r and 2
![Page 16: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/16.jpg)
Copyright 2002
David M. Hassenzahl
r Conceptual
• Compares model predictions to the data
• Asks – “What if there is no relationship (or
correlation) between model and data?”– Is the model as close to the average value
of the x’s as it is to the actual x’s?
![Page 17: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/17.jpg)
Copyright 2002
David M. Hassenzahl
r terms or components
• Predicted mean and standard deviation
• Observed mean and standard deviation
• “Covariance”– Do they go up and down together?– If independent, covariance = 0
• r = Covariance (predicted, observed)
(STDEV O) (STDEV P)
![Page 18: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/18.jpg)
Copyright 2002
David M. Hassenzahl
Means
• Observed xobar = ( xoi) /n
• xobar = (0.16+0.13+0.087+0.055+0.040+0.022+0.009+0.002+0.000)/9 = 0.056
• Observed xpbar = ( xpi) /n = 0.051
![Page 19: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/19.jpg)
Copyright 2002
David M. Hassenzahl
Standard Deviations
051.011 2 pipp xxns
056.011 2
oioo xxns
![Page 20: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/20.jpg)
Copyright 2002
David M. Hassenzahl
Covariance
pipoio xxxxnpoCov 11,
![Page 21: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/21.jpg)
Copyright 2002
David M. Hassenzahl
Calculated r
• r = Covariance (predicted, observed)
(STDEV O) (STDEV P)
99.0
s s
o , p covr
po
![Page 22: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/22.jpg)
Copyright 2002
David M. Hassenzahl
Intuition Behind r
• If there is no relationship between observed and predicted, r = 0
• If r 0, positive correlation
• If r 0, negative correlation
![Page 23: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/23.jpg)
Copyright 2002
David M. Hassenzahl
r Discussed
• 0.99 seems reasonably good
• Is there a better fit
• What about theory?
• Limitations: even low correlations may be okay…just a screening tool
![Page 24: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/24.jpg)
Copyright 2002
David M. Hassenzahl
t test for r
• n-2 = 7 degrees of freedom
• Look it up in the Student’s-t table
• Accept model validity at 99% confidence level if Student’s t is greater than 2.998
21
2
r
nrt
![Page 25: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/25.jpg)
Copyright 2002
David M. Hassenzahl
Chi-squared
• This formula “normalizes” to the size of the individual xoi
• If all xoi xip, 2 = 0
• Look up value in table (page 398)
n
i 1 pi
2pioi2
x
xxχ
![Page 26: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/26.jpg)
Copyright 2002
David M. Hassenzahl
Chi-squared
• 9 data points• Suppose we are concerned with 99%
confidence level• We would need a chi-squared of greater
than 21.7 to reject this line• Calculating, we find that 2 = 0.06! • Note that it still might be possible to find
a better line, even with the exponential
![Page 27: Copyright 2002 David M. Hassenzahl Using r and 2 Statistics for Risk Analysis](https://reader036.vdocuments.site/reader036/viewer/2022062518/56649c7d5503460f94931fb0/html5/thumbnails/27.jpg)
Copyright 2002
David M. Hassenzahl
Conclusion
• Both r and Chi-squared appear to validate this model
• Suggests that our theoretical idea about the model may be valid
• Doesn’t tell us we are right, just that we may be acceptably wrong!