summary statistics jake blanchard spring 2008 uncertainty analysis for engineers1
TRANSCRIPT
![Page 1: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/1.jpg)
Uncertainty Analysis for Engineers 1
Summary StatisticsJake BlanchardSpring 2008
![Page 2: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/2.jpg)
Uncertainty Analysis for Engineers 2
Summarizing and Interpreting Data
It is useful to have some metrics for summarizing statistical data (both input and output)
3 key characteristics are ◦central tendency (mean, median,
mode)◦Dispersion (variance)◦Shape (skewness, kurtosis)
![Page 3: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/3.jpg)
Uncertainty Analysis for Engineers 3
Central TendencyMean
Median=point such that exactly half of the probability is associated with lower values and half with greater values
Mode=most likely value (maximum of pdf)
dxxfxxEpxxE i
n
ii )()()(
1
z
dxxf 5.0)(
![Page 4: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/4.jpg)
Uncertainty Analysis for Engineers 4
For 1 Dice
5.3mod
5.3
5.3)(
6
16
6
15
6
14
6
13
6
12
6
11)()(
6
1
e
x
median
xE
xpxxE
mean
ix
i
i
![Page 5: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/5.jpg)
Uncertainty Analysis for Engineers 5
Radioactive DecayFor our example, the mean, median,
and mode are given by
The mode is x=0
)2ln(
5.0
1)()(
0
0
z
dte
median
dtetdtttftE
mean
zt
t
![Page 6: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/6.jpg)
Uncertainty Analysis for Engineers 6
Other CharacteristicsWe can calculate the expected
value of any function of our random variable as
iii xpxh
dxxfxh
xhE
)()(
![Page 7: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/7.jpg)
Uncertainty Analysis for Engineers 7
Some Results
n
jjj
n
jjj
n
jj
n
jj
xEbxbE
xExE
xcEcxE
ccE
11
11
)()(
)(
![Page 8: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/8.jpg)
Uncertainty Analysis for Engineers 8
ii
ki
k
kk
xpx
dxxfx
xE
dxxfx
)(
)(
)(
1
1
1
1
Moments of DistributionsWe can define many of these
parameters in terms of moments of the distribution
Mean is first moment. Variance is second momentThird and fourth moments are
related to skewness and kurtosis
![Page 9: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/9.jpg)
Uncertainty Analysis for Engineers 9
Spread (Variance)Variance is a measure of spread or
dispersion
For discrete data sets, the biased variance is:
and the unbiased variance is
The standard deviation is the square root of the variance
dxxfxxE )(21
212
2
n
i
xxn
s1
22 1
n
i
xxn
s1
22
1
1
![Page 10: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/10.jpg)
Uncertainty Analysis for Engineers 10
Skewnessskewness is a measure of
asymmetry
For discrete data sets, the biased skewness is related to:
The skewness is often defined as
dxxfxxE )(31
313
n
i
xxn
m1
33
1
33
1
![Page 11: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/11.jpg)
Uncertainty Analysis for Engineers 11
Skewness
![Page 12: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/12.jpg)
Uncertainty Analysis for Engineers 12
Kurtosiskurtosis is a measure of
peakedness
For discrete data sets, the biased kurtosis is related to:
The kurtosis is often defined as
dxxfxxE )(41
414
n
i
xxn
m1
44
1
344
2
![Page 13: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/13.jpg)
Uncertainty Analysis for Engineers 13
Kurtosis
Pdf of Pearson type VII distribution with kurtosis of infinity (red), 2 (blue), and 0 (black)
![Page 14: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/14.jpg)
Uncertainty Analysis for Engineers 14
Using Matlab
Sample data is length of time a person was able to hold their breath (40 attempts)
Try a scatter plotload RobPracticeHolds; y = ones(size(breathholds));h1 = figure('Position',[100 100 400
100],'Color','w');scatter(breathholds,y);
![Page 15: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/15.jpg)
Uncertainty Analysis for Engineers 15
Adding Informationdisp(['The mean is ',num2str(mean(breathholds)),' seconds
(green line).']);
disp(['The median is ',num2str(median(breathholds)),' seconds (red line).']);
hold all;
line([mean(breathholds) mean(breathholds)],[0.5 1.5],'color','g');
line([median(breathholds) median(breathholds)],[0.5 1.5],'color','r');
![Page 16: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/16.jpg)
Uncertainty Analysis for Engineers 16
Box Plot
title('Scatter with Min, 25%iqr, Median, Mean, 75%iqr, & Max lines');
xlabel('');
h3 = figure('Position',[100 100 400 100],'Color','w'); boxplot(breathholds,'orientation','horizontal','widths',.5);
set(gca,'XLim',[40 140]);
title('A Boxplot of the same data'); xlabel(''); set(gca,'Yticklabel',[]); ylabel('');
![Page 17: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/17.jpg)
Uncertainty Analysis for Engineers 17
Box Plot
Min
MaxMedia
n
Outlier
Box represents
inter-quartile
range (half of data)
![Page 18: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/18.jpg)
Uncertainty Analysis for Engineers 18
Empirical cdfh3 = figure('Position',[100 100 600
400],'Color','w');
cdfplot(breathholds);
![Page 19: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/19.jpg)
Uncertainty Analysis for Engineers 19
Multivariate Data SetsWhen there are multiple input
variables, we need some additional ways to characterize the data
If x and y are independent, then Cov(x,y)=0
)()()(),(
,),(
),(),(),(
yExExyEyxCov
discreteyxpyxh
continuousdxdyyxfyxhyxhE
i jjiji
![Page 20: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/20.jpg)
Uncertainty Analysis for Engineers 20
Correlation Coefficients
Two random variables may be relatedDefine correlation coefficient of input (x)
and output (y) as
=1 implies linear dependence, positive slope
=0 no dependence=-1 implies linear dependence, negative
slope
)()(
),(
1 1
22
1, yx
yxCov
yyxx
yyxxm
k
m
k kk
m
k kkyx
![Page 21: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/21.jpg)
Uncertainty Analysis for Engineers 21
Example
=0.98
=-0.38
=1
=-0.98
![Page 22: Summary Statistics Jake Blanchard Spring 2008 Uncertainty Analysis for Engineers1](https://reader034.vdocuments.site/reader034/viewer/2022051401/56649d0c5503460f949e0ea1/html5/thumbnails/22.jpg)
Uncertainty Analysis for Engineers 22
Examplex=rand(25,1)-0.5;y=x;corrcoef(x,y)subplot(2,2,1), plot(x,y,'o')y2=x+0.2*rand(25,1);corrcoef(x,y2)subplot(2,2,2), plot(x,y2,'o')y3=-x+0.2*rand(25,1);corrcoef(x,y3)subplot(2,2,3), plot(x,y3,'o')y4=rand(25,1)-0.5;corrcoef(x,y4)subplot(2,2,4), plot(x,y4,'o')