epidemiology 9509 - principle of biostatistics chapter 5 probability distributions...
TRANSCRIPT
Epidemiology 9509 probability dist’ns (continued)
Epidemiology 9509Principle of Biostatistics
Chapter 5Probability Distributions (continued)
John Koval
Department of Epidemiology and BiostatisticsUniversity of Western Ontario
1
Epidemiology 9509 probability dist’ns (continued)
What was covered previously
1. probability P(A)setsP(A and B); P(A or B)
2. probability distributions2.1 discrete
2.1.1 equiprobable
2.1.2 bernoulli
2.1.3 binomial
2.1.4 poisson
2.2 continuous
2.2.1 uniform
2.2.2 normal
3. calculating probabilities
3.1 discretePr(X = x)
3.2 continuousintervals: Pr(X < a), Pr(a < X < b)
2
Epidemiology 9509 probability dist’ns (continued)
What is being covered now
Using SAS to
1. calculate probabilities
2. calculate and plot probability distributions
3
Epidemiology 9509 probability dist’ns (continued)
Calculating probabilities
SAS function PDF
title ’calculate binomial probability’;
data binom1;
prob = pdf(’binomial’, 4, 0.4, 10);
output ;
proc print data=binom1;
4
Epidemiology 9509 probability dist’ns (continued)
binomial probability
calculate binomial probability
Obs prob
1 0.25082
Does this agree with previous calculations?
5
Epidemiology 9509 probability dist’ns (continued)
binomial probability
calculate binomial probability
Obs prob
1 0.25082
Does this agree with previous calculations?0.251, Lecture Chapter 5, page 8
6
Epidemiology 9509 probability dist’ns (continued)
Calculating probability distribution
title "calculate binomial probability distribution’;
data binom2;
do x = 0 to 10 by 1;
prob = pdf(’binomial’, x, 0.4, 10);
output;
end;
proc print data=binom2;
proc gplot;
plot prob*x;
run;
7
Epidemiology 9509 probability dist’ns (continued)
binomial probability distribution
calculate binomial probability distribution
Obs x prob
1 0 0.00605
2 1 0.04031
3 2 0.12093
4 3 0.21499
5 4 0.25082
6 5 0.20066
7 6 0.11148
8 7 0.04247
9 8 0.01062
10 9 0.00157
11 10 0.00010
8
Epidemiology 9509 probability dist’ns (continued)
GPLOT of pdf
9
Epidemiology 9509 probability dist’ns (continued)
Calculating cumulative probabilities
values up to and includingSAS function CDF
title ’calculate cumulative binomial probability’;
data binom3;
prob = cdf(’binomial’, 7, 0.4, 20);
output ;
proc print data=binom3;
run;
10
Epidemiology 9509 probability dist’ns (continued)
binomial cumulative distribution calculation
calculate cumulative binomial probability
Obs prob
1 0.41589
Does this agree with previous calculations ?
11
Epidemiology 9509 probability dist’ns (continued)
binomial cumulative distribution calculation
calculate cumulative binomial probability
Obs prob
1 0.41589
Does this agree with previous calculations ?0.4159, using R, Lecture Chapter 5, page 30
12
Epidemiology 9509 probability dist’ns (continued)
cumulative continuous probabilities
Pr(X < b)
= Pr
(
ZN <b−µ
σ
)
= Φ(
b−µ
σ
)
Φ() given by SAS function PROBNORM
13
Epidemiology 9509 probability dist’ns (continued)
example
Recall normal approximation to binomial
wantPr(Xnorm < 7.5)= Pr(ZN <
(
7.5−82.19
)
= Φ(−.228)
title ’calculate Normal probability’;
data norm1;
prob =probnorm(-0.228);
output;
proc print data=norm1;
run;
;
14
Epidemiology 9509 probability dist’ns (continued)
binomial cumulative distribution calculation
calculate Normal probability
Obs prob
1 0.40982
Does this agree with previous calculations ?
15
Epidemiology 9509 probability dist’ns (continued)
binomial cumulative distribution calculation
calculate Normal probability
Obs prob
1 0.40982
Does this agree with previous calculations ?0.4098by linear interpolation,see lecture Chapter 5, page 30
16
Epidemiology 9509 probability dist’ns (continued)
Probability of interval
Pr(17 < X < 22)= Pr
(
17−205 < ZN <
22−205
)
= Pr(−0.6 < ZN < 0.4) = Φ(0.4) − Φ(−0.6)
title ’calculate Normal probability for interval’;
data norm2;
a=-0.6;
b=0.4;
proba =probnorm(a);
probb = probnorm(b);
probint = probb - proba;
output;
proc print data=norm2;
run;
17
Epidemiology 9509 probability dist’ns (continued)
binomial cumulative distribution calculation
calculate Normal probability for interval
Obs a b proba probb probint
1 -0.6 0.4 0.27425 0.65542 0.38117
Does this agree with previous calculations ?
18
Epidemiology 9509 probability dist’ns (continued)
binomial cumulative distribution calculation
calculate Normal probability for interval
Obs a b proba probb probint
1 -0.6 0.4 0.27425 0.65542 0.38117
Does this agree with previous calculations ?0.3809,see lecture Chapter 5, page 26
19
Epidemiology 9509 probability dist’ns (continued)
Plotting normal density function
not usually done in practice
data norm3;
do x = 0 to 10 by 0.05;
density = pdf(’normal’, x, 4, 1.55);
output ;
end;
proc gplot data = norm3;
plot density*x;
symbol interpol=join;
20
Epidemiology 9509 probability dist’ns (continued)
GPLOT of pdf of Normal N(4,2.4)
21
Epidemiology 9509 probability dist’ns (continued)
normal approximation to binomial
title ’Normal approximation to binomial’;
data normbinom;
n=20;
pi=0.4;
mu = n*pi;
var = n*pi*(1-pi);
sd = sqrt(var);
do i = 0 to 20.975 by 0.025;
binompdf = pdf(’binomial’, floor(i), pi, n);
x = i-0.5;
normpdf = pdf(’normal’, x, mu, sd);
output normbinom;
end;
22
Epidemiology 9509 probability dist’ns (continued)
normal approximation to binomial(continued)
proc gplot data=normbinom;
plot binompdf * x normpdf * x/
haxis=-1 to 21 by 1 vaxis=0 to 0.2 by 0.05 overlay;
symbol interpol=join;
23
Epidemiology 9509 probability dist’ns (continued)
GPLOT of normal approximation to Bin(20,0.4)
24
Epidemiology 9509 probability dist’ns (continued)
another normal approximation to binomial
non-symmetric distribution Bin(10,.2)
data normbinom2;
n=10;
pi=0.2;
mu = n*pi;
var = n*pi*(1-pi);
sd = sqrt(var);
do i = 0 to 10.9075 by 0.025;
binompdf = pdf(’binomial’, floor(i), pi, n);
x = i-0.5;
normpdf = pdf(’normal’, x, mu, sd);
output normbinom2;
end;
25
Epidemiology 9509 probability dist’ns (continued)
non-symmetric distribution (continued)
proc gplot data=normbinom2;
plot binompdf * x normpdf * x /
haxis=-1 to 11 by 1 vaxis=0 to 0.5 by 0.05 overlay;
symbol interpol=join;
26
Epidemiology 9509 probability dist’ns (continued)
Normal approximation to Bin(10,0.2)
original distribution is asymmetricnot a good fit to the normal
27