binomial normal distribution
TRANSCRIPT
-
7/30/2019 Binomial Normal Distribution
1/47
Normal Distribution
-
7/30/2019 Binomial Normal Distribution
2/47
Normal Distribution
Why are normal distributions so important?
Many dependent variables are commonly assumed
to be normally distributed in the population
If a variable is approximately normally distributed wecan make inferences about values of that variable
Example: Sampling distribution of the mean
-
7/30/2019 Binomial Normal Distribution
3/47
TYPES OF HISTOGRAM
-
7/30/2019 Binomial Normal Distribution
4/47
Normal Distribution
Symmetrical, bell-shaped curve
Also known as Gaussian distribution
Point of inflection = 1 standard deviationfrom mean
Mathematical formula
f(X) 1
2
(e)
(X)2
22
-
7/30/2019 Binomial Normal Distribution
5/47
Since we know the shape of the curve, we can
calculate the area under the curve
The percentage of that area can be used to
determine the probability that a given value could bepulled from a given distribution
The area under the curve tells us about the probability- in
other words we can obtain a p-value for our result (data)
by treating it as a normally distributed data set.
-
7/30/2019 Binomial Normal Distribution
6/47
Problem:
Each normal distribution with its own values of
and would need its own calculation of the area
under various points on the curve
-
7/30/2019 Binomial Normal Distribution
7/47
The normalBell shaped curve:
=100,2=10
90 95 100 105 110
x
0.
00
0.
02
0.0
4
0.
06
0.
08
0.
10
0.
12
fx
-
7/30/2019 Binomial Normal Distribution
8/47
Normal curves:
(=0,2=1) and (=5,2=1)
-2 0 2 4 6 8
x
0.
0
0.
1
0.
2
0.
3
0.
4
fx1
-
7/30/2019 Binomial Normal Distribution
9/47
Normal curves:(=0,2=1) and (=0,2=2)
-3 -2 -1 0 1 2 3
x
0.
0
0.
1
0.
2
0.
3
0.
4
y
-
7/30/2019 Binomial Normal Distribution
10/47
Normal curves:(=0,2=1) and (=2,2=0.25)
-2 0 2 4 6 8
0.
0
0.
2
0.
4
0.
6
0.
8
1.
0
fx1
-
7/30/2019 Binomial Normal Distribution
11/47
The standard normal curve:
=0, and2=1
-3 -2 -1 0 1 2 3
x
0.
0
0.
1
0.
2
0.
3
0.
4
y
-
7/30/2019 Binomial Normal Distribution
12/47
Normal Probability DistributionsStandard Normal Distribution N(0,1)
We agree to use thestandard normaldistribution
Bell shaped
=0
=1
Note: not all bellshaped distributionsare normaldistributions
-
7/30/2019 Binomial Normal Distribution
13/47
Normal Distribution
The standard normal distribution will allow us
to make claims about the probabilities of
values related to our own data
How do we apply the standard normal
distribution to our data?
-
7/30/2019 Binomial Normal Distribution
14/47
Z-score
If we know the population mean and
population standard deviation, for any value
of X we can compute a z-score by subtracting
the population mean and dividing the resultby the population standard deviation
zX
-
7/30/2019 Binomial Normal Distribution
15/47
Probabilities are depicted by areas under the curve
Total area under the curve is1
The area in red is equal top(z > 1)
The area in blue is equal top(-1< z
-
7/30/2019 Binomial Normal Distribution
16/47
-
7/30/2019 Binomial Normal Distribution
17/47
Binomial Probability Distribution
A fixed number of observations (trials), n
e.g., 15 tosses of a coin; 20 patients; 1000 people surveyed
A binary random variable
e.g., head or tail in each toss of a coin; defective or notdefective light bulb
Generally called success and failure
Probability of success is p, probability of failure is 1 p
Constant probability for each observation
e.g., Probability of getting a tail is the same each time we
toss the coin
-
7/30/2019 Binomial Normal Distribution
18/47
Binomial example
Take the example of 5 coin tosses. Whats the
probability that you flip exactly 3 heads in 5
coin tosses?
-
7/30/2019 Binomial Normal Distribution
19/47
Binomial distribution
Solution:
One way to get exactly 3 heads: HHHTT
Whats the probability of this exact arrangement?P(heads)xP(heads) xP(heads)xP(tails)xP(tails) =(1/2)3x
(1/2)2
Another way to get exactly 3 heads: THHHT
Probability of this exact outcome = (1/2)1x (1/2)3x(1/2)1 = (1/2)3x(1/2)2
-
7/30/2019 Binomial Normal Distribution
20/47
Binomial distribution
In fact, (1/2)3x(1/2)2 is the probability of each
unique outcome that has exactly 3 heads and 2
tails.
So, the overall probability of 3 heads and 2 tails is:
(1/2)3x(1/2)2 + (1/2)3x(1/2)2+ (1/2)3x(1/2)2 +
.. for as many unique arrangements as there arebut how many are there??
-
7/30/2019 Binomial Normal Distribution
21/47
Outcome Probability
THHHT (1/2)3x(1/2)2
HHHTT (1/2)3x(1/2)2
TTHHH (1/2)3x(1/2)2
HTTHH (1/2)3x(1/2)2
HHTTH (1/2)3x(1/2)2
THTHH (1/2)3x(1/2)2
HTHTH (1/2)3x(1/2)2
HHTHT (1/2)3x(1/2)2
THHTH (1/2)3x(1/2)2
HTHHT (1/2)3x(1/2)210 arrangementsx (1/2)3x(1/2)2
The probabilityof each unique
outcome (note:
they are all
equal)
ways toarrange 3
heads in
5 trials
5
3
5C3 = 5!/3!2! = 10
-
7/30/2019 Binomial Normal Distribution
22/47
P(3 heads and 2 tails) = x P(heads)3 x P(tails)2 =
10 x ()5=
31.25%
5
3
-
7/30/2019 Binomial Normal Distribution
23/47
x
p(x)
0 3 4 51 2
Binomial distribution function:X= the number of heads tossed in 5 coin tosses
number of heads
p(x)
number of heads
-
7/30/2019 Binomial Normal Distribution
24/47
Example 2
As voters exit the polls, you ask a representative
random sample of 6 voters if they voted for
proposition 100. If the true percentage of voters who
vote for the proposition is 55.1%, what is theprobability that, in your sample, exactly 2 voted for
the proposition and 4 did not?
-
7/30/2019 Binomial Normal Distribution
25/47
Solution:
Outcome Probability
YYNNNN = (.551)2x(.449)4
NYYNNN(.449)1x (.551)2x(.449)3 = (.551)2x(.449)4
NNYYNN(.449)2x (.551)2x(.449)2 = (.551)2x(.449)4
NNNYYN(.449)3x (.551)2x(.449)1 = (.551)2x(.449)4
NNNNYY(.449)4x (.551)2 = (.551)2x(.449)4
.
.
ways to
arrange 2
Obama votes
among 6
voters
6
2
15 arrangementsx(.551)2x(.449)4
6
2
P(2 yes votes exactly) = x (.551)2x(.449)4 = 18.5%
-
7/30/2019 Binomial Normal Distribution
26/47
Binomial distribution, generally
XnXn
X
pp
)1(
1-p =probabilityof failure
p =
probability of
success
X = #
successes
out ofntrials
n = number of trials
Note the general pattern emerging if you have only two possibleoutcomes (call them 1/0 or yes/no or success/failure) in n independent
trials, then the probability of exactlyXsuccesses=
-
7/30/2019 Binomial Normal Distribution
27/47
Definitions: Binomial
Binomial: Suppose that n independent experiments, or trials,
are performed, where n is a fixed number, and that each
experiment results in a success with probabilityp and a
failure with probability 1-p. The total number of successes,
X, is a binomial random variable with parameters n andp.
We write:X ~ Bin (n, p) {reads: X is distributed binomially
with parameters n and p}
And the probability thatX=r(i.e., that there are exactly r
successes) is:
rnrn
r
pprXP
)1()(
-
7/30/2019 Binomial Normal Distribution
28/47
Definitions: Bernouilli
Bernouilli trial: If there is only 1 trial with
probability of successp and probability of
failure 1-p, this is called a Bernouilli
distribution. (special case of the binomial withn=1)
Probability of success:
Probability of failure:pppXP
111
1
1
)1()1(
pppXP
1)1()0( 010
1
0
-
7/30/2019 Binomial Normal Distribution
29/47
Binomial distribution: example
If I toss a coin 20 times, whats the
probability of getting exactly 10 heads?
176.)5(.)5(. 101020
10
-
7/30/2019 Binomial Normal Distribution
30/47
Binomial distribution: example
If I toss a coin 20 times, whats the
probability of getting of getting 2 or fewer
heads?
4
472018220
2
572019120
1
72020020
0
108.1
108.1105.9190)5(.!2!18
!20)5(.)5(.
109.1105.920)5(.!1!19
!20)5(.)5(.
105.9)5(.!0!20
!20)5(.)5(.
x
xxx
xxx
x
-
7/30/2019 Binomial Normal Distribution
31/47
Things that follow a binomial
distribution
Cohort study (or cross-sectional):
The number of exposed individuals in your sample that
develop the disease
The number of unexposed individuals in your sample that
develop the disease
Case-control study:
The number of cases that have had the exposure The number of controls that have had the exposure
-
7/30/2019 Binomial Normal Distribution
32/47
Answer
1. You are performing a cohort study. If the probability of developing disease in the
exposed group is .05 for the study duration, then if you sample (randomly) 500
exposed people, how many do you expect to develop the disease? Give a margin
of error (+/- 1 standard deviation) for your estimate.
X ~ binomial (500, .05)E(X) = 500 (.05) = 25
Var(X) = 500 (.05) (.95) = 23.75
StdDev(X) = square root (23.75) = 4.8725 4.87
-
7/30/2019 Binomial Normal Distribution
33/47
Answer
2. Whats the probability that at most 10 exposed subjects
develop the disease?
01.)95(.)05(....)95(.)05(.)95(.)05(.)95(.)05(.49010
500
10
4982500
2
4991500
1
5000500
0
This is asking for a CUMULATIVE PROBABILITY: the probability of 0 getting the
disease or 1 or 2 or 3 or 4 or up to 10.
P(X10) = P(X=0) + P(X=1) + P(X=2) + P(X=3) + P(X=4)+.+ P(X=10)=
(well learn how to approximate this long sum next week)
-
7/30/2019 Binomial Normal Distribution
34/47
Youll rarely calculate the binomial by hand. However, it is good to know how to
Pascals Triangle Trick for calculating binomial coefficients
Recall from math in your past that Pascals Triangle is used to get thecoefficients for binomial expansion
For example, to expand: (p + q)5
The powers follow a set pattern: p5 + p4q1 + p3q2 + p2q3+ p1q4+ q5
But what are the coefficients?
Use Pascals Magic Triangle
A brief distraction: Pascals Triangle
Trick
-
7/30/2019 Binomial Normal Distribution
35/47
Pascals Triangle
1
1 11 2 1
1 3 3 1
1 4 6 4 1
1 5 10 10 5 1
1 6 15 20 15 6 1
1 7 21 35 35 21 7 1
Edges are all 1s
Add the twonumbers in the row
above to get the
number below, e.g.:
3+1=4; 5+10=15
To get the
coefficient forexpanding to the
5th power, use the
row that starts
with 5.
(p + q)5 = 1p5 + 5p4q1 + 10p3q2 + 10p2q3+ 5p1q4+ 1q5
-
7/30/2019 Binomial Normal Distribution
36/47
50
5
0
)5(.)5(.
415
1
)5(.)5(.
325
2
)5(.)5(.
235
3
)5(.)5(.
145
4
)5(.)5(.
055
5
)5(.)5(.
X P(X)
0
1
2
3
4
5
Same coefficients for X~Bin(5,p)
X P(X)
0 5)5(.1
1 5)5(.5
2 5)5(.10 3 5)5(.10
4 5)5(.5
5 5)5(.1
32(.5)5=
1.0
5
0
=5!/0!5!=1
5
1
=5!/1!4! = 5
5
2
= 5!/2!3!=5x4/2=10
5
3
=5!/3!2!=10
5
4
=5!/4!1!= 5
5
5
=5! /5!1!=1 (Note the symmetry!)
For example, X=# heads in 5 coin tosses:
From line 5
of Pascalstriangle!
-
7/30/2019 Binomial Normal Distribution
37/47
Relationship between binomial probability
distribution and binomial expansion
If p + q = 1 (which is the case if they are binomial probabilities)
then: (p + q)5= (1)
5= 1 or, equivalently:
1p5 + 5p4q1 + 10p3q2 + 10p2q3+ 5p1q4+ 1q5 = 1
(the probabilities sum to 1, making it a
probability distribution!)
P(X=0) P(X=1) P(X=2) P(X=3) P(X=4) P(X=5)
-
7/30/2019 Binomial Normal Distribution
38/47
Introduction to the Poisson Distribution
Poisson distribution is for countsif events happen ata constant rate over time, the Poisson distributiongives the probability of X number of events occurring in
time T.
-
7/30/2019 Binomial Normal Distribution
39/47
Poisson Mean and Variance
Mean
Variance and Standard Deviation
2
where = expected number of hits in a given
time period
For a Poisson
random variable, the
variance and mean
are the same!
-
7/30/2019 Binomial Normal Distribution
40/47
Poisson Distribution, example
The Poisson distribution models counts, such as the number of new cases
of SARS that occur in women in New England next month.
The distribution tells you the probability of all possible numbers of new
cases, from 0 to infinity.
If X= # of new cases next month andX~ Poisson (), then the probabilitythatX=k(a particular count) is:
!
)(
k
ekXp
k
-
7/30/2019 Binomial Normal Distribution
41/47
Example
For example, if new cases of West Nile Virus in
New England are occurring at a rate of about 2
per month, then these are the probabilities
that: 0,1, 2, 3, 4, 5, 6, to 1000 to 1 million tocases will occur in New England in the next
month:
-
7/30/2019 Binomial Normal Distribution
42/47
Poisson Probability table
!0
220
e
!1
2 21
e
!2
222
e
!3
223
e
X P(X)
0 =.135
1 =.27
2 =.27
3 =.18
4 =.09
5
-
7/30/2019 Binomial Normal Distribution
43/47
Example: Poisson distribution
Suppose that a rare disease has an incidence of 1 in 1000 person-
years. Assuming that members of the population are affected
independently, find the probability of k cases in a population of
10,000 (followed over 1 year) for k=0,1,2.
The expected value (mean) = = .001*10,000 = 10
10 new cases expected in this population per year
00227.!2
)10()2(
000454.!1
)10()1(
0000454.!0
)10()0(
)10(2
)10(1
)10(0
eXP
eXP
eXP
-
7/30/2019 Binomial Normal Distribution
44/47
more on Poisson
Poisson Process (rates)
Note that the Poisson parameter can be given as themean number of events that occur in a defined time
period OR, equivalently, can be given as a rate, such as=2/month (2 events per 1 month) that must bemultiplied by t=time (called a Poisson Process)
X ~ Poisson ()
!
)()(
k
etkXP
tk
E(X) = t
Var(X) = t
-
7/30/2019 Binomial Normal Distribution
45/47
Example
For example, if new cases of West Nile in NewEngland are occurring at a rate of about 2 permonth, then whats the probability that exactly 4
cases will occur in the next 3 months?
X ~ Poisson (=2/month)
%4.13!4
6
!4
)3*2(months)3in4P(X
)6(4)3*2(4
ee
Exactly 6 cases?
%16!6
6
!6
)3*2(months)3in6P(X
)6(6)3*2(6
ee
-
7/30/2019 Binomial Normal Distribution
46/47
Practice problems
1a. If calls to your cell phone are a Poisson process
with a constant rate =2 calls per hour, whats the
probability that, if you forget to turn your phone off
in a 1.5 hour movie, your phone rings during thattime?
1b. How many phone calls do you expect to get
during the movie?
-
7/30/2019 Binomial Normal Distribution
47/47
Answer
1a. If calls to your cell phone are a Poisson process with a constantrate =2 calls per hour, whats the probability that, if you forget toturn your phone off in a 1.5 hour movie, your phone rings during thattime?
X ~ Poisson (=2 calls/hour)P(X1)=1 P(X=0)
05.!0
)3(
!0
)5.1*2()0( 3
30)5.1(20
eee
XP
P(X1)=1 .05 = 95% chance
1b. How many phone calls do you expect to get during the movie?
E(X) = t = 2(1.5) = 3