probability distributions binomial probability distribution (asw, section 5.4) using excel for the...
TRANSCRIPT
Probability distributionsBinomial probability distribution (ASW, section 5.4)Using Excel for the binomial (ASW, pp. 222-223)Uniform probability distribution (ASW, section 6.1)Normal probability distribution (ASW, section 6.2)
Bring the text to class on Monday and Wednesday, Sept. 29 and October 1. We will be using Tables 1 and 5 of Appendix B of ASW.
Notes for September 29, 2008
Variance (ASW, 195)The variance of a probability distribution is the expected value of
the squares of the differences of the random variable x from the mean μ. Symbolically,
Var(x) = σ2 = ∑(x – μ)2 f(x)The Greek symbol σ is “sigma.”The variance can be difficult to calculate and interpret. It is in
units that are the square of the random variable x. Partly because of this, in statistical work it is more common to use the square root of the variance or σ. The standard deviation has the same units as x.
Variance of x, number of females selected
x f(x) x f(x) x - μ (x – μ)2 (x – μ)2f(x)
0 1/8 = 0.125 0.000 -1.5 2.25 0.28125
1 3/8 = 0.375 0.375 -0.5 0.25 0.09375
2 3/8 = 0.375 0.750 0.5 0.25 0.09375
3 1/8 = 0.125 0.375 1.5 2.25 0.28125
Total 8/8 = 1.000 1.500 0.75000
If a random sample of 3 persons is obtained from a large population composed of half females and half males, the expected number of females selected is μ = 1.5. The variance of the number of females selected is Var(x) = σ2 = ∑(x – μ)2 f(x) = 0.75. The standard deviation is the square root of 0.75, so that σ = 0.866.
Sample and population variance
• The variance of a sampling distribution is (ASW, 195)
• This is equivalent to the variance of a population (ASW, 92)
Note that the variance of a sample is
Var(x) = σ2 = ∑(x – μ)2 f(x)
N
xi
22 )(
1
)( 22
n
xxs i
Unbiased estimator
• The expected value of s2 is equal to σ2, a characteristic that is referred to as an unbiased estimate. That is,
• Using (n-1) in the denominator of s2, rather than n, produces this unbiased estimate.
• The concept of biased and unbiased estimators is important in constructing good estimators and is a major consideration in econometric work.
• When using Excel to estimate mean and standard deviation, make sure you use the proper formulae.
22 )( sE
Binomial probability distribution (ASW, 200)A binomial experiment is a probability experiment with the
following characteristics: – The experiment has n identical trials.– Two outcomes are possible on each trial – one trial is
termed a success and the other is termed a failure.– The probability of a success occurring on each trial is
p. This probability p is the same on each trial.– Since the outcome must either be a success or
failure, a failure is the complement of a success and the probability of a failure is 1-p. (Some texts refer to this probability as q, that is, q=1-p).
– The trials are independent of each other.
Given the above conditions:• The binomial probability distribution provides the probability
of x successes in n trials, where x=0, 1 ,2, 3, … , n.• Note that there are only two parameters that determine
binomial probabilities:n = the number of trials. p = the probability of success.
• Successive trials must be independent of each other. That is, the outcome of any one trial must not affect the probability of success or failure for any other trial. P (success failure on any other trial) = pP (success success on any other trial) = p
i
x f(x)
0 1/8 = 0.125
1 3/8 = 0.375
2 3/8 = 0.375
3 1/8 = 0.125
Total 8/8 = 1.000
Example – number of females selected in a random sample of size 3 from a large population of half males and half females.
The above distribution is a binomial probability distribution with success defined as selecting a female. There are n = 3 independent trials, the probability of success is p = 0.5, and x is the number of successes. In this experiment, selecting a male is termed a failure, and the probability of selecting a male is 1-p = 1-0.5 = 0.5.
x is the number of females selected and f(x) is the probability of x females being selected
Formula for binomial probability
1!0
)1)(2)....(2)(1(!
)1()!(!
!)( )(
nnnn
ppxnx
nxf xnx
If n is the number of trials of the binomial experiment and p is the probability of success, then the probability of x successes in n trials of the experiment is given by the probability function f(x), defined as follows:
Using the binomial formula
125.0125.01)125.0)(1(1231
123)5.01(5.0
)!03(!0
!3)0(
375.0125.03)5.0)(25.0(112
123)5.01(5.0
)!23(!2
!3)2(
)03(0
)23(2
f
f
Combinations and permutations (ASW, 146-147)
Permutations – the number of ways of arranging N objects, taken n at a time, where the order of the objects is taken into account, is:
Where is the number of possible combinations of N objects, taken n at a time, where the order of the objects does not matter.
)!(
!!
nN
NnCP
N
n
N
n
CN
n
)!(!
!
nNn
NC
N
n
Rationale for the binomial formulaProbability of x successes and (n-x) failures is
This is and represents the probability of any particular sequence of x successes and (n-x) failures.
And there are ways of arranging these x successes and (n-x) failures. To obtain the probability of x successes in n trials, multiply the probability of any particular sequence by this combination.
)1(...)1()1()1(... pppppppp n – x times x times
)1( ppxnx
Cn
x
Example – selection of Saskatchewan workers, classified by years of education and wages and salaries
From all these workers, randomly select 13 workers with 14-17 years of education. What is the probability that exactly 8 of these will have incomes of $45,000 or more? Probability of 8 or more?
A random sample from a large population means that successive selections are independent of each other. There are n = 13 workers selected. If success is defined as the probability of selecting a worker with an income of $45,000 or more, the probability of success p = 82/230 = 0.357.
Probability of 8 with $45,000 or more income = 0.0373. See the following slides for the calculation.
Using the formula
00000154.0)1)(00000153.0(!0!13
!13)357.01(357.0
)!1313(!13
!13)13(
000035827.0)643.0)(000004286.0(1!12
!1213)357.01(357.0
)!1213(!12
!13)12(
000387139.0)413449.0)(000012005.0(12!11
!111213)357.01(357.0
)!1113(!11
!13)11(
002556743.0)265847707.0)(000033627.0(123!10
!10111213)357.01(357.0
)!1013(!10
!13)10(
011512347.0)170940076.0)(000094192.0(1234!9
!910111213)357.01(357.0
)!913(!9
!13)9(
037323.0)109914469.0)(000263843.0(12345!8
!8910111213)357.01(357.0
)!813(!8
!13)8(
)1313(13
)1213(12
)1113(11
)1013(10
)913(9
)813(8
f
f
f
f
f
f
Probabilities to 3 decimal places Number of successes (x)
Probability of x or f(x)
8 0.0373
9 0.0115
10 0.0026
11 0.0004
12 0.0000
13 0.0000
The probability of 8 or more successes is the sum of the probabilities of 8, 9, 10, 11, 12, or 13 successes. This is 0.0373 + 0.0115 + 0.0026 + 0.0004 + 0.0000 + 0.0000 = 0.0518.
Using an Excel worksheet to obtain the probabilities
130.357
x f(x)0 0.0032117571 0.0231815912 0.0772239013 0.1572100884 0.2182115145 0.2180757686 0.1614371167 0.0896314948 0.0373232239 0.011512347
10 0.00255670811 0.00038713912 3.58239E-0513 1.52998E-06
1
Formula in Excel
130.357
x f(x)0 =BINOMDIST(C4,$A$1,$A$2,FALSE)1 =BINOMDIST(C5,$A$1,$A$2,FALSE)2 =BINOMDIST(C6,$A$1,$A$2,FALSE)3 =BINOMDIST(C7,$A$1,$A$2,FALSE)4 =BINOMDIST(C8,$A$1,$A$2,FALSE)5 =BINOMDIST(C9,$A$1,$A$2,FALSE)6 =BINOMDIST(C10,$A$1,$A$2,FALSE)7 =BINOMDIST(C11,$A$1,$A$2,FALSE)8 =BINOMDIST(C12,$A$1,$A$2,FALSE)9 =BINOMDIST(C13,$A$1,$A$2,FALSE)10 =BINOMDIST(C14,$A$1,$A$2,FALSE)11 =BINOMDIST(C15,$A$1,$A$2,FALSE)12 =BINOMDIST(C16,$A$1,$A$2,FALSE)13 =BINOMDIST(C17,$A$1,$A$2,FALSE)
=SUM(D4:D17)
n=13 is in cell a1 and p=0.357 is in cell a2.
Mean and standard deviation
• For a binomial distribution with n trials and p as the probability of success, the mean or expected value and variance of the random variable x is
• For the sex distribution of n = 3 individuals, the expected number of females selected is 3 × 0.5 = 1.5 and the variance is 3 × 0.5 × 0.5 = 0.75, as we previously determined.
• For the experiment of selecting 13 individuals, the mean number of those with 14-17 years of education is 13 × 0.357 = 4.64, the variance is 13 × 0.357 × 0.643 = 2.984, and the standard deviation is 1.727.
)1()(
)(2 pnpxVar
npxE
Examples where binomial could be applied• The probability of ten or more heads when flipping a
coin twelve times.• The probability of 6 threes in 15 rolls of a die.• The probability of selecting 56 or more unemployed
persons in a random sample of 500 workers in the province of Saskatchewan.
• The probability that the tax form has been correctly completed in a random sample of 500 Canadian taxpayers.
• The probability that more than 1/3 of a sample 1,000 Saskatchewan residents has a university degree.
Why might the binomial not apply in the following?• The probability that there will be snow on 20 or more days in
January?• The probability of 6 threes and 7 fives in 25 rolls of a die. • The probability that the UR Rams win all of their remaining
football games?• The probability that the Conservatives win 155 or more seats,
among the 308 up for election, in the coming federal election.• The probability that 10 or more automobiles in a car dealer’s
lot in Regina will have defective transmissions.• The probability that fifty or more clients of the Regina Food
Bank , during the month of October, will be unemployed.
Extending the binomial
• When the number of trials of a binomial experiment is large, ie. if n is large, then it is time-consuming to compute binomial probabilities without a computer.
• In this case, it is possible to use the normal distribution to approximate the binomial probabilities. See ASW, section 6.3.
• In addition, we may not be as interested in the number of successes as in the proportion of successes. In this case, the normal approximation can be used to obtain probabilities for the proportion p of the times that a success occurs. See ASW, section 7.6.