estimation. estimators & estimates estimators are the random variables used to estimate...
TRANSCRIPT
![Page 1: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/1.jpg)
Estimation
![Page 2: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/2.jpg)
Estimators & EstimatesEstimators are the random variables used to estimate population parameters, while the specific values of these variables are the estimates.
Example: the estimator of is often
n
XX
n
ii
1
but if the observed values of X are 1, 2, 3, and 6, the estimate is 3.
So the estimator is a formula; the estimate is a number.
![Page 3: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/3.jpg)
Properties of a Good Estimator
1. Unbiasedness
2. Efficiency
3. Sufficiency
4. Consistency
![Page 4: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/4.jpg)
Unbiasedness
An estimator (“theta hat”) is unbiased if its
expected value equals the value of the parameter
(theta) being estimated. That is,
)ˆ(E
In other words, on average the estimator is right on target.
![Page 5: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/5.jpg)
Examples
. ofestimator unbiasedan is X , )XE( Since
Since E(X/n) , X/n is an unbiased estimator of .
. ofestimator unbiasedan is s , )E(s Since 2222
. 1
)X-(X s that Recall 1
2
2
n
n
i
If we divided by n instead of by n-1, we would not have an unbiased estimator of 2. That is why s2 is defined the way it is.
![Page 6: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/6.jpg)
Bias
The bias of an unbiased estimator is
zero.
)ˆE( bias
![Page 7: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/7.jpg)
Mean Squared Error (MSE)
])ˆ[( 2 EMSE
22 equal tohappenswhich
bias
![Page 8: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/8.jpg)
Efficiency
The most efficient estimator is the one with the smallest MSE.
![Page 9: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/9.jpg)
Efficiency, MSE Since 22 bias
for unbiased estimators (where the bias is zero), MSE = 2.
So if you are comparing unbiased estimators, the most efficient one is the one with the smallest variance.
If you have two estimators, one of which has a small bias & a small variance and the other has no bias but a large variance, the more efficient one may be the one that is just slightly off on average, but that is more frequently in the right vicinity.
![Page 10: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/10.jpg)
Example: sample mean & median
As we have found, the sample mean is an unbiased estimator of .
It turns out that the sample median is also an unbiased estimator of .
We know the variance of the sample mean is 2/n.
The variance of the sample median is (/2)(2/n).
Since is about 3.14, /2 >1.
So the variance of the sample median is greater than 2/n, the variance of the sample mean.
Since both estimators are unbiased, the one with the smaller variance (the sample mean) is the more efficient one.
In fact, among all unbiased estimators of , the sample mean is the one with the smallest variance.
![Page 11: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/11.jpg)
Sufficiency
An estimator is said to be sufficient if it uses all the information about the population parameter that the sample can provide.
![Page 12: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/12.jpg)
Examples
Example 1: The sample median is not a sufficient estimator because it uses only the ranking of the observations, and not their numerical values [with the exception of the middle one(s)].
Example 2: The sample mean, however, uses all the information, and therefore is a sufficient estimator.
![Page 13: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/13.jpg)
Consistency
An estimator is said to be consistent if it yields estimates that converge in probability to the population parameter being estimated as n approaches infinity.
In other words, as the sample size increases, The estimator spends more and more of its time closer and closer to the parameter value.
One way that an estimator can be consistent is for its bias and its variance to approach zero as the sample size approaches infinity.
![Page 14: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/14.jpg)
Example of a consistent estimator
distribution of estimator when n = 50
distribution of estimator when n = 5
distribution of estimator when n = 500As the sample size increases, the bias & the variance are both shrinking.
![Page 15: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/15.jpg)
Example: Sample Mean _We know that the mean of X is .So its bias not only goes to zero as n
approaches infinity, its bias is always zero.The variance of the sample mean is 2/n.As n approaches infinity, that variance
approaches zero.So, since both the bias and the variance go
to zero, as n approaches infinity, the sample mean is a consistent estimator.
![Page 16: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/16.jpg)
A great estimator: the sample mean X
We have found that the sample mean is a great estimator of the population mean .
It is unbiased,
efficient,
sufficient,
& consistent.
![Page 17: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/17.jpg)
Point Estimators versus Interval Estimators
Up until now we have considered point estimators that provide us with a single value as an estimate of a desired parameter.
It is unlikely, however, that our estimate will precisely equal our parameter.
We, therefore, may prefer to report something like this: We are 95% certain that the parameter is between “a” and “b.”
This statement is a confidence interval.
![Page 18: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/18.jpg)
Building a Confidence Interval
We know that Pr(0 < Z < 1.96) = 0.4750
Then Pr(-1.96 < Z < 1.96) = 0.95
0 1.96 Z
0.4750
-1.96
961961 .
n
-X .-
We also know that is distributed as a standard normal (Z).
So there is a 95% probability that
n
-X
![Page 19: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/19.jpg)
Continuing from: with 95% probability,
n
X
n
1.96 X n
1.96 - X
n 1.96 -X
n 1.96-
n 1.96 X- -
n 1.96- X-
n 1.96- X
n 1.96X
Multiplying through by ,
Subtracting off ,
Multiplying by -1 and flipping the inequalities appropriately,
Flipping the entire expression,
1.96
n
-X 1.96-
![Page 20: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/20.jpg)
So we have a 95% Confidence Interval for the Population Mean
n
1.96 X n
1.96 - X
![Page 21: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/21.jpg)
Example: Suppose a sample of 25 students at a university has a sample mean IQ of 127. If the population standard deviation is 5.4, calculate the 95% confidence interval for the population mean.
25
4.5 1.96 127
25
4.5 1.96 - 127
129.12 .88 124
n
1.96 X n
1.96 - X
2.12 127 2.12 - 127
We are 95% certain that the population mean is between 124.88 & 129.12 .
![Page 22: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/22.jpg)
When we say we are 95% certain that the population mean is between 124.88 & 129.12,
it means this:
The population mean is a fixed number, but we don’t know what it is.
Our confidence intervals, however, vary with the random sample that we take.
Sometimes we get a more typical sample, sometimes a less typical one.
If we took 100 random samples and from them calculated 100 confidence intervals, 95 of the intervals should contain the population mean that we are trying to estimate.
![Page 23: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/23.jpg)
What if we want a confidence level other than 95%?
n
1.96 X n
1.96 - X
In our formula, the 1.96 came from our the fact that the Z distribution will be between -1.96 and 1.96 95% of the time.
To get a different confidence level, all we need to do is find the Z values such that we are between them the desired percent of the time.
Using that Z value, we have the general formula for the
confidence interval for the population mean :
n Z X
n Z- X
![Page 24: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/24.jpg)
Notice: In our confidence interval formula, we used “less than” symbols:
X - Z X Z n n
Either of these is acceptable. Recall that the formula is built upon the concept of the normal probability distribution. The probability that a continuous variable is exactly equal to any particular number is zero. So it doesn’t matter whether you include the endpoints of the interval or not.
Your textbook uses “less than or equal to” symbols:
X - Z X Z n n
![Page 25: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/25.jpg)
Determining Z values for confidence intervals
Suppose we want a 98% confidence interval.We need to find 2 values, call them –k and k, such that
Z is between them 98% of the time.Then Z will be between 0 and k with probability half of
0.98, which is 0.49 .Look in the body of the Z table for the value closest to
0.49, which is 0.4901 .The number on the border of the table corresponding
to 0.4901 is 2.33. So that is your value of k, and the number you use for
Z in your confidence interval.
-k 0 k Z -2.33 2.33
0.4900
0.9800
![Page 26: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/26.jpg)
Sometimes 2 numbers in the Z table are equally close to the value you want.
For example, if you want a 90% confidence interval, you look for half of 0.90 in the body of the Z table, that is, 0.45.
You find 0.4495 and 0.4505. Both are off by 0.0005.The number on the border of the table
corresponding to 0.4495 is 1.64.The number corresponding to 0.4505 is 1.65.Usually in these cases, we use the average of 1.64
and 1.65, which is 1.645.Similarly for the 99% confidence interval, we usually
use 2.575. (Draw your graph & work through the logic of this number.)
![Page 27: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/27.jpg)
Which interval is wider: One with a higher confidence level (such as 99%) or one with
a lower confidence level (such as 90%)?
Let’s think it through using an unrealistic but slightly entertaining example.
![Page 28: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/28.jpg)
You have the misfortune of being stranded on an island, with a cannibal & a bunch of bears.
It gets worse…You get captured by the cannibal.The cannibal, who knows the island well, decides to give you a chance to avoid being dinner. He says if you can correctly estimate the number of bears, he’ll let you go.To give you a fighting chance, he’ll let you give him an interval estimate.
![Page 29: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/29.jpg)
You think that there are probably about a hundred bears on the island.
Would you be more confident of not being dinner if you gave the cannibal a narrow interval like 90 to 110 bears, or a wider one like 75 to 125 bears?
You would definitely be more confident with the wider interval.
Thus, when the confidence level needs to be very high (such as 99%), the interval needs to be wide.
![Page 30: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/30.jpg)
Let’s redo the IQ example with a different confidence level.
n Z X
n Z- X
We had a sample of 25 students with a sample mean IQ of 127. The population standard deviation was 5.4 . Calculate the 99% confidence interval for the population mean.
Our general formula is:
We said that the Z value for 99% confidence is 2.575.
Putting in our values,
25
4.5 2.575 127
25
4.5 2.575 - 127
or 124.22 < < 129.78
![Page 31: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/31.jpg)
We had for the 95% confidence interval:
124.88 < < 129.12
We just got for the 99% confidence interval:
124.22 < < 129.78
The 99% confidence interval starts a little lower
& ends a little higher than the 95% interval.
So the 99% interval is wider than the 95%
interval, as we said it should be.
![Page 32: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/32.jpg)
What do we do if we want to compute a confidence interval for , but we don’t know
the population standard deviation ?
We use the next best thing, the sample standard deviation s.But with s, instead of a Z distribution, we have a t (with n-1 degrees of freedom). So,
n Z X
n Z- X
becomes
n t X
n t- X
1-n1-n
ss
![Page 33: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/33.jpg)
Example: From a large class of normally distributed grades, sample 4 grades: 64, 66, 89, & 77. Calculate the 95% confidence interval for the class mean grade .
n t X
n t- X
1-n1-n
ss
is the appropriate formula.
So we need to determine the sample mean, sample standard deviation, and the t-value.
![Page 34: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/34.jpg)
4 grades: 64, 66, 89, & 7795% confidence interval for
X64
66
89
77
296
Adding our X values,we get 296.
![Page 35: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/35.jpg)
4 grades: 64, 66, 89, & 7795% confidence interval for
X64
66
89
77
296
74X
Dividing by 4, we findour sample mean is 74.
![Page 36: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/36.jpg)
4 grades: 64, 66, 89, & 7795% confidence interval for
X64 -10
66 -8
89 15
77 3
296
2)( XX
74X
XX
So, next we subtract our sample mean 74 from each of our X values,
Keep in mind that the sample standard deviation is
1
)(1
2
n
XXs
n
i
![Page 37: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/37.jpg)
4 grades: 64, 66, 89, & 7795% confidence interval for
X64 -10 100
66 -8 64
89 15 225
77 3 9
296 398
2)( XX
74X
XX square the differencesand add them up.
1
)(1
2
n
XXs
n
i
![Page 38: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/38.jpg)
4 grades: 64, 66, 89, & 7795% confidence interval for
X64 -10 100
66 -8 64
89 15 225
77 3 9
296 398
s2 = 398/3
=132.7
2)( XX
74X
XX Then we divide by n-1 (which is 3) to get the sample variance s2,
1
)(1
2
n
XXs
n
i
![Page 39: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/39.jpg)
4 grades: 64, 66, 89, & 7795% confidence interval for
X64 -10 100
66 -8 64
89 15 225
77 3 9
296 398
s2 = 398/3
=132.7
s = 11.5
2)( XX
74X
XX and take the square root to get the sample standard deviation s.
1
)(1
2
n
XXs
n
i
![Page 40: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/40.jpg)
So we have and s = 11.5
n t X
n t- X
1-n1-n
ss
74X
4
5.11 3.182 74
4
5.11 3.182 - 47
Since n = 4, dof = n-1 = 3
Since we want 95% confidence, we want 0.95 as the middle area of our graph, and .025 in each of the 2 tails.
We find the 3.182 in our t table.
Our formula is
Putting in our numbers we have
0 3.182 t3
0.0250.950.025
So our 95% confidence interval is 56 < < 92.
The interval is very wide, because we only have 4 observations. If we had more information, we’d be able to get a narrower interval.
![Page 41: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/41.jpg)
From our previous confidence intervals, we can see that we have a basic format, which can be
used when the point estimator is roughly normal.
n t X
n t- X
1-n1-n
ss
point estimate
std . dev. or estimate of
the std. dev. of our pt. estimate
z or t
Desired parameter
point estimate
z or t
std . dev. or estimate of
the std. dev. of our pt. estimate
n Z X
n Z- X
![Page 42: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/42.jpg)
Calculating confidence intervals for the binomial proportion parameter
When the number events of interest (X) and the number of events not of interest (n-X) are each at least five, the binomial distribution can be approximated by the normal and we can develop a confidence interval for the binomial proportion parameter .
That is, we can develop a confidence interval for ,if X ≥ 5 and n-X ≥ 5 .
![Page 43: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/43.jpg)
We need a point estimate for , & the standard deviation of our point estimate.
For the point estimate we will use the
binomial proportion variable X/n or p .
Its standard deviation was .
Since we don’t know , we will use our
sample proportion p in the standard deviation
formula.
(1 )
n
![Page 44: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/44.jpg)
Use our format to get the confidence interval for the binomial proportion .
point estimate
std . dev. or estimate of
the std. dev. of our pt. estimate
z or t
Desired parameter
point estimate
z or t
std . dev. or estimate of
the std. dev. of our pt. estimate
(1 ) (1 )p p p pp z p z
n n
![Page 45: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/45.jpg)
We have our confidence interval for the binomial proportion .
(1 ) (1 )p p p pp z p z
n n
![Page 46: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/46.jpg)
Example: Consider a random sample of 144 families; 48 have 2 or more cars. Compute the 95% confidence interval for the population proportion of families with 2 or more cars.
48 1
144 3p
n = 144
21
3p
0 1.96 Z
0.95
0.4750
Our z value is 1.96 .
(1 ) (1 )p p p pp z p z
n n
![Page 47: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/47.jpg)
0.333 0.077 0.333 0.077
1 2 1 21 13 3 3 3
1.96 1.963 144 3 144
We now have n = 144, z = 1.96, 1 2
p and 13 3
p
So our 95% confidence interval for is:
0.256 < < 0.410 .
(1 ) (1 )p p p pp z p z
n n
![Page 48: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/48.jpg)
Suppose we want a confidence interval not for a mean but for
the difference in two means (1-2).
For example, we may be interested in
the difference in the mean income for two counties, or
the difference in the mean exam scores for two classes.
![Page 49: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/49.jpg)
We will use the same basic format, but it will be a bit more complicated.
21 XX
21 XX
point estimate
std . dev. or estimate of
the std. dev. of our pt. estimate
z or t
Desired parameter
point estimate
z or t
std . dev. or estimate of
the std. dev. of our pt. estimate
Our “desired parameter” is 1 – 2 .
Our point estimate is .
Initially, we will assume that we have the population standard deviations, so we will use a z.
We need the standard deviation of the point
estimate, .
To get that we will first determine the variance of .
21 XX
![Page 50: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/50.jpg)
Recall:V(aX + bY) = a2V(X) + b2V(Y) + 2ab[C(X,Y)]
Letting a = 1, b = -1, 21 XYandXX
),()1)(1(2)()1()()1()( 2122
12
21 XXCXVXVXXV
If our samples are independent, the covariance term is zero, and the expression becomes
)()1()()1()( 22
12
21 XVXVXXV
)()()( 2121 XVXVXXV or
![Page 51: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/51.jpg)
)()()( 2121 XVXVXXV
Recall that .
Applying subscripts for our samples,
nXV
2
)(
2
2
2
1
2
121 )(
nnXXV
& the standard deviation of is
2
2
2
1
2
1
nn
21 XX
We now have
![Page 52: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/52.jpg)
Apply our basic format
point estimate
std . dev. or estimate of
the std. dev. of our pt. estimate
z or t
Desired parameter
point estimate
z or t
std . dev. or estimate of
the std. dev. of our pt. estimate
2
2
2
1
2
121
21
2
2
2
1
2
121 )()(
nnzXX
nnzXX
![Page 53: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/53.jpg)
Example: From 2 large classes, with normally distributed grades, sample
4 grades (64, 66, 89, & 77) & 3 grades (56, 71, & 53). If the population variances for the 2 classes are both 96, compute the 90% confidence interval
for the difference in means of the class grades.
2
2
2
1
2
121
21
2
2
2
1
2
121 )()(
nnzXX
nnzXX
We will use the formula we just developed:
![Page 54: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/54.jpg)
We need the 2 sample means & the z value.
Adding the observations & dividing by the number of observations, our sample means are
(64 + 66 + 89 + 77) / 4 = 74 and
(56 + 71 + 53) / 3 = 60
The z value for 90% confidence, as we found before, is 1.645 .
0 1.645 Z
0.90
0.4500
![Page 55: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/55.jpg)
Assembling our formula:
2
2
2
1
2
121
21
2
2
2
1
2
121 )()(
nnzXX
nnzXX
3
96
4
96645.1)6074(
3
96
4
96645.1)6074(
21
31.121431.121421
31.2669.121
![Page 56: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/56.jpg)
31.2669.121
We are 90% certain that the difference in class mean grades is between 1.69 and 26.31 .
Notice that this interval does not include zero.
If 1 – 2 = 0, then 1 = 2 .
That implies that the probability is less than 10% that the class mean grades are equal.
Interpreting the results
![Page 57: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/57.jpg)
What do we do if we want to compare means, but we don’t know the population variances?
As before, we use the sample variances & the t distribution.
Our formula was
2
2
2
1
2
121
21
2
2
2
1
2
121 )()(
nnzXX
nnzXX
2
2
2
1
2
121
21
2
2
2
1
2
121 )()(
n
s
n
stXX
n
s
n
stXX
Now the formula is
For the t, the number of degrees of freedom is determined by a very messy formula.
![Page 58: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/58.jpg)
The degrees of freedom for the t for the confidence interval for the difference between means with
unknown variances
22 2
1 2
1 2
2 22 2
1 2
1 2
1 2
the integer part of
1 1
s sn n
dofs sn n
n n
![Page 59: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/59.jpg)
From 2 large classes, with normally distributed grades, sample 4 grades (64, 66, 89, & 77) & 3 grades (56, 71, & 53). Compute the 90% confidence interval for the difference in means of the class grades.
This time we need to calculate the sample variances.
Let’s do the same example as before, but without knowing the population variances.
![Page 60: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/60.jpg)
Class 1 Class 2
X1 X2
64 56
66 71
89 53
77
296 180
74 4
2961
X
60 3
1802
X
11
XX 22
XX 21
1)( XX
22
2)( XX
1
)(s Recall 1
2
2
n
XXn
ii We calculate the sample means
as before.
![Page 61: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/61.jpg)
Class 1 Class 2
X1 X2
64 -10 100 56 -4 16
66 -8 64 71 11 121
89 15 225 53 -7 49
77 3 9
296 180
74 4
2961
X
60 3
1802
X
11
XX 22
XX 21
1)( XX
22
2)( XX
1
)(s Recall 1
2
2
n
XXn
ii
Then subtract the sample mean from each observation, square that difference,
![Page 62: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/62.jpg)
Class 1 Class 2
X1 X2
64 -10 100 56 -4 16
66 -8 64 71 11 121
89 15 225 53 -7 49
77 3 9
296 398 180 186
74 4
2961
X
60 3
1802
X
11
XX 22
XX 21
1)( XX
22
2)( XX
1
)(s Recall 1
2
2
n
XXn
ii and add up.
![Page 63: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/63.jpg)
Class 1 Class 2
X1 X2
64 -10 100 56 -4 16
66 -8 64 71 11 121
89 15 225 53 -7 49
77 3 9
296 398 180 186
74 4
2961
X
60 3
1802
X
11
XX 22
XX 21
1)( XX
22
2)( XX
67.1323
3982
1
s
0.932
1862
2
s
1
)(s Recall 1
2
2
n
XXn
ii Dividing by n-1, we have
our sample variances.
![Page 64: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/64.jpg)
What are the dof & t value?
22 2
1 2
1 2
2 22 2
1 2
1 2
1 2
the integer part of
1 1
dof
s sn n
s sn n
n n
0 2.1318 t4
0.900.05
So the degrees of freedom is the integer part of 4.86 or 4.
For 90% confidence & 4 dof, the t value is 2.1318 .
2
2 2
132.67 93.04 3
= 4.860132.67 93.0
4 33 2
![Page 65: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/65.jpg)
2
2
2
1
2
121
21
2
2
2
1
2
121 )()(
n
s
n
stXX
n
s
n
stXX
Assemble our formula
1 2
132.67 93.0 132.67 93.0(74 60) 2.1318 (74 60) 2.1318
4 3 4 3
1 23.08 31.08
1 214 17.08 14 17.08
Notice here that zero is contained in our 90% confidence interval.
So we can’t rule out the possibility that the class mean grades are equal.
![Page 66: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/66.jpg)
We have another confidence interval for this situation.
Sometimes we believe the variances of 2 populations are equal, even though
we don’t know the actual values.
![Page 67: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/67.jpg)
2
2
2
1
2
121
21
2
2
2
1
2
121 )()(
nnzXX
nnzXX
In our earlier formula above, we can drop the distinguishing subscripts on our variances.
2
2
1
2
2121
2
2
1
2
21 )()(nn
zXXnn
zXX
Factoring out the variance, we have
21
221
21
21
221
11)(
11)(
nnzXX
nnzXX
21
221
21
21
221
11)(
11)(
nnstXX
nnstXX
pp
Next we replace the variance by a pooled sample variance, based on information from both samples.
The dof for the t value is n1 + n2 – 2 .
![Page 68: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/68.jpg)
The pooled sample variance
When the 2 samples are the same size, this estimator gives an estimate that is halfway between the two sample variances.
When the samples are not the same size, the estimate will be closer to the sample variance from the larger sample.
2
)1()1(
21
2
22
2
112
nn
snsns
p
![Page 69: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/69.jpg)
So our confidence interval for the difference in the population means, when we don’t know the population variances but we believe that they are equal is:
21
221
21
21
221
11)(
11)(
nnstXX
nnstXX
pp
where2
)1()1(
21
2
22
2
112
nn
snsns
p
and the number of degrees of freedom is n1 + n2 – 2 .
![Page 70: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/70.jpg)
We had:
Let’s do the same example as before, assuming that the unknown population
variances are believed equal.
0.93s ,67.132s ,60 ,74 2
2
2
121 XX
2
)1()1(
21
2
22
2
112
nn
snsns
p
234
0.93)2(67.132)3(
8.1165
584
![Page 71: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/71.jpg)
We have: 8.116s ,60 ,74 2
p21 XX
0 2.015 t5
0.900.05
We want 90% confidence.
dof = n1 + n2 – 2 = 4 + 3 – 2 = 5
So our t value is 2.015 .
![Page 72: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/72.jpg)
We have: 2.015 t8116s 60X 74X 2p21 ,.,,
21
221
21
21
221
11)(
11)(
nnstXX
nnstXX
pp
3
1
4
18.116015.2)6074(
3
1
4
18.116015.2)6074(
21
63.161463.161421
We are 90% certain that the difference in the population means is between -2.63 & 30.63.
Again, since zero is in this interval, we can’t rule out the possibility that the class mean grades are equal.
63.3063.221
![Page 73: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/73.jpg)
We can also develop a confidence interval for the difference in population proportions
1 – 2
The point estimate is the difference in the sample proportions
1 2p p
![Page 74: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/74.jpg)
1 2 1 2( ) ( ) ( )V p p V p V p
Recalling that our previous estimate of was ,
we have
(1 )p p
n
1 1 2 21 2
1 2
(1 ) (1 )( )
p p p pV p p
n n
( )V p
The estimated standard deviation of our point estimate becomes
1 1 2 2
1 2
(1 ) (1 )p p p p
n n
Similarly to the case of the difference in population means,
Next we need the standard deviation of our point estimate.
![Page 75: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/75.jpg)
Using our basic format, we find the confidence interval for the difference in population proportions.
1 1 2 2 1 1 2 21 2 1 2 1 2
1 2 1 2
(1 ) (1 ) (1 ) (1 )( ) ( )
p p p p p p p pp p z p p z
n n n n
point estimate
std . dev. or estimate of
the std. dev. of our pt. estimate
z or t
Desired parameter
point estimate
z or t
std . dev. or estimate of
the std. dev. of our pt. estimate
![Page 76: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/76.jpg)
Example: Samples from 2 states show proportions of Democrats 1/3 & 1/5 with sample sizes 100 & 225.
Calculate the 99% confidence interval for the difference in population proportions.
0 2.575 Z
0.99
0.4950
The z value for a 99% confidence interval is 2.575 .
1 1 2 2 1 1 2 21 2 1 2 1 2
1 2 1 2
(1 ) (1 ) (1 ) (1 )( ) ( )
p p p p p p p pp p z p p z
n n n n
![Page 77: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/77.jpg)
1 2
(0.33)(0.67) (0.20)(0.80)(0.33 0.20) 2.575
100 225
(0.33)(0.67) (0.20)(0.80) (0.33 0.20) 2.575
100 225
1 1 2 2 1 2
We have:
0.33, 1-p 0.67, 0.20, 1 0.80, n 100, n 225, z 2.575p p p
Applying the formula
yields
or1 20.13 0.14 0.13 0.14
So the 99% confidence interval for the difference in population proportions is
1 20.01 0.27
1 1 2 2 1 1 2 21 2 1 2 1 2
1 2 1 2
(1 ) (1 ) (1 ) (1 )( ) ( )
p p p p p p p pp p z p p z
n n n n
![Page 78: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/78.jpg)
1 20.01 0.27
Can we conclude that the two population proportions are not equal?
No. Since zero is in the interval, 1 may equal 2 .
Given our confidence interval:
![Page 79: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/79.jpg)
How do you decide the appropriate sample size for a project?
2 Decisions:• Desired confidence level• Maximum difference D between the estimate of
the population parameter & the true value of the population parameter (that is, the maximum error you’re willing to accept)
For example, if you’re estimating the population mean using the sample mean , X
. and between difference maximum the is XD
![Page 80: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/80.jpg)
Suppose you have chosen 95% as your desired confidence level.
You know that there is a z value (call it z0) such that - z0 < Z < z0 95% of the time.
You also know that Z. a as ddistribute is
n
X
y.probabilit 95% with So,00
z
n
Xz
![Page 81: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/81.jpg)
. have wey,probabilit 95% With00
z
n
Xz
. have we,by gMultiplyin00
nzX
nz
n
.
,
nz is D, called wewhich
and X between difference the of value largest the that here see We
0
n
0
zD So,
![Page 82: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/82.jpg)
n. size sample the for solve can weand
, n
zD now, have We0
First, square both sides of the equation: n
zD2
2
0
2
Multiply through by n:22
0
2 znD
Divide through by D2:2
2
2
0 Dzn
Dropping the subscript on z for convenience, we have the formula: 2
2
2
Dzn
![Page 83: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/83.jpg)
2
2
2
Dzn
So we have a formula for determining the appropriate sample size n when we want to
estimate the population mean.
![Page 84: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/84.jpg)
2
2
2
Dzn
Example: Suppose you’re trying to estimate the mean monthly rent of 2-bedroom apartments in towns of 100,000 people or less. The population standard deviation is 20. You want to be 95% sure that your estimate is within $3 of the true mean. How large a sample should you take?
2
2
2
3
20961
)(
)().(
3170.
You need to sample 171 observations.
It’s not 170, because sample sizes smaller than 170.3 provide you with less information & therefore less than the desired level of confidence.
![Page 85: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/85.jpg)
Our formula for n has the population standard deviation in it.
What do we do if we don’t know ?
In the past, we used the sample standard deviation s. Why can’t we do that here?
s came from the sample. We haven’t taken the sample yet. We’re still trying to figure out how many observations our sample should have.
If previous researchers have done related work, you may be able to use their estimate for the standard deviation.
Alternatively, you can do a small preliminary sample, & based on that information, estimate the standard deviation.
![Page 86: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/86.jpg)
Determining the appropriate sample size n for estimating the population proportion .
Again we will use - z0 < Z < z0 with the desired confidence level as
our starting point.
p-We know that is approximately standard normal.
(1- )
n
0 0So with the desired level of confidence, -z .(1 )
pz
n
![Page 87: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/87.jpg)
0 0Starting from -z (1 )
pz
n
0 0
(1- ) (1- ) (1- )Multiply through by , -z z
n n np
We see here that the maximum difference D
between our estimator p & our parameter is: 0
(1- )z
n
![Page 88: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/88.jpg)
0
(1 )We have now, D z
and we can solve for the sample size n.n
First, square both sides of the equation:
2 20
(1 )D z
n
Multiply through by n:2 2
0 (1 )nD z
Divide through by D2:20 2
(1 )n z
D
Dropping the subscript on z for convenience, we have the formula:
22
(1 )n z
D
![Page 89: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/89.jpg)
There’s one big problem with this formula. What is it?
We want to collect a sample in order to estimate , but we have the unknown in our equation for determining the sample size!
We can’t use the sample proportion as we did before, because we haven’t taken the sample yet.
As it happens, we can resolve this problem fairly easily.
22
(1 )n z
D
![Page 90: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/90.jpg)
The largest possible value for occurs when is ½, and that largest value is ¼.
Play with some values for & 1-, and convince yourself that this is true. For example,
(1/3)(2/3) = 2/9 < 1/4
(3/10)(7/10) = 21/100 < 1/4
(1/100)(99/100) = 99/10,000< 1/4
![Page 91: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/91.jpg)
22
(1 )n z
D
If we know the largest possible value for pq, we can determine the largest sample size we should need for
Plugging in the maximum value of ¼ for , we have
2
2
D
4
1
zn
2
2
D
1
4
1z 2
2
D4
z
2
2
D4
1z
2
2
D4
zn So our formula for n is:
![Page 92: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/92.jpg)
Sometimes you have a rough idea of what is, but you’re trying to get a more precise value.
You can use your rough idea to determine the sample size.
If 1 is your rough idea, then the sample size formula becomes
2 1 12
(1 )n z
D
![Page 93: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/93.jpg)
2
2
D4
zn
So we have 2 formulae for determining the appropriate sample size for estimating the population proportion.
If you have no idea at all what p is, you use:
If you have a rough idea of 1 for the value of , you use:
2 1 12
(1 )n z
D
![Page 94: Estimation. Estimators & Estimates Estimators are the random variables used to estimate population parameters, while the specific values of these variables](https://reader036.vdocuments.site/reader036/viewer/2022062516/56649e0f5503460f94af9ed2/html5/thumbnails/94.jpg)
Example: We are estimating the proportion of families with 2 or more cars. We want to be 95% certain that
the estimate is within 3% (0.03) of the correct percentage. What is the necessary sample size?
We’re clueless on the proportion p, so we use the formula
2
2
D4
zn
2
2
0304
961
).(
).( 11067.
The z value for 95% confidence is 1.96 .
0 1.96 Z
0.95
0.4750
2
2
D4
zn Filling in our
values, we get
So the needed sample size is 1068.