chapter 7. frequency distributions probability distributions 2 © ray panko
TRANSCRIPT
Chapter 7
Frequency Distributions Probability Distributions
2© Ray Panko
3
Same mean, different standard deviations
Different means
© Ray Panko
Frequency Distributionfor a variable
Sampling Distribution to findthe mean of the variable
4
xxμ
µ
µ
µ
Sampling
Sampling
X
X
X
Event is the estimation of the mean (X bar) from
a sample of size n.
Event is the estimation of the mean (X bar) from
a sample of size n.
© Ray Panko
5
xxμ
µ
X
δ
n
σσ
Xμμ
X
Sampling Distribution for µ
Population Distribution
© Ray Panko
Forty percent of voters call themselves independents.◦ 40% is a proportion (∏)
◦ Take a sample to estimate ∏
◦ The sample mean, p, is an unbiased estimator of ∏
◦ The sampling standard deviation, δp, is given by:
© Ray Panko
6
“Based on a sample of 1,500 households, the percentage of voters in favor of Proposition X is 40%, with a sampling error of plus or minus 3%.”
The sample mean (X) or proportion (p) is not likely to be exactly the population mean (µ) or proportion (∏)
However, they should be close.
Confidence intervals allow us to estimate how close.
Example: “It is estimated that the proportion of independent voters is 49%, with a sampling error of plus or minus 3%.”
© Ray Panko
8
Probability that the true population mean µ will lie within a certain interval around the sampling distribution mean Xbar, with a certain degree of confidence.
9
95% Confidence Interval95% Confidence Interval
Xbar
© Ray Panko
If the confidence level is 95%, then the area outside the confidence interval, which we call α, is 0.05.
The upper and lower tails are α/2 or 0.025
10
0.0252
α
0.0252
α
Xbar
0.05 so ,1 95%
© Ray Panko
Find the Z values for α/2.
For P(1-0.025) = P(0.975), Z is 1.96
So the Z values are -1.96 and 1.96
11
Zα/2 = -1.96 Zα/2 = 1.96
0.0252
α
0.0252
α
Point Estimate
Lower Confidence Limit
UpperConfidence Limit
Z units:
X units: Point Estimate
0
© Ray Panko
12
Confidence Level
Confidence Coefficient,
Zα/2 value
1.281.6451.962.332.583.083.27
0.800.900.950.980.990.9980.999
80%90%95%98%99%99.8%99.9%
1
© Ray Panko
13
2.4068 1.9932
0.2068 2.20
)11(0.35/ 1.96 2.20
n
σ/2 ZX
μ
α
A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
95% confidencefor the true mean:
© Ray Panko
14
A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the
population standard deviation is 0.35 ohms.
90% confidenceinterval for thetrue mean:
2.3736 0264.2
0.173595 2.20
)11(0.35/ 1.645 2.20
nσ
Z X /2
© Ray Panko
Population Mean
σ Unknown
Confidence
Intervals
PopulationProportion
σ Known
UseNormal
DistributionWith δ
Uset Distribution
based on the sample standard deviation S
computed from sample instead of δ
15
© Ray Panko
16
Assumptions◦ Population standard deviation is unknown◦ Population is normally distributed◦ If population is not normal, use large sample
Use Student’s t Distribution instead of the normal distribution
Confidence Interval Estimate:n
StX 2/α
(where tα/2 is the critical value of the t distribution with n -1 degrees of freedom and an area of α/2 in each tail)
© Ray Panko
Idea: Number of observations that are free to vary
after sample mean has been calculated
17
Example: Suppose the mean of 3 numbers is
8.0 If the mean of these three values is 8.0, then X3 must be 9 (i.e., X3 is not free to vary)
If the mean of these three values is 8.0, then X3 must be 9 (i.e., X3 is not free to vary)
Here, the sample size (n) = 3
So degrees of freedom = n – 1 = 3 – 1 = 2
X1 = 7X2 = 8X3 = ?
X1 = 7X2 = 8X3 = ?
© Ray Panko
© Ray Panko
18
For confidence intervals based on sample standard deviations,
d.f. = n-1
Where n is the sample size
For confidence intervals based on sample standard deviations,
d.f. = n-1
Where n is the sample size
19
t0
t (df = 5)t (df = 5)
t (df = 13) t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal
t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal
Standard Normal
(t with df = ∞)
Standard Normal
(t with df = ∞)
Note: t Z as n increasesso (n-1 n)
© Ray Panko
20
Upper Tail Area
df
.25 .10 .05
1 1.000 3.078 6.314
2 0.817 1.886 2.9202.920
3 0.765 1.638 2.353
t0 2.920
2.920
The body of the table contains t values, not probabilities
The body of the table contains t values, not probabilities
/2 = 0.05
90% confidence level, = 0.10
/2 = 0.05
90% confidence level, = 0.10
/2 = 0.05
Sample Size = 3df = n-1 df = 2
Sample Size = 3df = n-1 df = 2
© Ray Panko
21
Confidence
Level
t (10 d.f.)
t (20 d.f.)
t (30 d.f.)
z
.90 1.812 1.725 1.697 1.645
.95 2.228 2.086 2.042 1.96
.99 3.169 2.845 2.750 2.58
As sample size n increases, df (n-1) increases.As df increases, t approaches zSo at large sample sizes, t and z are the same
© Ray Panko
22
A random sample of n = 25 has X = 50 and S = 8. Form a 95% confidence interval for μ
◦ d.f. = n – 1 = 49, and α/2 = .025
◦ From Table E.1, tα/2 = 2.0639
◦ So The confidence interval is
258
(2.0649)50n
S/2 tX
46.698 ≤ μ ≤ 53.302
© Ray Panko
TINV(Probability, df)
For a 95% confidence level, sample size of 25, and a standard deviation S of 8◦ df is 24 (n-1)
◦ Probability is α (.05), not α/2 = .05
◦ Equation is = TINV(.05,24)
◦ Its value is 2.063899
◦ This is the same value found with the table lookup
23
© Ray Panko
24
Population Mean
σ Unknown
Confidence
Intervals
PopulationProportion
σ Known
Based on a sample of 70, 95% of our faculty members have PhDs.
© Ray Panko
Recall that the distribution of the sample proportion is approximately normal if the sample size is large, with standard deviation
We will estimate this with sample data:
25
n
p)p(1
n
)(1σp
© Ray Panko
Upper and lower confidence limits for the population proportion are calculated with the formula
where ◦ Zα/2 is the standard normal value for the level of confidence desired
◦ p is the sample proportion
◦ n is the sample size
Note: must have np > 5 and n(1-p) > 5
26
n
p)p(1/2Zp
α
© Ray Panko
A random sample of 100 people shows that 25 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.
27
/1000.25(0.75)1.9625/100
p)/np(1/2Zp
α
0.3349 0.1651
(0.0433)* 1.96 0.25
© Ray Panko
for a desired error size and confidence level
To determine the required sample size for the mean, you must know:
◦ The desired level of confidence (1 - ), which determines the critical value, Zα/2
◦ The acceptable sampling error, e (the plus or minus in the estimate).
◦ The population standard deviation, σ
(continued)
29
© Ray Panko
If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence?
(Always round up)
219.195
(45)(1.645)
e
σZn
2
22
2
22
So the required sample size is n = 220
30
© Ray Panko
If unknown, σ can be estimated when using the required sample size formula
◦ Use a value for σ that is expected to be at least as large as the true σ
◦ Select a pilot sample and estimate σ with the sample standard deviation, S
31
© Ray Panko
A confidence interval estimate (reflecting sampling error) should always be included when reporting a point estimate
The level of confidence should always be reported
The sample size should be reported
An interpretation of the confidence interval estimate should also be provided
32
© Ray Panko