overview - sam houston state university
TRANSCRIPT
Overview
7.1 Introduction to Sampling Distributions
7.2 Central Limit Theorem for Means
7.3 Central Limit Theorem for Proportions
7.1 Introduction to Sampling Distributions
Objectives:
By the end of this section, I will be
able to…
1) Compute point estimates and sampling error.
2) Explain the sampling distribution of the sample mean
3) Describe the sampling distribution of the sample mean when the population is normal.
4) Use normal probability plots to assess normality.
5) Find probabilities and percentiles for the sample mean when the population is normal.
x
x
Point Estimates
Use known statistics to estimate unknown
parameters and report a single number
as the estimate
The value of the statistic is called the point estimate
Table 7.1 Point estimation: Use statistics to estimate unknown population parameters
Sampling error
The distance between the point estimate and its target parameter
Table 7.3 Sampling error for common characteristics
Page 349
Example
Do problems 10 and 12
Solutions:
10)
12)
Example
1.00.19.0s
05.075.070.0ˆ pp
Example 7.3 - Commuting times for student government members
We are interested in how long it takes the five
members of the student government to
commute to school. The times (in minutes)
are given in Table 7.4. Since these five
people are all the members of the student
government, we can consider them to
constitute a population.
Example 7.3 continued
a. Calculate the population mean commuting time μ.
b. Take a sample of the following student
government members: Amber, Brandon, and
Chantal. Find the sample mean commuting
time x and the sampling error |x - μ|.
Table 7.4 Commuting times for the five members of the student government
Example 7.3 continued Solution
The mean commuting time of this population is
For Amber, Brandon, and Chantal, the sample mean commuting time is
The sampling error of this sample mean is
|x1 – | = |11.67 – 16|= 4.33 minutes.
10 20 5 30 1516 minutes
5
x
N
1
10 20 511.67 minutes
3
xx
N
Table 7.5 All possible samples of size 3 from population of student government members
Consider previous example and find all possible
samples of student government members,
the sample means, population mean,
and sampling errors (page 339):
Sampling distribution of the sample mean x
Collection of the sample means of all
possible samples of size n
Calculate the mean of the sample means as the average of the sample means:
Note: the mean of the sample means equals the population mean in this example.
Mean of the Sample Means
Fact 1
The mean of the sampling distribution of the sample mean x is the value of the population mean .
Denoted as
Read as “the mean of the sampling distribution of x is ”
x
For the previous example
Mean of the Sample Means
x
Standard Deviation of the Population
=3.5119
Standard Deviation of the Sample Means
Fact 2
The standard deviation of the sampling distribution of the sample mean x is
is the population standard deviation
n is the sample size where n is assumed to be very large (if n is not large, see the note on page 340)
/x
n
Previous example
x
Page 350
Example
Solutions
a)
Example
inches 68x
inches 95.010/inches 3/ nx
Solutions
b)
Example
inches 68x
inches 30.0100/inches 3/ nx
Solutions
c)
Example
inches 68x
inches 09.01000/inches 3/ nx
Sampling Distribution of the Sample Mean for a Normal Population
Fact 3
Is itself normal, regardless of sample size
Sampling Distribution of the Sample Mean for a Normal Population
Fact 4
Distributed as normal with mean:
And standard deviation:
x
nx /
Fact 5: Standardizing a Normal Sampling Distribution for Means
When the sampling distribution of
is normal, we may standardize to produce
the standard normal random variable Z as follows:
where is the population mean, is the
population standard deviation, and n
is the sample size.
/
x
x
x xZ
n
x
Page 350
Example
Solutions:
Example
000,50$x
1000$25/5000$/ nx
Solutions:
a) Z-score for sample mean of 52,000 is
Example
21000
5000052000
/ n
xZ
Solutions:
a)
Example
0.0228
9772.01
)2(1
)2()000,50$(
ZP
ZPxP
Look up in Table C
(b) calculator
Example
(c) calculator
Example
0013.0
000)00,50000,1-10^99,470normalcdf(
)000,47$(xP
(c) calculator
Example
(c) calculator
Example
0215.0
00)0,50000,1052000,5300normalcdf(
)000,53$000,52($ xP
Page 350
Example
Do part (a)
Solution:
a) $50,000 (note: for a normal distribution, the mean and the median are equal)
Example
Page 350
Example
Do part (b)
Solutions:
b) Find so that
for a normal distribution with
Example
95.0)( CXXP
CX
000,50$x
1000$25/5000$/ nx
First find so that
using the standard normal distribution. From Table C
we get that:
Example
95.0)( CZZP
CZ
655.1CZ
Use to get by using the z-score formula:
Example
CZ
CC
XZ
CX
1000$
000,50$655.1 CX
655,51$CX
Solutions (directly with calculator):
b)
Example
85.644,51$000)95,50000,1invNorm(0.CX
Page 350
Example
Do parts (c) and (d)
Solutions:
c) $48,355
d) $48,355 and $51,645
Example
15.355,48$000)05,50000,1invNorm(0.CX
Normal Probability Plots
A normal probability plot is a scatterplot of the estimated cumulative normal probabilities (expressed as percents) against the corresponding data values in a data set.
Normal Probability Plot
FIGURE 7.4 Normal probability plot of normal data.
Analyzing Normal Probability Plots
If the points in the normal probability plot either cluster around a straight line or nearly all fall within the curved bounds, then it is likely that the data set is normal.
Systematic deviations off the straight line are evidence against the claim that the data set is normal.
Normal Probability Plot
FIGURE 7.5 Normal probability plot of right-skewed data.
Summary
We can use sample statistics as point estimates of the unknown population parameters.
For each statistic, sampling error is the distance between the point estimate and its target parameter.
The sampling distribution of the sample mean for a given sample size n consists of the collection of the sample means of all possible samples of size n from the population.
x
Summary
The mean of the sampling distribution of the sample mean is the value of the population mean μ (Fact 1).
The standard deviation of the sampling
distribution of the sample mean is
, σ is the population standard
deviation (Fact 2).
The sampling distribution of the sample mean for a normal population is itself normal, regardless of sample size (Fact 3).
x
x
/x n
Summary
For a normal population, the sampling distribution of the sample mean is distributed as normal (μ, ), where μ is the population mean and σ is the population standard deviation (Fact 4).
Normal probability plots are used to assess the normality of a data set.
We can use Fact 4 to find probabilities and percentiles using sample means.
x
/ n
7.2 Central Limit Theorem for Means Objectives:
By the end of this section, I will be
able to…
1) Describe the sampling distribution of x for skewed and symmetric populations as the sample size increases.
2) Apply the Central Limit Theorem for Means to solve probability questions about the sample mean.
Main Idea
In this section we want to be able to describe the shape of the distribution of the sample means.
Symmetric Populations
For a symmetric distribution, at n=20 the sampling distribution of the mean is approximately normal.
Roll a fair die once. Make up a table that represents the probability distribution of X=number on the die. Also, plot the probability distribution in a bar chart.
Example
Distribution (table format) is:
x P(x)
1 0.1667
2 0.1667
3 0.1667
4 0.1667
5 0.1667
6 0.1667
Example
5.3)(xxP
FIGURE 7.11 Distribution of a single fair die roll is symmetric.
Roll a fair die one hundred times. Take random samples of size 10 from these 100 rolls. For each sample of size 10, calculate the sample mean. Plot the distribution of the means as a histogram.
Example
FIGURE 7.12 Sample means of size n = 10: already approaching normality.
Roll a fair die one hundred times. Take random samples of size 20 from these 100 rolls. For each sample of size 20, calculate the sample mean. Plot the distribution of the means as a histogram.
Example
FIGURE 7.13 Sample means of size n = 20: approximately normal.
FIGURE 7.14 Normal probability plot for n = 20: acceptable normality.
Skewed Populations
For a skewed population, sampling distribution of the mean becomes approximately normal as the sample size approaches 30.
FIGURE 7.10 Sampling distribution of x and normal probability plots for n = 10, 20, and 30.
¯
Central Limit Theorem for Means
Population with mean μ
Standard deviation σ
The sampling distribution of the sample mean x becomes approximately normal with mean μ and standard deviation as the sample size gets larger
Regardless of the shape of the population.
/ n
Rule of Thumb
We consider n ≥ 30 as large enough to apply the Central Limit Theorem for any population.
Three Cases for the Sampling Distribution of the Sample Mean x
Case 1
The population is normal.
Then the sampling distribution of x is normal (Fact 3 from 7.1).
Three Cases for the Sampling Distribution of the Sample Mean x continued
Case 2
The population is either non-normal or of unknown distribution and the sample size is at least 30.
Apply Central Limit Theorem for Means:
The sampling distribution of the sample mean is approximately normal
Three Cases for the Sampling Distribution of the Sample Mean x continued
Case 3
The population is either non-normal or of unknown distribution and the sample size is less than 30.
Insufficient information to conclude that the sampling distribution of the sample mean x is either normal or approximately normal
Page 362-363
Example
Solutions
6) Case 2
8) Case 3
10) Case 1
Example
Page 363
Example
Solutions
16(a)
16(b)
16(c) unknown
Example
000,60$x
500,2$16/000,10$/ nx
Page 363
Example
Solutions
20(a)
20(b)
20(c) approximately normal
Example
gallonper miles 50x
mpg 75.064/6/ nx
Page 363
Example
Solution:
first notice that it is possible to find the
probability since the systolic blood pressure readings are normally distributed so the distribution of the sample mean is also normal (case 1)
Example
80x
6.125/8/ nx
Solution:
Example
7887.0.6)78,82,80,1normalcdf()8278( xP
Page 363
Example
Solution
Not possible: since the pollen count distribution is not normally distributed and the sample size is smaller than 30, the sampling distribution of the mean of x is unknown.
Example
Page 363
Example
Solution:
first notice that it is possible to find the
probability since even though the pollen count distribution is not normal, the sample size is at least 30, so the distribution of the sample mean is also normal (case)
Example
8x
125.064/1/ nx
Solution (directly with calculator):
Find so that
Example
1.8)75,8,0.125invNorm(0.CX
75.0)( CXXPCX
Page 364
Example
Solution:
(a) yes- case 1 applies
Example
o
x 6.38
oo
x n 225/10/
0.2420
7580.01
)40(1)40( oo xPxP
Solution:
(b) case 2 does not apply since the sample
size is less than 30.
Example
Summary
In this section, we examine the behavior of the sample mean when the population is not normal.
The approximate normality of the sampling distribution of the sample mean kicks in much quicker when the original population is symmetric rather than skewed.
The Central Limit Theorem is one of the most important results in statistics and is stated as follows:
Summary
Given a population with mean μ and standard deviation σ, the sampling distribution of the sample mean x becomes approximately normal(μ, ) as the sample size gets larger, regardless of the shape of the population.
This approximation applies for smaller sample sizes when the original distribution is more symmetric.
/ n
7.3 Central Limit Theorem for Proportions Objectives:
By the end of this section, I will be
able to…
1) Explain the sampling distribution of the sample proportion p.
2) Describe the sampling distribution of the sample proportion p for extreme and moderate values of p.
3) Apply the Central Limit Theorem for Proportions to solve probability questions about the sample proportion.
ˆ
ˆ
Sample Proportion p
Suppose each individual in a population either has or does not have a particular characteristic
For sample of size n sample proportion p (read “p-hat”) is
x represents the number of individuals in the sample that have the particular characteristic
Use p to estimate the unknown value of the population proportion p
ˆx
pn
ˆ
ˆ
ˆ
Example
Page 367, Example 7.14
Table 7.6 All possible samples of size 3 from population of student government members
Example 7.15 - Mean of sample proportions
Calculate the mean of the ten sample
proportions p from Table 7.6 page 367.
ˆ
Example 7.15 continued
Solution
Mean is
p equals the population proportion of
females for the original population,
p = 3/5 = 0.6.
2 1 2 2 3 2 1 2 1 2
3 3 3 3 3 3 3 3 3 30.6
10
ˆ
Fact 6: Mean of the Sampling
Distribution of the Sample Proportion p
The value of the population proportion p
Denoted as
where
Read as “the mean of the sampling distribution of p is p”
ˆ
ˆ
p̂
pp̂
Fact 7 - Standard Deviation of the Sampling Distribution of the Sample Proportion p
where
p is the population proportion
n is the sample size
ˆ
n
pq
n
ppp
)1(ˆ
pq 1
Example
Page 379
Do 8 and 10 parts (a) and (b)
Solutions
8(a)
8(b)
Example
5.0ˆ pp
2236.05
)5.0()5.0(ˆ
n
pqp
5.01 pq
Solutions
10(a)
10(b)
Example
01.0ˆ pp
0044.0500
)99.0()01.0(ˆ
n
pqp
99.01 pq
Fact 8 - Conditions for Approximate Normality for the Sampling Distribution of the Sample Proportion p
May be considered approximately normal only if both the following conditions hold:
(1) np ≥ 5 and (2) n(1 - p) ≥ 5
ˆ
Example
Page 379
Do part (c)
Solutions
8(c)
Unknown (we cannot conclude that the sampling distribution of the proportion is normal in this case)
Example
55.2)5.0(5np
55.2)5.0(5)1( nqpn
Solutions
10(c)
We conclude that the sampling distribution of the proportion is approximately normal in this case
Example
5)01.0(500np
5495)99.0(500)1( nqpn
Fact 8 continued
Given a value for p, the minimum sample size required to produce approximate normality in the sampling distribution of the proportion can be found by solving each of these for n:
and choosing the next largest integer value that is at least as large as both of these values for n.
5 and 5 nqnp
Example
Page 379
Do problem 16
Solutions
16)
The minimum sample size is 100
Example
100 5)05.0( nnnp
26.5 5)95.0( nnnq
Fact 9 - Standardizing a Normal Sampling Distribution for Proportions
When the sampling distribution of p is normal (or approximately normal), we
can standardize to produce the standard normal Z:
where p is the population proportion of successes and n is the sample size.
ˆ
ˆ
ˆ ˆ
1
p
p
p p pZ
p p
n
Example
Page 379
Do problem 30
Solutions
30) First check that we can assume a normal distribution:
We cannot conclude that the sampling distribution of the proportion is normal in this case and we cannot find the probability.
Example
55.2)5.0(5np
55.2)5.0(5)1( nqpn
Example
Page 379
Do problem 32
Solutions
First check that we can assume a
normal distribution for :
Example
5)01.0(500np
5495)99.0(500)1( nqpn
p̂
Solutions
For normal distribution use
mean:
Standard deviation:
Example
01.0ˆ pp
0044.0500
)99.0()01.0(ˆ
n
pqp
Solutions
z-score method (Table)
Example
23.00044.0
01.0011.0ˆ
ˆ
ˆ
p
ppZ
Solutions
using standard normal distribution and table lookup:
Example
0.4090
5910.01
)23.0(1)23.0( ZPZP
From Table T-10
Solutions
It is more accurate to compute Z as
so that:
Example
0.4129
5871.01
)22.0(1)22.0( ZPZP
22.0500/)99.0()01.0(
01.0011.0ˆ
ˆ
ˆ
p
ppZ
Example
Page 379
Do problem 36
Solutions
First check that we can assume a normal distribution for :
Yes- both are greater than 5
Example
200)5.0(400np
200)50.0(400)1( nqpn
p̂
Solutions
Find the value:
So that
is the 90th percentile of values of
Example
90.0)ˆˆ( cppP
cp̂ p̂
cp̂
Solutions
For normal distribution use
mean:
standard deviation:
Example
5.0ˆ pp
025.0400
)5.0()5.0(ˆ
n
pqp
Solutions
Using calculator gives:
Example
532.025)90,0.5,0.0invNorm(0.ˆCp
Central Limit Theorem for Proportions
the sampling distribution of the sample proportion follows an approximately normal distribution with
mean
standard deviation
when the following conditions are satisfied:
np ≥ 5 and n(1 - p) ≥ 5
p̂
pp̂
n
pq
n
ppp
)1(ˆ
Example
Page 379
Solutions
(a) Take p=0.87 and choose smallest integer value of n so that both of the following are true:
Answer:
Example
5)87.0(nnp
5)13.0()1( nnqpn
39*nn
Solutions
(b) Check that we can assume a normal distribution for with a sample size of n=39
Yes- both larger than 5
Example
93.33)87.0(39np
07.5)13.0(39)1( nqpn
p̂
Solutions
(c) The Central Limit Theorem tells us that the sampling distribution of
is approximately normal when
n=39
Example
p̂
Use n=50
(d) The Central Limit Theorem gives that the sampling distribution of
is approximately normal with
Example
p̂
87.0ˆ pp
0476.050
)13.0(87.0)1(ˆ
n
ppp
(d) For n=50, find
Example
90.0ˆ50
45ˆ pPpP
0.264376),0.87,0.040.90,10normalcdf( 99
Summary
The sampling distribution of the sample proportion p for a given sample size n consists of the collection of the sample proportions of all possible samples of size n from the population.
The approximate normality of the sampling distribution of the sample proportion kicks in much quicker when the population proportion is moderate rather than extreme.
ˆ
Summary
According to the Central Limit Theorem for Proportions, the sampling distribution of the sample proportion p follows an approximately normal distribution with mean μp = p and standard deviation
the following conditions are satisfied: (1) np ≥ 5 and (2) n(1 - p) ≥ 5.
We can use Fact 9 to find probabilities and percentiles for sample proportions.
1 /p
p p n
ˆ
ˆ