overview - sam houston state university

127
Overview 7.1 Introduction to Sampling Distributions 7.2 Central Limit Theorem for Means 7.3 Central Limit Theorem for Proportions

Upload: others

Post on 24-Oct-2021

3 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Overview - Sam Houston State University

Overview

7.1 Introduction to Sampling Distributions

7.2 Central Limit Theorem for Means

7.3 Central Limit Theorem for Proportions

Page 2: Overview - Sam Houston State University

7.1 Introduction to Sampling Distributions

Objectives:

By the end of this section, I will be

able to…

1) Compute point estimates and sampling error.

2) Explain the sampling distribution of the sample mean

3) Describe the sampling distribution of the sample mean when the population is normal.

4) Use normal probability plots to assess normality.

5) Find probabilities and percentiles for the sample mean when the population is normal.

x

x

Page 3: Overview - Sam Houston State University

Point Estimates

Use known statistics to estimate unknown

parameters and report a single number

as the estimate

The value of the statistic is called the point estimate

Table 7.1 Point estimation: Use statistics to estimate unknown population parameters

Page 4: Overview - Sam Houston State University

Sampling error

The distance between the point estimate and its target parameter

Table 7.3 Sampling error for common characteristics

Page 5: Overview - Sam Houston State University

Page 349

Example

Do problems 10 and 12

Page 6: Overview - Sam Houston State University

Solutions:

10)

12)

Example

1.00.19.0s

05.075.070.0ˆ pp

Page 7: Overview - Sam Houston State University

Example 7.3 - Commuting times for student government members

We are interested in how long it takes the five

members of the student government to

commute to school. The times (in minutes)

are given in Table 7.4. Since these five

people are all the members of the student

government, we can consider them to

constitute a population.

Page 8: Overview - Sam Houston State University

Example 7.3 continued

a. Calculate the population mean commuting time μ.

b. Take a sample of the following student

government members: Amber, Brandon, and

Chantal. Find the sample mean commuting

time x and the sampling error |x - μ|.

Table 7.4 Commuting times for the five members of the student government

Page 9: Overview - Sam Houston State University

Example 7.3 continued Solution

The mean commuting time of this population is

For Amber, Brandon, and Chantal, the sample mean commuting time is

The sampling error of this sample mean is

|x1 – | = |11.67 – 16|= 4.33 minutes.

10 20 5 30 1516 minutes

5

x

N

1

10 20 511.67 minutes

3

xx

N

Page 10: Overview - Sam Houston State University

Table 7.5 All possible samples of size 3 from population of student government members

Consider previous example and find all possible

samples of student government members,

the sample means, population mean,

and sampling errors (page 339):

Page 11: Overview - Sam Houston State University

Sampling distribution of the sample mean x

Collection of the sample means of all

possible samples of size n

Page 12: Overview - Sam Houston State University

Calculate the mean of the sample means as the average of the sample means:

Note: the mean of the sample means equals the population mean in this example.

Mean of the Sample Means

Page 13: Overview - Sam Houston State University

Fact 1

The mean of the sampling distribution of the sample mean x is the value of the population mean .

Denoted as

Read as “the mean of the sampling distribution of x is ”

x

Page 14: Overview - Sam Houston State University

For the previous example

Mean of the Sample Means

x

Page 15: Overview - Sam Houston State University

Standard Deviation of the Population

Page 16: Overview - Sam Houston State University

=3.5119

Standard Deviation of the Sample Means

Page 17: Overview - Sam Houston State University

Fact 2

The standard deviation of the sampling distribution of the sample mean x is

is the population standard deviation

n is the sample size where n is assumed to be very large (if n is not large, see the note on page 340)

/x

n

Page 18: Overview - Sam Houston State University

Previous example

x

Page 19: Overview - Sam Houston State University

Page 350

Example

Page 20: Overview - Sam Houston State University

Solutions

a)

Example

inches 68x

inches 95.010/inches 3/ nx

Page 21: Overview - Sam Houston State University

Solutions

b)

Example

inches 68x

inches 30.0100/inches 3/ nx

Page 22: Overview - Sam Houston State University

Solutions

c)

Example

inches 68x

inches 09.01000/inches 3/ nx

Page 23: Overview - Sam Houston State University

Sampling Distribution of the Sample Mean for a Normal Population

Fact 3

Is itself normal, regardless of sample size

Page 24: Overview - Sam Houston State University

Sampling Distribution of the Sample Mean for a Normal Population

Fact 4

Distributed as normal with mean:

And standard deviation:

x

nx /

Page 25: Overview - Sam Houston State University

Fact 5: Standardizing a Normal Sampling Distribution for Means

When the sampling distribution of

is normal, we may standardize to produce

the standard normal random variable Z as follows:

where is the population mean, is the

population standard deviation, and n

is the sample size.

/

x

x

x xZ

n

x

Page 26: Overview - Sam Houston State University

Page 350

Example

Page 27: Overview - Sam Houston State University

Solutions:

Example

000,50$x

1000$25/5000$/ nx

Page 28: Overview - Sam Houston State University

Solutions:

a) Z-score for sample mean of 52,000 is

Example

21000

5000052000

/ n

xZ

Page 29: Overview - Sam Houston State University

Solutions:

a)

Example

0.0228

9772.01

)2(1

)2()000,50$(

ZP

ZPxP

Look up in Table C

Page 30: Overview - Sam Houston State University

(b) calculator

Example

Page 31: Overview - Sam Houston State University

(c) calculator

Example

0013.0

000)00,50000,1-10^99,470normalcdf(

)000,47$(xP

Page 32: Overview - Sam Houston State University

(c) calculator

Example

Page 33: Overview - Sam Houston State University

(c) calculator

Example

0215.0

00)0,50000,1052000,5300normalcdf(

)000,53$000,52($ xP

Page 34: Overview - Sam Houston State University

Page 350

Example

Do part (a)

Page 35: Overview - Sam Houston State University

Solution:

a) $50,000 (note: for a normal distribution, the mean and the median are equal)

Example

Page 36: Overview - Sam Houston State University

Page 350

Example

Do part (b)

Page 37: Overview - Sam Houston State University

Solutions:

b) Find so that

for a normal distribution with

Example

95.0)( CXXP

CX

000,50$x

1000$25/5000$/ nx

Page 38: Overview - Sam Houston State University

First find so that

using the standard normal distribution. From Table C

we get that:

Example

95.0)( CZZP

CZ

655.1CZ

Page 39: Overview - Sam Houston State University

Use to get by using the z-score formula:

Example

CZ

CC

XZ

CX

1000$

000,50$655.1 CX

655,51$CX

Page 40: Overview - Sam Houston State University

Solutions (directly with calculator):

b)

Example

85.644,51$000)95,50000,1invNorm(0.CX

Page 41: Overview - Sam Houston State University

Page 350

Example

Do parts (c) and (d)

Page 42: Overview - Sam Houston State University

Solutions:

c) $48,355

d) $48,355 and $51,645

Example

15.355,48$000)05,50000,1invNorm(0.CX

Page 43: Overview - Sam Houston State University

Normal Probability Plots

A normal probability plot is a scatterplot of the estimated cumulative normal probabilities (expressed as percents) against the corresponding data values in a data set.

Page 44: Overview - Sam Houston State University

Normal Probability Plot

FIGURE 7.4 Normal probability plot of normal data.

Page 45: Overview - Sam Houston State University

Analyzing Normal Probability Plots

If the points in the normal probability plot either cluster around a straight line or nearly all fall within the curved bounds, then it is likely that the data set is normal.

Systematic deviations off the straight line are evidence against the claim that the data set is normal.

Page 46: Overview - Sam Houston State University

Normal Probability Plot

FIGURE 7.5 Normal probability plot of right-skewed data.

Page 47: Overview - Sam Houston State University

Summary

We can use sample statistics as point estimates of the unknown population parameters.

For each statistic, sampling error is the distance between the point estimate and its target parameter.

The sampling distribution of the sample mean for a given sample size n consists of the collection of the sample means of all possible samples of size n from the population.

x

Page 48: Overview - Sam Houston State University

Summary

The mean of the sampling distribution of the sample mean is the value of the population mean μ (Fact 1).

The standard deviation of the sampling

distribution of the sample mean is

, σ is the population standard

deviation (Fact 2).

The sampling distribution of the sample mean for a normal population is itself normal, regardless of sample size (Fact 3).

x

x

/x n

Page 49: Overview - Sam Houston State University

Summary

For a normal population, the sampling distribution of the sample mean is distributed as normal (μ, ), where μ is the population mean and σ is the population standard deviation (Fact 4).

Normal probability plots are used to assess the normality of a data set.

We can use Fact 4 to find probabilities and percentiles using sample means.

x

/ n

Page 50: Overview - Sam Houston State University

7.2 Central Limit Theorem for Means Objectives:

By the end of this section, I will be

able to…

1) Describe the sampling distribution of x for skewed and symmetric populations as the sample size increases.

2) Apply the Central Limit Theorem for Means to solve probability questions about the sample mean.

Page 51: Overview - Sam Houston State University

Main Idea

In this section we want to be able to describe the shape of the distribution of the sample means.

Page 52: Overview - Sam Houston State University

Symmetric Populations

For a symmetric distribution, at n=20 the sampling distribution of the mean is approximately normal.

Page 53: Overview - Sam Houston State University

Roll a fair die once. Make up a table that represents the probability distribution of X=number on the die. Also, plot the probability distribution in a bar chart.

Example

Page 54: Overview - Sam Houston State University

Distribution (table format) is:

x P(x)

1 0.1667

2 0.1667

3 0.1667

4 0.1667

5 0.1667

6 0.1667

Example

5.3)(xxP

Page 55: Overview - Sam Houston State University

FIGURE 7.11 Distribution of a single fair die roll is symmetric.

Page 56: Overview - Sam Houston State University

Roll a fair die one hundred times. Take random samples of size 10 from these 100 rolls. For each sample of size 10, calculate the sample mean. Plot the distribution of the means as a histogram.

Example

Page 57: Overview - Sam Houston State University

FIGURE 7.12 Sample means of size n = 10: already approaching normality.

Page 58: Overview - Sam Houston State University

Roll a fair die one hundred times. Take random samples of size 20 from these 100 rolls. For each sample of size 20, calculate the sample mean. Plot the distribution of the means as a histogram.

Example

Page 59: Overview - Sam Houston State University

FIGURE 7.13 Sample means of size n = 20: approximately normal.

Page 60: Overview - Sam Houston State University

FIGURE 7.14 Normal probability plot for n = 20: acceptable normality.

Page 61: Overview - Sam Houston State University

Skewed Populations

For a skewed population, sampling distribution of the mean becomes approximately normal as the sample size approaches 30.

Page 62: Overview - Sam Houston State University

FIGURE 7.10 Sampling distribution of x and normal probability plots for n = 10, 20, and 30.

¯

Page 63: Overview - Sam Houston State University

Central Limit Theorem for Means

Population with mean μ

Standard deviation σ

The sampling distribution of the sample mean x becomes approximately normal with mean μ and standard deviation as the sample size gets larger

Regardless of the shape of the population.

/ n

Page 64: Overview - Sam Houston State University

Rule of Thumb

We consider n ≥ 30 as large enough to apply the Central Limit Theorem for any population.

Page 65: Overview - Sam Houston State University

Three Cases for the Sampling Distribution of the Sample Mean x

Case 1

The population is normal.

Then the sampling distribution of x is normal (Fact 3 from 7.1).

Page 66: Overview - Sam Houston State University

Three Cases for the Sampling Distribution of the Sample Mean x continued

Case 2

The population is either non-normal or of unknown distribution and the sample size is at least 30.

Apply Central Limit Theorem for Means:

The sampling distribution of the sample mean is approximately normal

Page 67: Overview - Sam Houston State University

Three Cases for the Sampling Distribution of the Sample Mean x continued

Case 3

The population is either non-normal or of unknown distribution and the sample size is less than 30.

Insufficient information to conclude that the sampling distribution of the sample mean x is either normal or approximately normal

Page 68: Overview - Sam Houston State University

Page 362-363

Example

Page 69: Overview - Sam Houston State University

Solutions

6) Case 2

8) Case 3

10) Case 1

Example

Page 70: Overview - Sam Houston State University

Page 363

Example

Page 71: Overview - Sam Houston State University

Solutions

16(a)

16(b)

16(c) unknown

Example

000,60$x

500,2$16/000,10$/ nx

Page 72: Overview - Sam Houston State University

Page 363

Example

Page 73: Overview - Sam Houston State University

Solutions

20(a)

20(b)

20(c) approximately normal

Example

gallonper miles 50x

mpg 75.064/6/ nx

Page 74: Overview - Sam Houston State University

Page 363

Example

Page 75: Overview - Sam Houston State University

Solution:

first notice that it is possible to find the

probability since the systolic blood pressure readings are normally distributed so the distribution of the sample mean is also normal (case 1)

Example

80x

6.125/8/ nx

Page 76: Overview - Sam Houston State University

Solution:

Example

7887.0.6)78,82,80,1normalcdf()8278( xP

Page 77: Overview - Sam Houston State University

Page 363

Example

Page 78: Overview - Sam Houston State University

Solution

Not possible: since the pollen count distribution is not normally distributed and the sample size is smaller than 30, the sampling distribution of the mean of x is unknown.

Example

Page 79: Overview - Sam Houston State University

Page 363

Example

Page 80: Overview - Sam Houston State University

Solution:

first notice that it is possible to find the

probability since even though the pollen count distribution is not normal, the sample size is at least 30, so the distribution of the sample mean is also normal (case)

Example

8x

125.064/1/ nx

Page 81: Overview - Sam Houston State University

Solution (directly with calculator):

Find so that

Example

1.8)75,8,0.125invNorm(0.CX

75.0)( CXXPCX

Page 82: Overview - Sam Houston State University

Page 364

Example

Page 83: Overview - Sam Houston State University

Solution:

(a) yes- case 1 applies

Example

o

x 6.38

oo

x n 225/10/

0.2420

7580.01

)40(1)40( oo xPxP

Page 84: Overview - Sam Houston State University

Solution:

(b) case 2 does not apply since the sample

size is less than 30.

Example

Page 85: Overview - Sam Houston State University

Summary

In this section, we examine the behavior of the sample mean when the population is not normal.

The approximate normality of the sampling distribution of the sample mean kicks in much quicker when the original population is symmetric rather than skewed.

The Central Limit Theorem is one of the most important results in statistics and is stated as follows:

Page 86: Overview - Sam Houston State University

Summary

Given a population with mean μ and standard deviation σ, the sampling distribution of the sample mean x becomes approximately normal(μ, ) as the sample size gets larger, regardless of the shape of the population.

This approximation applies for smaller sample sizes when the original distribution is more symmetric.

/ n

Page 87: Overview - Sam Houston State University

7.3 Central Limit Theorem for Proportions Objectives:

By the end of this section, I will be

able to…

1) Explain the sampling distribution of the sample proportion p.

2) Describe the sampling distribution of the sample proportion p for extreme and moderate values of p.

3) Apply the Central Limit Theorem for Proportions to solve probability questions about the sample proportion.

ˆ

ˆ

Page 88: Overview - Sam Houston State University

Sample Proportion p

Suppose each individual in a population either has or does not have a particular characteristic

For sample of size n sample proportion p (read “p-hat”) is

x represents the number of individuals in the sample that have the particular characteristic

Use p to estimate the unknown value of the population proportion p

ˆx

pn

ˆ

ˆ

ˆ

Page 89: Overview - Sam Houston State University

Example

Page 367, Example 7.14

Page 90: Overview - Sam Houston State University

Table 7.6 All possible samples of size 3 from population of student government members

Page 91: Overview - Sam Houston State University

Example 7.15 - Mean of sample proportions

Calculate the mean of the ten sample

proportions p from Table 7.6 page 367.

ˆ

Page 92: Overview - Sam Houston State University

Example 7.15 continued

Solution

Mean is

p equals the population proportion of

females for the original population,

p = 3/5 = 0.6.

2 1 2 2 3 2 1 2 1 2

3 3 3 3 3 3 3 3 3 30.6

10

ˆ

Page 93: Overview - Sam Houston State University

Fact 6: Mean of the Sampling

Distribution of the Sample Proportion p

The value of the population proportion p

Denoted as

where

Read as “the mean of the sampling distribution of p is p”

ˆ

ˆ

pp̂

Page 94: Overview - Sam Houston State University

Fact 7 - Standard Deviation of the Sampling Distribution of the Sample Proportion p

where

p is the population proportion

n is the sample size

ˆ

n

pq

n

ppp

)1(ˆ

pq 1

Page 95: Overview - Sam Houston State University

Example

Page 379

Do 8 and 10 parts (a) and (b)

Page 96: Overview - Sam Houston State University

Solutions

8(a)

8(b)

Example

5.0ˆ pp

2236.05

)5.0()5.0(ˆ

n

pqp

5.01 pq

Page 97: Overview - Sam Houston State University

Solutions

10(a)

10(b)

Example

01.0ˆ pp

0044.0500

)99.0()01.0(ˆ

n

pqp

99.01 pq

Page 98: Overview - Sam Houston State University

Fact 8 - Conditions for Approximate Normality for the Sampling Distribution of the Sample Proportion p

May be considered approximately normal only if both the following conditions hold:

(1) np ≥ 5 and (2) n(1 - p) ≥ 5

ˆ

Page 99: Overview - Sam Houston State University

Example

Page 379

Do part (c)

Page 100: Overview - Sam Houston State University

Solutions

8(c)

Unknown (we cannot conclude that the sampling distribution of the proportion is normal in this case)

Example

55.2)5.0(5np

55.2)5.0(5)1( nqpn

Page 101: Overview - Sam Houston State University

Solutions

10(c)

We conclude that the sampling distribution of the proportion is approximately normal in this case

Example

5)01.0(500np

5495)99.0(500)1( nqpn

Page 102: Overview - Sam Houston State University

Fact 8 continued

Given a value for p, the minimum sample size required to produce approximate normality in the sampling distribution of the proportion can be found by solving each of these for n:

and choosing the next largest integer value that is at least as large as both of these values for n.

5 and 5 nqnp

Page 103: Overview - Sam Houston State University

Example

Page 379

Do problem 16

Page 104: Overview - Sam Houston State University

Solutions

16)

The minimum sample size is 100

Example

100 5)05.0( nnnp

26.5 5)95.0( nnnq

Page 105: Overview - Sam Houston State University

Fact 9 - Standardizing a Normal Sampling Distribution for Proportions

When the sampling distribution of p is normal (or approximately normal), we

can standardize to produce the standard normal Z:

where p is the population proportion of successes and n is the sample size.

ˆ

ˆ

ˆ ˆ

1

p

p

p p pZ

p p

n

Page 106: Overview - Sam Houston State University

Example

Page 379

Do problem 30

Page 107: Overview - Sam Houston State University

Solutions

30) First check that we can assume a normal distribution:

We cannot conclude that the sampling distribution of the proportion is normal in this case and we cannot find the probability.

Example

55.2)5.0(5np

55.2)5.0(5)1( nqpn

Page 108: Overview - Sam Houston State University

Example

Page 379

Do problem 32

Page 109: Overview - Sam Houston State University

Solutions

First check that we can assume a

normal distribution for :

Example

5)01.0(500np

5495)99.0(500)1( nqpn

Page 110: Overview - Sam Houston State University

Solutions

For normal distribution use

mean:

Standard deviation:

Example

01.0ˆ pp

0044.0500

)99.0()01.0(ˆ

n

pqp

Page 111: Overview - Sam Houston State University

Solutions

z-score method (Table)

Example

23.00044.0

01.0011.0ˆ

ˆ

ˆ

p

ppZ

Page 112: Overview - Sam Houston State University

Solutions

using standard normal distribution and table lookup:

Example

0.4090

5910.01

)23.0(1)23.0( ZPZP

From Table T-10

Page 113: Overview - Sam Houston State University

Solutions

It is more accurate to compute Z as

so that:

Example

0.4129

5871.01

)22.0(1)22.0( ZPZP

22.0500/)99.0()01.0(

01.0011.0ˆ

ˆ

ˆ

p

ppZ

Page 114: Overview - Sam Houston State University

Example

Page 379

Do problem 36

Page 115: Overview - Sam Houston State University

Solutions

First check that we can assume a normal distribution for :

Yes- both are greater than 5

Example

200)5.0(400np

200)50.0(400)1( nqpn

Page 116: Overview - Sam Houston State University

Solutions

Find the value:

So that

is the 90th percentile of values of

Example

90.0)ˆˆ( cppP

cp̂ p̂

cp̂

Page 117: Overview - Sam Houston State University

Solutions

For normal distribution use

mean:

standard deviation:

Example

5.0ˆ pp

025.0400

)5.0()5.0(ˆ

n

pqp

Page 118: Overview - Sam Houston State University

Solutions

Using calculator gives:

Example

532.025)90,0.5,0.0invNorm(0.ˆCp

Page 119: Overview - Sam Houston State University

Central Limit Theorem for Proportions

the sampling distribution of the sample proportion follows an approximately normal distribution with

mean

standard deviation

when the following conditions are satisfied:

np ≥ 5 and n(1 - p) ≥ 5

pp̂

n

pq

n

ppp

)1(ˆ

Page 120: Overview - Sam Houston State University

Example

Page 379

Page 121: Overview - Sam Houston State University

Solutions

(a) Take p=0.87 and choose smallest integer value of n so that both of the following are true:

Answer:

Example

5)87.0(nnp

5)13.0()1( nnqpn

39*nn

Page 122: Overview - Sam Houston State University

Solutions

(b) Check that we can assume a normal distribution for with a sample size of n=39

Yes- both larger than 5

Example

93.33)87.0(39np

07.5)13.0(39)1( nqpn

Page 123: Overview - Sam Houston State University

Solutions

(c) The Central Limit Theorem tells us that the sampling distribution of

is approximately normal when

n=39

Example

Page 124: Overview - Sam Houston State University

Use n=50

(d) The Central Limit Theorem gives that the sampling distribution of

is approximately normal with

Example

87.0ˆ pp

0476.050

)13.0(87.0)1(ˆ

n

ppp

Page 125: Overview - Sam Houston State University

(d) For n=50, find

Example

90.0ˆ50

45ˆ pPpP

0.264376),0.87,0.040.90,10normalcdf( 99

Page 126: Overview - Sam Houston State University

Summary

The sampling distribution of the sample proportion p for a given sample size n consists of the collection of the sample proportions of all possible samples of size n from the population.

The approximate normality of the sampling distribution of the sample proportion kicks in much quicker when the population proportion is moderate rather than extreme.

ˆ

Page 127: Overview - Sam Houston State University

Summary

According to the Central Limit Theorem for Proportions, the sampling distribution of the sample proportion p follows an approximately normal distribution with mean μp = p and standard deviation

the following conditions are satisfied: (1) np ≥ 5 and (2) n(1 - p) ≥ 5.

We can use Fact 9 to find probabilities and percentiles for sample proportions.

1 /p

p p n

ˆ

ˆ