1 sampling distributions chapter 9. 2 9.1 introduction in real life calculating parameters of...

33
1 Sampling Sampling Distributions Distributions Chapter 9 Chapter 9

Post on 20-Dec-2015

219 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

1

Sampling Sampling DistributionsDistributions

Sampling Sampling DistributionsDistributions

Chapter 9Chapter 9

Page 2: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

2

9.1 Introduction9.1 Introduction

In real life calculating parameters of populations In real life calculating parameters of populations is prohibitive because populations are very is prohibitive because populations are very large.large.

Rather than investigating the whole population, Rather than investigating the whole population, we take a sample, calculate a we take a sample, calculate a statisticstatistic related related to theto the parameterparameter of interest, and make an of interest, and make an inference. inference.

The The sampling distributionsampling distribution of the of the statisticstatistic is is the tool that tells us how close is the statistic to the tool that tells us how close is the statistic to the parameter.the parameter.

Page 3: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

3

9.2 Sampling Distribution of 9.2 Sampling Distribution of the Meanthe Mean

An exampleAn example A die is thrown infinitely many times. Let X A die is thrown infinitely many times. Let X

represent the number of spots showing on represent the number of spots showing on any throw.any throw.

The probability distribution of X isThe probability distribution of X is

x 1 2 3 4 5 6p(x) 1/6 1/6 1/6 1/6 1/6 1/6

E(X) = 1(1/6) +2(1/6) + 3(1/6)+………………….= 3.5

V(X) = (1-3.5)2(1/6) + (2-3.5)2(1/6) + …………. …= 2.92

Page 4: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

4

Suppose we want to estimate Suppose we want to estimate from the mean of a sample of from the mean of a sample of size n = 2.size n = 2.

What is the distribution of ?What is the distribution of ?

x

Throwing a die twice – sample mean Throwing a die twice – sample mean

x

Page 5: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

5

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5

10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5

10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Throwing a die twice – sample mean Throwing a die twice – sample mean

Page 6: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

6

xThe distribution of when n = 2 The distribution of when n = 2

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5

10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.5

10 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

6/365/36

4/36

3/36

2/36

1/36x

E( ) =1.0(1/36)+1.5(2/36)+….=3.5

V(X) = (1.0-3.5)2(1/36)+(1.5-3.5)2(2/36)... = 1.46

x

2and:Note

2x2

xxx

2

and:Note2x2

xxx

Page 7: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

7

6)

5(5833.

5.35n

2x2

x

x

)10

(2917.

5.310n

2x2

x

x

)25

(1167.

5.325n

2x2

x

x

Sampling Distribution of the Sampling Distribution of the Mean Mean

Page 8: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

8

Sampling Distribution of the Sampling Distribution of the Mean Mean

)5

(5833.

5.35n

2x2

x

x

)10

(2917.

5.310n

2x2

x

x

)25

(1167.

5.325n

2x2

x

x

Notice that is smaller than . The larger the sample size the smaller . Therefore, tends to fall closer to , as the sample size increases.

2x

x2x

Notice that is smaller than x.

The larger the sample size the smaller . Therefore, tends to fall closer to , as the sample size increases.

2x

x2x

2

Page 9: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

9

Sampling Distribution of the Sampling Distribution of the Mean Mean

Demonstration: The variance of the sample mean is smaller than the variance of the population.

1 2 3

Mean = 1.5 Mean = 2.5Mean = 2.

Population 1.51.51.51.51.51.51.51.51.51.51.51.51.5

2.52.52.52.52.52.52.52.52.52.52.52.52.5

22222222222

Compare the variability of the populationto the variability of the sample mean.

Let us take samplesof two observations

Page 10: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

10

Also,Expected value of the population =

(1 + 2 + 3)/3 = 2

Expected value of the sample mean = (1.5 + 2 + 2.5)/3 = 2

Sampling Distribution of the Sampling Distribution of the Mean Mean

Page 11: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

11

If a random sample is drawn from any If a random sample is drawn from any population, the sampling distribution of the population, the sampling distribution of the sample mean is approximately normal for a sample mean is approximately normal for a sufficiently large sample size. sufficiently large sample size.

The larger the sample size, the more closely The larger the sample size, the more closely the sampling distribution of will resemble a the sampling distribution of will resemble a normal distribution.normal distribution.

x

The Central Limit TheoremThe Central Limit Theorem

Page 12: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

12

size. sample largely sufficient

for ddistributenormally ely approximat is x

nonnormal is x If normal. is x normal, .3

.2

.1

22

isxIfnx

x

xx

Sampling Distribution of the Sample Sampling Distribution of the Sample MeanMean

Page 13: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

13

Example 9.1Example 9.1 The amount of soda pop in each bottle is normally The amount of soda pop in each bottle is normally

distributed with a mean of 32.2 ounces and a distributed with a mean of 32.2 ounces and a standard deviation of .3 ounces.standard deviation of .3 ounces.

Find the probability that a bottle bought by a Find the probability that a bottle bought by a customer will contain more than 32 ounces.customer will contain more than 32 ounces.

SolutionSolution The random variable X is the The random variable X is the

amount of soda in a bottle.amount of soda in a bottle.

= 32.2

0.7486

x = 32

7486.0)67.z(P

)3.

2.3232x(P)32x(P

x

Sampling Distribution of the Sampling Distribution of the Sample MeanSample Mean

Page 14: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

14

= 32.2

0.7486

x = 32

Find the probability that a carton of four bottles will Find the probability that a carton of four bottles will have a mean of more than 32 ounces of soda per have a mean of more than 32 ounces of soda per bottle.bottle.

SolutionSolution Define the random variable as the mean amount of soda per Define the random variable as the mean amount of soda per

bottle.bottle.

9082.0)33.1z(P

)43.

2.3232x(P)32x(P

x

32x

0.9082

2.32x

Sampling Distribution of the Sampling Distribution of the Sample MeanSample Mean

Page 15: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

15

Example 9.2Example 9.2 Dean’s claim: The average weekly income of Dean’s claim: The average weekly income of

B.B.A graduates one year after graduation is B.B.A graduates one year after graduation is $600.$600.

Suppose the distribution of weekly income has a Suppose the distribution of weekly income has a standard deviation of $100. What is the probability standard deviation of $100. What is the probability that 25 randomly selected graduates have an that 25 randomly selected graduates have an average weekly income of less than $550?average weekly income of less than $550?

SolutionSolution

0062.0)5.2z(P

)25100

600550x(P)550x(P

x

Sampling Distribution of the Sampling Distribution of the Sample MeanSample Mean

Page 16: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

16

Example 9.2– continuedExample 9.2– continued If a random sample of 25 graduates actually had If a random sample of 25 graduates actually had

an average weekly income of $550, what would an average weekly income of $550, what would you conclude about the validity of the claim that you conclude about the validity of the claim that the average weekly income is 600?the average weekly income is 600?

SolutionSolution With With = 600 the probability of observing a sample mean = 600 the probability of observing a sample mean

as low as 550 is very small (0.0062). The claim that the as low as 550 is very small (0.0062). The claim that the mean weekly income is $600 is probably unjustified.mean weekly income is $600 is probably unjustified.

It will be more reasonable to assume that It will be more reasonable to assume that is smaller is smaller than $600, because then a sample mean of $550 than $600, because then a sample mean of $550 becomes more probable.becomes more probable.

Sampling Distribution of the Sample Sampling Distribution of the Sample MeanMean

Page 17: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

17

95.)n

96.1xn

96.1(P

becomewhich

95.)n

96.1xn

96.1(P

aswrittenbecanThis

95.)96.1n

x96.1(Por,95.)96.1z96.1(P

To make inference about population parameters we use To make inference about population parameters we use sampling distributions (as in Example 9.2).sampling distributions (as in Example 9.2).

The symmetry of the normal distribution along with the The symmetry of the normal distribution along with the sample distribution of the mean lead to:sample distribution of the mean lead to:

- Z.025 Z.025

Using Sampling Distributions for Using Sampling Distributions for InferenceInference

Page 18: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

18

Using Sampling Distributions for Using Sampling Distributions for InferenceInference

-1.96 -1.960n

96.1

n

96.1

.025 .025.025 .025

Standard normal distribution Z Normal distribution of x

95.)25

10096.1600x

25

10096.1600(P

.95.95

xZ

25

10096.1600(P

25

10096.1600(P

Page 19: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

19

ConclusionConclusion There is 95% chance that the sample mean There is 95% chance that the sample mean

falls within the interval [560.8, 639.2] falls within the interval [560.8, 639.2] if the if the population mean is 600.population mean is 600.

Since the sample mean was 550, the Since the sample mean was 550, the population mean is probably not 600.population mean is probably not 600.

95.)2.639x8.560(PtoreducesWhich

95.)25

10096.1600x

25

10096.1600(P

Using Sampling Distributions for Using Sampling Distributions for InferenceInference

Page 20: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

20

The estimate of p = The estimate of p =

The parameter of interest for nominal data The parameter of interest for nominal data is the is the proportion of timesproportion of times a particular a particular outcome (success) occurs.outcome (success) occurs.

To estimate the population proportion ‘p’ To estimate the population proportion ‘p’ we use the sample proportion.we use the sample proportion.

9.3 Sampling Distribution of 9.3 Sampling Distribution of a Proportiona Proportion

pp̂̂ ==XXnn

The number of successes

Page 21: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

21

Since X is binomial, probabilities about Since X is binomial, probabilities about can be calculated from the binomial can be calculated from the binomial distribution.distribution.

Yet, for inference about we prefer to use Yet, for inference about we prefer to use normal approximation to the binomial.normal approximation to the binomial.

pp̂̂

9.3 Sampling Distribution of 9.3 Sampling Distribution of a Proportiona Proportion

pp̂̂

Page 22: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

22

Normal approximation to the Normal approximation to the BinomialBinomial

Normal approximation to Normal approximation to the binomial works best the binomial works best whenwhen the number of the number of

experiments (sample experiments (sample size) is large, andsize) is large, and

the probability of success, the probability of success, p, is close to 0.5.p, is close to 0.5.

For the approximation to For the approximation to provide good results two provide good results two conditions should be met:conditions should be met:

np 5; n(1 - p) 5np 5; n(1 - p) 5

Page 23: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

23

Normal approximation to the Normal approximation to the BinomialBinomial

ExampleApproximate the binomial probability P(x=10) when n = 20 and p = .5

The parameters of the normal distribution used to approximate the binomial are:

= np; 2 = np(1 - p)

Page 24: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

24

109.5 10.5

P(XBinomial = 10) =

~= P(9.5<Y<10.5)

= np = 20(.5) = 10; 2 = np(1 - p) = 20(.5)(1 - .5) = 5 = 51/2 = 2.24

1742.)24.2

105.10Z

24.2105.9

(P

.176

Let us build a normal distribution to approximate the binomial P(X = 10).

P(9.5<YNormal<10.5)The approximation

Normal approximation to the Normal approximation to the BinomialBinomial

Page 25: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

25

More examples of normal approximation More examples of normal approximation to the binomialto the binomial

44.5

1413.5

P(X P(X 14) 14)

P(Y< 4.5)P(Y< 4.5)

P(Y > 13.5)P(Y > 13.5)

Normal approximation to the Normal approximation to the BinomialBinomial

P(X P(X 4) 4)

Page 26: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

26

Approximate Sampling Distribution Approximate Sampling Distribution of a Sample Proportionof a Sample Proportion

From the laws of expected value and variance, it From the laws of expected value and variance, it can be shown that E( ) = p and V( ) can be shown that E( ) = p and V( )

=p(1-p)/n=p(1-p)/n If both np > 5 and np(1-p) > 5, thenIf both np > 5 and np(1-p) > 5, then

Z is approximately standard normally distributed.Z is approximately standard normally distributed.n

)p1(ppp̂

z

n)p1(p

pp̂z

p̂ p̂

Page 27: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

27

Example 9.3Example 9.3 A state representative received 52% of the A state representative received 52% of the

votes in the last election.votes in the last election. One year later the representative wanted One year later the representative wanted

to study his popularity.to study his popularity. If his popularity has not changed, what is If his popularity has not changed, what is

the probability that more than half of a the probability that more than half of a sample of 300 voters would vote for him? sample of 300 voters would vote for him?

Page 28: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

28

Example 9.3Example 9.3 SolutionSolution

The number of respondents who prefer the The number of respondents who prefer the representative is binomial with n = 300 and p representative is binomial with n = 300 and p = .52. Thus, np = 300(.52) = 156 and= .52. Thus, np = 300(.52) = 156 andn(1-p) = 300(1-.52) = 144 (both greater than 5)n(1-p) = 300(1-.52) = 144 (both greater than 5)

7549.300)52.1)(52(.

52.50.

)1(

ˆ)50.ˆ(

npp

ppPpP

Page 29: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

29

9.4 Sampling Distribution of the 9.4 Sampling Distribution of the Difference Between Two MeansDifference Between Two Means

Independent samples are drawn from Independent samples are drawn from each of two normal populationseach of two normal populations

We’re interested in the sampling We’re interested in the sampling distribution of the difference between the distribution of the difference between the two sample meanstwo sample means 21 xx

Page 30: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

30

The distribution of is normal ifThe distribution of is normal if The two samples are independent, andThe two samples are independent, and The parent populations are normally The parent populations are normally

distributed.distributed.

21 xx

21 xx

If the two populations are If the two populations are notnot both both normally distributed, but the sample normally distributed, but the sample sizes are 30 or more, the distribution of sizes are 30 or more, the distribution of is approximately normal. is approximately normal.

Sampling Distribution of the Sampling Distribution of the Difference Between Two MeansDifference Between Two Means

Page 31: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

31

Applying the laws of expected value and Applying the laws of expected value and variance we have:variance we have:

nn)x(V)x(V)xx(V

)x(E)x(E)xx(E22

21

2121

212121

2

22

1

21

2121

nn

)()xx(Z

2

22

1

21

2121

nn

)()xx(Z

We can define:We can define:

Sampling Distribution of the Sampling Distribution of the Difference Between Two MeansDifference Between Two Means

Page 32: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

32

Example 9.4Example 9.4 The starting salaries of MBA students from The starting salaries of MBA students from

two universities (WLU and UWO) are $62,000 two universities (WLU and UWO) are $62,000 (stand.dev. = $14,500), and $60,000 (stand. (stand.dev. = $14,500), and $60,000 (stand. dev. = $18,3000).dev. = $18,3000).

What is the probability that a sample mean of What is the probability that a sample mean of WLU students will exceed the sample mean of WLU students will exceed the sample mean of UWO students? (nUWO students? (nWLUWLU = 50; n = 50; nUWOUWO = 60) = 60)

Sampling Distribution of the Sampling Distribution of the Difference Between Two MeansDifference Between Two Means

Page 33: 1 Sampling Distributions Chapter 9. 2 9.1 Introduction  In real life calculating parameters of populations is prohibitive because populations are very

33

Example 9.4 – SolutionExample 9.4 – Solution

We need to determine

128,3$60

300,18

50

500,14 2222

21

nn

7389.2389.5.)64.z(P

)3128

20000

nn

) - (xx(P)0xx(P

2

22

1

21

212121

)0xx(P 21

1 - 2 = 62,000 - 60,000 = $2,000

Sampling Distribution of the Sampling Distribution of the Difference Between Two MeansDifference Between Two Means