sampling distribution. 2 introduction in real life calculating parameters of populations is usually...

25
SAMPLING DISTRIBUTION

Upload: bathsheba-anderson

Post on 24-Dec-2015

224 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

SAMPLING DISTRIBUTION

Page 2: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

2

Introduction

• In real life calculating parameters of populations is usually impossible because populations are very large.

• Rather than investigating the whole population, we take a sample, calculate a statistic related to the parameter of interest, and make an inference.

Page 3: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

3

STATISTIC

• Let X1, X2,…,Xn be a r.s. of size n from a population and let T(x1,x2,…,xn) be a function which does not depend on any unknown parameters. Then, the r.v. or a random vector Y=T(X1, X2,…,Xn) is called a statistic.

Page 4: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

4

STATISTIC• The sample mean is the arithmetic average of

the values in a r.s.1 2

1

1 nn

ii

X X XX X

n n

• The sample variance is the statistic defined by

22

1

11

n

ii

S X Xn

• The sample standard deviation is the statistic defined by S.

Page 5: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

SAMPLING DISTRIBUTION

• A statistic is also a random variable. Its distribution depends on the distribution of the random sample and the form of the function Y=T(X1, X2,…,Xn). The probability distribution of a statistic Y is called the sampling distribution of Y.

Page 6: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

6

Sampling Distribution of the Mean• An example– A die is thrown infinitely many times. Let X

represent the number of spots showing on any throw.

– The probability distribution of X is

x 1 2 3 4 5 6p(x) 1/6 1/6 1/6 1/6 1/6 1/6

E(X) = 1(1/6) +2(1/6) + 3(1/6)+………………….= 3.5

V(X) = (1-3.5)2(1/6) + (2-3.5)2(1/6) + …………. …= 2.92

Page 7: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

7

• Suppose we want to estimate the mean of a population from the mean of a sample, , of size n = 2.

• What is the distribution of ?X

Throwing a die twice – sample mean

X

Page 8: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

8

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.510 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.510 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Throwing a die twice – sample mean

Page 9: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

9

XThe distribution of when n = 2 Sample Mean Sample Mean Sample Mean

1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.510 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

Sample Mean Sample Mean Sample Mean1 1,1 1 13 3,1 2 25 5,1 32 1,2 1.5 14 3,2 2.5 26 5,2 3.53 1,3 2 15 3,3 3 27 5,3 44 1,4 2.5 16 3,4 3.5 28 5,4 4.55 1,5 3 17 3,5 4 29 5,5 56 1,6 3.5 18 3,6 4.5 30 5,6 5.57 2,1 1.5 19 4,1 2.5 31 6,1 3.58 2,2 2 20 4,2 3 32 6,2 49 2,3 2.5 21 4,3 3.5 33 6,3 4.510 2,4 3 22 4,4 4 34 6,4 511 2,5 3.5 23 4,5 4.5 35 6,5 5.512 2,6 4 24 4,6 5 36 6,6 6

1 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0

6/365/36

4/36

3/36

2/36

1/36

x

E( ) =1.0(1/36)+1.5(2/36)+….=3.5

V( ) = (1.0-3.5)2(1/36)+(1.5-3.5)2(2/36)... = 1.46

x

22:

2x

x x xNote and

22:

2x

x x xNote and

x

Page 10: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

10

6

)5

(5833.

5.35n

2x2

x

x

)10

(2917.

5.310n

2x2

x

x

)25

(1167.

5.325n

2x2

x

x

Sampling Distribution of the Mean

Page 11: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

11

Sampling Distribution of the Mean

)5

(5833.

5.35n

2x2

x

x

)10

(2917.

5.310n

2x2

x

x

)25

(1167.

5.325n

2x2

x

x

Notice that is smaller than . The larger the sample size the smaller . Therefore, tends to fall closer to , as the sample size increases.

2x

x2x

Notice that is smaller than x. The larger the sample size the smaller . Therefore, tends to fall closer to , as the sample size increases.

2x

x2x

2

Page 12: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

12

SAMPLING FROM THE NORMAL DISTRIBUTION

Properties of the Sample Mean and Sample Variance

• Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then, 2) and are independent rvs.a X S

2) ~ , /b X N n

22

12

1) ~ n

n Sc

Page 13: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

13

SAMPLING FROM THE NORMAL DISTRIBUTION

• Let X1, X2,…,Xn be a r.s. of size n from a N(,2) distribution. Then,

~ 0,1/

XN

n

•Most of the time is unknown, so we use:

./

X

S n

Page 14: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

14

SAMPLING FROM THE NORMAL DISTRIBUTION

In statistical inference, Student’s t distribution is very important.

Page 15: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

15

SAMPLING FROM THE NORMAL DISTRIBUTION

• Let X1, X2,…,Xn be a r.s. of size n from a N(X,X

2) distribution and let Y1,Y2,…,Ym be a r.s. of size m from an independent N(Y,Y

2).

• If we are interested in comparing the variability of the populations, one quantity of interest would be the ratio

2 2 2 2/ /X Y X YS S

Page 16: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

16

SAMPLING FROM THE NORMAL DISTRIBUTION

• The F distribution allows us to compare these quantities by giving the distribution of

2 2 2 2

1, 12 2 2 2

/ /~

/ /X Y X X

n m

X Y Y Y

S S SF

S

• If X~Fp,q, then 1/X~Fq,p.

• If X~tq, then X2~F1,q.

Page 17: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

17

CENTRAL LIMIT THEOREMIf a random sample is drawn from any population, the sampling

distribution of the sample mean is approximately normal for a sufficiently large sample size. The larger the sample size, the more closely the sampling distribution of will resemble a normal distribution.

Random Sample

(X1, X2, X3, …,Xn)

Sample Mean Distribution

XX

Random Variable (Population) Distribution

as n

X

Page 18: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

18

Sampling Distribution of the Sample Mean

If X is normal, is normal.

If X is non-normal, is approximately normally distributed for sample size greater than or equal to 30.

X

2

2 or X Xn n

X

X

2 XX ~ N( , / n ) Z ~ N(0,1)

/ n

Page 19: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

19

• The amount of soda pop in each bottle is normally distributed with a mean of 32.2 ounces and a standard deviation of 0.3 ounces.– Find the probability that a bottle bought by a customer

will contain more than 32 ounces.– Solution• The random variable X is the

amount of soda in a bottle.

= 32.2

0.7486

x = 327486.0)67.z(P

)3.

2.3232x(P)32x(P

x

EXAMPLE 1

Page 20: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

20 = 32.2

0.7486

x = 32

• Find the probability that a carton of four bottles will have a mean of more than 32 ounces of soda per bottle.

• Solution– Define the random variable as the mean amount of soda per

bottle.

9082.0)33.1z(P

)43.

2.3232x(P)32x(P

x

32x

0.9082

2.32x

EXAMPLE 1 (contd.)

Page 21: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

21

The estimate of p = The estimate of p =

• The parameter of interest for nominal data is the proportion of times a particular outcome (success) occurs.

• To estimate the population proportion ‘p’ we use the sample proportion.

Sampling Distribution of a Proportion

pp̂̂ == XXnn

The number of successes

Page 22: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

22

• Since X is binomial, probabilities about can be calculated from the binomial distribution.

• Yet, for inference about we prefer to use normal approximation to the binomial whenever this approximation is appropriate.

pp̂̂

Sampling Distribution of a Proportion

pp̂̂

Page 23: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

23

Approximate Sampling Distribution of a Sample Proportion

• From the laws of expected value and variance, it can be shown that E( ) = p and V( )=p(1-p)/n

• If both np ≥ 5 and n(1-p) ≥ 5, then

• Z is approximately standard normally distributed.

ˆ

(1 )

p pz

p p

n

ˆ

(1 )

p pz

p p

n

p̂ p̂

Page 24: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

24

EXAMPLE– A state representative received 52% of the

votes in the last election.– One year later the representative wanted to

study his popularity.– If his popularity has not changed, what is the

probability that more than half of a sample of 300 voters would vote for him?

Page 25: SAMPLING DISTRIBUTION. 2 Introduction In real life calculating parameters of populations is usually impossible because populations are very large. Rather

25

EXAMPLE (contd.)

Solution• The number of respondents who prefer the representative is

binomial with n = 300 and p = .52. Thus, np = 300(.52) = 156 andn(1-p) = 300(1-.52) = 144 (both greater than 5)

7549.300)52.1)(52(.

52.50.

)1(

ˆ)50.ˆ(

npp

ppPpP