12.5 differences between means ( ’s known) two populations: ( 1, 1 ) & ( 2, 2 ) two...

37
12.5 Differences between Means (’s known) Two populations: ( 1 , 1 ) & ( 2 , 2 ) Two samples: one from each population Two sample means and sample sizes: n 1 & n 2 Compare two population means: H 0 : 1 - 2 = (=0 in most cases) Alternatives: 1 - 2 >; 1 - 2 <; 1 - 2 1 x 2 x

Post on 20-Dec-2015

216 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

12.5 Differences between Means (’s known)

Two populations: (1, 1) & (2, 2) Two samples: one from each population Two sample means and sample sizes: n1 & n2 Compare two population means: H0: 1-2= (=0 in most cases) Alternatives: 1-2>; 1-2<; 1-2

1x 2x

Page 2: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Let’s go through a two sided alternative

H0: 1-2=0 vs HA: 1-2≠0 Reject H0 if is too far from zero in

either direction. How far from zero might be if 1-

2=0? Sampling distribution of is

asymptotically normal with mean 0 and standard deviation

We need to know

)( 21 xx

)( 21 xx

)( 21 xx

1 2x x

1 2x x

Page 3: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Fact: If the sample means are from

independent samples, then

1 2 1 2

1 2 1 2

2 2 2

2 22 2 2 21 2

1 11 2

x x x x

x x x x SE SEn n

Page 4: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Thus under certain assumptions:

1 2

2 21 2

1 2

( ) 0x xz

n n

Correspondingly, a confidence interval for 1-2 is

2

22

1

21

2/21 )(nn

zxx

Page 5: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Assumptions

1 & 2 are known Normal populations or large sample

sizes Under null hypothesis

is (asymptotically) standard normal

2

22

1

21

21 )(

nn

xxz

Page 6: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Rejection Regions:

Alternative Hypotheses

1-2> 1-2< 1-2

Rejection Regions

z>z z<-z z>z/2 or

z<-z/2

Page 7: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Example 12.4

Two labs measure the specific gravity of metal. On average do the two labs give the same answer?

1 -- Population mean by lab1

2 -- Population mean by lab2

H0: 1=2 vs HA: 12 1=0.02, n1=20, 2=0.03, n2=25,

032.21 x020.22 x

Page 8: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

95% Confidence Interval

from –0.014 to 0.016

2 21 2

1 2 0.0251 2

2 2

( )

0.02 0.03(2.032 2.020) 1.96

20 250.012 1.96 (0.0075)

x x zn n

Page 9: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Two-tailed Hypotheses Test

Two sample test

Rejection region: |Z|>z0.025=1.96

Conclusion: Don’t reject H0.

1 2

1 2 0.0121.6

0.0075x x

x xz

Page 10: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Rejection Regions

Alternative Hypotheses

HA: 1>2

HA: 1<2 HA: 12

Rejection Regions

z>z z<-z z>z/2 or

z<-z/2

Page 11: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Exercise An investigation of two kinds of photocopying

equipment showed that a random sample of 60 failures of one kind of equipment took on the average 84.2 minutes to repair, while a random sample of 60 failures of another kind of equipment took on the average 91.6 minutes to repair. If, on the basis of collateral information, it can be assumed that 1=2=19.0 minutes for such data, test at the 0.02 level of significance whether the difference between these two sample means is significant.

Page 12: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

12.6 Differences Between Means (unknown equal variances) Large samples n130; n230

Small samples 1. 1=2

2. 12

Page 13: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Large Samples

n130; n230 Estimate 1 and 2 by s1 and s2

Set

2

22

1

21

21 )(

ns

ns

xxz

Page 14: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Rejection Regions

Alternative Hypotheses

HA: 1>2

HA: 1<2 HA: 12

Rejection Regions

z>z z<-z z>z/2 or

z<-z/2

Page 15: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Small Samples

1=2= unknown Two populations are normal Standard error

Estimate the common variance

212

22

1

21 11

21 nnnnxx

Page 16: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Pooled standard deviation

Using both s12 and s2

2 to estimate 2, we combine these estimates, weighting each by its d.f.. The combined estimate of 2 is sp

2, the pooled estimate:

Estimate by sp

2

)1()1(

21

222

2112

nn

snsnsp

Page 17: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Two-Sample T-test

T-test (t distribution with df=n1+n2-2)

100(1-)% CI

21

21

11

)(

nns

xxt

p

212/21

11)(

nnstxx p

Hypothesized 1- 2

Page 18: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Example 12.5

Compare blood pressures Two populations: common

variance =0.05 n1=10, s1=16.2, n2=12, s2=14.3,

1251 x

1372 x

6.23021210

)3.14)(112()2.16)(110( 222

ps

Page 19: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

CI & test

sp=15.2 df=10+12-2=20 Critical value t0.025=2.086 t statistic: reject H0 if |t|>2.086

Conclusion? Don’t Reject.

CI: -122.086(6.51)=-12 13.6 -1.6 to 25.6

84.151.6

12

121

101

2.15

137125

t

Page 20: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

What happens when variances are not equal?

Testing: H0: 1-2=δ. Normal population 1 and 2 are not necessarily equal 1 and 2 unknown

1 2 1 2

1 2 1 2

2 2 2

2 2 2 22 2 1 2 1 2

1 2 1 2

estimated by

x x x x

x x x x

s s

n n n n

Page 21: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Two sample t-test with unequal variances

1 2

2 21 2

1 2

x xt

s s

n n

d.f. =min(n1-1, n2-1)

Page 22: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Exercise In a department store’s study designed to test

whether or not the mean balance outstanding on 30-day charge accounts is the same in its two suburban branch stores, random samples yielded the following results:

Use the 0.05 level of significance to test the null hypothesis 1-2=0.

1 1 1

2 2 2

80 $64.20 s $16.00

100 $71.41 s $22.13

n x

n x

Page 23: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

12.7 Paired Data

12

3

4

5

6

T=top water zinc concentration (mg/L)B=bottom water zinc (mg/L)

1 2 3 4 5 6Top 0.415 0.238 0.390 0.410 0.605 0.609Bottom 0.430 0.266 0.567 0.531 0.7070.716

1982 study of trace metals in South Indian River. 6 random locations

Page 24: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

One of the first things to do when analyzing data is to PLOT the data

This is not a useful way to plot the data. There is not a clear distinction between bottom water and top water zinc—even though Bottom>Top at all 6 locations.

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

Zinc

Top Bottom

Page 25: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

A better way

0.2

0.3

0.4

0.5

0.6

0.7

Zinc

Top Bottom

Connect points in the same pair.

Page 26: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

A better way

0

0.2

0.4

0.6

0.8

0 0.2 0.4 0.6 0.8

Bottom=Top

The plot suggests that Bottom>Top. Is it true?

Page 27: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

That is equivalent to ask: is it true that difference>0?

1 2 3 4 5 6

Top 0.4150.2380.3900.4100.6050.609Bottom 0.4300.2660.5670.5310.7070.716D=B-T 0.0150.0280.1770.1210.1020.107

Ho: D=0 vs HA: D>0

Page 28: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

First check the assumption that the population is normal

Normal Pl ot

0

0. 05

0. 1

0. 15

0. 2

- 2 - 1 0 1 2

Expected Z

Orde

red

diff

eren

ce(x

)

Ser i es1

Page 29: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Doing a one-sided test

Ho: D=0 vs HA: D>0

6

0.092 0.0923.68

0.0250.061/ 6

D D

D Dt

S

t0.05 at 5 d.f. is 2.015. So anything greater than 2.015 will be an evidence against H0.We reject H0: B-T=0 in favor of HA: B-T>0.

Page 30: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Another example

The average weekly losses of man-hours due to accidents in 10 industrial plants before and after installation of an elaborate safety program:

Plants 1 2 3 4 5 6 7 8 9 10 Before 45 73 46 124 33 57 83 34 26 17 After 36 60 44 119 35 51 77 29 24 11diff(B-A) 9 13 2 5 -2 6 6 5 2 6

Is the safety program effective? (level=0.05)

Page 31: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Two Populations: Before and After

Normal? Independent?

No, No

Page 32: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Normal Probability Plots

Small sizes Skew to right

somehow

-1 0 1

Quantiles of Standard Normal

20

40

60

80

10

01

20

be

fore

-1 0 1

Quantiles of Standard Normal

20

40

60

80

10

01

20

aft

er

Page 33: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Normal Probability Plot for Difference

Looks better

-1 0 1

Quantiles of Standard Normal

05

10

diff

Page 34: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Consider the Differences

Paired Observations:before and after the installation of safety program are from the same plants (dependent)

Data from different plants may be independent

Diff: 9 13 2 5 -2 6 6 5 2 6

Page 35: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Set up a Test—Paired T-Test

‘ effective’ means the program reduces the accidents, i.e., before > after (D>0)

=difference of average accidents H0: D=0 vs HA: D>0The procedure is the same as the one-sample t-test

Df=n-1ns

xt

D

D

/

Page 36: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Rejection Regions for Paired T-test

Alternative Hypotheses

D> D< D

Rejection Regions

t>t t<-t t>t/2 or

t<-t/2

Page 37: 12.5 Differences between Means (  ’s known) Two populations: (  1,  1 ) & (  2,  2 ) Two samples: one from each population Two sample means and sample

Paired t-test

One-tailed test Critical value: df=9, t0.05=1.833 Sample mean & standard deviation:

t-statistic: Conclusion: reject H0 since

t=4.03>1.833

03.410/08.4

02.5

/

ns

xt

D

D 08.4;2.5 DD sx