12 two tail test

8/11/2019 12 Two Tail Test

1/50

Two-Sample Tests


2/50

Learning Objectives

In this session, you learn how to use hypothesistesting for comparing the difference between:

The means of two independent populations

The means of two related populations The proportions of two independent

populations

The variances of two independentpopulations


3/50

Two-Sample Tests Overview

Two Sample Tests

IndependentPopulation

Means

Means,Related

Populations


Variances

Group 1 vs.Group 2

Same groupbefore vs. aftertreatment

Variance 1 vs.Variance 2

Examples


Proportions

Proportion 1vs.Proportion 2


4/50

Two-Sample Tests

Independent

Population Means

1 and 2 known

1 and 2 unknown

Goal: Test hypothesis or forma confidence interval for the

difference between two

population means, 12

The point estimate for the

difference between samplemeans:

X1 X2


5/50

Two-Sample TestsIndependent Populations

Independent

Population Means

1 and 2 known

1 and 2 unknown

Different data sources

Independent: Sample selected

from one population has no

effect on the sample selected

from the other population

Use the difference between 2

sample means

Use Z test, pooled variance ttest, or separate-variance t test


6/50


7/50


Independent

Population Means

1 and 2 known

1 and 2 unknown

Assumptions:

Samples are randomly and

independently drawn

population distributions are

normal


8/50


Independent

Population Means

1 and 2 known

1 and 2 unknown

When 1 and 2 are known and bothpopulations are normal, the test

statistic is a Z-value and the

standard error of X1 X2 is

2

2

2

1

2

1

XX n

n

21


9/50


10/50


Lower-tail test:

H0: 1 2H1: 1 < 2

i.e.,H0: 12 0

H1: 12 < 0

Upper-tail test:

H0: 1 2H1: 1 >2

i.e.,

H0: 12 0

H1: 12 > 0

Two-tail test:

H0: 1 = 2H1: 1 2

i.e.,

H0: 12 = 0

H1: 12 0

Two Independent Populations, Comparing Means


11/50


Two Independent Populations, Comparing MeansLower-tail test:

H0: 12 0

H1: 12 < 0

Upper-tail test:

H0: 12 0

H1: 12 > 0

Two-tail test:

H0: 12 = 0

H1: 12 0

/2

/2

-z -z/2z z/2

Reject H0 if Z < -Za Reject H0 if Z > Za Reject H0 if Z < -Za/2or Z > Za/2


12/50


Independent

Population Means

1 and 2 known

1 and 2 unknown

Assumptions:

Samples are randomly andindependently drawn

Populations are normally

distributed

Population variances are

unknown but assumed equal


13/50


Independent

Population Means

1 and 2 known

1 and 2 unknown

Forming interval estimates:

The population variances

are assumed equal, so use

the two sample standarddeviations and pool them to

estimate

the test statistic is a t value

with (n1 + n2 2) degrees

of freedom


14/50


Independent

Population Means

1 and 2 known

1 and 2 unknown

The pooled standard

deviation is:

1)n()1(n

S1nS1n

S 21

2

22

2

11

p


15/50


Where t has (n1

+ n2 2) d.f., and

21

2p

2121

n1

n1S

XXt

The test statistic is:

1)n()1(n

S1nS1nS

21

2

22

2

112

p

Independent

Population Means

1 and 2 known

1 and 2 unknown


16/50


You are a financial analyst for a brokerage firm. Is there a

difference in dividend yield between stocks listed on theNYSE & NASDAQ? You collect the following data:

NYSE NASDAQNumber 21 25

Sample mean 3.27 2.53

Sample std dev 1.30 1.16

Assuming both populations are approximately normal with equal

variances, is there a difference in average yield ( = 0.05)?


17/50


1.5021

1)25(1)-(21

1.161251.30121

1)n()1(n

S1nS1nS

22

21

2

22

2

112

p

2.040

251

2115021.1

02.533.27

n1

n1S

XXt

21

2p

2121



18/50


H0: 1 -2 = 0 i.e. (1 = 2) H1: 1 -20 i.e. ( 1 2)

= 0.05

df = 21 + 25 - 2 = 44 Critical Values: t = 2.0154

Test Statistic: 2.040

t0 2.0154-2.0154

.025

Reject H0 Reject H0

.025

Decision: Reject H0 at = 0.05

2.040

Conclusion: There is evidence

of a difference in the means.


19/50


Independent

Population Means

1 and 2 known

1 and 2 unknown

2

2

2

1

2

121

n

n

XX Z

The confidence interval for12 is:


20/50


Independent

Population Means

1 and 2 known

1 and 2 unknown

21

2p2-nn21

n

1

n

1SXX 21t

The confidence interval for12 is:

Where

1)n()1(n

S1nS1nS

21

2

22

2

112

p


21/50

Two-Sample TestsRelated Populations

D = X1 - X2

Tests Means of 2 Related Populations

Paired or matched samples

Repeated measures (before/after)

Use difference between paired values:

Eliminates Variation Among Subjects

Assumptions: Both Populations Are Normally Distributed


22/50


The ith paired difference is Di , where

n

D

D

n

1i

i

Di = X1i - X2i

The point estimate for the population meanpaired difference is D :

Suppose the population standard deviation of

the difference scores, D, is known.


23/50


The test statistic for the mean difference is a Zvalue:

n

DZ

D

D

Where

D = hypothesized mean differenceD = population standard deviation of differences

n = the sample size (number of pairs)


24/50


If D is unknown, you can estimate theunknown population standard deviation with a

sample standard deviation:

1n

)D(D

S

n

1i

2

i

D


25/50


1n

)D(D

S

n

1i

2i

D

n

S

Dt

D

D

The test statistic for D is now a t statistic:

Where t has n - 1 d.f.

and SD is:


26/50


Lower-tail test:

H0: D 0

H1: D < 0

Upper-tail test:

H0: D 0

H1: D > 0

Two-tail test:

H0: D = 0

H1: D 0

/2 /2

-t -t/2t t/2

Reject H0 if t < -ta Reject H0 if t > ta Reject H0 if t < -ta/2or t > ta/2


27/50

Two-Sample TestsRelated Populations ExampleAssume you send your salespeople to a customer

service training workshop. Has the training made adifference in the number of complaints? You collect the

following data:

Salesperson Number of Complaints Difference, Di

(2-1)Before (1) After (2)

C.B. 6 4 -2

T.F. 20 6 -14

M.H. 3 2 -1

R.K. 0 0 0

M.O 4 0 -4


28/50

Two-Sample TestsRelated Populations Example

2.4n

D

D

n

1i

i

5.67

1n

)D(DS

2

i

D

Salesperson Number of Complaints Difference, Di

(2-1)Before (1) After (2)

C.B. 6 4 -2

T.F. 20 6 -14

M.H. 3 2 -1

R.K. 0 0 0

M.O 4 0 -4


29/50


Has the training made a difference in the number of

complaints (at the = 0.01 level)?

H0: D = 0

H1: D 0

Critical Value = 4.604

d.f. = n - 1 = 4

Test Statistic:

1.6655.67/

04.2

n/S

t

D

D

D


30/50


Reject

- 4.604 4.604

Reject

/2

- 1.66

Decision: Do not reject H0(t statistic is not in the rejectregion)

Conclusion: There is no

evidence of a significant change

in the number of complaints

/2


31/50


The confidence interval for D ( known) is:

n

DZD

Where

n = the sample size (number of pairs in the paired sample)


32/50


The confidence interval for D ( unknown) is:

1n

)D(D

S

n

1i

2i

D

n

StD D1n

where


33/50

Two Population Proportions

Goal: Test a hypothesis or form a confidence

interval for the difference between two

independent population proportions, 12

Assumptions:

n11 5 , n1(1-1) 5

n22 5 , n2(1-2) 5

The point estimate for the difference is p1 - p2


34/50


Since you begin by assuming the nullhypothesis is true, you assume 1 = 2 and pool

the two sample (p) estimates.

21

21

nn

XXp

The pooled estimate for

the overall proportion is:

where X1 and X2 are the number of

successes in samples 1 and 2


35/50


21

2121

11)1(

nnpp

ppZ

The test statistic for p1 p2 is a Z statistic:

2

22

1

11

21

21

n

X,

n

X,

nn

XXp

PPwhere


36/50


Hypothesis for Population Proportions

Lower-tail test:

H0: 1 2H1: 1 < 2

i.e.,

H0: 12 0

H1: 12 < 0

Upper-tail test:

H0: 1 2H1: 1 >2

i.e.,

H0: 12 0

H1: 12 > 0

Two-tail test:

H0: 1 = 2H1: 1 2

i.e.,

H0: 12 = 0

H1: 12 0


37/50


Hypothesis for Population Proportions

Lower-tail test:

H0: 12 0

H1: 12 < 0

Upper-tail test:

H0: 12 0

H1: 12 > 0

Two-tail test:

H0: 12 = 0

H1: 12 0

/2

/2

-z -z/2z z/2

Reject H0 if Z < -Z Reject H0 if Z > Z Reject H0 if Z < -Zor Z > Z


38/50

Two Independent Population

Proportions: Example

Is there a significant difference between theproportion of men and the proportion ofwomen who will vote Yes on Proposition A?

In a random sample of 72 men, 36 indicatedthey would vote Yes and, in a sample of 50women, 31 indicated they would vote Yes

Test at the .05 level of significance


39/50



H0: 12 = 0 (the two proportions are equal)

H1: 120 (there is a significant differencebetween proportions)

The sample proportions are:

Men: p1 = 36/72 = .50

Women: p2 = 31/50 = .62

The pooled estimate for the overall proportion is:

.549122

67

5072

3136

nn

XXp

21

21


40/50



The test statistic for 12

is:

1.31

50

1

72

1.549)(1.549

0.62.50

n

1

n

1)p(1p

z

21

2121

pp

Critical Values = 1.96For = .05

.025

-1.96 1.96

.025

-1.31

Decision: Do not reject H0

Conclusion: There is no evidence of asignificant difference in proportions who

will vote yes between men and women.

Reject H0 Reject H0


41/50


Proportions

2

22

1

1121

n)(1

n)(1 ppppZpp

The confidence interval for 12 is:


42/50

Testing Population Variances

Purpose: To determine if two independentpopulations have the same variability.

H0: 12 = 2

2

H1: 12 2

2

H0: 12 2

2

H1: 12 < 2

2

H0: 12 2

2

H1: 12 > 2

2

Two-tail test Lower-tail test Upper-tail test


43/50


2

2

2

1

S

SF

The F test statistic is:

= Variance of Sample 1

n1 - 1 = numerator degrees of freedom

n2 - 1 = denominator degrees of freedom

= Variance of Sample 2

2

1S

2

2S


44/50


The F critical value is found from the F table There are two appropriate degrees of

freedom: numerator and denominator.

In the F table, numerator degrees of freedom determine the

column

denominator degrees of freedom determine therow


45/50


0

FLReject

H0

Do notreject H0

H0: 12 2

2

H1: 12 < 2

2

Reject H0 if F < FL

0

FUReject H0Do not

reject H0

H0: 12 2

2

H1: 12 > 2

2

Reject H0 if F > FU

Lower-tail test Upper-tail test


46/50


L2

2

2

1

U22

2

1

FS

SF

FS

S

F

rejection regionfor a two-tail test is:F

0

/2

Reject H0Do notreject H0 FU

H0: 12 = 2

2

H1: 12 22

FL

/2

Two-tail test


47/50


To find the critical F values:

1. Find FU from the F table for n1 1

numerator and n2 1 denominator degrees

of freedom.

*U

LF

1F

2. Find FL using the formula:

Where FU* is from the F table with n2 1numerator and n1 1 denominator degrees of

freedom (i.e., switch the d.f. from FU)


48/50


You are a financial analyst for a brokerage firm. You

want to compare dividend yields between stocks listedon the NYSE & NASDAQ. You collect the followingdata:

NYSE NASDAQNumber 21 25

Mean 3.27 2.53

Std dev 1.30 1.16

Is there a difference in the variances between theNYSE & NASDAQ at the = 0.05 level?


49/50


Form the hypothesis test:

H0: 2

12

2 = 0 (there is no difference between variances)

H1: 2

12

20 (there is a difference between variances)

Numerator:

n1 1 = 21 1 = 20 d.f.

Denominator: n2 1 = 25 1 = 24 d.f.

FU = F.025, 20, 24 = 2.33

Numerator:

n2 1 = 25 1 = 24 d.f.

Denominator: n1 1 = 21 1 = 20 d.f.

FL = 1/F.025, 24, 20 = 0.41

FU: FL:


50/50



256.116.1

30.12

2

2

2

2

1 S

SF

0

/2 = .025

FU=2.33Reject H0Do not

reject H0

FL=0.41

/2 = .025

Reject H0F

F = 1.256 is not in the

rejection region, so we do

not reject H0

Conclusion: There is insufficient

evidence of a difference in

variances at = .05

12 two tail test

Documents