dr. ka-fu wong
DESCRIPTION
Dr. Ka-fu Wong. ECON1003 Analysis of Economic Data. Overview. Control Group. Experimental Group. Placebo. Treatment. - PowerPoint PPT PresentationTRANSCRIPT
Ka-fu Wong © 2003 Chap 11- 1
Dr. Ka-fu Wong
ECON1003Analysis of Economic Data
Ka-fu Wong © 2003 Chap 11- 2
Control GroupExperimental Group
Sample1
Sample2
To test the effect of an herbal treatment on improvement of memory you randomly select two samples, one to receive the treatment and one to receive a placebo. Results of a memory test taken one month later are given.
95
15
77
1
1
1
n
s
x
105
12
73
2
2
2
n
s
x
The resulting test statistic is 77 - 73 = 4. Is this difference significant or is it due to chance (sampling error)?
Treatment Placebo
Overview
Ka-fu Wong © 2003 Chap 11- 3l
GOALS
1. Understand the difference between dependent and independent samples.
2. Conduct a test of hypothesis about the difference between two independent population means when both samples have 30 or more observations.
3. Conduct a test of hypothesis about the difference between two independent population means when at least one sample has less than 30 observations.
4. Conduct a test of hypothesis about the mean difference between paired or dependent observations.
5. Conduct a test of hypothesis regarding the difference in two population proportions.
Chapter ElevenTwo Sample Tests of Two Sample Tests of HypothesisHypothesis
Ka-fu Wong © 2003 Chap 11- 4
Two Sample Tests
TEST FOR EQUAL VARIANCESTEST FOR EQUAL VARIANCES TEST FOR EQUAL MEANSTEST FOR EQUAL MEANS
HHo
HH1
Population 1
Population 2
Population 1
Population 2
HHo
HH1
Population 1
Population 2
Population 1Population 2
Ka-fu Wong © 2003 Chap 11- 5
The formula of general test statistic
Suppose we are interested in testing the population parameter () is equal to k. H0: = k H1: k
First, we need to get a sample estimate (q) of the population parameter ().
Second, we know in most cases, the test statistics will be in the following form: t=(q-k)/q
The form of q depends on what q is. Sample size and the null at hand determine the
distribution of the statistic. If is population mean, and the sample size is
larger than 30, t is approximately normal.
Ka-fu Wong © 2003 Chap 11- 6
Comparing two populations
We wish to know whether the distribution of the differences in sample means has a mean of 0.
If both samples contain at least 30 observations we use the z distribution as the test statistic.
Ka-fu Wong © 2003 Chap 11- 7
Hypothesis Tests for Two Population Means
Format 1Format 1
Two-Tailed Two-Tailed TestTest
Upper Upper One-Tailed One-Tailed TestTest
Lower Lower One-Tailed One-Tailed TestTest
0.0:
0.0:
21
210
AH
H
0.0:
0.0:
21
210
AH
H
0.0:
0.0:
21
210
AH
H
Format 2Format 2
21
210
:
:
AH
H
21
210
:
:
AH
H
21
210
:
:
AH
H
Preferred
Ka-fu Wong © 2003 Chap 11- 8
Two Independent Populations: Examples
1. An economist wishes to determine whether there is a difference in mean family income for households in two socioeconomic groups. Do HKU students come from families with
higher income than CUHK students?
2. An admissions officer of a small liberal arts college wants to compare the mean SAT scores of applicants educated in rural high schools & in urban high schools.
Do students from rural high schools have lower A-level exam score than from urban high schools?
Ka-fu Wong © 2003 Chap 11- 9
Two Dependent Populations: Examples
1. An analyst for Educational Testing Service wants to compare the mean GMAT scores of students before & after taking a GMAT review course.
Get HKU graduates to take A-Level English and Chinese exam again. Do they get a higher A-Level English and Chinese exam score than at the time they enter HKU?
2. Nike wants to see if there is a difference in durability of 2 sole materials. One type is placed on one shoe, the other type on the other shoe of the same pair.
Ka-fu Wong © 2003 Chap 11- 10
Thinking Challenge
1. Miles per gallon ratings of cars before & after mounting radial tires
2. The life expectancies of light bulbs made in two different factories
3. Difference in hardness between 2 metals: one contains an alloy, one doesn’t
4. Tread life of two different motorcycle tires: one on the front, the other on the back
Are they independent or dependent?
independent
independent
dependent
dependent
Ka-fu Wong © 2003 Chap 11- 11
Comparing two populations
No assumptions about the shape of the populations are required.
The samples are from independent populations.Values in one sample have no influence
on the values in the other sample(s).Variance formula for independent
random variables A and B: V(A-B) = V(A) + V(B)
The formula for computing the value of z is:
2
22
1
21
21
ns
ns
XXz
Ka-fu Wong © 2003 Chap 11- 12
EXAMPLE 1
Two cities, Bradford and Kane are separated only by the Conewango River. There is competition between the two cities. The local paper recently reported that the mean household income in Bradford is $38,000 with a standard deviation of $6,000 for a sample of 40 households. The same article reported the mean income in Kane is $35,000 with a standard deviation of $7,000 for a sample of 35 households. At the .01 significance level can we conclude the mean income in Bradford is more?
Ka-fu Wong © 2003 Chap 11- 13
EXAMPLE 1 continued
Step 1: State the null and alternate hypotheses.
H0: µB ≤ µK ; H1: µB > µK
Step 2: State the level of significance. The .01 significance level is stated in the problem.
Step 3: Find the appropriate test statistic. Because both samples are more than 30, we can use z as the test statistic.
Ka-fu Wong © 2003 Chap 11- 14
Example 1 continued
Step 4: State the decision rule. The null hypothesis is rejected if z is greater than 2.33.
33.2z0
Rejection Region = 0.01
H0: µB ≤ µK ;
H1: µB > µK
Probability density of z statistic : N(0,1)
Acceptance Region = 0.01
Ka-fu Wong © 2003 Chap 11- 15
Example 1 continued
Step 5: Compute the value of z and make a decision.
98.1
35)000,7($
40)000,6($
000,35$000,38$22
z
33.2z0
H0: µB ≤ µK ;
H1: µB > µK
1.98
Rejection Region = 0.01
Acceptance Region = 0.01
Ka-fu Wong © 2003 Chap 11- 16
Example 1 continued
The decision is to not reject the null hypothesis. We cannot conclude that the mean household income in Bradford is larger.
Ka-fu Wong © 2003 Chap 11- 17
Example 1 continued
The p-value is:P(z > 1.98) = .5000 - .4761
= .0239
33.2z0
Rejection Region = 0.01
H0: µB ≤ µK ;
H1: µB > µK
1.98
P-value = 0.0239
Ka-fu Wong © 2003 Chap 11- 18
Small Sample Tests of Means
The t distribution is used as the test statistic if one or more of the samples have less than 30 observations.
The required assumptions are:1. Both populations must follow the
normal distribution.2. The populations must have equal
standard deviations.3. The samples are from independent
populations.
Ka-fu Wong © 2003 Chap 11- 19
Small sample test of means continued
Finding the value of the test statistic requires two steps.Step 1: Pool the sample standard deviations.
2
)1()1(
21
222
2112
nn
snsnsp
21
2
21
11nn
s
XXt
p
Step 2: Determine the value of t from the following formula.
Ka-fu Wong © 2003 Chap 11- 20
EXAMPLE 2
A recent EPA study compared the highway fuel economy of domestic and imported passenger cars. A sample of 15 domestic cars revealed a mean of 33.7 mpg with a standard deviation of 2.4 mpg. A sample of 12 imported cars revealed a mean of 35.7 mpg with a standard deviation of 3.9.
At the .05 significance level can the EPA conclude that the mpg is higher on the imported cars?
Ka-fu Wong © 2003 Chap 11- 21
Example 2 continued
Step 1: State the null and alternate hypotheses.
H0: µD ≥ µI ; H1: µD < µI
Step 2: State the level of significance. The .05 significance level is stated in the problem.
Step 3: Find the appropriate test statistic. Both samples are less than 30, so we use the t distribution.
Ka-fu Wong © 2003 Chap 11- 22
EXAMPLE 2 continued
Step 4: The decision rule is to reject H0 if t<-1.708. There are 25 degrees of freedom.
708.1t 0
Rejection Region = 0.05
05.0
:
:0
IDA
ID
H
H
Probability density of t statistic : t (df=25)
Ka-fu Wong © 2003 Chap 11- 23
EXAMPLE 2 continued
918.921215
)9.3)(112()4.2)(115(
2
))(1())(1(
22
21
222
2112
nn
snsnsp
Step 5: We compute the pooled variance:
Ka-fu Wong © 2003 Chap 11- 24
Example 2 continued
We compute the value of t as follows.
640.1
121
151
312.8
7.357.33
11
21
2
21
nns
XXt
p
Ka-fu Wong © 2003 Chap 11- 25
Example 2 continued
708.1t 0
Rejection Region = 0.05
05.0
:
:0
IDA
ID
H
H
-1.640
H0 is not rejected. There is insufficient sample evidence to claim a higher mpg on the imported cars.
Ka-fu Wong © 2003 Chap 11- 26
Hypothesis Testing Involving Paired Observations
Independent samples are samples that are not related in any way.
Dependent samples are samples that are paired or related in some fashion. For example: If you wished to buy a car you would look at
the same car at two (or more) different dealerships and compare the prices.
If you wished to measure the effectiveness of a new diet you would weigh the dieters at the start and at the finish of the program.
Ka-fu Wong © 2003 Chap 11- 27
Hypothesis Testing Involving Paired Observations
Use the following test when the samples are dependent:
where is the mean of the differences is the standard deviation of the
differences n is the number of pairs (differences)
dsd
ns
dt
d
Ka-fu Wong © 2003 Chap 11- 28
EXAMPLE 3
An independent testing agency is comparing the daily rental cost for renting a compact car from Hertz and Avis. A random sample of eight cities revealed the following information. At the .05 significance level can the testing agency conclude that there is a difference in the rental charged?
Ka-fu Wong © 2003 Chap 11- 29
EXAMPLE 3 continued
City Hertz ($) Avis ($)
Atlanta 42 40
Chicago 56 52
Cleveland 45 43
Denver 48 48
Honolulu 37 32
Kansas City 45 48
Miami 41 39
Seattle 46 50
Ka-fu Wong © 2003 Chap 11- 30
EXAMPLE 3 continued
Step 1: State the null and alternate hypotheses.
H0: µd = 0 ; H1: µd ≠ 0
Step 2: State the level of significance. The .05 significance level is stated in the problem.
Step 3: Find the appropriate test statistic. We can use t as the test statistic.
Ka-fu Wong © 2003 Chap 11- 31
EXAMPLE 3 continued
Step 4: State the decision rule. H0 is rejected if t < -2.365 or t > 2.365. We use the t distribution with 7 degrees of freedom.
365.22/ t
H0: µB ≤ µK ;
H1: µB > µK
Rejection Region IIprobability=0.025
Acceptance Region = 0.01
Rejection Region IProbability =0.025
365.22/ t
Probability density of t statistic : t (df=7)
Ka-fu Wong © 2003 Chap 11- 32
Example 3 continued
City Hertz ($) Avis ($) d d2
Atlanta 42 40 2 4
Chicago 56 52 4 16
Cleveland 45 43 2 4
Denver 48 48 0 0
Honolulu 37 32 5 25
Kansas City 45 48 -3 9
Miami 41 39 2 4
Seattle 46 50 -4 16
Ka-fu Wong © 2003 Chap 11- 33
Example 3 continued
00.18
0.8
n
dd
1623.3
1888
78
1
222
nnd
dsd
894.081623.3
00.1
ns
dt
d
Ka-fu Wong © 2003 Chap 11- 34
Example 3 continued
Step 5: Because 0.894 is less than the critical value, do not reject the null hypothesis. There is no difference in the mean amount charged by Hertz and Avis.
365.22/ t
H0: µB ≤ µK ;
H1: µB > µK
Rejection Region IIprobability=0.025
Acceptance Region = 0.01
Rejection Region IProbability =0.025
365.22/ t
0.894
Ka-fu Wong © 2003 Chap 11- 35
Two Sample Tests of Proportions
We investigate whether two samples came from populations with an equal proportion of successes.
The two samples are pooled using the following formula.
where X1 and X2 refer to the number of successes in the respective samples of n1 and n2.
21
21
nn
XXpc
Ka-fu Wong © 2003 Chap 11- 36
Two Sample Tests of Proportions continued
The value of the test statistic is computed from the following formula.
21
21
)1()1(npp
npp
ppz
cccc
Ka-fu Wong © 2003 Chap 11- 37
Example 4
Are unmarried workers more likely to be absent from work than married workers? A sample of 250 married workers showed 22 missed more than 5 days last year, while a sample of 300 unmarried workers showed 35 missed more than five days. Use a .05 significance level.
Ka-fu Wong © 2003 Chap 11- 38
Example 4 continued
The null and the alternate hypothesis are:
H0: U ≤ M H1: U > M
The null hypothesis is rejected if the computed value of z is greater than 1.65.
Ka-fu Wong © 2003 Chap 11- 39
Example 4 continued
The pooled proportion is
1036.250300
2235
cp
The value of the test statistic is
10.1
250)1036.1(1036.
300)1036.1(1036.25022
30035
z
Ka-fu Wong © 2003 Chap 11- 40
Example 4 continued
The null hypothesis is not rejected. We cannot conclude that a higher proportion of unmarried workers miss more days in a year than the married workers.
The p-value is:P(z > 1.10) = .5000 - .3643 = .1457
Ka-fu Wong © 2003 Chap 11- 41
- END -
Chapter ElevenTwo Sample Tests of Two Sample Tests of HypothesisHypothesis