9.1 basic concepts of hypothesis testing · 9.1 basic concepts of hypothesis testing a hypothesis...
TRANSCRIPT
9.1 Basic concepts of hypothesis testing
A hypothesis is an assumption – a statement made to explain a set of facts and to form a basis
for further investigation. It is understood that the statement is subject to proof or checking. The
testing of hypotheses is the second major part of statistical inference. It is of great importance
because it is used as the basis for decision-making in industry, business and government. Decisions
could involve:
• determining whether a certain management structure will reduce costs and improve efficiency;
• determining whether a manufacturing process is in control, i.e. whether deviations in the
products are significant;
• sampling soil from a building site to see whether the radioactive levels are significantly different
from those of ‘normal’ sites;
• determining whether a new additive in a batch of paint significantly affects shelf life.
Statistical testing begins with a hypothesis - an assumption about the value of a population
parameter. A sample is chosen from the population, and the value of the sample mean is calcu-
lated. A decision then has to be made. If there is no significant difference between the values, the
hypothesis may be accepted; if there is a difference, it may be rejected. These decisions are made
on the significance (size) of the difference.
9.2 Formulating and using hypotheses
The various consumer ‘watchdog’ organisations regularly check the mass of items being sold to
ensure that advertised data matches reality. Consider the following problem, which will elucidate
some of the steps in the testing process.
Example 1 The 2 kg bags of sugar from the Citizen Kane Sugar Co. are under scrutiny and
we assume that the bags are correctly labelled - i.e. that they contain exactly 2 kg of sugar. An
hypothesis test must be designed to provide evidence for the truth of the assumption.
2
Solution: The mean mass for the population of 2 kg bags can be:
• greater than 2 kg;
• less than 2 kg;
• equal to 2 kg.
If µ represents the average bag mass of the population, then the following possibilities exist:
Possible value Action
µ > 2 The company is exceeding the specifications the customers are getting
a bargain. Take no action.
µ = 2 Customers are being treated fairly.
µ < 2 The company is not meeting the label specifications – the consumers are being
sold short. Consumers should seek to complain.
Although there are 3 possibilities for µ, 2 of them amount to the same thing. No action will be
taken against the company for µ > 2, since this is giving customers a value for money deal. No
action will be taken for µ = 2 since this is fair dealing. So we combine these into µ ≥ 2. But µ < 2
will produce action!
• The null hypothesis is set up formally:
It is appropriate to assume that this company is meeting its obligations and so the null hypothesis
is that there is no disadvantage to the consumer.
• The alternative hypothesis is set up formally:
The company is not meeting its obligations and consumers are being disadvantage.
• In summary:
Hypothesis Action
H0 : µ ≥ 2 The company is exceeding or meeting the label specification. Take no action.
Ha : µ < 2 The company is not meeting the label specifications. Take appropriate action.
The problem of Citizen Kane Sugar can be used to give a generalised picture where we use the
symbol µ0 to stand for the hypothesised mean. In the problem above it took on the value 2 kg.
The three forms of hypothesis test concerning the population mean are:
3
Form 1 Form 2 Form 3
Null hypothesis H0 : µ ≥ µ0 H0 : µ ≤ µ0 H0 : µ = µ0
Alternative hypothesis Ha : µ < µ0 Ha : µ > µ0 Ha : µ 6= µ0
Example 2 Prior to the institution of a new safety program, the average number of on-the-job
accidents per day at a factory was 4.5. To determine whether the safety program has been effective,
the factory will test
H0 : µ = 4.5
where µ is the true mean number of on-the-job accidents per day at the factory. Formulate the
appropriate alternative hypothesis for the factory.
Solution: The factory is interested in detecting whether the true mean number of on-the-job
accidents per day is less than 4.5, for if it is the case that µ < 4.5, then the new safety program
has been effective in reducing the average number of on-the-job accidents. Thus, the alternative
hypothesis of interest to the factory is
Ha : µ < 4.5
Note that the null hypothesis
H0 : µ = 4.5 (or H0 : µ ≥ 4.5)
actually represents all possible situations for which the safety program has not reduced the true
mean number of on-the-job accidents, i.e., H0 : µ ≥ 4.5.
Example 3 A metal lathe is checked periodically by quality control inspectors to determine whether
it is producing machine bearings with a mean diameter of 5 mm. If the mean diameter of the bearings
is larger or smaller 5 mm, then the process is out of control and must be adjusted. Formulate the
null and alternative hypotheses that could be used to test whether the bearing production process is
out of control.
Solution: The hypotheses must be stated in terms of a population parameter. Thus,we define
µ = True mean diameter (in mm) of all bearings produced by the lathe. If either µ > 5 or µ < 5,
then the metal lathe’s production process is out of control.
Since we wish to be able to detect either possibility, the null and alternative hypotheses would
be
H0 : µ = 5 (i.e., the process is in control)
Ha : µ 6= 5 (i.e., the process is out of control)
4
Example 4 Since 1970, cigarette advertisements have been required by law to carry the following
statement: “Warning: The surgeon general has determined that cigarette smoking is dangerous to
your health.” However, this warning is often located in inconspicuous corners of the advertisements
and printed in small type. Consequently, a spokesperson for the Department of Health (DoH) believes
that over 80% of those who read cigarette advertisements fail to see the warning. Specify the null
and alternative hypotheses that would be used in testing the spokesperson’s theory.
Solution: The DoH spokesperson wants to make an inference about p, the true proportion of all
readers of cigarette advertisements who fail to see the surgeon general’s warning. In particular, the
DoH spokesperson wishes to collect evidence to support the claim that p is greater than 0.80; thus,
the null and alternative hypotheses are
H0 : p ≤ 0.8
Ha : p > 0.8
The sign in H0 is “≤” because we want to cover all situations for which Ha, does not occur. In
other words, the event that H0 occurs is the complement of the event that Ha occurs.
9.3 Errors involved in Hypothesis Testing
Ideally, the hypothesis procedure would always to lead us to accept H0 when it is true and reject H0
when it is false. This is not always the case, for hypothesis testing errors can occur. For example,
a medical test can give a patient a clean bill of health when a disease is present.
The following table shows the situation for Citizen Kane Sugar Co., where hypotheses are:
H0 : µ ≥ 2 and Ha : µ < 2.
State of Nature
Conclusion H0 true (µ ≥ 2) H0 False (µ < 2)
Do not reject H0 (conclude µ ≥ 2) Correct Conclusion Type II Error
Reject H0 (conclude µ < 2) Type I Error Correct Conclusion
Definition 1 A Type I error occurs if we reject a null hypothesis when it is true. The probability
of committing a Type I error is usually denoted by α. This is called the level of significance (or
significance level) for a hypothesis test.
In the case of Citizen Kane Sugar, this would mean that the company is meeting its obligations,
but a sample leads us to conclude the opposite, and a complaint is filed (wrongly).
5
Definition 2 A Type II error occurs if we accept a null hypothesis when it is false. The proba-
bility of making a Type II error is usually denoted by β.
In the case of Citizen Kane Sugar, this could mean that the company was underfilling the bags,
but a sample did not detect this. No complaint would be file when one should, in fact, be filed.
State of Nature
Conclusion H0 true H0 False
Do not reject H0 Correct Conclusion Type II Error
Reject H0 Type I Error Correct Conclusion
Example 5 Refer to Example 2. Specify what Type I and Type II errors would represent, in terms
of the problem.
Solution: A Type I error is that of incorrectly rejecting the null hypothesis. In the example,
this would occur if we conclude that the new safety program is effective in reducing µ, the mean
number of on-the-job accidents, when, in fact, it is ineffective. The consequence of making such an
error would be that unnecessary time, effort, and money would be invested in an ineffectual safety
program.
A Type II error, that of incorrectly accepting the null hypothesis, would occur if we conclude that
µ is equal to 4.5 accidents per day when in fact µ is less than 4.5 accidents per day. The practical
significance of making a Type II error is that the new safety program (thought to be ineffective) will
be discontinued, when in fact it was effective in reducing the mean number of on-the-job accidents.
The probability of making such errors can be computed and controlled. We can control the value
of α – it is set before the experiment by choosing the level of significance. It is equal to the error
of rejecting H0 when H0 is actually true: the type I error. This is because α defines the rejection
region. The probability of correctly accepting H0 is 1 − α.
9.4 Tails of Test
In the statistical hypothesis testing, the range of sampling distribution of a sample statistic is
divided into rejection and nonrejection regions. The size of rejection regions depends on the value
assigned to α (Type I error). As mentioned earlier, α is also called the significance level of the
test.
6
In hypothesis testing we do not speak of the level of confidence, but of the level of significance.
The significance levels chosen are usually 1%, 2.5% or 5%, but any others can be used. Having
chosen this significance level, we have determined the critical value which separates the rejection
and nonrejection regions. It is called a critical value because it is the value against which the actual
sample value will be compared, and is written as Zα or Zα/2.
The rejection region for a hypothesis-testing problem can be on both sides with the nonrejection
region in the middle, or it can be on the left side or on the right side of the nonrejection region.
These possibilities are explained in the next three parts of this section. A test with two rejection
regions is called a two-tailed test, and a test with one rejection region is called a one-tailed
test. The one-tailed test is called a left-tailed test if the rejection region is in the left tail of the
distribution curve, and it is called a right-tailed test if the rejection region is in the right tail of
the distribution curve.
9.4.1 A Two-tailed Test
According to the Statistics and Census Department of Hong Kong, the mean family size in Hong
Kong was 3.19 in 1995. A researcher wants to check whether or not this mean has changed since
1995. The key word here is changed. The mean family size has changed if it has either increased
or decreased during the period since 1995. This is an example of a two-tailed test. Let µ be the
current mean family size for all families. The two possible decisions are
1. The mean family size has not changed, that is, µ = 3.19.
2. The mean family size has changed, that is, µ 6= 3.19.
We write the null and alternative hypotheses for this test as
H0 : µ = 3.19 (The mean family size has not changed)
H1 : µ 6= 3.19 (The mean family size has changed)
Whether a test is two-tailed or one-tailed is determined by the sign in the alternative hypothesis.
If the alternative hypothesis has a not equal to (6=) sign, as in this example, it is a two-tailed test.
As shown in the following figure, a two-tailed test has two rejection regions, one in each tail of the
distribution curve. The following figure shows the sampling distribution of X̄ for a large sample.
Assuming H0 is true, has a normal distribution with its mean equal to 3.19 (the value of µ in H0).
7
In the following figure, the area of each of the two rejection regions is α/2 and the total area of both
rejection regions is α (the significance level). As shown in this figure, a two-tailed test of hypothesis
has two critical values that separate the two rejection regions from the nonrejection region. We will
reject H0 if the value of X̄ obtained from the sample falls in either of the two rejection regions. We
will not reject H0 if the value of X̄ lies in the nonrejection region. By rejecting H0, we are saying
that the difference between the value of µ stated in H0 and the value of X̄ obtained from the sample
is too large to have occurred because of the sampling error alone. Consequently, this difference is
real. By not rejecting H0, we are saying that the difference between the value of µ stated in H0 and
the value of X̄ obtained from the sample is small and it may have occurred because of the sampling
error alone.
9.4.2 A Left-tailed Test
A soft-drink company claims that, on average, its cans contain 12 ounces of soda. However, if these
cans contain less than the claimed amount of soda, then the company can be accused of cheating.
Suppose a consumer agency wants to test whether the mean amount of soda per can is less than 12
ounces. Note that the key phrase this time is less than, which indicates a left-tailed test. Let µ be
the mean amount of soda in all cans. The two possible decisions are
1. The mean amount of soda in all cans is not less than 12 ounces, that is, µ = 12 ounces.
2. The mean amount of soda in all cans is less than 12 ounces, that is, µ < 12 ounces.
8
The null and alternative hypotheses for this test are written as
H0 : µ = 12 ounces (The mean is not less than 12 ounces)
H1 : µ < 12 ounces (The mean is less than 12 ounces)
In this case, we can also write the null hypothesis as H0 : µ ≥ 12. This will not affect the result
of the test as long as the sign in H1 is less than (<).
When the alternative hypothesis has a less than (<) sign, as in this case, the test is always left-
tailed. In a left-tailed test, the rejection region is always in the left tail of the distribution curve,
as shown in the following figure, and the area of this rejection region is equal to α (the significance
level). We can observe from this figure that there is only one critical value in a left-tailed test.
Assuming H0 is true, X̄ has a normal distribution for a large sample with its mean equal to 12
ounces (the value of µ in H0.) We will reject H0 if the value of obtained from the sample falls in
the rejection region; we will not reject H0 otherwise.
9.4.3 A Right-tailed Test
To illustrate the third case, according to a study by the Organization for Economic Cooperation
and Development released in 1996, the mean starting salary of secondary school teachers in Hong
Kong was $22,753. Suppose we want to test if the current mean starting salary of all secondary
school teachers in Hong Kong is higher than $22,753. The key phrase in this case is higher than,
9
which indicates a right-tailed test. Let µ be the current mean starting salary of secondary school
teachers in Hong Kong. The two possible decisions this time are
1. The current mean starting salary of all secondary school teachers in Hong Kong is not higher
than $22,753, that is, µ = $22, 753.
2. The current mean starting salary of all secondary school teachers in Hong Kong is higher than
$22,753, that is, µ > $22, 753.
We write the null and alternative hypotheses for this test as
H0 : µ = $22, 753 (The current mean starting salary is not higher than $22,753)
H1 : µ > $22, 753 (The current mean starting salary is higher than $22,753)
In this case, we can also write the null hypothesis as H : µ ≤ $22, 753, which states that the
current mean starting salary of all secondary school teachers in Hong Kong is either equal to or less
than $22,753. Again, the result of the test will not be affected whether we use an equal to (=) or
a less than or equal to (≤) sign in H0 as long as the alternative hypothesis has a greater than (>)
sign.
When the alternative hypothesis has a greater than (>) sign, the test is always right- tailed.
As shown in the following figure, in a right-tailed test, the rejection region is in the right tail of
the distribution curve. The area of this rejection region is equal to α, the significance level. Like a
left-tailed test, a right-tailed test has only one critical value.
Again, assuming H0 is true, X̄ has a normal distribution for a large sample with its mean equal
to $22,753 (the value of µ in H0). We will reject H0 if the value of X̄ obtained from the sample
falls in the rejection region. Otherwise, we will not reject H0.
Example 6 Refer to Example 2. Since the alternative hypothesis is directional–that is, since the
factory is interested in detecting a departure from H0 in the direction of values of µ smaller than
4.5–a one-tailed test is to be performed.
Example 7 The test in Example 3 is a two-tailed test. The test in Example 4 is one-tailed test.
10
9.5 Steps of Hypothesis Testing
1. Determine the null and alternative hypotheses that are appropriate for the application.
2. Select the distribution to use.
3. Determine the rejection and nonrejection regions.
4. Calculate the value of the test statistic.
5. Make a decision.
9.6 Hypothesis Tests about a population mean: Known Population
Variance
Large-Sample Hypothesis Test about a Population Mean
Hypothesis One-Tailed Test One-Tailed Test Two-Tailed Test
Null
Alternative
H0 : µ ≥ µ0
Ha : µ < µ0
H0 : µ ≤ µ0
Ha : µ > µ0
H0 : µ = µ0
Ha : µ 6= µ0
Rejection Rule Reject H0 if Z < −Zα Reject H0 if Z > Zα Reject H0 if Z > Zα/2
or Z < −Zα/2
Test statistic: Z=X̄ − µ0
σ/√
n
11
Example 8 The XTI Telephone Company provides long-distance telephone service in an area.
According to the company’s records, the average length of all long distance calls placed through this
company in 1993 was 12.44 minutes. The company’s management wanted to check if the mean
length of the current long distance calls is different from 12.44 minutes. A sample of 150 such
calls placed through this company produced a mean length of 13.71 minutes. Assume the population
standard deviation is 2.65 minutes. Using the 5% significance level, can you conclude that the mean
length of all current long distance calls is different from 12.44 minutes?
Solution: Let µ be the mean length of all current long distance calls placed through this company
and X̄ be the corresponding mean for the sample. From the given information, n = 150, X̄ = 13.71
minutes, and σ = 2.65 minutes
We are to test whether or not the mean length of all current long distance calls is different
from 12.44 minutes. The significance level α is 0.05. That is, the probability of rejecting the null
hypothesis when it actually is true should not exceed 0.05. This is the probability of making a Type
I error. We perform the test of hypothesis using the five steps as follows.
Step 1. State the null and alternative hypotheses
Notice that we are testing to find whether or not the mean length of all current long distance
calls is different from 12.44 minutes. We write the null and alternative hypotheses as follows.
H0 : µ = 12.44 (The mean length of all current long distance calls is 12.44 minutes)
Ha : µ 6= 12.44 (The mean length of all current long distance calls is different from 12.44 minutes)
Step 2. Select the distribution to use
Because the sample size is large (n ≥ 30), the sampling distribution of X̄ is (approximately)
normal. Consequently, we use the normal distribution to make the test.
Step 3. Determine the rejection and nonrejection regions
The significance level is 0.05. The 6= sign in the alternative hypothesis indicates that the test is
two-tailed with two rejection regions, one in each tail of the normal distribution curve of X̄. Because
the total area of both rejection regions is 0.05 (the significance level), the area of the rejection region
in each tail is 0.025, that is,
Area in each tail = α/2 = 0.05/2 = 0.025
Two critical points in this figure separate the rejection regions from the nonrejection region. Next
we find the Z values for the two points using the area of the rejection region. To find the Z values
12
for these critical points, first find the area between the mean and one of the critical points. From
the standard normal distribution, Z values of the two critical points, are −1.96 and 1.96.
Step 4. Calculate the value of the test statistic
The decision to reject or not to reject the null hypothesis will depend on whether the evidence
from the sample falls in the rejection or nonrejection region. If the value of the sample mean X̄ falls
in either of the two rejection regions, we reject H0. Otherwise, we do not reject H0. The value of Z
obtained from the sample is called the observed value of X̄. To locate the position of X̄ = 13.71
on the sampling distribution curve of X̄ in the figure, we first calculate the Z value for X̄ = 13.71.
This is called the value of the test statistic. Then, we compare the value of the test statistic
with the two critical values of Zα/2 are −1.96 and 1.96, shown in the figure. If the value of the test
statistic is between −1.96 and 1.96, we do not reject H0. If the value of the test statistic is either
greater than 1.96 or less than −1.96, we reject H0.
The value of X̄ from the sample is 13.71. We calculate the Z value as follows.
Z =X̄ − µ0
σ/√
n=
13.71 − 12.44
2.65/√
150= 5.8695 = 5.87
The value of µ in the calculation of the Z value is substituted from the null hypothesis. The value
of Z = 5.87 calculated for X̄ is called the computed value of the test statistic Z. This is
the value of Z that corresponds to the value of X̄ observed from the sample. It is also called the
observed value of Z.
Step 5. Make a decision
In the final step we make a decision based on the location of the value of the test statistic Z
computed for X̄ in Step 4. This value of Z = 5.87 is greater than the critical value of Zα/2 = 1.96,
and it falls in the rejection region in the right tail. Hence, we reject H0 and conclude that based
on the sample information, it appears that the mean length of all such calls is not equal to 12.44
minutes.
Example 9 Because couples are deciding to have fewer children, the family size in the H. K. has
declined continuously during the past few decades. According to the Census and Statistic department,
the mean family size was 3.16 in 1992. A researcher wanted to check if the current mean family
size is less than 3.16. A sample of 900 families taken this year by this researcher produced a mean
family size of 3.13. Assume the population standard deviation is 0.70. Using the 0.025 significance
level, can we conclude that the mean family size has declined since 1992?
13
Solution: Let µ be the current mean size of all families and X̄ be the mean family size for the
sample. From the given information, n = 900, X̄ = 3.13, and σ = 0.70. The mean family size for
1992 is given to be 3.16. The significance level α is 0.025.
Step 1. State the null and alternative hypotheses
Notice that we are testing for a decline in the mean family size. The null and alternative
hypotheses are written as follows.
H0 : µ = 3.16 (The mean family size has not declined)
Ha : µ < 3.16 (The mean family size has declined)
Step 2. Select the distribution to use
Because the sample size is large (n > 30), the sampling distribution of X̄ is (approximately)
normal. Consequently, we use the normal distribution to make the test.
Step 3. Determine the rejection and nonrejection regions
The significance level is 0.025. The < sign in the alternative hypothesis indicates that the test
is left-tailed with the rejection region in the left tail of the sampling distribution curve of X̄. The
critical value of Zα obtained from the normal table is −1.96.
Step 4. Calculate the value of the test statistic
The value of the test statistic Z for X̄ = 3.13 is calculated as follows.
Z =X̄ − µ0
σ/√
n=
3.13 − 3.16
0.70/√
900= −1.29
Step 5. Make a decision
The value of the test statistic Z = −1.29 is greater than the critical value of Zα = −1.96 and it
does not fall in the rejection region. As a result, we fail to reject H0. Consequently, we can state
that based on the sample information, it appears that the mean family size has no declined since
1992. Note that we are not concluding that the mean family size has definitely not declined. By not
14
rejecting the null hypothesis, we are saying that the information obtained from the sample is not
strong enough to reject the null hypothesis and to conclude that the family size has not declined
since 1992.
9.7 Hypothesis Tests using p-value approach
9.7.1 p-value for One-Tailed Tests
Definition 3 The p-value is the smallest level of significance α for which the sample data indicate
that the null hypothesis should be rejected.
If H0 : µ ≤ µ0 and Ha : µ > µ0, then
p-value = P
(
Z >X̄ − µ0
σ/√
n
)
If H0 : µ ≥ µ0 and Ha : µ < µ0, then
p-value = P
(
Z <X̄ − µ0
σ/√
n
)
The p-value can be used to make the decision for hypothesis test by noting that if the p-value is
less than the level of significance, α, the value of the test statistic must be in the rejection region.
Similarly, if the p-value is greater than or equal to α, the value of the test statistic is not in the
rejection region.
p-Value Criterion for Hypothesis Testing:
Reject H0 if the p-value < α.
15
Example 10 The management of Priority Health Club claims that its members lose an average of
10 pounds or more within the first month after joining the club. A consumer agency that wanted to
check this claim took a random sample of 36 members of this health club and found that they lost
an average of 9.2 pounds within the first month of membership. Assume the population standard
deviation is 2.4 pounds. Find the p-value for this test.
Solution: Let µ be the mean weight lost during the first month of membership by all members
of this health club and X̄ be the corresponding mean for the sample. From the given information,
n = 36, X̄ = 9.2 pounds, and σ = 2.4 pounds.
The claim of the club is that its members lose, on average, 10 pounds or more within the first
month of membership. To calculate the p-value, we apply the following three steps.
Step 1. State the null and alternative hypotheses
H0 : µ ≥ 10 (The mean weight lost is 10 pounds or more)
Ha : µ < 10 (The mean weight lost is less than 10 pounds)
Step 2. Select the distribution to use
Because the sample size is large, we use the normal distribution to make the test and to calculate
the p-value.
Step 3. Calculate the p-value
The < sign in the alternative hypothesis indicates that the test is left-tailed. The p-value is
given by the area to the left of X̄ = 9.2 under the sampling distribution curve of X̄, as shown in
the figure. To find this area, we first find the Z value for X̄ = 9.2 as follows.
Z =X̄ − µ0
σ/√
n=
9.2 − 10
2.4/√
36= −2.0
The area to the left of X̄ = 9.2 under the sampling distribution of X̄ is equal to the area under the
standard normal curve to the left of Z = −2.00. From the normal distribution table, the area to
16
the left of Z = −2.00 is 0.0228 (= 1 − 0.9772). Consequently, p-value = 0.0228.
Step 4. Make a decision
Based on the p-value of 0.0228 we can state that for any α (significance level) greater than 0.0228
we will reject the null hypothesis stated in Step 1 and for any α less than 0.0228 we will not reject
the null hypothesis. Suppose we make the test for this example at α = 0.01. Because α = 0.01 is
less than the p-value of 0.0228, we will not reject the null hypothesis. Now, suppose we make the
test at α = 0.05. This time, because α = 0.05 is greater than the p-value of 0.0228, we will reject
the null hypothesis.
9.7.2 p-value for Two-Tailed Tests
To compute the p-value for a test-tailed test we must first calculate the area in the tail of the
sampling distribution corresponding to the observed value of X̄. The p-value for the two-tailed test
is then computed by doubling this area or probability.
Example 11 At Canon Food Corporation, it used to take an average of 50 minutes for new workers
to learn a food processing job. Recently the company installed a new food processing machine. The
supervisor at the company wants to find if the mean time taken by new workers to learn the food
processing procedure on this new machine is different from 50 minutes. A sample of 40 workers
showed that it took, on average, 47 minutes for them to learn the food processing procedure on the
new machine. Assume the population standard deviation is 7 minutes. Find the p-value for the test
that the mean learning time for the food processing procedure on the new machine is different from
50 minutes.
Solution: Let µ be the mean time (in minutes) taken to learn the food processing procedure
on the new machine by all workers and X̄ be the corresponding sample mean. From the given
17
information, n = 40, X̄ = 47 minutes and σ = 7 minutes.
To calculate the p-value, we apply the following three steps.
Step 1. State the null and alternative hypotheses
H0 : µ = 50 minutes
Ha : µ 6= 50 minutes
Note that the null hypothesis states that the mean time for learning the food processing pro-
cedure on the new machine is 50 minutes and the alternative hypothesis states that this time is
different from 50 minutes.
Step 2. Select the distribution to use
Because the sample size is large, we use the normal distribution to make the test and to calculate
the p-value.
Step 3. Calculate the p-value
The 6= sign in the alternative hypothesis indicates that the test is two-tailed. The p-value is
equal to twice the area in the tail of the sampling distribution curve of X̄ to the left of X̄ = 47, as
shown in the figure. To find this area, we first find the Z value for X̄ = 47 as follows.
Z =X̄ − µ0
σ/√
n=
47 − 50
7/√
40= −2.71
The area to the left of X̄ = 47 is equal to the area under the standard normal curve to the left
of Z = −2.71. From the normal distribution table, the area to the left of Z = −2.71 is 0.0034
(= 1 − 0.9966) . Consequently, the p-value = 2 × 0.0034 = 0.0068.
Step 4. Make a decision
Based on the p-value of 0.0068, we conclude that for any α (significance level) greater than
0.0068 we will reject the null hypothesis and for any α less than 0.0068 we will not reject the null
hypothesis.
9.8 The Relationship Between Interval Estimation and Hypothesis Test-
ing
In the case of interval estimation, the population mean µ was unknown. Once the sample was
selected and the sample mean X̄ computed, we developed an interval around the value of X̄ that
had a good chance of including the value of the parameter µ. The interval estimate computed
was referred to as a confidence interval with 1 − α defined as the confidence coefficient. In the
18
large-sample case, the formula for interval estimation of a population mean was given by
X̄ ± Zα/2σ√n
. (1)
Conducting a hypothesis test requires us first to make an assumption about the value of a population
parameter. In the case of the population mean, the two-tailed hypothesis test has the form
H0 : µ = µ0
Ha : µ 6= µ0
where µ0 is the hypothesized value for the population mean. Using the decision rule,we see that
the region over which we do not reject includes all values of the sample mean X̄; that are within
−Zα/2 and +Zα/2 standard errors of µ0. Thus the following expression provides the do-not-reject
region for the sample mean X̄ in a two-tailed hypothesis test with a level of significance of α:
µ0 ± Zα/2σ√n
. (2)
Note in particular that both procedures require the computation of the values Zα/2 and σ/√
n.
Focusing on α, we see that a confidence coefficient of (1 − α) for interval estimation corresponds
to a level of significance of α in hypothesis testing. If X̄ falls in the do-not-reject region defined by
(2), the hypothesized value µ0 will be in the confidence interval defined by (1). Conversely, if the
hypothesized value µ0 falls in the confidence interval defined by (1), the sample mean X̄ will be in
the do-not-reject region for the hypothesis H0 : µ = µ0.
• A Confidence Interval Approach to Testing a Hypothesis of the Form
H0 : µ = µ0
Ha : µ 6= µ0
19
1. Select a simple random sample from the population and use the value of the sample mean X̄
to develop the confidence interval
X̄ ± Zα/2σ√n
2. If the confidence interval contains the hypothesized value µ0, do not reject H0. Otherwise,
reject H0.
9.9 Hypothesis Tests about a population mean: Unknown Population
Variance
Suppose that the population standard deviation is unknown. If we can assume that the population
has a normal distribution, the t distribution can be used to makes inferences about the value of a
population mean.
Small-Sample Hypothesis Test about a Population Mean
Hypothesis One-Tailed Test One-Tailed Test Two-Tailed Test
Null
Alternative
H0 : µ ≥ µ0
Ha : µ < µ0
H0 : µ ≤ µ0
Ha : µ > µ0
H0 : µ = µ0
Ha : µ 6= µ0
Rejection Rule Reject H0 if t < −tα Reject H0 if t > tα Reject H0 if t > tα/2
or t < −tα/2
Degrees of freedom ν = n − 1 ν = n − 1 ν = n − 1
Test statistic: t =X̄ − µ0
s/√
n
20
Example 12 A psychologist claims that the mean age at which children start walking is 12.5
months. Carol wanted to check if this claim is true. She took a random sample of 18 children
and found that the mean age at which these children started walking was 12.9 months with a stan-
dard deviation of 0.80 months. Using the 1% significance level, can you conclude that the mean age
at which all children start walking is different from 12.5 months? Assume that the ages at which all
children start walking have an approximate normal distribution.
Solution: Let µ be the mean age at which all children start walking and X̄ be the corresponding
mean for the sample. Then, from the given information, n = 18, X̄ = 12.9 months, s = 0.80
months, and α = 0.01.
Step 1. State the null and alternative hypotheses
We are to test if the mean age at which all children start walking is different from 12.5 months.
The null and alternative hypotheses are
H0 : µ = 12.5 (The mean walking age is 12.5 months)
Ha : µ 6= 12.5 (The mean walking age is different from 12.5 months)
Step 2. Select the distribution to use
The sample size is small, and the population is approximately normally distributed. However,
we do not know the population standard deviation σ. Hence, we use the t distribution to make the
test.
Step 3. Determine the rejection and nonrejection regions
The significance level is 0.01. The 6= sign in the alternative hypothesis indicates that the test is
two-tailed and the rejection region lies in both tails. The area of the rejection region in each tail of
the t distribution curve is
Area in each tail = α/2 = 0.01/2 = 0.005 and ν = n − 1 = 18 − 1 = 17
From the t distribution table, the critical values of tα/2 for 17 degrees of freedom and 0.005 area in
each tail of the t distribution curve are −2.8982 and 2.8982.
Step 4. Calculate the value of the test statistic
We calculate the value of the test statistic t for X̄ = 12.9 as follows.
t =X̄ − µ0
s/√
n=
12.9 − 12.5
0.80/√
18= 2.121
Step 5. Make a decision
21
The value of the test statistic t = 2.121 falls between the two critical points, −2.8982 and
2.8982. Consequently, we fail to reject H0. As a result, we can state that the difference between the
hypothesized population mean and the sample mean is so small that it may have occurred because
of sampling error. The mean age at which children start walking is not different from 12.5 months.
Example 13 The management at ABC Savings Bank is always concerned about the quality of ser-
vice provided to its customers. With the old computer system, a teller at this bank could serve, on
average, 22 customers per hour. The management noticed that with this service rate, the waiting
time for customers was too long. Recently the management of the bank installed a new computer
system in the bank expecting that it would increase the service rate and consequently make the cus-
tomers happier by reducing the waiting time. To check if the new computer system is more efficient
than the old system, the management of the bank took a random sample of 20 hours and found that
during these hours the mean number of customers served by tellers was 28 per hour with a standard
deviation of 2.5. Testing at the 1% significance level, would you conclude that the new computer
system is more efficient than the old computer system? Assume that the number of customers served
per hour by a teller on this computer system has an approximate normal distribution.
Solution: Let µ be the mean number of customers served per hour by a teller and X̄ be the
corresponding mean for the sample. Then, from the given information, n = 20 hours, X̄ = 28
customers, s = 2.5 customers, and α = 0.01.
Step 1. State the null and alternative hypotheses
We are to test whether or not the new computer system is more efficient than the old system. The
new computer system will be more efficient than the old system if the mean number of customers
served per hour by using the new computer system is significantly more than 22; otherwise, it will
not be more efficient. The null and alternative hypotheses are
22
H0 : µ = 22 (The new computer system is not more efficient)
Ha : µ > 22 (The new computer system is more efficient)
Step 2. Select the distribution to use
The sample size is small, and the population is approximately normally distributed. However,
we do not know the population standard deviation σ. Hence, we use the t distribution to make the
test.
Step 3. Determine the rejection and nonrejection regions
The significance level is 0.01. The > sign in the alternative hypothesis indicates that the test is
right-tailed and the rejection region lies in the right tail of the t distribution curve.
Area in the right tail = α = 0.01 and ν = n − 1 = 20 − 1 = 19
From the t distribution table, the critical value of tα for 19 degrees of freedom and 0.01 area in the
right tail is 2.5395.
Step 4. Calculate the value of the test statistic
The value of the test statistic t for X̄ = 28 is calculated as follows.
t =X̄ − µ0
s/√
n=
28 − 22
2.5/√
20= 10.733
Step 5. Make a decision
The value of the test statistic t = 10.733 is larger than the critical value of tα = 2.5395, and it
falls in the rejection region. Consequently, we reject H0. As a result, we conclude that the value of
the sample mean is too large compared to the hypothesized value of the population mean and the
difference between the two may not be attributed to chance alone. The mean number of customers
served per hour using the new computer system is more than 22. The new computer system is more
efficient than the old computer system.
Example 14 A light bulb manufacturer stated that the mean life time of their light bulbs is at least
724 hours. A sample of 10 light bulbs are inspected, their life times(in hours) are recorded as follow:
670 690 800 720 660 650 560 540 600 730.
Use the significance level of 0.05 to test the manufacturer’s claim. Assume that the distribution of
the life times of light bulb are normally distributed.
23
Solution: First of all, we calculate the sample mean and sample standard deviation
∑
x = 670 + 690 + 800 + 720 + 660 + 650 + 560 + 540 + 600 + 730 = 6620
X̄ =
∑
x
n=
6620
10= 662
∑
x2 = 6702 + 6902 + 8002 + 7202 + 6602 + 6502 + 5602 + 5402 + 6002 + 7302 = 4439 600
s =
√
∑
x2 − nX̄2
n − 1=
√
4439 600 − 10 × 6622
9= 79.69
Step 1. The null and alternative hypotheses are
H0 : µ ≥ 724
Ha : µ < 724
Step 2. The sample size is small, and the population is approximately normally distributed.
However, we do not know the population standard deviation σ. Hence, we use the t distribution to
make the test.
Step 3. Note that α = 0.05, the degree of freedom of the t distribution = n − 1 = 9. From the
t table, tα/2 = t0.05 = 1.8331.
Rejection rule: reject H0 if t < −1.8331.
Step 4. The test statistic is
t =X̄ − µ0
s/√
n=
662 − 724
79.69/√
10= −2.4603
Step 5. Conclusion: Since t = −2.4603 is less than −1.8331, we reject H0 and conclude that
the manufacturer’s claim is not true.
24
9.10 Hypothesis Tests about a Population Proportion: Large Samples
We can use the following hypothesis tests if np ≥ 5 and n(1 − p) ≥ 5. In this case, we can assume
the distribution of p̄ is a normal distribution.
Hypothesis Test about a Population Proportion
Hypothesis One-Tailed Test One-Tailed Test Two-Tailed Test
Null
Alternative
H0 : p ≥ p0
Ha : p < p0
H0 : p ≤ p0
Ha : p > p0
H0 : p = p0
Ha : p 6= p0
Rejection Rule Reject H0 if Z < −Zα Reject H0 if Z > Zα Reject H0 if Z > Zα/2
or Z < −Zα/2
Test statistic: Z=p̄ − p0
√
p0(1−p0)n
Example 15 Based on the sales of teeth-cleaning products in supermarkets and drugstores during
the period from October 1991 to September 1992, ABC toothpaste controlled a 31.2% share of the
market. For convenience, assume that a 31.2% share of the market means that 31.2% of all people
in the H. K. used ABC toothpaste. A researcher from a rival company wants to find whether or
not the current market share controlled by ABC is different from 31.2%. She took a sample of 400
persons and found that 29% of them use ABC toothpaste. Using the 0.01 significance level, can you
conclude that the current market share of ABC toothpaste is different from that for 1991–1992?
Solution: Let p be the proportion of all people in the H. K. who currently use ABC toothpaste
and p̄ be the corresponding sample proportion. Then, from the given information, n = 400, p̄ = 0.29,
and α = 0.01.
Based on 1991–1992 sales data, 31.2% of all people use ABC toothpaste. Assuming this claim
is true, p = 0.312 and 1 − p = 1 − 0.312 = 0.688.
25
Step 1. State the null and alternative hypotheses
ABC toothpaste still controls the same market share if p = 0.312 and the current market share
is different if p 6= 0.312. The null and alternative hypotheses are as follows.
H0 : p = 0.312 (The current market share of ABC is the same)
Ha : p 6= 0.312 (The current market share of ABC is different)
Step 2. Select the distribution to use
The values of np and n (1 − p) are np = 400(0.312) = 124.80 and n (1 − p) = 400(0.688) =
275.20. Because both np and n (1 − p) are greater than 5, the sample size is large. Consequently,
we use the normal distribution to make the hypothesis test about p.
Step 3. Determine the rejection and nonrejection regions
The 6= sign in the alternative hypothesis indicates that the test is two-tailed. The significance
level is 0.01. Therefore, the total area of the two rejection regions is 0.01 and the rejection region
in each tail of the sampling distribution of p̄ is α/2 = 0.01/2 = 0.005. The critical values of Zα/2
obtained from the standard normal distribution table for 0.005 are −2.58 and 2.58.
Step 4. Calculate the value of the test statistic
The value of the test statistic Z for p̄ = 0.29 is calculated as follows.
Z =p̄ − p
√
p (1 − p) /n=
0.29 − 0.312√
(0.312) (0.688) /400= −0.94969 = −0.95
Step 5. Make a decision
The value of the test statistic Z = −0.95 for p̄ lies between −2.58 and 2.58, and it falls in the
nonrejection region. Consequently, we fail to reject H0. Therefore, we can state that the sample
proportion is not too far from the hypothesized value of the population proportion and the difference
between the two can be attributed to chance. We conclude that the current market share of ABC
toothpaste is not different from 31.2%.
Example 16 Direct Mailing Company sells computers and computer parts by mail. The company
claims that at least 90% of all orders are mailed within 72 hours after they are received. The quality
control department at the company often takes samples to check if this claim is valid. A recently
taken sample of 150 orders showed that 129 of them were mailed within 72 hours. Do you think the
company’s claim is true? Use a 2.5% significance level.
Solution: Let p be the proportion of all orders that are mailed by the company within 72 hours
and p̄ be the corresponding sample proportion. Then, from the given information, n = 150,
26
p̄ = 129/150 = 0.86, and α = 0.025. The company claims that at least 90% of all orders are mailed
within 72 hours. Assuming, that this claim is true, the values of p = 0.90.
Step 1. State the null and alternative hypotheses
The null and alternative hypotheses are
H0 : p ≥ 0.90 (The company’s claim is true)
Ha : p < 0.90 (The company’s claim is false)
Step 2. Select the distribution to use
We first check whether both np and n (1 − p) are greater than 5.
np = 150(0.90) = 135 > 5 and n (1 − p) = 150(0.10) = 15 > 5
Consequently, the sample size is large. Therefore, we use the normal distribution to make the
hypothesis test about p.
Step 3. Determine the rejection and nonrejection regions
The significance level is 0.025. The < sign in the alternative hypothesis indicates that the test
is left-tailed and the rejection region lies in the left tail of the sampling distribution of p̄ with
its area equal to 0.025. The critical value of Zα obtained from the normal distribution table is
(approximately) −1.96.
Step 4. Calculate the value of the test statistic
The value of the test statistic Z for p̄ = 129150
= 0.86 is calculated as follows.
Z =p̄ − p
√
p (1 − p) /n=
0.86 − 0.90√
(0.90)(0.10)/150= −1.63
Step 5. Make a decision
The value of the test statistic Z = −1.63 is greater than the critical value of Zα = −1.96, and it
does not fall in the rejection region. Therefore, we fail to reject H0. We can state that the difference
27
between the sample proportion and the hypothesized value of the population proportion is small
and this difference may have occurred owing to chance alone. Therefore, the proportion of all orders
that are mailed within 72 hours is at least 90% and the company’s claim is true.
28