Section 9.3b Tests about a Population Mean
Page 1 of 16
So far most of our statistical tests for the mean have been from single sample data. It is sometimes necessary to
compare the means from different treatment groups to see if there is a significant difference between the means
of these two groups. Comparative studies are more convincing than single-sample investigations. For that
reason, one-sample inference is less common than comparative inference. Study designs that involve making
two observations on the same individual, or one observation on each of two similar individuals, results in
paired data. We saw an example of paired data in the job satisfaction study in section 9.1. In this matched
pairs experiment, each worker’s satisfaction was recorded twice—after self-paced assembly work and after
machine-paced assembly work.
When paired data result from measuring the same quantitative variable twice, as in the job satisfaction study,
we can make comparisons by analyzing the differences in each pairing. If the conditions for inference are met,
we can use one-sample t procedures to perform inference about the mean of the differences μd. (These methods
are sometimes called paired t procedures.) An example will help illustrate what is meant.
Inference for Means: Paired Data
Researchers designed an experiment to study the effects of caffeine withdrawal. They recruited 11 volunteers
who were diagnosed as being caffeine dependent to serve as subjects. Each subject was barred from caffeinated
drinks for the duration of the experiment. During a single two-day period, subjects took capsules containing
their normal caffeine intake. During a different two-day period, they took placebo capsules. The order in which
subjects took caffeine and the placebo was randomized. At the end of each two-day period, a test for depression
was given to all 11 subjects. Researchers wanted to know whether being deprived of caffeine would lead to an
increase in depression.
The table below shows the results of the depression test. Higher scores indicate more symptoms of depression.
Section 9.3b Tests about a Population Mean
Page 2 of 16
a. Carry out an appropriate test to investigate the researchers claim. (We will discuss each step)
Section 9.3b Tests about a Population Mean
Page 3 of 16
b. Construct and interpret a 95% confidence interval for this problem. Can we use the confidence interval to
conduct the significance test? What are the advantages of the confidence interval?
Section 9.3b Tests about a Population Mean
Page 4 of 16
2. Most automobile tires are inflated with compressed air, which consists of about 78%
nitrogen. Aircraft tires are filled entirely with nitrogen, which is safer than air in case of fire.
Could filling automobile tires with nitrogen improve safety, performance, or both?
Consumers Union designed a study to test whether nitrogen-filled tires would maintain
pressure better than air-filled tires. They obtained two tires from each of several brands and
then filled one tire in each pair with air and one with nitrogen. (The article didn’t say, but assume that these
assignments were made at random.) All tires were inflated to a pressure of 30 pounds per square inch and then
placed outside for a year. At the end of the year, Consumers Union measured the pressure in each tire.
The amount of pressure lost (in pounds per square inch) during the year for the air-filled and nitrogen-filled tires
of each brand is shown in the table below along with the difference (Air – Nitrogen).
Brand Air Nitrogen Air-Nitrogen
BF Goodrich Traction 7.6 7.2 0.4
Bridgestone HPSO (Sears) 3.8 2.5 1.3
Bridgestone Potenza EL400 2.1 1 1.1
Bridgestone Potenza G009 3.7 1.6 2.1
Bridgestone Potenza RE9S0 4.7 1.5 3.2
Continental Premier Contact H 4.9 3.1 1.8
Cooper Lifeliner Touring SLE 5.2 3.5 1.7
Dayton Daytona HR 3.4 3.2 0.2
Faken Ziex ZE-S12 4.1 3.3 0.8
Fuzion Hrl 2.7 2.2 0.5
General Exclaim 3.1 3.4 -0.3
Goodyear Assurance Tripletred 3.8 3.2 0.6
Hankook Optimo H418 3 0.9 2.1
Kumho Solus KH16 6.2 3.4 2.8
Michelin Energy MXV4 2 1.8 0.2
Michelin Pilot XGT H4 1.1 0.7 0.4
Pirelli P6 Four Seasons 4.4 4.2 0.2
Sumitomo HTR H4 1.4 2.1 -0.7
Yokohama Avid H4S 4.3 3 1.3
BF Goodrich Traction T/A V 5.5 3.4 2.1
Bridgestone Potenza RE9S0 4.1 2.8 1.3
Continental Conti Extreme Contact 5 3.4 1.6
Continental ContiProContact 4.8 3.3 1.5
Cooper Lifeliner Touring SLE 3.2 2.5 0.7
General Exclaim UHP 6.8 2.7 4.1
Hankook Ventus V4 H10S 3.1 1.4 1.7
Michelin Energy MXV4 Plus 2.5 1.5 1
Michelin Pilot Exalto NS 6.6 2.2 4.4
Michelin Pilot HX MXM4 2.2 2 0.2
Pirelli P6 Four Seasons 2.5 2.7 -0.2
Sumitomo HTR+ 4.4 3.7 0.7
Section 9.3b Tests about a Population Mean
Page 5 of 16
a. A scatterplot of the data using Air pressure loss as the explanatory variable and Nitrogen pressure loss as the
response variable is shown below. Discuss how the graph shows the relationship between these paired data.
b. Add the line y = x to your scatterplot.
What is true for any point on the line?
Points above this line?
Points below this line?
Section 9.3b Tests about a Population Mean
Page 6 of 16
What do you notice? Discuss what this pattern implies.
c. State an appropriate null and the alternative hypothesis.
d. Now let’s analyze the differences. Is this a matched pairs design?
Check the conditions for performing inference about the true mean difference μ in pressure loss (air —
nitrogen). Again, the 10% condition is not necessary because we are not selecting a sample.
Section 9.3b Tests about a Population Mean
Page 7 of 16
Although the sample size is large (n = 31), there are two high outliers that will affect any results we get using t
procedures. For that reason, you need to do Questions e twice—once with the outliers included and once with
the outliers removed.
e. Carry out a significance test at the α = 0.05 level. Do these data provide convincing evidence that filling tires
with nitrogen instead of air decreases pressure loss?
Section 9.3b Tests about a Population Mean
Page 8 of 16
Using Tests Wisely
There are several examples in your text on pages from 589 to 594 on cautions regarding interpreting the results
of statistical tests. You should read these examples and try to avoid these pitfalls.
Section 9.3b Tests about a Population Mean
Page 9 of 16
So far most of our statistical tests for the mean have been from single sample data. It is sometimes necessary to
compare the means from different treatment groups to see if there is a significant difference between the means
of these two groups. Comparative studies are more convincing than single-sample investigations. For that
reason, one-sample inference is less common than comparative inference. Study designs that involve making
two observations on the same individual, or one observation on each of two similar individuals, results in
paired data. We saw an example of paired data in the job satisfaction study in section 9.1. In this matched
pairs experiment, each worker’s satisfaction was recorded twice—after self-paced assembly work and after
machine-paced assembly work.
When paired data result from measuring the same quantitative variable twice, as in the job satisfaction study,
we can make comparisons by analyzing the differences in each pairing. If the conditions for inference are met,
we can use one-sample t procedures to perform inference about the mean of the differences μd. (These methods
are sometimes called paired t procedures.) An example will help illustrate what is meant.
Inference for Means: Paired Data
Researchers designed an experiment to study the effects of caffeine withdrawal. They recruited 11 volunteers
who were diagnosed as being caffeine dependent to serve as subjects. Each subject was barred from caffeinated
drinks for the duration of the experiment. During a single two-day period, subjects took capsules containing
their normal caffeine intake. During a different two-day period, they took placebo capsules. The order in which
subjects took caffeine and the placebo was randomized. At the end of each two-day period, a test for depression
was given to all 11 subjects. Researchers wanted to know whether being deprived of caffeine would lead to an
increase in depression.
The table below shows the results of the depression test. Higher scores indicate more symptoms of depression.
Section 9.3b Tests about a Population Mean
Page 10 of 16
a. Carry out an appropriate test to investigate the researchers claim. (We will discuss each step)
Section 9.3b Tests about a Population Mean
Page 11 of 16
b. Construct and interpret a 95% confidence interval for this problem. Can we use the confidence interval to
conduct the significance test? What are the advantages of the confidence interval?
Section 9.3b Tests about a Population Mean
Page 12 of 16
2. Most automobile tires are inflated with compressed air, which consists of about 78%
nitrogen. Aircraft tires are filled entirely with nitrogen, which is safer than air in case of fire.
Could filling automobile tires with nitrogen improve safety, performance, or both?
Consumers Union designed a study to test whether nitrogen-filled tires would maintain
pressure better than air-filled tires. They obtained two tires from each of several brands and
then filled one tire in each pair with air and one with nitrogen. (The article didn’t say, but assume that these
assignments were made at random.) All tires were inflated to a pressure of 30 pounds per square inch and then
placed outside for a year. At the end of the year, Consumers Union measured the pressure in each tire.
The amount of pressure lost (in pounds per square inch) during the year for the air-filled and nitrogen-filled tires
of each brand is shown in the table below along with the difference (Air – Nitrogen).
Brand Air Nitrogen Air-Nitrogen
BF Goodrich Traction 7.6 7.2 0.4
Bridgestone HPSO (Sears) 3.8 2.5 1.3
Bridgestone Potenza EL400 2.1 1 1.1
Bridgestone Potenza G009 3.7 1.6 2.1
Bridgestone Potenza RE9S0 4.7 1.5 3.2
Continental Premier Contact H 4.9 3.1 1.8
Cooper Lifeliner Touring SLE 5.2 3.5 1.7
Dayton Daytona HR 3.4 3.2 0.2
Faken Ziex ZE-S12 4.1 3.3 0.8
Fuzion Hrl 2.7 2.2 0.5
General Exclaim 3.1 3.4 -0.3
Goodyear Assurance Tripletred 3.8 3.2 0.6
Hankook Optimo H418 3 0.9 2.1
Kumho Solus KH16 6.2 3.4 2.8
Michelin Energy MXV4 2 1.8 0.2
Michelin Pilot XGT H4 1.1 0.7 0.4
Pirelli P6 Four Seasons 4.4 4.2 0.2
Sumitomo HTR H4 1.4 2.1 -0.7
Yokohama Avid H4S 4.3 3 1.3
BF Goodrich Traction T/A V 5.5 3.4 2.1
Bridgestone Potenza RE9S0 4.1 2.8 1.3
Continental Conti Extreme Contact 5 3.4 1.6
Continental ContiProContact 4.8 3.3 1.5
Cooper Lifeliner Touring SLE 3.2 2.5 0.7
General Exclaim UHP 6.8 2.7 4.1
Hankook Ventus V4 H10S 3.1 1.4 1.7
Michelin Energy MXV4 Plus 2.5 1.5 1
Michelin Pilot Exalto NS 6.6 2.2 4.4
Michelin Pilot HX MXM4 2.2 2 0.2
Pirelli P6 Four Seasons 2.5 2.7 -0.2
Sumitomo HTR+ 4.4 3.7 0.7
Section 9.3b Tests about a Population Mean
Page 13 of 16
a. A scatterplot of the data using Air pressure loss as the explanatory variable and Nitrogen pressure loss as the
response variable is shown below. Discuss how the graph shows the relationship between these paired data.
b. Add the line y = x to your scatterplot.
What is true for any point on the line?
Points above this line?
Points below this line?
Section 9.3b Tests about a Population Mean
Page 14 of 16
What do you notice? Discuss what this pattern implies.
c. State an appropriate null and the alternative hypothesis.
d. Now let’s analyze the differences. Is this a matched pairs design?
Check the conditions for performing inference about the true mean difference μ in pressure loss (air —
nitrogen). Again, the 10% condition is not necessary because we are not selecting a sample.
Section 9.3b Tests about a Population Mean
Page 15 of 16
Although the sample size is large (n = 31), there are two high outliers that will affect any results we get using t
procedures. For that reason, you need to do Questions e twice—once with the outliers included and once with
the outliers removed.
e. Carry out a significance test at the α = 0.05 level. Do these data provide convincing evidence that filling tires
with nitrogen instead of air decreases pressure loss?
Section 9.3b Tests about a Population Mean
Page 16 of 16
Using Tests Wisely
There are several examples in your text on pages from 589 to 594 on cautions regarding interpreting the results
of statistical tests. You should read these examples and try to avoid these pitfalls.