statistics tests of hypotheses for a single sample contents, figures, and exercises come from the...
Post on 28-Dec-2015
232 Views
Preview:
TRANSCRIPT
StatisticsStatisticsTests of Hypotheses for a Single Sample
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
Statistical hypothesisA statistical hypothesis is a statement
about the parameters of one or more populations.
For example, centimeters per second centimeters per second is the null hypothesis and is a
two-sided alternative hypothesis
Hypothesis TestingHypothesis Testing
50:0 H
0H
50:1 H
1H
Type I errorRejecting the null hypothesis when
it is true is defined as a type I errorType II errorFailing to reject the null hypothesis
when it is false is defined as a type II error
Probability of type I error = P(type I error) = P(reject when
is true)Probability of type II error = P(type II error) = P(fail to reject
when is false)
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
0H
0H 0H
0H 0H
Null hypothesis (H0) is true
Null hypothesis (H0) is false
Reject null hypothesis
Type I errorFalse positive
Correct outcomeTrue positive
Fail to reject null hypothesis
Correct outcomeTrue negative
Type II errorFalse negative
From Wikipedia, http://www.wikipedia.org.
PropertiesThe size of the critical region and
can be reduced by appropriate selection of the critical values
Type I and type II errors are related. Decrease one will increase the other
An increase in sample size reduces increases as the true value of the
parameter approaches the value hypothesized in the null hypothesis
= 0.05Widely used
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
Power
The probability of correctly rejecting a false null hypothesis
Sensitivity: the ability to detect differences
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
1
Formulating one-sided hypothesis : = 1.5 MPa : > 1.5 Mpa (We want)Or : = 1.5 MPa : < 1.5 Mpa (We want)
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
0H
0H
1H
1H
Formulating one-sided hypothesis : = 1.5 MPa : > 1.5 Mpa (We want)Or : = 1.5 MPa : < 1.5 Mpa (We want)
P-valueThe P-value is the smallest level of
significance that would lead to rejection of the null hypothesis with the given data
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
0H
0H
1H
1H
0H
General procedure for hypothesis testsSpecify the test statistic to be used (such
as )Specify the location of the critical region
(two-tailed, upper-tailed, or lower-tailed)Specify the criteria for rejection (typically,
the value of , or the P-value at which rejection should occur)
Practical significanceBe careful when interpreting the results
from hypothesis testing when the sample size is large, because any small departure from the hypothesized value will probably be detected, even when the difference is of little or no practical significance
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
0Z
0
Example 9-1 Propellant Burning RateSuppose that if the burning rate is less
than 50 centimeters per second, we wish to show this with a strong conclusion.
: centimeters per second : centimeters per secondSince the rejection of is always a
strong conclusion, this statement of the hypotheses will produce outcome if is rejected.
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
0H
0H1H
5050
0H
Exercise 9-27A random sample of 500 registered
voters in Phoenix is asked if they favor the use of oxygenated fuels year-round to reduce air pollution. If more than 400 voters respond positively, we will conclude that more than 60% of the voters favor the use of these fuels.
(a) Find the probability of type I error if exactly 60% of the voters favor the use of these fuels.
(b) What is the type II error probability if 75% of the voters favor this action?
Hint: use the normal approximation to the binomial.
Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.
Hypothesis tests on the meanHypotheses, two-sided alternative
Test statistic: P-value: Reject if or
Tests on the Mean of a Tests on the Mean of a Normal Distribution, Normal Distribution, Variance KnownVariance Known
00 : H
|)](|1[2 0zP
01 : H
n
xz
/0
0
0H 2/0 zz 2/0 zz
Hypotheses, upper-tailed alternative P-value: Reject if
Hypotheses, lower-tailed alternative P-value: Reject if
00 : H
)(1 0zP 01 : H
00 : H
)( 0zP 01 : H
0H zz 0
0H zz 0
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, two-sided alternative
Suppose the true value of the mean
under is Test statistic:
Under
0
1H
00 : H
01 : H
n
n
x
n
xz
/
)(
/00
0
1H
1,0
nNz
nz
nz 2/2/
Type II error and choice of sample sizeSample size formulas If
Let be the 100 upper percentile of the standard normal distribution. Then
0
)( z
nz
nz
nz
2/
2/2/
n
zz 2/
z
Note
nz
nz
nz
nz
nz
nz
2/2/
2/2/
2/2/
11
Sample size for a two-sided test on the mean, variance known
Sample size for a one-sided test on the mean, variance known
02
222/ where
)(
zzn
02
22
where)(
zz
n
Operating characteristic (OC) curvesCurves plotting against a parameter
for various sample size
See Appendix VIIFor a given and , find .For a given and , find
Large-sample test If , the sample standard
deviation can be substituted for in the test procedures with little effect
|| 0d
dn
d
n
d
n
s40n
Example 9-2 Propellant Burning Rate , , , ,Specifications require that the mean
burning rate must be 50 centimeters per second. What conclusions should be drawn?
Example 9-3 Propellant Burning Rate Type II ErrorSuppose that the true burning rate is
49 centimeters per second. What is for the two-sided test with , , and ?
Example 9-4 Propellant Burning Rate Type II Error from OC CurveSuppose the true mean burning rate is
centimeters per second.
2 05.0 3.51x25n
205.0 25n
51
2
1|||| 0
d
Example 9-4 Propellant Burning Rate Sample Size from OC CurveDesign the test so that if the true
mean burning rate differs from 50 centimeters per second by as much an 1 centimeter per second, the test will detect this with a high probability 0.90.90.01
Exercise 9-47Medical researchers have developed a
new artificial heart constructed primarily of titanium and plastic. The heart will last and operate almost indefinitely once it is implanted in the patient’s body, but the battery pack needs to be recharged about every four hours. A random sample of 50 battery packs is selected and subjected to a life test. The average life of these batteries is 4.05 hours. Assume that battery life is normally distributed with standard deviation
hour. (a) Is there evidence to support the
claim that mean battery life exceeds 4 hours? Use .
(b) What is the P-value for the test in part (a)?
2.0
05.0
Exercise 9-47 (c) Compute the power of the test if
the true mean battery life is 4.05 hours.
(d) What sample size would be required to detect a true mean battery life of 4.5 hours if we wanted the power of the test to be at least 0.9?
(e) Explain how the question in part (a) could be answered by constructing a one-sided confidence bound on the mean life.
Hypothesis tests on the meanHypotheses, two-sided alternative
Test statistic: P-value: Reject if or
Tests on the Mean of a Tests on the Mean of a Normal Distribution, Normal Distribution, Variance UnknownVariance Unknown
00 : H
|)|(2 01 tTPP n
01 : H
nS
xT
/0
0
0H 1,2/0 ntt 1,2/0 ntt
Hypotheses, upper-tailed alternative P-value: Reject if
Hypotheses, lower-tailed alternative P-value: Reject if
00 : H
)( 01 tTPP n
01 : H
00 : H
)( 01 tTPP n
01 : H
0H 1,0 ntt
0H 1,0 ntt
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, two-sided alternative
Suppose the true value of the mean
under is Test statistic:
Under is of the noncentral
distribution with degrees of freedom and noncentrality parameter .
0
1H
00 : H
01 : H
11)1(
))((
/2
2
0
00
nSn
nxn
nS
xt
1H
0t t
/n
PDF of noncentral distributiont
From Wikipedia, http://www.wikipedia.org.
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, two-sided alternative
where denotes the noncentral
random variable Operating characteristic (OC) curves
Curves plotting against a parameter for various sample size
See Appendix VIINote that depends on the
unknown parameter
.
'0T t}'{
}0|{
1,2/01,2/
1,2/01,2/
nn
nn
tTtP
tTtP
|| 0d
dn
d2
Example 9-6 Golf Club Design
It is of interest to determine if there is evidence (with ) to support a claim that the mean coefficient of restitution exceeds 0.82.
Data: 0.8411, … and
Example 9-7 Golf Club Design Sample Size If the mean coefficient of restitution
exceeds 0.82 by as much as 0.02, is the sample size adequately to ensure that will be rejected with probability at least 0.8?
.
15n
05.0
83725.0x
15n82.0:0 H
02456.0s
Exercise 9-59A 1992 article in the Journal of the
American Medical Association (“A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlich”) reported body temperature, gender, and heart rate for a number of subjects. The body temperatures for 25 female subjects follow: 97.8, …
(a) Test the hypothesis versus using . Find the P-value.
(b) Check the assumption that female body temperature is normally distributed.
(c) Compute the power of the test if the true mean female body temperature is as low as 98.0.
.
05.06.98:0 H 6.98:1 H
Exercise 9-59 (d) What sample size would be required
to detect a true mean female body temperature as low as 98.2 if we wanted the power of the test to be at least 0.9?
(e) Explain how the question in part (a) could be answered by constructing a two-sided confidence interval on the mean female body temperature.
Exercise 9-59Normality plot
Hypothesis tests on the varianceHypotheses, two-sided alternative
Test statistic: P-value: Reject if or
Tests on the Variance and Tests on the Variance and Standard Deviation of a Standard Deviation of a Normal DistributionNormal Distribution
20
20 : H
)()( 21,2/1
21
21,2/
21 nnnn XPXPP
20
21 : H
20
220
)1(
Sn
X
0H2
1,2/20 n 2
1,2/120 n
Hypotheses, upper-tailed alternative P-value: Reject if
Hypotheses, lower-tailed alternative P-value: Reject if
)( 21,
21 nnXPP
0H2
1,20 n
0H
20
20 : H
20
21 : H
20
20 : H
20
21 : H
)( 21,1
21 nnXPP 2
1,120 n
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, two-sided alternative
Suppose the true value of the variance
under is
21H
}|)1(
{
}|)1(
{
12
1,2/2
20
2
22
1,2/12
20
12
1,2/20
22
1,2/1
Hsn
P
Hsn
P
nn
nn
20
20 : H
20
21 : H
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, upper-tailed alternative
Suppose the true value of the variance
under is
21H
}|)1(
{
}|)1(
{
12
1,2
20
2
2
12
1,20
2
Hsn
P
Hsn
P
n
n
20
20 : H
20
21 : H
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, lower-tailed alternative
Suppose the true value of the variance
under is
21H
}|)1(
{
}|)1(
{
12
22
1,12
20
120
22
1,1
Hsn
P
Hsn
P
n
n
20
20 : H
20
21 : H
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, two-sided alternative
Operating characteristic (OC) curvesCurves plotting against a parameter
for various sample size
See Appendix VII
0
n
Example 9-8 Automated Filling , , . Is there evidence in the sample data to
suggest that the manufacture has a problem with underfilled or overfilled bottles? ( )
Example 9-8 Automated Filling Sample Size , Find
20n 0153.02 s 05.0
01.02
10.00 125.0
Exercise 9-83 Recall the sugar content of the syrup in
canned peaches from Exercise 8-46. Suppose that the variance is thought to be (milligrams)2. Recall that a random sample of cans yields a sample standard deviation of milligrams.
(a) Test the hypothesis versus using . Find the P-value for this test.
(b) Suppose that the actual standard deviation is twice as large as the hypothesized value. What is the probability that this difference will be detected by the test described in part (a)?
(c) Suppose that the true variance is . How large a sample would be required to detect this difference with probability at least 0.90?
182
18: 20 H
05.0
10n8.4s
18: 21 H
402
Large-sample tests on a proportionHypotheses, two-sided alternative
Test statistic: P-value: Reject if or
Tests on a Population Tests on a Population ProportionProportion
00 : ppH
|)](|1[2 0zP
01 : ppH
)1( 00
00
pnp
npXz
0H 2/0 zz 2/0 zz
Hypotheses, upper-tailed alternative P-value: Reject if
Hypotheses, lower-tailed alternative P-value: Reject if
00 : ppH
)(1 0zP 01 : ppH
00 : ppH
)( 0zP 01 : ppH
0H zz 0
0H zz 0
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, two-sided alternative
Suppose the true value of the
proportion under is
p1H
00 : ppH
01 : ppH
npp
pnppzp
npp
pnppzp
HnppzppnppzpP
/)1(
/)1(
/)1(
/)1(
}|/)1(/)1({
002/0002/0
1002/0002/0
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, upper-tailed alternative
Suppose the true value of the
proportion under is
p1H
00 : ppH
01 : ppH
npp
pnppzp
HnppzppP
/)1(
/)1(
}|/)1({
000
1000
Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, lower-tailed alternative
Suppose the true value of the
proportion under is
p1H
00 : ppH
01 : ppH
npp
pnppzp
HpnppzpP
/)1(
/)1(1
}|/)1({
000
1000
Type II error and choice of sample sizeTwo-sided alternativeLet be the 100 upper percentile of
the standard normal distribution. Then)( zz
npp
pnppzp
npp
pnppzp
npp
pnppzp
/)1(
/)1(
/)1(
/)1(
/)1(
/)1(
002/0
002/0002/0
npp
pnppzpz
/)1(
/)1( 002/0
2
0
002/ )1()1(
pp
ppzppzn
Type II error and choice of sample sizeUpper-tailed alternativeLet be the 100 upper percentile of
the standard normal distribution. Then)( zz
npp
pnppzpz
/)1(
/)1( 000
2
0
00 )1()1(
pp
ppzppzn
npp
pnppzp
/)1(
/)1( 000
Type II error and choice of sample sizeLower-tailed alternativeLet be the 100 upper percentile of
the standard normal distribution. Then)( zz
npp
pnppzpz
/)1(
/)1( 000
2
0
00 )1()1(
pp
ppzppzn
npp
pnppzp
npp
pnppzp
/)1(
/)1(
/)1(
/)1(1
000
000
Example 9-10 Automobile Engine Controller , , The semiconductor manufacturer takes a
random sample of 200 devices and finds that four of them are defective. Can the manufacturer demonstrate process capability for the customer? ( )
Example 9-11 Automobile Engine Controller Type II ErrorSuppose that its process fallout is really
. What is the -error for a test of process capability that uses and ?
05.0p
05.0p
05.0 200n
03.0p
05.0200n
Exercise 9-95 In a random sample of 85 automobile
engine crankshaft bearings, 10 have a surface finish roughness that exceeds the specifications. Does this data present strong evidence that the proportion of crankshaft bearings exhibiting excess surface roughness exceeds 0.10?
(a) State and test the appropriate hypotheses using
. (b) If it is really the situation that ,
how likely is it that the test procedure in part (a) will not reject the null hypotheses?
(c) If , how large would the sample size have to be for us to have a probability of correctly rejecting the null hypothesis of 0.9? , ,
15.0p
05.0
15.0p
Test the hypothesis that a particular distribution will be satisfactory as a population model
Based on the chi-square distribution observations, is the number of
parameters of the hypothesized distribution estimated by sample statistics
: the observed frequency in the th class interval
: the expected frequency in the th class interval
Test statistic:
P-value: Reject the hypothesis if
Testing for Goodness of Testing for Goodness of FitFit
n
iiO
iE
k
i i
ii
E
EOX
1
220
)(
i
)( 20
21 pkPP
21,
20 pk
p
Example 9-12 Printed Circuit Board Defects, Poisson DistributionNumber of defects: 0, observed
frequency: 32Number of defects: 1, observed
frequency: 15Number of defects: 2, observed
frequency: 9Number of defects: 3, observed
frequency: 4Example 9-13 Power Supply Distribution,
Continuous Distribution , ,A manufacturer engineer is testing a
power supply used in a notebook computer and, using , wishes to determine whether output voltage is adequately described by a normal distribution.
08.0s04.5x
05.0
100n
Exercise 9-101The number of cars passing eastbound
through the intersection of Mill and University Avenues has been tabulated by a group of civil engineering students. They have obtained the data in the adjacent table:
(a) Does the assumption of a Poisson distribution seem appropriate as a probability model for this process? Use .
(b) Calculate the P-value for this test.Data: (40, 14), (41, 24), …
05.0
Test the hypothesis that two methods of classification are statistically independent
Based on the chi-square distribution observations, contingency table : the observed frequency for level
of the first classification and level for the second classification
, ,
Test statistic:
P-value: Reject the hypothesis if
Contingency Table TestsContingency Table Tests
n
iijO
r
i
c
j ij
ijij
E
EOX
1 1
220
)(
c
jiji O
nu
1
1ˆ
)( 20
2)1)(1( crPP
2)1)(1(,
20 cr
cr
j
r
iijj O
nv
1
1ˆ jiij vunE ˆˆ
Example 9-13 Health Insurance Plan PreferenceA company has to choose among three
health insurance plans. Management wishes to know whether the preference for plans is independent of job classification and wants to use .
, data: …Exercise 9-107
A study is being made of the failure of an electronic component. There are four types of failures possible and two mounting positions for the device
Would you conclude that the type of failure is independent of the mounting position? Use . Find the P-value for this test.
01.0
05.0
500n
A B C D
1 20 48 20 7
2 4 17 6 12
The sign testTest hypotheses about the median of a
continuous distribution : the observed number of plus signs (
)Hypotheses, two-sided alternative
P-value: if
or ifReject if
Nonparametric ProceduresNonparametric Procedures
00~~: H
)2
1 when (2 prRPP
01~~: H
~
0~0 iX
r
2/nr
)2
1 when (2 prRPP 2/nr
0H P
Hypotheses, upper-tailed alternative P-value: Reject if
Hypotheses, lower-tailed alternative P-value: Reject if
0H
0H
00~~: H
)2
1 when ( prRPP
01~~: H
P
00~~: H
)2
1 when ( prRPP
01~~: H
P
Appendix Table VIII ( )Hypotheses, two-sided alternative
Reject if
Hypotheses, upper-tailed alternative Reject if
Hypotheses, lower-tailed alternative Reject if
0H
0H
00~~: H
01~~: H
rrr ),min(
00~~: H
01~~: H
rr
r
00~~: H
01~~: H
0H rr
Ties in the sign testValues of exactly equal to
should be set aside and the sign test applied to the remaining data
Normal approximation for sign test statistic Reject if for or if for or if for
n
nRZ
5.0
5.00
0H
01~~: H
2/0 || zz
01~~: H
zz 0
iX 0~
01~~: H
zz 0
Type II error for the sign testFinding the probability of type II errorNot only a particular value of , say,
, must be used but also the form of the underlying distribution will affect the calculations
~ ~
Wilcoxon signed-rank testAppendix Table IX ( )Rank the absolute differences in
ascending order, and then give the ranks the signs of their corresponding differences
: the sum of the positive ranks : the absolute value of the sum of
negative ranksHypotheses, two-sided alternative
Reject if
0H www ),min(
w
00 : H
01 : H
|| 0iX
ww
Wilcoxon signed-rank testAppendix Table IX ( ) Hypotheses, upper-tailed alternative
Reject if
Hypotheses, lower-tailed alternative Reject if
0H ww
w
00 : H
01 : H
0H ww
00 : H
01 : H
Ties in the Wilcoxon signed-rank test If several observations have the same
absolute magnitude, they are assigned the average of the ranks that they would receive if they differed slightly from one another
Normal approximation for Wiocoxon signen-rank test statistic
Reject if for or if for or if for
24/)12)(1(
4/)1(0
nnn
nnWZ
0H
01~~: H
2/0 || zz
01~~: H
zz 0
01~~: H
zz 0
Example 9-15 Propellant Shear Strength Sign Test
We would like to test the hypothesis that the median shear strength is 13790 kN/m2, using
Example 9-16 Propellant Shear Strength Wilcoxon Signed-Rank Test
We would like to test the hypothesis that the median shear strength is 13790 kN/m2, using
20n
05.0
20n
05.0
Exercise 9-117A primer paint can be used on aluminum
panels. The drying time of the primer is an important consideration in the manufacturing process. Twenty panels are selected and the drying times are as follows: 1.6, …
Is there evidence that the mean drying time of the primer exceeds 1.5 hr?
top related