statistics tests of hypotheses for a single sample contents, figures, and exercises come from the...

StatisticsStatisticsTests of Hypotheses for a Single Sample

Contents, figures, and exercises come from the textbook: Applied Statistics and Probability for Engineers, 5th Edition, by Douglas C. Montgomery, John Wiley & Sons, Inc., 2011.

Statistical hypothesisA statistical hypothesis is a statement

about the parameters of one or more populations.

For example, centimeters per second centimeters per second is the null hypothesis and is a

two-sided alternative hypothesis

Hypothesis TestingHypothesis Testing

50:0 H

50:1 H

Type I errorRejecting the null hypothesis when

it is true is defined as a type I errorType II errorFailing to reject the null hypothesis

when it is false is defined as a type II error

Probability of type I error = P(type I error) = P(reject when

is true)Probability of type II error = P(type II error) = P(fail to reject

when is false)

Null hypothesis (H0) is true

Null hypothesis (H0) is false

Reject null hypothesis

Type I errorFalse positive

Correct outcomeTrue positive

Fail to reject null hypothesis

Correct outcomeTrue negative

Type II errorFalse negative

From Wikipedia, http://www.wikipedia.org.

PropertiesThe size of the critical region and

can be reduced by appropriate selection of the critical values

Type I and type II errors are related. Decrease one will increase the other

An increase in sample size reduces increases as the true value of the

parameter approaches the value hypothesized in the null hypothesis

= 0.05Widely used

The probability of correctly rejecting a false null hypothesis

Sensitivity: the ability to detect differences

Formulating one-sided hypothesis : = 1.5 MPa : > 1.5 Mpa (We want)Or : = 1.5 MPa : < 1.5 Mpa (We want)

P-valueThe P-value is the smallest level of

significance that would lead to rejection of the null hypothesis with the given data

General procedure for hypothesis testsSpecify the test statistic to be used (such

as )Specify the location of the critical region

(two-tailed, upper-tailed, or lower-tailed)Specify the criteria for rejection (typically,

the value of , or the P-value at which rejection should occur)

Practical significanceBe careful when interpreting the results

from hypothesis testing when the sample size is large, because any small departure from the hypothesized value will probably be detected, even when the difference is of little or no practical significance

Example 9-1 Propellant Burning RateSuppose that if the burning rate is less

than 50 centimeters per second, we wish to show this with a strong conclusion.

: centimeters per second : centimeters per secondSince the rejection of is always a

strong conclusion, this statement of the hypotheses will produce outcome if is rejected.

Exercise 9-27A random sample of 500 registered

voters in Phoenix is asked if they favor the use of oxygenated fuels year-round to reduce air pollution. If more than 400 voters respond positively, we will conclude that more than 60% of the voters favor the use of these fuels.

(a) Find the probability of type I error if exactly 60% of the voters favor the use of these fuels.

(b) What is the type II error probability if 75% of the voters favor this action?

Hint: use the normal approximation to the binomial.

Hypothesis tests on the meanHypotheses, two-sided alternative

Test statistic: P-value: Reject if or

Tests on the Mean of a Tests on the Mean of a Normal Distribution, Normal Distribution, Variance KnownVariance Known

00 : H

|)](|1[2 0zP

01 : H

0H 2/0 zz 2/0 zz

Hypotheses, upper-tailed alternative P-value: Reject if

Hypotheses, lower-tailed alternative P-value: Reject if

00 : H

)(1 0zP 01 : H

00 : H

)( 0zP 01 : H

0H zz 0

Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, two-sided alternative

Suppose the true value of the mean

under is Test statistic:

00 : H

01 : H

nz 2/2/

Type II error and choice of sample sizeSample size formulas If

Let be the 100 upper percentile of the standard normal distribution. Then

Sample size for a two-sided test on the mean, variance known

Sample size for a one-sided test on the mean, variance known

222/ where

where)(

Operating characteristic (OC) curvesCurves plotting against a parameter

for various sample size

See Appendix VIIFor a given and , find .For a given and , find

Large-sample test If , the sample standard

deviation can be substituted for in the test procedures with little effect

Example 9-2 Propellant Burning Rate , , , ,Specifications require that the mean

burning rate must be 50 centimeters per second. What conclusions should be drawn?

Example 9-3 Propellant Burning Rate Type II ErrorSuppose that the true burning rate is

49 centimeters per second. What is for the two-sided test with , , and ?

Example 9-4 Propellant Burning Rate Type II Error from OC CurveSuppose the true mean burning rate is

centimeters per second.

2 05.0 3.51x25n

205.0 25n

1|||| 0

Example 9-4 Propellant Burning Rate Sample Size from OC CurveDesign the test so that if the true

mean burning rate differs from 50 centimeters per second by as much an 1 centimeter per second, the test will detect this with a high probability 0.90.90.01

Exercise 9-47Medical researchers have developed a

new artificial heart constructed primarily of titanium and plastic. The heart will last and operate almost indefinitely once it is implanted in the patient’s body, but the battery pack needs to be recharged about every four hours. A random sample of 50 battery packs is selected and subjected to a life test. The average life of these batteries is 4.05 hours. Assume that battery life is normally distributed with standard deviation

hour. (a) Is there evidence to support the

claim that mean battery life exceeds 4 hours? Use .

(b) What is the P-value for the test in part (a)?

Exercise 9-47 (c) Compute the power of the test if

the true mean battery life is 4.05 hours.

(d) What sample size would be required to detect a true mean battery life of 4.5 hours if we wanted the power of the test to be at least 0.9?

(e) Explain how the question in part (a) could be answered by constructing a one-sided confidence bound on the mean life.

Hypothesis tests on the meanHypotheses, two-sided alternative

Tests on the Mean of a Tests on the Mean of a Normal Distribution, Normal Distribution, Variance UnknownVariance Unknown

00 : H

|)|(2 01 tTPP n

01 : H

0H 1,2/0 ntt 1,2/0 ntt

00 : H

)( 01 tTPP n

01 : H

00 : H

)( 01 tTPP n

01 : H

0H 1,0 ntt

Suppose the true value of the mean

under is Test statistic:

Under is of the noncentral

distribution with degrees of freedom and noncentrality parameter .

00 : H

01 : H

PDF of noncentral distributiont

From Wikipedia, http://www.wikipedia.org.

where denotes the noncentral

random variable Operating characteristic (OC) curves

Curves plotting against a parameter for various sample size

See Appendix VIINote that depends on the

unknown parameter

'0T t}'{

1,2/01,2/

Example 9-6 Golf Club Design

It is of interest to determine if there is evidence (with ) to support a claim that the mean coefficient of restitution exceeds 0.82.

Data: 0.8411, … and

Example 9-7 Golf Club Design Sample Size If the mean coefficient of restitution

exceeds 0.82 by as much as 0.02, is the sample size adequately to ensure that will be rejected with probability at least 0.8?

83725.0x

15n82.0:0 H

02456.0s

Exercise 9-59A 1992 article in the Journal of the

American Medical Association (“A Critical Appraisal of 98.6 Degrees F, the Upper Limit of the Normal Body Temperature, and Other Legacies of Carl Reinhold August Wunderlich”) reported body temperature, gender, and heart rate for a number of subjects. The body temperatures for 25 female subjects follow: 97.8, …

(a) Test the hypothesis versus using . Find the P-value.

(b) Check the assumption that female body temperature is normally distributed.

(c) Compute the power of the test if the true mean female body temperature is as low as 98.0.

05.06.98:0 H 6.98:1 H

Exercise 9-59 (d) What sample size would be required

to detect a true mean female body temperature as low as 98.2 if we wanted the power of the test to be at least 0.9?

(e) Explain how the question in part (a) could be answered by constructing a two-sided confidence interval on the mean female body temperature.

Exercise 9-59Normality plot

Hypothesis tests on the varianceHypotheses, two-sided alternative

Tests on the Variance and Tests on the Variance and Standard Deviation of a Standard Deviation of a Normal DistributionNormal Distribution

20 : H

)()( 21,2/1

21 nnnn XPXPP

21 : H

1,2/20 n 2

1,2/120 n

)( 21,

21 nnXPP

1,20 n

20 : H

21 : H

20 : H

21 : H

)( 21,1

21 nnXPP 2

1,120 n

Suppose the true value of the variance

under is

1,2/12

1,2/20

20 : H

21 : H

Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, upper-tailed alternative

under is

20 : H

21 : H

Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, lower-tailed alternative

under is

20 : H

21 : H

Operating characteristic (OC) curvesCurves plotting against a parameter

for various sample size

See Appendix VII

Example 9-8 Automated Filling , , . Is there evidence in the sample data to

suggest that the manufacture has a problem with underfilled or overfilled bottles? ( )

Example 9-8 Automated Filling Sample Size , Find

20n 0153.02 s 05.0

10.00 125.0

Exercise 9-83 Recall the sugar content of the syrup in

canned peaches from Exercise 8-46. Suppose that the variance is thought to be (milligrams)2. Recall that a random sample of cans yields a sample standard deviation of milligrams.

(a) Test the hypothesis versus using . Find the P-value for this test.

(b) Suppose that the actual standard deviation is twice as large as the hypothesized value. What is the probability that this difference will be detected by the test described in part (a)?

(c) Suppose that the true variance is . How large a sample would be required to detect this difference with probability at least 0.90?

18: 20 H

10n8.4s

18: 21 H

Large-sample tests on a proportionHypotheses, two-sided alternative

Tests on a Population Tests on a Population ProportionProportion

00 : ppH

|)](|1[2 0zP

01 : ppH

)1( 00

0H 2/0 zz 2/0 zz

00 : ppH

)(1 0zP 01 : ppH

00 : ppH

)( 0zP 01 : ppH

0H zz 0

Suppose the true value of the

proportion under is

00 : ppH

01 : ppH

pnppzp

HnppzppnppzpP

}|/)1(/)1({

002/0002/0

1002/0002/0

Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, upper-tailed alternative

proportion under is

00 : ppH

01 : ppH

pnppzp

HnppzppP

}|/)1({

Type II error and choice of sample sizeFinding the probability of type II errorHypotheses, lower-tailed alternative

proportion under is

00 : ppH

01 : ppH

pnppzp

HpnppzpP

}|/)1({

Type II error and choice of sample sizeTwo-sided alternativeLet be the 100 upper percentile of

the standard normal distribution. Then)( zz

pnppzp

002/0002/0

pnppzpz

/)1( 002/0

002/ )1()1(

ppzppzn

Type II error and choice of sample sizeUpper-tailed alternativeLet be the 100 upper percentile of

pnppzpz

/)1( 000

00 )1()1(

ppzppzn

pnppzp

/)1( 000

Type II error and choice of sample sizeLower-tailed alternativeLet be the 100 upper percentile of

pnppzpz

/)1( 000

00 )1()1(

ppzppzn

pnppzp

Example 9-10 Automobile Engine Controller , , The semiconductor manufacturer takes a

random sample of 200 devices and finds that four of them are defective. Can the manufacturer demonstrate process capability for the customer? ( )

Example 9-11 Automobile Engine Controller Type II ErrorSuppose that its process fallout is really

. What is the -error for a test of process capability that uses and ?

05.0 200n

05.0200n

Exercise 9-95 In a random sample of 85 automobile

engine crankshaft bearings, 10 have a surface finish roughness that exceeds the specifications. Does this data present strong evidence that the proportion of crankshaft bearings exhibiting excess surface roughness exceeds 0.10?

(a) State and test the appropriate hypotheses using

. (b) If it is really the situation that ,

how likely is it that the test procedure in part (a) will not reject the null hypotheses?

(c) If , how large would the sample size have to be for us to have a probability of correctly rejecting the null hypothesis of 0.9? , ,

Test the hypothesis that a particular distribution will be satisfactory as a population model

Based on the chi-square distribution observations, is the number of

parameters of the hypothesized distribution estimated by sample statistics

: the observed frequency in the th class interval

: the expected frequency in the th class interval

Test statistic:

P-value: Reject the hypothesis if

Testing for Goodness of Testing for Goodness of FitFit

21 pkPP

Example 9-12 Printed Circuit Board Defects, Poisson DistributionNumber of defects: 0, observed

frequency: 32Number of defects: 1, observed

frequency: 4Example 9-13 Power Supply Distribution,

Continuous Distribution , ,A manufacturer engineer is testing a

power supply used in a notebook computer and, using , wishes to determine whether output voltage is adequately described by a normal distribution.

08.0s04.5x

Exercise 9-101The number of cars passing eastbound

through the intersection of Mill and University Avenues has been tabulated by a group of civil engineering students. They have obtained the data in the adjacent table:

(a) Does the assumption of a Poisson distribution seem appropriate as a probability model for this process? Use .

(b) Calculate the P-value for this test.Data: (40, 14), (41, 24), …

Test the hypothesis that two methods of classification are statistically independent

Based on the chi-square distribution observations, contingency table : the observed frequency for level

of the first classification and level for the second classification

Test statistic:

P-value: Reject the hypothesis if

Contingency Table TestsContingency Table Tests

jiji O

2)1)(1( crPP

2)1)(1(,

iijj O

1ˆ jiij vunE ˆˆ

Example 9-13 Health Insurance Plan PreferenceA company has to choose among three

health insurance plans. Management wishes to know whether the preference for plans is independent of job classification and wants to use .

, data: …Exercise 9-107

A study is being made of the failure of an electronic component. There are four types of failures possible and two mounting positions for the device

Would you conclude that the type of failure is independent of the mounting position? Use . Find the P-value for this test.

A B C D

1 20 48 20 7

2 4 17 6 12

The sign testTest hypotheses about the median of a

continuous distribution : the observed number of plus signs (

)Hypotheses, two-sided alternative

P-value: if

or ifReject if

Nonparametric ProceduresNonparametric Procedures

00~~: H

1 when (2 prRPP

01~~: H

0~0 iX

1 when (2 prRPP 2/nr

00~~: H

1 when ( prRPP

01~~: H

00~~: H

1 when ( prRPP

01~~: H

Appendix Table VIII ( )Hypotheses, two-sided alternative

Reject if

Hypotheses, upper-tailed alternative Reject if

Hypotheses, lower-tailed alternative Reject if

00~~: H

01~~: H

rrr ),min(

00~~: H

01~~: H

00~~: H

01~~: H

Ties in the sign testValues of exactly equal to

should be set aside and the sign test applied to the remaining data

Normal approximation for sign test statistic Reject if for or if for or if for

01~~: H

2/0 || zz

01~~: H

Type II error for the sign testFinding the probability of type II errorNot only a particular value of , say,

, must be used but also the form of the underlying distribution will affect the calculations

Wilcoxon signed-rank testAppendix Table IX ( )Rank the absolute differences in

ascending order, and then give the ranks the signs of their corresponding differences

: the sum of the positive ranks : the absolute value of the sum of

negative ranksHypotheses, two-sided alternative

Reject if

0H www ),min(

00 : H

01 : H

|| 0iX

Wilcoxon signed-rank testAppendix Table IX ( ) Hypotheses, upper-tailed alternative

Reject if

Hypotheses, lower-tailed alternative Reject if

00 : H

01 : H

00 : H

01 : H

Ties in the Wilcoxon signed-rank test If several observations have the same

absolute magnitude, they are assigned the average of the ranks that they would receive if they differed slightly from one another

Normal approximation for Wiocoxon signen-rank test statistic

Reject if for or if for or if for

24/)12)(1(

4/)1(0

01~~: H

2/0 || zz

01~~: H

Example 9-15 Propellant Shear Strength Sign Test

We would like to test the hypothesis that the median shear strength is 13790 kN/m2, using

Example 9-16 Propellant Shear Strength Wilcoxon Signed-Rank Test

We would like to test the hypothesis that the median shear strength is 13790 kN/m2, using

Exercise 9-117A primer paint can be used on aluminum

panels. The drying time of the primer is an important consideration in the manufacturing process. Twenty panels are selected and the drying times are as follows: 1.6, …

Is there evidence that the mean drying time of the primer exceeds 1.5 hr?

statistics tests of hypotheses for a single sample contents, figures, and exercises come from the...

onesided hypothesis

hypothesis testsspecify

applied statistics

john wiley sons

truenull hypothesis

true value

powerthe probability

hypothesized value

Documents

note 12 of 5e statistics with economics and business...

genetics and statistics a tale of two hypotheses

introduction to the practice of basic statistics (textbook...

chi-square x 2. review: the “null” hypothesis...

business statistics - qbm117 testing hypotheses about a...

bootstrap p-valuesfor tests of nonparametluc...

hypothesis testing. central limit theorem hypotheses and...

chapter 22 hypotheses in research - department of statistics

evaluating hypotheses chapter 9. descriptive vs. inferential...

statistics for business and economics chapter 7 inferences...

lecture 13: statistical hypotheses and testslecture 13:...

department of statistics - cuh.ac.in statistics syllabus wef...

chi-squared genetics and statistics a tale of two hypotheses

chapter 8: inferences based on a single sample: tests of...

21 testing hypotheses p - github pages · testing...

math statistics textbook chap 1~16

follett destiny – textbook reports · follett destiny –...

probability and statistics textbook

statistics hypotheses test (iii) nonparametric...

tests of hypotheses using statistics - williams...