inference on single mean_new

83
L. Wang, Department of Statistics L. Wang, Department of Statistics University of South Carolina University of South Carolina Inference on a Inference on a Single Mean Single Mean

Upload: tariqmtch

Post on 18-Jan-2016

22 views

Category:

Documents


5 download

DESCRIPTION

Normal Probability Distribution for Students.

TRANSCRIPT

Page 1: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South CarolinaUniversity of South Carolina

Inference on a Single Inference on a Single MeanMean

Page 2: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 22

Use Calculation from Sample to Use Calculation from Sample to Estimate Population ParameterEstimate Population Parameter

Population Sample(select)

Statistic

(calculate)

Parameter(estimate)

(describes)

%63ˆ p?p

Page 3: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 33

Use Calculation from Sample to Use Calculation from Sample to Estimate Population ParameterEstimate Population Parameter

Population Sample(select)

Statistic

(calculate)

Parameter(estimate)

(describes)

hrsy 200,2?

Page 4: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 44

Statistic ParameterStatistic Parameter Describes a Describes a

sample.sample. Always knownAlways known Changes upon Changes upon

repeated repeated sampling.sampling.

Examples:Examples:

Describes a Describes a population.population.

Usually unknownUsually unknown Is fixedIs fixed

Examples:Examples:

pssy ˆ,,, 2 p,,, 2

Page 5: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 55

A Statistic is a Random A Statistic is a Random VariableVariable

Upon repeated sampling of the same Upon repeated sampling of the same population, the value of a statistic changes.population, the value of a statistic changes.

While we don’t know what the next value While we don’t know what the next value will be, we do know the overall pattern over will be, we do know the overall pattern over many, many samplingsmany, many samplings..

The distribution of possible values of a The distribution of possible values of a statistic for repeated samples of the same statistic for repeated samples of the same size from a population is called the size from a population is called the sampling distributionsampling distribution of the statistic. of the statistic.

Page 6: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 66

Sampling Distribution of Sampling Distribution of

y

•If a random sample of size n is taken from a normal population having mean μy and variance σy

2, then is a random variable which is also normally distributed with mean μy and variance σy

2/n .

y

Page 7: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 77

Sampling Distribution of Sampling Distribution of Original Population

80 85 90 95 100 105 110 115 120

X

Averages - Sample Size = 2

80 85 90 95 100 105 110 115 120

X(2)

Averages - Sample Size = 10

80 85 90 95 100 105 110 115 120

X(10)

Averages - Sample Size = 25

80 85 90 95 100 105 110 115 120

X(25)

N(100,5)

N(100,1)N(100,3.54)

N(100,1.58)

y

Page 8: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 88

Light BulbsLight Bulbs

The life of a light bulb is normally The life of a light bulb is normally distributed with a mean of 2000 hours distributed with a mean of 2000 hours and standard deviation of 300 hours.and standard deviation of 300 hours.

What is the probability that a What is the probability that a randomly chosen light bulb will have a randomly chosen light bulb will have a life of less than 1700 hours?life of less than 1700 hours?

What is the probability that the mean What is the probability that the mean life of three randomly chosen light life of three randomly chosen light bulbs will be less than 1700 hours?bulbs will be less than 1700 hours?

Page 9: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 99

Why Averages Instead of Why Averages Instead of Single Readings?Single Readings?

Suppose we are manufacturing light bulbs. Suppose we are manufacturing light bulbs. The life of these bulbs has historically The life of these bulbs has historically followed a normal distribution with a mean of followed a normal distribution with a mean of 2000 hours and standard deviation of 300 2000 hours and standard deviation of 300 hours.hours.

We change the filament material and We change the filament material and unbeknown to us the average life of the bulbs unbeknown to us the average life of the bulbs decreases to 1500 hours. (We will assume decreases to 1500 hours. (We will assume that the distribution remains normal with a that the distribution remains normal with a standard deviation of 300 hours.)standard deviation of 300 hours.)

If we randomly sample 1 bulb, will we realize If we randomly sample 1 bulb, will we realize that the average life has decrease? What if that the average life has decrease? What if we sample 3 bulbs? 9 bulbs?we sample 3 bulbs? 9 bulbs?

Page 10: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1010

Why Averages Instead of Why Averages Instead of Single Readings?Single Readings?

800 1300 1800 2300 2800

μ = 1500 μ = 2000

Single Readings

σ = 300

Y < 1400 would signal shift

Page 11: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1111

Why Averages Instead of Why Averages Instead of Single Readings?Single Readings?

800 1300 1800 2300 2800

μ = 1500 μ = 2000

Averages of n = 3

σ = 173

Y < 1650 would signal shift

Page 12: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1212

Why Averages Instead of Why Averages Instead of Single Readings?Single Readings?

Averages of n = 9

μ = 1500 μ = 2000µ = 1500 µ = 2000

800 1300 1800 2300 2800

µ = 1500 µ = 2000 σ = 100

Y < 1800 would signal shift

Page 13: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1313

What if the original What if the original distribution is not normal? distribution is not normal? Consider the roll of a fair Consider the roll of a fair die:die: Rolling A Fair Die

0.00

0.05

0.10

0.15

0.20

1 2 3 4 5 6

# of Dots

Pro

bab

ilit

y

Page 14: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1414

Suppose the single Suppose the single measurements are not measurements are not normally Distributed. normally Distributed.

Let Y = life of a light bulb in Let Y = life of a light bulb in hourshours

Y is exponentially distributedY is exponentially distributed with with λλ = 0.0005 = 1/2000 = 0.0005 = 1/2000

0 1000 2000 3000 4000 5000 6000 7000 8000

0.0005

Page 15: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1515

Source: Lawrence L. Lapin, Statistics in Modern Business Decisions, 6th ed., 1993, Dryden Press, Ft. Worth, Texas.

Single measurements

Averages of 2 measurements

Averages of 4 measurements

Averages of 25 measurements

Page 16: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1616

n=1

n=2

n=4

n=25

As n increases, what happens to the variance?

A.Variance increases.

B.Variance decreases.

C.Variance remains the same.

Page 17: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1717

n = 1

n = 2

n = 4

n = 25

Page 18: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1818

Central Limit TheoremCentral Limit Theorem

IfIf n n is sufficiently large, the sample is sufficiently large, the sample means of random samples from a means of random samples from a population with mean population with mean μ μ and and standard deviation standard deviation σσ are are approximately normally distributed approximately normally distributed with mean with mean μμ and standard and standard deviation .deviation .

n/

Page 19: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 1919

Random Behavior of Means Random Behavior of Means SummarySummary

If Y is distributed n(If Y is distributed n(μ, σ), then μ, σ), then

is distributed N(is distributed N(μ, ).μ, ).

If Y is distributed non-NIf Y is distributed non-N((μ, σ), μ, σ), then then

is distributed approximately is distributed approximately

N(μ, ). N(μ, ).

n/

ny

n/

30ny

Page 20: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2020

If We Can Consider to be If We Can Consider to be Normal …Normal …

Recall: If Y is distributed normally Recall: If Y is distributed normally with mean with mean μμ and standard and standard deviation deviation σσ, then, then

So if is distributed normally with So if is distributed normally with mean mean μμ and standard deviation and standard deviation , ,

then then

y

n/

y

YZ

n

YZ

/

Page 21: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2121

If the time between adjacent accidents If the time between adjacent accidents in an industrial plant follows an in an industrial plant follows an exponential distribution with an exponential distribution with an average of 700 days, what is the average of 700 days, what is the probability that the average time probability that the average time between 49 pairs of adjacent between 49 pairs of adjacent accidents will be greater than 900 accidents will be greater than 900 days?days?

Page 22: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2222

XYZ Bottling Company claims XYZ Bottling Company claims that the distribution of fill on that the distribution of fill on it’s 16 oz bottles averages 16.2 it’s 16 oz bottles averages 16.2 ounces with a standard ounces with a standard deviation of 0.1 oz. We deviation of 0.1 oz. We randomly sample 36 bottles randomly sample 36 bottles and get y = 16.15. If we and get y = 16.15. If we assume a standard deviation of assume a standard deviation of 0.1 oz, do we believe XYZ’s 0.1 oz, do we believe XYZ’s claim of averaging 16.2 claim of averaging 16.2 ounces?ounces?

Page 23: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2323

Up Until Now We have been Up Until Now We have been Assuming that We Knew the True Assuming that We Knew the True Standard Deviation (Standard Deviation (σ)σ), But Let’s , But Let’s Face Facts …Face Facts …

When we use When we use ss to estimate to estimate σσ, then the , then the calculated valuecalculated value

follows a follows a t-distribution with t-distribution with n-1n-1 degrees of freedom.degrees of freedom.

Note: we must be able to assume that we Note: we must be able to assume that we are sampling from a normal population.are sampling from a normal population.

ns

y

/

Page 24: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2424

Let’s take another look at XYZ Let’s take another look at XYZ Bottling Company. If we Bottling Company. If we assume that fill on the assume that fill on the individual bottles follows a individual bottles follows a normal distribution, does the normal distribution, does the following data support the following data support the claim of an average fill of 16.2 claim of an average fill of 16.2 oz? oz?

16.1 16.0 16.3 16.2 16.116.1 16.0 16.3 16.2 16.1

Page 25: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2525

In SummaryIn Summary

When we know When we know σσ::

When we estimate When we estimate σσ with with ss::

n

yZ

/

ns

yt ndf

/1

We assume we are sampling from a normal population.

Page 26: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2626

Relationship Between Z and t Relationship Between Z and t DistributionsDistributions

-4 -3 -2 -1 0 1 2 3 4

Ztdf=3

tdf=1

Page 27: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2727

Internal Combustion EngineInternal Combustion Engine

The nominal power produced by a student-The nominal power produced by a student-designed internal combustion engine is designed internal combustion engine is 100 hp. The student team that designed 100 hp. The student team that designed the engine conducted 10 tests to the engine conducted 10 tests to determine the actual power. The data determine the actual power. The data follow:follow:

98, 101, 102, 97, 101, 98, 100, 92, 98, 10098, 101, 102, 97, 101, 98, 100, 92, 98, 100

Assume data came from a normal distribution.

Page 28: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2828

Internal Combustion EngineInternal Combustion Engine

ColumnColumn nn MeanMean Std. Dev.Std. Dev.

hphp 1010 98.798.7 2.92.9

Summary Data:

What is the probability of getting a sample mean of 98.7 hp or less if the true mean is 100 hp?

Page 29: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 2929

Internal Combustion EngineInternal Combustion Engine

-4 -3 -2 -1 0 1 2 3 4

t(df=9)

)418.1(10/9.2

1007.98)100|7.98( 99

dfdf tPtPyP

0.0949

What did we assume when doing this analysis?

Are you comfortable with the assumption?

Page 30: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3030

Can We Assume Sampling from Can We Assume Sampling from aa

Normal Population? Normal Population? If data are from a normal population, If data are from a normal population,

there is a linear relationship between there is a linear relationship between the data and their corresponding Z the data and their corresponding Z values.values.

Y

Z ZY

If we plot y on the vertical axis and z on the horizontal axis, the y intercept estimates μ and the slope estimates σ.

Page 31: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3131

How to Calculate How to Calculate Corresponding Z-ValuesCorresponding Z-Values

Order dataOrder data Estimate percent of population below Estimate percent of population below

each data point.each data point.

Look up Z-Value that has PLook up Z-Value that has Pii proportion of distribution below it.proportion of distribution below it.

n

iPi

5.0

where i is a data point’s position in the ordered set and n is the number of data points in the set.

Page 32: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3232

Normal Probability (QQ) PlotNormal Probability (QQ) Plot Data set:Data set: Z Pi yi i

-1.15 .125 2 1

-0.32 .375 4 2

+0.32 .625 7 3

+1.15 .875 10 4

2 4 7 10

Normal QQ Plot

0

2

4

6

8

10

12

-1.5 -1 -0.5 0 0.5 1 1.5

Z values

Data

Page 33: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3333

Normal Probability (QQ) PlotNormal Probability (QQ) Plot

QQ Plot with Data on Vertical Axis

0

2

4

6

8

10

12

14

16

-3 -2 -1 0 1 2 3

This data is a random sample from a N(10,2) population.

Page 34: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3434

Normal Probability (QQ) PlotNormal Probability (QQ) Plot

QQ Plot with Data on Vertical Axis

0

2

4

6

8

10

12

14

16

-3 -2 -1 0 1 2 3

Page 35: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South CarolinaUniversity of South Carolina

Estimation of the Estimation of the MeanMean

Page 36: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3636

Point EstimatorsPoint Estimators A A point estimatorpoint estimator is a single number is a single number

calculated from sample data that is used calculated from sample data that is used to estimate the value of a parameter.to estimate the value of a parameter.

Recall that statistics change value upon Recall that statistics change value upon repeated sampling of the same population repeated sampling of the same population while parameters are fixed, but unknown.while parameters are fixed, but unknown.

Examples:Examples:

estimates ˆ y

estimates ˆ s 222 estimates ˆ s

p estimates p̂

Page 37: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3737

In General:In General: parameter arbitrary theofestimator an is ˆ

What makes a “Good” estimator?

Accuracy: An unbiased estimator of a parameter is one whose expected value is equal to the parameter of interest.

(1)

Precision: An estimator is more precise if its sampling distribution has a smaller standard error*.

(2)

*Standard error is the standard deviation for the sampling distribution.

Page 38: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3838

Unbiased EstimatorsUnbiased Estimators

For normal populations, both the sample mean and sample median are unbiased estimators of μ.

Sampling Distributions for Mean and Median

-8 -6 -4 -2 0 2 4 6 8µ

mean

median

Page 39: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 3939

Most Efficient EstimatorsMost Efficient Estimators If you have multiple unbiased estimators, If you have multiple unbiased estimators,

then you choose the estimator whose then you choose the estimator whose sampling distribution has the least variation. sampling distribution has the least variation. This is called the This is called the most efficient estimatormost efficient estimator..

Sampling Distributions for Mean and Median

-8 -6 -4 -2 0 2 4 6 8

mean

median

For normal populations, the sample mean is the most efficient estimator of μ.

Page 40: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4040

Interval Estimate of the Interval Estimate of the MeanMean

95.0)96.1/

96.1(

n

YP

on distributi normal standard a follows / n

YZ n

So we say that we are 95% sure

that μ is in the interval

nY

96.1

(with a little algebra)

What assumptions have we made?

)1()( 2/2/

nzY

nzYP

Page 41: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4141

Interval Estimate of the Interval Estimate of the MeanMean

Standard Normal

-4 -3 -2 -1 0 1 2 3 4

.025.025 0.95

Z1.96-1.96

Page 42: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4242

Interval Estimate of the Interval Estimate of the MeanMean

Let’s go from 95% confidence to the Let’s go from 95% confidence to the general case.general case.

The symbol zThe symbol zαα is the z-value that has is the z-value that has an area of α to the right of it.an area of α to the right of it.

)1()/

( 2/2/

z

n

YzP

)1()( 2/2/

nzY

nzYP

Page 43: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4343

Interval Estimate of the Interval Estimate of the MeanMean

Standard Normal

-4 -3 -2 -1 0 1 2 3 4

α/2α/2 1 - α

(1 – α) 100% Confidence Interval

-Zα/2 +Zα/2

Page 44: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4444

What Does (1 – What Does (1 – α) 100% Confidence α) 100% Confidence Mean?Mean?

y y

y

xx

xx

0

1

2

3

4

5

6

7

8

Sam

ple

μ

y

Sampling Distribution of the y

(1-α)100% Confidence Intervals

)/,( nn

y

y

yy

y

y

Page 45: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4545

If ZIf Z0.050.05 = 1.645, we are _____% = 1.645, we are _____% confident that the mean is confident that the mean is between between

ny

645.1

A.99%

B.95%

C.90%

D.85%

Page 46: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4646

Which z-value would you Which z-value would you use to calculate a 99% use to calculate a 99% confidence interval on a confidence interval on a mean?mean?

A.A. ZZ0.100.10 = 1.282 = 1.282

B.B. ZZ0.010.01 = 2.326 = 2.326

C.C. ZZ0.0050.005 = 2.576 = 2.576

D.D. ZZ0.00050.0005 = 3.291 = 3.291

Page 47: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4747

Plastic Injection Molding Plastic Injection Molding ProcessProcess

A plastic injection molding process for a A plastic injection molding process for a part that has a critical width dimension part that has a critical width dimension historically follows a normal distribution historically follows a normal distribution with a standard deviation of 8.with a standard deviation of 8.

Periodically, clogs from one of the Periodically, clogs from one of the feeder lines causes the mean width to feeder lines causes the mean width to change. As a result, the operator change. As a result, the operator periodically takes random samples of periodically takes random samples of size 4.size 4.

Page 48: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4848

Plastic Injection MoldingPlastic Injection Molding

A recent sample of four yielded a A recent sample of four yielded a sample mean of 101.4.sample mean of 101.4.

Construct a 95% confidence interval Construct a 95% confidence interval for the true mean width.for the true mean width.

Construct a 99% confidence for the Construct a 99% confidence for the true mean width.true mean width.

Page 49: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 4949

When going from a 95% When going from a 95% confidence interval to a 99% confidence interval to a 99% confidence interval, the width of confidence interval, the width of the interval willthe interval willA.A. Increase.Increase.

B.B. Decrease.Decrease.

C.C. Remain the same.Remain the same.

Page 50: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5050

Interval Width, Level of Interval Width, Level of Confidence and Sample SizeConfidence and Sample Size

At a given sample size, as level of At a given sample size, as level of confidence increases, interval width confidence increases, interval width __________.__________.

At a given level of confidence as At a given level of confidence as sample size increases, interval width sample size increases, interval width __________.__________.

Page 51: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5151

Calculate Sample Size Before Calculate Sample Size Before Sampling!Sampling!

The width of the interval is determined by:The width of the interval is determined by:

nz

2/

nzderrorMax

2/

2

2/

d

zn

Suppose we wish to estimate the mean to a maximum error of d:

Page 52: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5252

Plastic Injection MoldingPlastic Injection Molding A plastic injection molding process A plastic injection molding process

for a part that has a critical width for a part that has a critical width dimension historically follows a dimension historically follows a normal distribution with a standard normal distribution with a standard deviation of 8.deviation of 8.

What sample size is required to What sample size is required to estimate the true mean width to estimate the true mean width to within within ++ 2 units at 95% confidence? 2 units at 95% confidence?

What sample size is required to What sample size is required to estimate the true mean width to estimate the true mean width to within within ++ 2 units at 99% confidence? 2 units at 99% confidence?

Page 53: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5353

If we don’t have prior knowledge If we don’t have prior knowledge of the standard deviation, but can of the standard deviation, but can assume we are sampling from a assume we are sampling from a normal population…normal population…

Instead of using a z-value to calculate Instead of using a z-value to calculate the confidence interval…the confidence interval…

)1()/

( 2/2/

t

ns

YtP

)1()( 2/2/ n

stY

n

stYP

Page 54: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5454

Interval Estimate of the Interval Estimate of the MeanMean

Standard Normal

-4 -3 -2 -1 0 1 2 3 4

α/2α/2 1 - α

(1 – α) 100% Confidence Interval

-tα/2 +tα/2

t df=n-1

Page 55: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5555

Plastic Injection Molding – Plastic Injection Molding – RewordedReworded

A plastic injection molding process for A plastic injection molding process for a part that has a critical width a part that has a critical width dimension historically follows a dimension historically follows a normal distribution.normal distribution.

A recent sample of four yielded a A recent sample of four yielded a sample mean of 101.4 and sample sample mean of 101.4 and sample standard deviation of 8.standard deviation of 8.

Estimate the true mean width with a Estimate the true mean width with a 95% confidence interval.95% confidence interval.

Page 56: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South CarolinaUniversity of South Carolina

Hypothesis TestingHypothesis Testing

Page 57: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5757

Statistical HypothesisStatistical Hypothesis

A A statistical hypothesisstatistical hypothesis is an is an assertion or conjecture concerning one assertion or conjecture concerning one or more population parameters.or more population parameters.

Examples:Examples:– More than 7% of the landings for a certain More than 7% of the landings for a certain

airline exceed the runway.airline exceed the runway.– The defective rate on a manufacturing line The defective rate on a manufacturing line

is less than 10%.is less than 10%.– The mean lifetime of the bulbs is above The mean lifetime of the bulbs is above

2200 hours. 2200 hours.

Page 58: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5858

The Null and Alternative The Null and Alternative HypothesesHypotheses

Null Hypothesis, HNull Hypothesis, Hoo,, represents what represents what we we assumeassume to be true. It is always stated to be true. It is always stated so as to specify an exact value of the so as to specify an exact value of the parameter. parameter.

Alternative (Research) HypothesisAlternative (Research) Hypothesis, , HH11 or H or Haa,, represents the alternative to represents the alternative to the null hypothesis and allows for the the null hypothesis and allows for the possibility of several values. possibility of several values. It carries the It carries the burden of proof.burden of proof.

In most situations, the researcher hopes In most situations, the researcher hopes to disprove or reject the null hypothesis to disprove or reject the null hypothesis in favor of the alternative hypothesis.in favor of the alternative hypothesis.

Page 59: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 5959

Steps to a Hypothesis Steps to a Hypothesis TestTest

(1)(1) Determine the null and alternative Determine the null and alternative hypotheses.hypotheses.

(2)(2) Collect data and calculate test Collect data and calculate test statistic, assuming null hypothesis it statistic, assuming null hypothesis it true.true.

(3)(3) Assuming the null hypothesis is true, Assuming the null hypothesis is true, calculate the p-value or use rejection calculate the p-value or use rejection region method.region method.

(4)(4) Draw conclusion and state it in Draw conclusion and state it in English.English.

Page 60: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6060

Two types of mistakesTwo types of mistakes(1) Type I error(1) Type I error

Reject null hypothesis when it is true.Reject null hypothesis when it is true.

(2) Type II error(2) Type II error

Fail to reject the null hypothesis Fail to reject the null hypothesis when the alternative hypothesis is when the alternative hypothesis is true.true.

Let Let αα= P(type I error), = P(type I error), ββ=P(type II error)=P(type II error)

Power of the test is 1-Power of the test is 1-ββ..

Page 61: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6161

Combustion EngineCombustion EngineThe nominal power produced by a student The nominal power produced by a student

designed combustion engine is assumed designed combustion engine is assumed to be at least 100 hp. We wish to test the to be at least 100 hp. We wish to test the alternative that the power is less than 100 alternative that the power is less than 100 hp.hp.

Let Let µ = nominal power of engine.µ = nominal power of engine.

QQ plots shows it is reasonable to assume QQ plots shows it is reasonable to assume data came from a normal distribution.data came from a normal distribution.

Sample Data:Sample Data:7.98y 8694.2s

10n

Page 62: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6262

Combustion EngineCombustion Engine(1) State hypotheses, set alpha.

(2) Choose test statistic

(3,4) Designate critical value for test ( if using the rejection region method) and draw conclusion

Calculate p-value and draw conclusion.

or

Page 63: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6363

(3) Designate Rejection (3) Designate Rejection RegionRegion

-4 -3 -2 -1 0 1 2 3 4Y=avg hp100-4 -3 -2 -1 0 +1 +2 +3 +4

tdf=9

-1.833

0.05

Assumes H0: µ = 100 is true

Page 64: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6464

Draw conclusion:Draw conclusion:

4327.110/8694.2

1007.98

/0

9

ns

ytdf

-4 -3 -2 -1 0 1 2 3 4

-1.833

-1.4327ttdf=9df=9

Page 65: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6565

p-valuep-value

The The p-valuep-value is the probability of is the probability of getting the sample result we got or getting the sample result we got or something more extreme.something more extreme.

-4 -3 -2 -1 0 1 2 3 4

-1.4327

tdf=9

0.0928

Page 66: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6666

p-valuep-value P(tP(tdf=9df=9 << -1.4327) = 0.0928 -1.4327) = 0.0928 Note:Note:

If p-value If p-value << αα, reject H, reject H00..

If p-value > If p-value > αα. Fail to reject H. Fail to reject H00..

-4 -3 -2 -1 0 1 2 3 4

-1.4327

-1.833

0.05

0.0928

tdf=9

Page 67: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6767

Average Life of a Light BulbAverage Life of a Light Bulb

Historically, a particular light bulb Historically, a particular light bulb has had a mean life of no more has had a mean life of no more than 2000 hours. We have than 2000 hours. We have changed the production process changed the production process and believe that the life of the and believe that the life of the bulb has increased.bulb has increased.

Let Let μ = mean life.μ = mean life.

H0:

Ha:

(1) Set Up Hypotheses α = 0.05

Page 68: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6868

Average Life of a Light BulbAverage Life of a Light Bulb(2) Collect Data and calculate test statistic:(2) Collect Data and calculate test statistic:

5282.215/216

20002141

/0

14

ns

ytdf

p-value = P(tdf=14 > 2.5282) = 0.0121

-4 -3 -2 -1 0 1 2 3 4

2141y 216s

1.7612.5282

0.050.0121

tdf=14

15n

Page 69: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 6969

Average Life of a Light BulbAverage Life of a Light Bulb

State Conclusion:State Conclusion:

A.A. At 0.05 level of significance At 0.05 level of significance there is insufficient evidence to there is insufficient evidence to conclude that conclude that µ > 2000 hours.µ > 2000 hours.

B.B. At 0.05 level of significance At 0.05 level of significance there is sufficient evidence to there is sufficient evidence to conclude that conclude that µ > 2000 hours.µ > 2000 hours.

Page 70: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7070

Mean Width of a Manufactured Mean Width of a Manufactured PartPart

Test the theory that the mean width Test the theory that the mean width of a manufactured part differs from of a manufactured part differs from 100 cm.100 cm.

Let Let µ = mean width.µ = mean width.

(1) Set up Hypotheses (1) Set up Hypotheses αα = 0.05 = 0.05

Page 71: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7171

Mean Width of a Manufactured Mean Width of a Manufactured PartPart

(2,3) (2,3) Collect data and calculate test Collect data and calculate test statistic.statistic.105y 6s

19dft

....(*2 19 dftPvaluep

(4) State conclusion.(4) State conclusion.

20n

Page 72: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7272

Given population parameter Given population parameter µµ and value and value µµ00::

For Ho: For Ho: µµ = = µµ00

HHaa: : µµ = = µµ00

HHaa: : µµ > > µµ00

HHaa:: µµ < < µµ00

α/2α/2

Ha

α

H0

α

Ha

Ha

Ha

H0

H0

Page 73: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7373

Focus on the two types of Focus on the two types of errors in hypothesis testerrors in hypothesis test

1)1) Reject HReject H00 when H when H00 is true. This is called a is true. This is called a type I error.type I error.

P(Rej HP(Rej H00|H|H00 is true) = is true) = αα

2)2) Fail to Reject HFail to Reject H00 when H when Haa is true at some is true at some value. This is called a type II error.value. This is called a type II error.

P(Fail to Rej HP(Fail to Rej H00|H|Haa is true at some value) is true at some value) = = ββ

Page 74: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7474

Avg Life of Light Bulb - Type I Avg Life of Light Bulb - Type I ErrorError

α = Probability that we will reject Ho when Ho is true.

H0: µ < 2000Ha: µ > 2000

Fail to reject H0.ZZ

Assumes HAssumes H00 is true.is true.

Page 75: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7575

Type I and Type II ErrorsType I and Type II Errors

What if µ = 2200H0: µ = 2000

β = Probability we will fail to reject Ho when Ha is true at µ = 2200

α = Probability that we will reject Ho when Ho is true.

Page 76: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7676

How can we control the size of How can we control the size of ββ??

The value of The value of αα..

Location of our point of interest.Location of our point of interest.

Sample size.Sample size.

Page 77: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7777

Calculating Calculating ββ If If µµ = 2200, what is the probability of a = 2200, what is the probability of a

type II error?type II error? Given: Given: α = 0.05 and we are assumingα = 0.05 and we are assuming

µµ = 2000. We will also assume we know = 2000. We will also assume we know σσ = 216. = 216.

05.0)645.1( ZP

209115/216

2000645.1

y

y

Page 78: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7878

Calculating Calculating ββ

H0: µ = 2000 What if µ = 2200

2091

)2200|2091(yP

Fail to Reject Ho Reject Ho

Page 79: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 7979

Calculating Calculating ββ)2200|2091( yP

0254.0)9544.1(15/216

22002091

zPzP

0254.0)2200 | HReject toFail( 0 P

Page 80: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 8080

α, β and Powerα, β and Power α = P(Reject Hα = P(Reject H00||µµ = 2000) = 0.05 = 2000) = 0.05

β = P(Fail to Rej Hβ = P(Fail to Rej H00| | µµ = 2200) = 0.0254 = 2200) = 0.0254

We say that the We say that the power power of this test atof this test at

µµ = 2200 is 1 – 0.0254 = 0.9746 = 2200 is 1 – 0.0254 = 0.9746

Power = 1 –βPower = 1 –β Power = P(Rej HPower = P(Rej H00||µµ is at some H is at some Haa level) level)

Page 81: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 8181

Plastic Injection MoldingPlastic Injection Molding

A plastic injection molding process for A plastic injection molding process for a part that has a critical width a part that has a critical width dimension historically follows a dimension historically follows a normal distribution.normal distribution.

A recent sample of n = 4 yielded a A recent sample of n = 4 yielded a sample mean of 101.4 and sample sample mean of 101.4 and sample standard deviation of 8.standard deviation of 8.

Does this data support the statement: Does this data support the statement: “The true average width is greater “The true average width is greater than 95.”?than 95.”?

Page 82: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 8282

Plastic Injection MoldingPlastic Injection MoldingConfidence Interval ApproachConfidence Interval Approach

95% confidence interval on 95% confidence interval on µ:µ:

n

sty df 025.0,3

728.124.1014

8182.34.101

)24.109,56.93(

Page 83: Inference on Single Mean_new

L. Wang, Department of StatisticsL. Wang, Department of Statistics

University of South Carolina; Slide University of South Carolina; Slide 8383

Plastic Injection MoldingPlastic Injection MoldingHypothesis Test ApproachHypothesis Test Approach

HH00::

HHaa::

αα = 0.05 = 0.05

Test statistics isTest statistics is

p-value =p-value =

Conclusion:Conclusion: