inference on single mean_new
DESCRIPTION
Normal Probability Distribution for Students.TRANSCRIPT
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South CarolinaUniversity of South Carolina
Inference on a Single Inference on a Single MeanMean
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 22
Use Calculation from Sample to Use Calculation from Sample to Estimate Population ParameterEstimate Population Parameter
Population Sample(select)
Statistic
(calculate)
Parameter(estimate)
(describes)
%63ˆ p?p
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 33
Use Calculation from Sample to Use Calculation from Sample to Estimate Population ParameterEstimate Population Parameter
Population Sample(select)
Statistic
(calculate)
Parameter(estimate)
(describes)
hrsy 200,2?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 44
Statistic ParameterStatistic Parameter Describes a Describes a
sample.sample. Always knownAlways known Changes upon Changes upon
repeated repeated sampling.sampling.
Examples:Examples:
Describes a Describes a population.population.
Usually unknownUsually unknown Is fixedIs fixed
Examples:Examples:
pssy ˆ,,, 2 p,,, 2
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 55
A Statistic is a Random A Statistic is a Random VariableVariable
Upon repeated sampling of the same Upon repeated sampling of the same population, the value of a statistic changes.population, the value of a statistic changes.
While we don’t know what the next value While we don’t know what the next value will be, we do know the overall pattern over will be, we do know the overall pattern over many, many samplingsmany, many samplings..
The distribution of possible values of a The distribution of possible values of a statistic for repeated samples of the same statistic for repeated samples of the same size from a population is called the size from a population is called the sampling distributionsampling distribution of the statistic. of the statistic.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 66
Sampling Distribution of Sampling Distribution of
y
•If a random sample of size n is taken from a normal population having mean μy and variance σy
2, then is a random variable which is also normally distributed with mean μy and variance σy
2/n .
y
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 77
Sampling Distribution of Sampling Distribution of Original Population
80 85 90 95 100 105 110 115 120
X
Averages - Sample Size = 2
80 85 90 95 100 105 110 115 120
X(2)
Averages - Sample Size = 10
80 85 90 95 100 105 110 115 120
X(10)
Averages - Sample Size = 25
80 85 90 95 100 105 110 115 120
X(25)
N(100,5)
N(100,1)N(100,3.54)
N(100,1.58)
y
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 88
Light BulbsLight Bulbs
The life of a light bulb is normally The life of a light bulb is normally distributed with a mean of 2000 hours distributed with a mean of 2000 hours and standard deviation of 300 hours.and standard deviation of 300 hours.
What is the probability that a What is the probability that a randomly chosen light bulb will have a randomly chosen light bulb will have a life of less than 1700 hours?life of less than 1700 hours?
What is the probability that the mean What is the probability that the mean life of three randomly chosen light life of three randomly chosen light bulbs will be less than 1700 hours?bulbs will be less than 1700 hours?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 99
Why Averages Instead of Why Averages Instead of Single Readings?Single Readings?
Suppose we are manufacturing light bulbs. Suppose we are manufacturing light bulbs. The life of these bulbs has historically The life of these bulbs has historically followed a normal distribution with a mean of followed a normal distribution with a mean of 2000 hours and standard deviation of 300 2000 hours and standard deviation of 300 hours.hours.
We change the filament material and We change the filament material and unbeknown to us the average life of the bulbs unbeknown to us the average life of the bulbs decreases to 1500 hours. (We will assume decreases to 1500 hours. (We will assume that the distribution remains normal with a that the distribution remains normal with a standard deviation of 300 hours.)standard deviation of 300 hours.)
If we randomly sample 1 bulb, will we realize If we randomly sample 1 bulb, will we realize that the average life has decrease? What if that the average life has decrease? What if we sample 3 bulbs? 9 bulbs?we sample 3 bulbs? 9 bulbs?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1010
Why Averages Instead of Why Averages Instead of Single Readings?Single Readings?
800 1300 1800 2300 2800
μ = 1500 μ = 2000
Single Readings
σ = 300
Y < 1400 would signal shift
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1111
Why Averages Instead of Why Averages Instead of Single Readings?Single Readings?
800 1300 1800 2300 2800
μ = 1500 μ = 2000
Averages of n = 3
σ = 173
Y < 1650 would signal shift
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1212
Why Averages Instead of Why Averages Instead of Single Readings?Single Readings?
Averages of n = 9
μ = 1500 μ = 2000µ = 1500 µ = 2000
800 1300 1800 2300 2800
µ = 1500 µ = 2000 σ = 100
Y < 1800 would signal shift
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1313
What if the original What if the original distribution is not normal? distribution is not normal? Consider the roll of a fair Consider the roll of a fair die:die: Rolling A Fair Die
0.00
0.05
0.10
0.15
0.20
1 2 3 4 5 6
# of Dots
Pro
bab
ilit
y
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1414
Suppose the single Suppose the single measurements are not measurements are not normally Distributed. normally Distributed.
Let Y = life of a light bulb in Let Y = life of a light bulb in hourshours
Y is exponentially distributedY is exponentially distributed with with λλ = 0.0005 = 1/2000 = 0.0005 = 1/2000
0 1000 2000 3000 4000 5000 6000 7000 8000
0.0005
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1515
Source: Lawrence L. Lapin, Statistics in Modern Business Decisions, 6th ed., 1993, Dryden Press, Ft. Worth, Texas.
Single measurements
Averages of 2 measurements
Averages of 4 measurements
Averages of 25 measurements
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1616
n=1
n=2
n=4
n=25
As n increases, what happens to the variance?
A.Variance increases.
B.Variance decreases.
C.Variance remains the same.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1717
n = 1
n = 2
n = 4
n = 25
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1818
Central Limit TheoremCentral Limit Theorem
IfIf n n is sufficiently large, the sample is sufficiently large, the sample means of random samples from a means of random samples from a population with mean population with mean μ μ and and standard deviation standard deviation σσ are are approximately normally distributed approximately normally distributed with mean with mean μμ and standard and standard deviation .deviation .
n/
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 1919
Random Behavior of Means Random Behavior of Means SummarySummary
If Y is distributed n(If Y is distributed n(μ, σ), then μ, σ), then
is distributed N(is distributed N(μ, ).μ, ).
If Y is distributed non-NIf Y is distributed non-N((μ, σ), μ, σ), then then
is distributed approximately is distributed approximately
N(μ, ). N(μ, ).
n/
ny
n/
30ny
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2020
If We Can Consider to be If We Can Consider to be Normal …Normal …
Recall: If Y is distributed normally Recall: If Y is distributed normally with mean with mean μμ and standard and standard deviation deviation σσ, then, then
So if is distributed normally with So if is distributed normally with mean mean μμ and standard deviation and standard deviation , ,
then then
y
n/
y
YZ
n
YZ
/
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2121
If the time between adjacent accidents If the time between adjacent accidents in an industrial plant follows an in an industrial plant follows an exponential distribution with an exponential distribution with an average of 700 days, what is the average of 700 days, what is the probability that the average time probability that the average time between 49 pairs of adjacent between 49 pairs of adjacent accidents will be greater than 900 accidents will be greater than 900 days?days?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2222
XYZ Bottling Company claims XYZ Bottling Company claims that the distribution of fill on that the distribution of fill on it’s 16 oz bottles averages 16.2 it’s 16 oz bottles averages 16.2 ounces with a standard ounces with a standard deviation of 0.1 oz. We deviation of 0.1 oz. We randomly sample 36 bottles randomly sample 36 bottles and get y = 16.15. If we and get y = 16.15. If we assume a standard deviation of assume a standard deviation of 0.1 oz, do we believe XYZ’s 0.1 oz, do we believe XYZ’s claim of averaging 16.2 claim of averaging 16.2 ounces?ounces?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2323
Up Until Now We have been Up Until Now We have been Assuming that We Knew the True Assuming that We Knew the True Standard Deviation (Standard Deviation (σ)σ), But Let’s , But Let’s Face Facts …Face Facts …
When we use When we use ss to estimate to estimate σσ, then the , then the calculated valuecalculated value
follows a follows a t-distribution with t-distribution with n-1n-1 degrees of freedom.degrees of freedom.
Note: we must be able to assume that we Note: we must be able to assume that we are sampling from a normal population.are sampling from a normal population.
ns
y
/
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2424
Let’s take another look at XYZ Let’s take another look at XYZ Bottling Company. If we Bottling Company. If we assume that fill on the assume that fill on the individual bottles follows a individual bottles follows a normal distribution, does the normal distribution, does the following data support the following data support the claim of an average fill of 16.2 claim of an average fill of 16.2 oz? oz?
16.1 16.0 16.3 16.2 16.116.1 16.0 16.3 16.2 16.1
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2525
In SummaryIn Summary
When we know When we know σσ::
When we estimate When we estimate σσ with with ss::
n
yZ
/
ns
yt ndf
/1
We assume we are sampling from a normal population.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2626
Relationship Between Z and t Relationship Between Z and t DistributionsDistributions
-4 -3 -2 -1 0 1 2 3 4
Ztdf=3
tdf=1
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2727
Internal Combustion EngineInternal Combustion Engine
The nominal power produced by a student-The nominal power produced by a student-designed internal combustion engine is designed internal combustion engine is 100 hp. The student team that designed 100 hp. The student team that designed the engine conducted 10 tests to the engine conducted 10 tests to determine the actual power. The data determine the actual power. The data follow:follow:
98, 101, 102, 97, 101, 98, 100, 92, 98, 10098, 101, 102, 97, 101, 98, 100, 92, 98, 100
Assume data came from a normal distribution.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2828
Internal Combustion EngineInternal Combustion Engine
ColumnColumn nn MeanMean Std. Dev.Std. Dev.
hphp 1010 98.798.7 2.92.9
Summary Data:
What is the probability of getting a sample mean of 98.7 hp or less if the true mean is 100 hp?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 2929
Internal Combustion EngineInternal Combustion Engine
-4 -3 -2 -1 0 1 2 3 4
t(df=9)
)418.1(10/9.2
1007.98)100|7.98( 99
dfdf tPtPyP
0.0949
What did we assume when doing this analysis?
Are you comfortable with the assumption?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3030
Can We Assume Sampling from Can We Assume Sampling from aa
Normal Population? Normal Population? If data are from a normal population, If data are from a normal population,
there is a linear relationship between there is a linear relationship between the data and their corresponding Z the data and their corresponding Z values.values.
Y
Z ZY
If we plot y on the vertical axis and z on the horizontal axis, the y intercept estimates μ and the slope estimates σ.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3131
How to Calculate How to Calculate Corresponding Z-ValuesCorresponding Z-Values
Order dataOrder data Estimate percent of population below Estimate percent of population below
each data point.each data point.
Look up Z-Value that has PLook up Z-Value that has Pii proportion of distribution below it.proportion of distribution below it.
n
iPi
5.0
where i is a data point’s position in the ordered set and n is the number of data points in the set.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3232
Normal Probability (QQ) PlotNormal Probability (QQ) Plot Data set:Data set: Z Pi yi i
-1.15 .125 2 1
-0.32 .375 4 2
+0.32 .625 7 3
+1.15 .875 10 4
2 4 7 10
Normal QQ Plot
0
2
4
6
8
10
12
-1.5 -1 -0.5 0 0.5 1 1.5
Z values
Data
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3333
Normal Probability (QQ) PlotNormal Probability (QQ) Plot
QQ Plot with Data on Vertical Axis
0
2
4
6
8
10
12
14
16
-3 -2 -1 0 1 2 3
This data is a random sample from a N(10,2) population.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3434
Normal Probability (QQ) PlotNormal Probability (QQ) Plot
QQ Plot with Data on Vertical Axis
0
2
4
6
8
10
12
14
16
-3 -2 -1 0 1 2 3
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South CarolinaUniversity of South Carolina
Estimation of the Estimation of the MeanMean
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3636
Point EstimatorsPoint Estimators A A point estimatorpoint estimator is a single number is a single number
calculated from sample data that is used calculated from sample data that is used to estimate the value of a parameter.to estimate the value of a parameter.
Recall that statistics change value upon Recall that statistics change value upon repeated sampling of the same population repeated sampling of the same population while parameters are fixed, but unknown.while parameters are fixed, but unknown.
Examples:Examples:
estimates ˆ y
estimates ˆ s 222 estimates ˆ s
p estimates p̂
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3737
In General:In General: parameter arbitrary theofestimator an is ˆ
What makes a “Good” estimator?
Accuracy: An unbiased estimator of a parameter is one whose expected value is equal to the parameter of interest.
(1)
Precision: An estimator is more precise if its sampling distribution has a smaller standard error*.
(2)
*Standard error is the standard deviation for the sampling distribution.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3838
Unbiased EstimatorsUnbiased Estimators
For normal populations, both the sample mean and sample median are unbiased estimators of μ.
Sampling Distributions for Mean and Median
-8 -6 -4 -2 0 2 4 6 8µ
mean
median
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 3939
Most Efficient EstimatorsMost Efficient Estimators If you have multiple unbiased estimators, If you have multiple unbiased estimators,
then you choose the estimator whose then you choose the estimator whose sampling distribution has the least variation. sampling distribution has the least variation. This is called the This is called the most efficient estimatormost efficient estimator..
Sampling Distributions for Mean and Median
-8 -6 -4 -2 0 2 4 6 8
mean
median
For normal populations, the sample mean is the most efficient estimator of μ.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4040
Interval Estimate of the Interval Estimate of the MeanMean
95.0)96.1/
96.1(
n
YP
on distributi normal standard a follows / n
YZ n
So we say that we are 95% sure
that μ is in the interval
nY
96.1
(with a little algebra)
What assumptions have we made?
)1()( 2/2/
nzY
nzYP
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4141
Interval Estimate of the Interval Estimate of the MeanMean
Standard Normal
-4 -3 -2 -1 0 1 2 3 4
.025.025 0.95
Z1.96-1.96
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4242
Interval Estimate of the Interval Estimate of the MeanMean
Let’s go from 95% confidence to the Let’s go from 95% confidence to the general case.general case.
The symbol zThe symbol zαα is the z-value that has is the z-value that has an area of α to the right of it.an area of α to the right of it.
)1()/
( 2/2/
z
n
YzP
)1()( 2/2/
nzY
nzYP
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4343
Interval Estimate of the Interval Estimate of the MeanMean
Standard Normal
-4 -3 -2 -1 0 1 2 3 4
α/2α/2 1 - α
(1 – α) 100% Confidence Interval
-Zα/2 +Zα/2
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4444
What Does (1 – What Does (1 – α) 100% Confidence α) 100% Confidence Mean?Mean?
y y
y
xx
xx
0
1
2
3
4
5
6
7
8
Sam
ple
μ
y
Sampling Distribution of the y
(1-α)100% Confidence Intervals
)/,( nn
y
y
yy
y
y
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4545
If ZIf Z0.050.05 = 1.645, we are _____% = 1.645, we are _____% confident that the mean is confident that the mean is between between
ny
645.1
A.99%
B.95%
C.90%
D.85%
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4646
Which z-value would you Which z-value would you use to calculate a 99% use to calculate a 99% confidence interval on a confidence interval on a mean?mean?
A.A. ZZ0.100.10 = 1.282 = 1.282
B.B. ZZ0.010.01 = 2.326 = 2.326
C.C. ZZ0.0050.005 = 2.576 = 2.576
D.D. ZZ0.00050.0005 = 3.291 = 3.291
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4747
Plastic Injection Molding Plastic Injection Molding ProcessProcess
A plastic injection molding process for a A plastic injection molding process for a part that has a critical width dimension part that has a critical width dimension historically follows a normal distribution historically follows a normal distribution with a standard deviation of 8.with a standard deviation of 8.
Periodically, clogs from one of the Periodically, clogs from one of the feeder lines causes the mean width to feeder lines causes the mean width to change. As a result, the operator change. As a result, the operator periodically takes random samples of periodically takes random samples of size 4.size 4.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4848
Plastic Injection MoldingPlastic Injection Molding
A recent sample of four yielded a A recent sample of four yielded a sample mean of 101.4.sample mean of 101.4.
Construct a 95% confidence interval Construct a 95% confidence interval for the true mean width.for the true mean width.
Construct a 99% confidence for the Construct a 99% confidence for the true mean width.true mean width.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 4949
When going from a 95% When going from a 95% confidence interval to a 99% confidence interval to a 99% confidence interval, the width of confidence interval, the width of the interval willthe interval willA.A. Increase.Increase.
B.B. Decrease.Decrease.
C.C. Remain the same.Remain the same.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5050
Interval Width, Level of Interval Width, Level of Confidence and Sample SizeConfidence and Sample Size
At a given sample size, as level of At a given sample size, as level of confidence increases, interval width confidence increases, interval width __________.__________.
At a given level of confidence as At a given level of confidence as sample size increases, interval width sample size increases, interval width __________.__________.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5151
Calculate Sample Size Before Calculate Sample Size Before Sampling!Sampling!
The width of the interval is determined by:The width of the interval is determined by:
nz
2/
nzderrorMax
2/
2
2/
d
zn
Suppose we wish to estimate the mean to a maximum error of d:
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5252
Plastic Injection MoldingPlastic Injection Molding A plastic injection molding process A plastic injection molding process
for a part that has a critical width for a part that has a critical width dimension historically follows a dimension historically follows a normal distribution with a standard normal distribution with a standard deviation of 8.deviation of 8.
What sample size is required to What sample size is required to estimate the true mean width to estimate the true mean width to within within ++ 2 units at 95% confidence? 2 units at 95% confidence?
What sample size is required to What sample size is required to estimate the true mean width to estimate the true mean width to within within ++ 2 units at 99% confidence? 2 units at 99% confidence?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5353
If we don’t have prior knowledge If we don’t have prior knowledge of the standard deviation, but can of the standard deviation, but can assume we are sampling from a assume we are sampling from a normal population…normal population…
Instead of using a z-value to calculate Instead of using a z-value to calculate the confidence interval…the confidence interval…
)1()/
( 2/2/
t
ns
YtP
)1()( 2/2/ n
stY
n
stYP
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5454
Interval Estimate of the Interval Estimate of the MeanMean
Standard Normal
-4 -3 -2 -1 0 1 2 3 4
α/2α/2 1 - α
(1 – α) 100% Confidence Interval
-tα/2 +tα/2
t df=n-1
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5555
Plastic Injection Molding – Plastic Injection Molding – RewordedReworded
A plastic injection molding process for A plastic injection molding process for a part that has a critical width a part that has a critical width dimension historically follows a dimension historically follows a normal distribution.normal distribution.
A recent sample of four yielded a A recent sample of four yielded a sample mean of 101.4 and sample sample mean of 101.4 and sample standard deviation of 8.standard deviation of 8.
Estimate the true mean width with a Estimate the true mean width with a 95% confidence interval.95% confidence interval.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South CarolinaUniversity of South Carolina
Hypothesis TestingHypothesis Testing
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5757
Statistical HypothesisStatistical Hypothesis
A A statistical hypothesisstatistical hypothesis is an is an assertion or conjecture concerning one assertion or conjecture concerning one or more population parameters.or more population parameters.
Examples:Examples:– More than 7% of the landings for a certain More than 7% of the landings for a certain
airline exceed the runway.airline exceed the runway.– The defective rate on a manufacturing line The defective rate on a manufacturing line
is less than 10%.is less than 10%.– The mean lifetime of the bulbs is above The mean lifetime of the bulbs is above
2200 hours. 2200 hours.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5858
The Null and Alternative The Null and Alternative HypothesesHypotheses
Null Hypothesis, HNull Hypothesis, Hoo,, represents what represents what we we assumeassume to be true. It is always stated to be true. It is always stated so as to specify an exact value of the so as to specify an exact value of the parameter. parameter.
Alternative (Research) HypothesisAlternative (Research) Hypothesis, , HH11 or H or Haa,, represents the alternative to represents the alternative to the null hypothesis and allows for the the null hypothesis and allows for the possibility of several values. possibility of several values. It carries the It carries the burden of proof.burden of proof.
In most situations, the researcher hopes In most situations, the researcher hopes to disprove or reject the null hypothesis to disprove or reject the null hypothesis in favor of the alternative hypothesis.in favor of the alternative hypothesis.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 5959
Steps to a Hypothesis Steps to a Hypothesis TestTest
(1)(1) Determine the null and alternative Determine the null and alternative hypotheses.hypotheses.
(2)(2) Collect data and calculate test Collect data and calculate test statistic, assuming null hypothesis it statistic, assuming null hypothesis it true.true.
(3)(3) Assuming the null hypothesis is true, Assuming the null hypothesis is true, calculate the p-value or use rejection calculate the p-value or use rejection region method.region method.
(4)(4) Draw conclusion and state it in Draw conclusion and state it in English.English.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6060
Two types of mistakesTwo types of mistakes(1) Type I error(1) Type I error
Reject null hypothesis when it is true.Reject null hypothesis when it is true.
(2) Type II error(2) Type II error
Fail to reject the null hypothesis Fail to reject the null hypothesis when the alternative hypothesis is when the alternative hypothesis is true.true.
Let Let αα= P(type I error), = P(type I error), ββ=P(type II error)=P(type II error)
Power of the test is 1-Power of the test is 1-ββ..
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6161
Combustion EngineCombustion EngineThe nominal power produced by a student The nominal power produced by a student
designed combustion engine is assumed designed combustion engine is assumed to be at least 100 hp. We wish to test the to be at least 100 hp. We wish to test the alternative that the power is less than 100 alternative that the power is less than 100 hp.hp.
Let Let µ = nominal power of engine.µ = nominal power of engine.
QQ plots shows it is reasonable to assume QQ plots shows it is reasonable to assume data came from a normal distribution.data came from a normal distribution.
Sample Data:Sample Data:7.98y 8694.2s
10n
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6262
Combustion EngineCombustion Engine(1) State hypotheses, set alpha.
(2) Choose test statistic
(3,4) Designate critical value for test ( if using the rejection region method) and draw conclusion
Calculate p-value and draw conclusion.
or
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6363
(3) Designate Rejection (3) Designate Rejection RegionRegion
-4 -3 -2 -1 0 1 2 3 4Y=avg hp100-4 -3 -2 -1 0 +1 +2 +3 +4
tdf=9
-1.833
0.05
Assumes H0: µ = 100 is true
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6464
Draw conclusion:Draw conclusion:
4327.110/8694.2
1007.98
/0
9
ns
ytdf
-4 -3 -2 -1 0 1 2 3 4
-1.833
-1.4327ttdf=9df=9
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6565
p-valuep-value
The The p-valuep-value is the probability of is the probability of getting the sample result we got or getting the sample result we got or something more extreme.something more extreme.
-4 -3 -2 -1 0 1 2 3 4
-1.4327
tdf=9
0.0928
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6666
p-valuep-value P(tP(tdf=9df=9 << -1.4327) = 0.0928 -1.4327) = 0.0928 Note:Note:
If p-value If p-value << αα, reject H, reject H00..
If p-value > If p-value > αα. Fail to reject H. Fail to reject H00..
-4 -3 -2 -1 0 1 2 3 4
-1.4327
-1.833
0.05
0.0928
tdf=9
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6767
Average Life of a Light BulbAverage Life of a Light Bulb
Historically, a particular light bulb Historically, a particular light bulb has had a mean life of no more has had a mean life of no more than 2000 hours. We have than 2000 hours. We have changed the production process changed the production process and believe that the life of the and believe that the life of the bulb has increased.bulb has increased.
Let Let μ = mean life.μ = mean life.
H0:
Ha:
(1) Set Up Hypotheses α = 0.05
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6868
Average Life of a Light BulbAverage Life of a Light Bulb(2) Collect Data and calculate test statistic:(2) Collect Data and calculate test statistic:
5282.215/216
20002141
/0
14
ns
ytdf
p-value = P(tdf=14 > 2.5282) = 0.0121
-4 -3 -2 -1 0 1 2 3 4
2141y 216s
1.7612.5282
0.050.0121
tdf=14
15n
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 6969
Average Life of a Light BulbAverage Life of a Light Bulb
State Conclusion:State Conclusion:
A.A. At 0.05 level of significance At 0.05 level of significance there is insufficient evidence to there is insufficient evidence to conclude that conclude that µ > 2000 hours.µ > 2000 hours.
B.B. At 0.05 level of significance At 0.05 level of significance there is sufficient evidence to there is sufficient evidence to conclude that conclude that µ > 2000 hours.µ > 2000 hours.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7070
Mean Width of a Manufactured Mean Width of a Manufactured PartPart
Test the theory that the mean width Test the theory that the mean width of a manufactured part differs from of a manufactured part differs from 100 cm.100 cm.
Let Let µ = mean width.µ = mean width.
(1) Set up Hypotheses (1) Set up Hypotheses αα = 0.05 = 0.05
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7171
Mean Width of a Manufactured Mean Width of a Manufactured PartPart
(2,3) (2,3) Collect data and calculate test Collect data and calculate test statistic.statistic.105y 6s
19dft
....(*2 19 dftPvaluep
(4) State conclusion.(4) State conclusion.
20n
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7272
Given population parameter Given population parameter µµ and value and value µµ00::
For Ho: For Ho: µµ = = µµ00
HHaa: : µµ = = µµ00
HHaa: : µµ > > µµ00
HHaa:: µµ < < µµ00
α/2α/2
Ha
α
H0
α
Ha
Ha
Ha
H0
H0
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7373
Focus on the two types of Focus on the two types of errors in hypothesis testerrors in hypothesis test
1)1) Reject HReject H00 when H when H00 is true. This is called a is true. This is called a type I error.type I error.
P(Rej HP(Rej H00|H|H00 is true) = is true) = αα
2)2) Fail to Reject HFail to Reject H00 when H when Haa is true at some is true at some value. This is called a type II error.value. This is called a type II error.
P(Fail to Rej HP(Fail to Rej H00|H|Haa is true at some value) is true at some value) = = ββ
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7474
Avg Life of Light Bulb - Type I Avg Life of Light Bulb - Type I ErrorError
α = Probability that we will reject Ho when Ho is true.
H0: µ < 2000Ha: µ > 2000
Fail to reject H0.ZZ
Assumes HAssumes H00 is true.is true.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7575
Type I and Type II ErrorsType I and Type II Errors
What if µ = 2200H0: µ = 2000
β = Probability we will fail to reject Ho when Ha is true at µ = 2200
α = Probability that we will reject Ho when Ho is true.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7676
How can we control the size of How can we control the size of ββ??
The value of The value of αα..
Location of our point of interest.Location of our point of interest.
Sample size.Sample size.
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7777
Calculating Calculating ββ If If µµ = 2200, what is the probability of a = 2200, what is the probability of a
type II error?type II error? Given: Given: α = 0.05 and we are assumingα = 0.05 and we are assuming
µµ = 2000. We will also assume we know = 2000. We will also assume we know σσ = 216. = 216.
05.0)645.1( ZP
209115/216
2000645.1
y
y
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7878
Calculating Calculating ββ
H0: µ = 2000 What if µ = 2200
2091
)2200|2091(yP
Fail to Reject Ho Reject Ho
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 7979
Calculating Calculating ββ)2200|2091( yP
0254.0)9544.1(15/216
22002091
zPzP
0254.0)2200 | HReject toFail( 0 P
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 8080
α, β and Powerα, β and Power α = P(Reject Hα = P(Reject H00||µµ = 2000) = 0.05 = 2000) = 0.05
β = P(Fail to Rej Hβ = P(Fail to Rej H00| | µµ = 2200) = 0.0254 = 2200) = 0.0254
We say that the We say that the power power of this test atof this test at
µµ = 2200 is 1 – 0.0254 = 0.9746 = 2200 is 1 – 0.0254 = 0.9746
Power = 1 –βPower = 1 –β Power = P(Rej HPower = P(Rej H00||µµ is at some H is at some Haa level) level)
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 8181
Plastic Injection MoldingPlastic Injection Molding
A plastic injection molding process for A plastic injection molding process for a part that has a critical width a part that has a critical width dimension historically follows a dimension historically follows a normal distribution.normal distribution.
A recent sample of n = 4 yielded a A recent sample of n = 4 yielded a sample mean of 101.4 and sample sample mean of 101.4 and sample standard deviation of 8.standard deviation of 8.
Does this data support the statement: Does this data support the statement: “The true average width is greater “The true average width is greater than 95.”?than 95.”?
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 8282
Plastic Injection MoldingPlastic Injection MoldingConfidence Interval ApproachConfidence Interval Approach
95% confidence interval on 95% confidence interval on µ:µ:
n
sty df 025.0,3
728.124.1014
8182.34.101
)24.109,56.93(
L. Wang, Department of StatisticsL. Wang, Department of Statistics
University of South Carolina; Slide University of South Carolina; Slide 8383
Plastic Injection MoldingPlastic Injection MoldingHypothesis Test ApproachHypothesis Test Approach
HH00::
HHaa::
αα = 0.05 = 0.05
Test statistics isTest statistics is
p-value =p-value =
Conclusion:Conclusion: