1 estimation from sample data chapter 08. chapter 8 - learning objectives explain the difference...

Estimation From Sample DataEstimation From Sample Data

Chapter 08

Chapter 8 - Learning Objectives

• Explain the difference between a point and an interval estimate.

• Construct and interpret confidence intervals: with a z for the population mean or proportion. with a t for the population mean.

• Determine appropriate sample size to achieve specified levels of accuracy and confidence.

8.1 Introduction

Statistical inference is the process by which we acquire information about populations from samples.

There are two types of inferences:1. Estimation of a parameter

2. Testing a Hypotheses about a parameter

8.2 Concept of Estimation

The main objective of estimation is to determine the value of a population parameter on the basis of a sample statistic.

There are two types of estimators:

1. Point Estimator2. Interval Estimator

Point Estimator

A point estimator allows us to draw inference about a population parameter (say the mean or the proportion) by estimating a statistic from a sample.

The statistic provides estimate of the value of the parameter at a single point (value)—thus the name point estimate.

An interval estimator draws inferences about a population parameter by providing a range (interval) of value within which the unknown population parameter lies.

Interval estimator

Population distribution

Sample distribution

Parameter

Interval Estimator

Take a sample and compute the Average Weekly Summer Income of students in your sample (Say, 600) of UMD students.

You want to know the Average Weekly Summer income of UMD students.

Point Estimate: µ=$400

Interval estimate: µ= $380-$420

Example--

Interval estimator is used more frequently than point estimator: (1) point estimator is more prone to making faulty inferences, and (2) when we use interval estimator we can specify how confident we are in our estimate.

In estimation, you would want to select the right sample and sample statistic that allow you to estimate a parameter with less error.

The selection of the right statistic depends on some important characteristics.

Characteristics of Estimators

Desirable characteristics of Estimators

Desirable Characteristics of Estimators

1. Unbiasedness: An unbiased estimator is one whose expected value is equal to the parameter it estimates.

2. Consistency: An unbiased estimator is said to be consistent if the difference between the estimator and the parameter grows smaller as the sample size increases.

3. Relative efficiency:From among two or more unbiased estimators (estimates), the one with a smaller variance is said to be relatively efficient.

8.3 Interval Estimation of the Population Mean and Proportion

8.3.1 When the Population Variance is Known

8.3.2 When the Population Variance is Unknown

8.3.1 Estimating the Population Mean when the Population Variance is Known

We are able to provide an interval estimate of a population mean or proportion based on the following characteristics of a sampling distribution.

1. Given the sampling distribution, we can draw a sample of size n from the population, and calculate the sample mean or proportion

2. Given the central limit theorem we consider that the sampling distribution of the sample means or proportions is normal (or approximately normal) and thus provide probability estimates for the sample mean or proportion that we estimate.

3. Given the formula for standardizing any random variable, we can relate the standardized value obtained from a normal distribution and the sample mean/proportion we are estimating :

The general form of an interval estimate of aThe general form of an interval estimate of a population mean ispopulation mean is

Margin of Errorx Margin of Errorx

Margin of Error and the Interval EstimateMargin of Error and the Interval Estimate

8.3 Estimating the Population Mean when the Population Variance is Known

We estimate the range (interval) that contains the value of the unknown population parameter (say the mean) as follows…

8.3.1 Estimating the Population Mean when the Population Variance is Known

where: where: is the sample meanis the sample mean

zz/2 /2 is the standardized value of the Random is the standardized value of the Random variable representing an area,variable representing an area,/2 in on one tail /2 in on one tail of the standard normal probability distributionof the standard normal probability distribution

is the population standard deviationis the population standard deviationnn is the sample size is the sample size1-1-αα is the confidence coefficient is the confidence coefficient

zx(P 22

In its expanded form, the interval can be stated as follows:

The Confidence Interval for ( is known)

The confidence interval

Interpreting the Confidence Interval for

Based on the estimate, we can say that with a (1 – percent confidence the interval:

contains the true value of the unknown population parameter.

Based on the estimate, we can say that with a (1 – percent confidence the interval:

contains the true value of the unknown population parameter.

Interval Estimationof a Population Proportion

p zp pn

where: 1 -where: 1 - is the confidence coefficient is the confidence coefficient

zz/2 /2 is the is the zz value providing an area of value providing an area of

/2 in the upper tail of the standard/2 in the upper tail of the standard

normal probability distributionnormal probability distribution

is the sample proportionis the sample proportionpp

Commonly used confidence levels and their corresponding Z scores

Confidence level α Z (for α/2)

90% 10% 1.645

95% 5% 1.960

99% 1% 2.575

SSDDInterval Estimate of Population Mean:

Known: Example

Step-1: Identify coefficient (α) and the confidence coefficient (1- α)at which the margin of error is to be computed (α =5%)

Step-2: Compute the corresponding margin of error for the selected Confidence coefficient

Step-3: Establish the Interval estimate of by adding and subtracting the margin of error to the sample mean:

Hands-On-Practice Problems

Interval Estimate of Population Mean: Known: Example

A random sample of 81 credit card sales in a department store showed that an average sale of $68. From past data, it is known that the standard deviation of sales on credit card is $27.

8.1) Determine the 90% confidence interval estimate of sales on credit .

8.2) Determine the 95% confidence interval estimate of sales on credit.

8.3) Determine the 99% confidence interval estimate of sales on credit.

Solution: n = 81; = $68. = $27. X Step-1: Identify coefficient (α) and the confidence coefficient (1- α) at which the margin of error is to be computed:

Coefficient (α =10%) Confidence Coefficient: (1- α)= 90%

Coefficient (Zα/2 =0.05)=1.645; Standard ErrorMargin of Error = 1.645 x 3 = 4.935

68 – 4.935 = 63.065; 68 + 4.935 = 72. 935; [ 63.065 72.935]

We are 90 percent confident that the average credit sales of the store lies in the interval $63 and $73

Step-2: Compute the corresponding margin of error for the selected confidence coefficient

A random sample of 81 credit card sales in a department store showed that an average sale of $68,000. From past data, it is known that the standard deviation of the credit card sales is $27.

8.1) Determine the 90% confidence interval estimate of the sales on credit cards. [63.065- 72.935]

8.2) Determine the 95% confidence interval estimate of the sales on credit cards.

68 – 5.88= 62.12; 68 + 5.88 = 73. 88; [ 62.12 73.88]

A random sample of 81 credit card sales in a department store showed that an average sale of $68,000. From past data, it is known that the standard deviation of the credit card sales is $27.

8.1) Determine the 90% confidence interval estimate of the sales on credit cards. [63.065 - 72.935]

8.2) Determine the 95% confidence interval estimate of the sales on credit cards [62.12 - 73.88]

8.3) Determine the 99% confidence interval estimate of the sales on credit cards.

68 – 7.725= 60.275; 68 + 7.725 = 75.725; [ 60.275 75.725]

A random sample of 81 credit card sales in a department store showed that an average sale of $68. From past data, it is known that the standard deviation of the credit card sales is $27.

8.1) The 90% confidence interval estimate of sales on credit cards.

[63.065 - 72.935]8.2) The 95% confidence interval estimate of sales on credit cards

[62.12 - 73.88]8.3) The 99% confidence interval estimate of sales on credit cards

[60.275 - 75.725]

Implications…

As we increase the confidence coefficient (say from 90% to 95% or to 99%), the interval that contains the mean of the population widens.

There is a trade-off between the width of the interval and the confidence with which we can make the estimation

The Confidence Interval for ( When The Population Standard Deviation Is Unknown)

Recall that when the population variance is known we use the following statistic to provide an interval estimate of a population mean

The Confidence Interval for ( When The Population Standard Deviation Is Unknown)

The t - Statistic

However, information about population variance may not be available all the time. Provided that the sampled population is normally distributed, even if the population variance is unknown, we use variance estimated from the sample and a t statistic (Student t distribution) to make inference about the population mean.

The t - Statistic

The t distribution is mound-shaped, and symmetrical around zero.

The variance of a t-distribution depends on the sample size. Generally it has higher variance than a normal distribution

t Distribution

StandardStandardnormalnormal

distributiondistribution

tt distributiondistribution(20 degrees(20 degreesof freedom)of freedom)

tt distributiondistribution(10 degrees(10 degrees

of of freedom)freedom)

zz, , tt

When the degrees of freedom (sample size) is more than 100, the standard normal When the degrees of freedom (sample size) is more than 100, the standard normal zz value value provides a good approximation to the provides a good approximation to the tt value. value.

The variance (spread) of a t-distribution, compared to that of normal distribution is largely determined by the “degrees of freedom” ( the sample size)

The t - Statistic

The interval estimate of the population mean is thus computed as :

] x 1-nat [ )(2n

Example:8.2.1 In a random sample of 100 oil changes, it was found that

on average it takes about 22 minutes to change oil for a car with a standard deviation of 5 minutes.

Assuming that the amount of time it takes to change oil on a car is normally distributed, provide the 99% confidence interval estimate of the average amount of time it takes to change oil on a typical car).

The Confidence Interval for ( is unknown)

Example 8.2.2.

Using the same information, but assuming a standard deviation of 25 minutes, provide the 99% confidence interval estimate of the population mean (the average amount of time it takes to change oil on a car).

Example 8.2.3. Using the same information (std. deviation=5,

sample=22, but assuming a sample size of 400 car changes and provide the 99% confidence interval estimate of the population mean (that is, the average amount of time it takes to change oil on a typical car).

The width of the confidence interval is affected by

1. The confidence level (1-a): The higher the confidence level, the wider the interval estimate.

2. The population standard deviation (s): The higher the variance, the wider the interval estimate.

3. The sample size (n): The larger the sample size, the narrower the interval estimate

The width of the confidence interval is affected by

1. The confidence level (1-a): The higher the confidence level, the wider the interval estimate.

2. The population standard deviation (s): The higher the variance, the wider the interval estimate.

3. The sample size (n): The larger the sample size, the narrower the interval estimate

Implications for the Width of the Confidence Interval

Wide interval estimator provides little information.

Where is

???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ???? ????

But we want a narrower confidence interval and a higher confidence level.But we want a narrower confidence interval and a higher confidence level.

Wide interval estimator provides little information.Where is

Information and the Width of the Interval

The width of the confidence interval is affected by confidence level, variance of the population, and sample size.

1. We want higher confidence level and narrow interval estimate. However, there is a trade-off between confidence level and the interval estimate we want to establish.

2. Although lower variance can provide us with narrow interval estimate, the variance of the population or sample is beyond our control.

3. Therefore, the only way we can establish narrow (more informative interval) while maintaining higher confidence level is thus by adjusting (increasing) our sample size.

The width of the confidence interval is affected by confidence level, variance of the population, and sample size.

1. We want higher confidence level and narrow interval estimate. However, there is a trade-off between confidence level and the interval estimate we want to establish.

2. Although lower variance can provide us with narrow interval estimate, the variance of the population or sample is beyond our control.

3. Therefore, the only way we can establish narrow (more informative interval) while maintaining higher confidence level is thus by adjusting (increasing) our sample size.

The Width of the Confidence Interval

Confidence level

Determining the Proper sample size is thus a critical component of in Establishing Narrow Interval Estimation

The Sample Size

From the formula that we used to establish the interval estimate of the population parameter, we can derive a formula that allows us to determine the appropriate sample size.Two important requirements:1. At what confidence level do we want to provide the

interval estimate2. What interval width (W) do we need?

8.3 Selecting the Sample size

Where W is the interval width we want to maintain. Thus to compute the sample size, first we need to determine the interval width.

2 )()(

8.3 Selecting the Sample size

Example 10.2 In order to estimate the amount of lumber that can be harvested from

a tract of land with a 99% confidence, it was indicated that the mean diameter of trees in the tract must be within one inch.

Assuming that diameters are normally distributed with standard deviation of 6 inches, how many samples should be selected to provide the interval estimation for the mean of the diameter of the trees in the tract at the specified confidence level?.

Selecting the Sample size

Solution The estimate accuracy is +/-1 inch. That is w = 1.

The confidence level 99% leads to = .01, thus z/2 = z.005 = 2.575.

The standard deviation was given as 6

Thus, we can compute the required sample size as follows:

)6(575.2w

Selecting the Sample size

1. Determine the sample size, and the values of variables of interest (width, spread of

the population or sample).

2. Select the confidence level for the interval estimation

3. Compute the sample mean ( population variance may be known or unknown).

4. Determine the critical value (Z or t from the standard normal table)

5. Compute the confidence interval.

Computing Interval Estimates: Summary

Summary of Interval Estimation Summary of Interval Estimation ProceduresProcedures

for a Population Meanfor a Population Mean

Is theIs thepopulation standardpopulation standard deviation deviation known ? known ?

Use the sampleUse the samplestandard deviationstandard deviation

ss to estimate to estimate

UseUse

YesYes NoNo

nUseUse

/ 2x zn

/ 2x z

KnownKnownCaseCase

UnknownUnknownCaseCase

Interval Estimationof a Population Proportion

p zp pn

where: 1 -where: 1 - is the confidence coefficient is the confidence coefficient

zz/2 /2 is the is the zz value providing an area of value providing an area of

/2 in the upper tail of the standard/2 in the upper tail of the standard

normal probability distributionnormal probability distribution

is the sample proportionis the sample proportionpp

1 estimation from sample data chapter 08. chapter 8 - learning objectives explain the difference...

Documents

chapter outline 2.1 estimation confidence interval estimates...

chapter 1 time interval measurement literature revie ·...

statistics for social and behavioral sciences session #15:...

1 chapter 4 (part 2) statistical inferences. 2 confidence...

scanned document - dr.jorge · pdf filethe diameter of ball...

chapter 19nielsen/soci252/notes/soci252notes19.pdfchapter 19...

estimation of parameters - … · interval estimation--...

chapter 4 construct fulcher

interval forecasting - zentraler informatikdienst...

eco220y estimation: con dence interval estimator for...

calculating interval forecasts -...

chapter 8 interval estimation

confidence interval estimation chapter 2. eqt 373 learning...

exercise chapter 8 · 2016. 11. 19. · exercise chapter 8...

chapter 11 helping students construct usable knowledge

chapter 4 the construct of a public- private partnership …

how to construct? chapter five–part i object oriented...

agresti/franklin statistics, 1 of 87 section 7.2 how can we...

chapter 3 confidence interval revby rao

chapter 15 association between variables measured at the...