confidence intervals for population meanweb2.slc.qc.ca/pfoth/qm2/presentations/09_estimate_mu... ·...

1

Confidence Intervals for Population Mean

Quantitative Methods II

Plan

• Inferential Statistics

• Point and Interval Estimates

• Confidence Intervals

• Estimating the required sample size

• Examples

2

Inferential Statistics

• Goal = use information obtained from a sample to increase our knowledge about the population from which the sample was taken (i.e., to estimate or make inferences about the population)

• 2 types:– Estimating the value of a population parameter

– Testing a hypothesis

• Using the Sampling Distribution of the Sample Mean (SDSM) is key

Estimating a population mean

• One of the purposes of randomly sampling a population is to get an estimate of the mean of the population

• Usually, the best estimate of a population mean is the sample mean. Example: mean SEL test score for a group of 64 students is 77.4, thus 77.4 is the best estimate for the population of all students who take SEL test

• Logic behind it is that you are more likely to get a sample mean of 77.4 from a population with a mean

of 77.4: this is a point estimate

3

Point and Interval Estimates

• Point estimate is when you estimate a specific value of a population parameter– Accuracy of the point estimate = SD (how much

the scores in this distribution typically vary)

• Interval estimate is when you estimate a range in which the population parameter is likely to fall– You can do this because the distribution of means

is generally a normal curve, thus you know the percentage of scores that lie at a given area of the distribution: about 68 % of all sample means lie between the mean ± 1 SD

Terminology

• Point estimate: a single number designed to estimate a quantitative parameter of a population, usually the corresponding sample statistic

• Interval estimate: an interval bounded by two values that is calculated from the sample and that is used to estimate the value of a population parameter

• Confidence interval: an interval estimate with a specified level of confidence

• Level of confidence 1 − 𝛼 : the proportion of all interval estimates that include the parameter being estimated –usually 90% , 95% , 98% or 99%

4

Example

Take a city, like Trenton, NJ. We want to know how much time it takes workers living in Trenton to get to work and back: the commuting time

• Sample = 36 workers from Trenton

• Mean = 49 minutes

• This mean becomes the point estimate for the population of all Trenton workers

• σ = 15 minutes

Example: continued

• This mean should be close to the population

mean, μ

• SDSM and the CLT tell us how close this mean,

a point estimate, is to the population mean, μ

• Recall: with a large enough sample the SDSM will be close to normally distributed

5

Recall: the Empirical Rule

Example: continued

If we knew the value of 𝜇, the population mean, then we could have calculated an interval between which ̴95% of the sample average commuting times should fall:

From 𝜇 − 2𝜎 ҧ𝑥 to 𝜇 + 2𝜎 ҧ𝑥 , i.e.

from 𝜇 − 2𝜎

𝑛to 𝜇 + 2

𝜎

𝑛, i.e.

from μ − 215

36to μ + 2

15

36, i.e.

from 𝜇 − 5 to 𝜇 + 5 minutes

6

Sampling Distribution of ഥ𝒙 ’s , unknown μ

In algebraic terms: 𝑃 𝜇 − 5 < ҧ𝑥 < 𝜇 + 5 ≈ 95%

Interval Estimates

• Interval estimate: an interval bounded by two values that is calculated from the sample and that is used to estimate the value of a population parameter

• Level of confidence 1 − 𝛼: the proportion of all interval estimates that include the parameter being estimated

• Confidence interval: an interval estimate with a specified level of confidence

7

Example: continued

What are the bounds of the interval centered at ҧ𝑥 = 49 minutes?

From ҧ𝑥 − 2𝜎 ҧ𝑥 to ҧ𝑥 + 2𝜎 ҧ𝑥 , i.e.

from 49−5 to 49+5 minutes

This means that the 95.44%

confidence interval for μ is

from 44 to 54 minutes.

Confidence Intervals

8

Summary : Calculating Confidence Intervals

• Sample Mean: ҧ𝑥

• Sample Size: n

• Population standard deviation: σ

• Level of confidence we wish to have: 1 − 𝛼

1 − 𝛼 ∙ 100% gives us an estimate of how confident you can be that your mean falls within this interval

0.95 *100% = 95%: you are 95% confident that the population mean falls within this interval

Estimation of Mean μ (σ known)

Assumption: either the general population

has the bell-shaped symmetric distribution,

or the sample size is at least 25.

Step by step

9

Confidence Coefficient 𝒛( Τ𝜶 𝟐)

Constructing a Confidence Interval

• Step 1: Set-Up

– Describe the population parameter of interest

• Step 2: The Confidence Interval Criteria

– Check the assumptions

– Identify the probability distribution and the formula to be used

– State the level of confidence 𝟏 − 𝜶

• Step 3: The Sample Evidence

– Collect the sample information

10

Constructing a Confidence Interval

• Step 4: The Confidence Interval

– Determine the confidence coefficient 𝑧( Τ𝛼 2)

– Find the error bound for a population mean

𝐸𝐵𝑀 = 𝑧( Τ𝛼 2) ∙𝜎

𝑛– Find the lower and upper confidence limits

• Step 5: State the confidence interval

from ҧ𝑥 − 𝐸𝐵𝑀 to ҧ𝑥 + 𝐸𝐵𝑀 (units)

The confidence coefficient

• Some useful numbers from the table:

If 1 − 𝛼 = 0.80 (80%), then 𝑧 Τ𝛼 2 = 1.28

if 1 − 𝛼 = 0.90 (90%), then 𝑧 Τ𝛼 2 = 1.645

if 1 − 𝛼 = 0.94 (94%), then 𝑧 Τ𝛼 2 = 1.88

If 1 − 𝛼 = 0.95 (95%), then 𝑧 Τ𝛼 2 = 1.96

If 1 − 𝛼 = 0.96 (96%), then 𝑧 Τ𝛼 2 = 2.055

If 1 − 𝛼 = 0.98 (98%), then 𝑧 Τ𝛼 2 = 2.33

if 1 − 𝛼 = 0.99 (99%), then 𝑧 Τ𝛼 2 = 2.575

Check for yourself!

11

Example: textbook cost

A random sample of 60 students from X University has revealed that their average annual textbook spending is $928. From previous studies, it is known that the standard deviation for annual textbook costs can be takes as $230. Find a 95% confidence interval for the mean annual textbook costs for allstudents at X University.

Example: textbook costs

Step 1: What is the population parameter of interest?

Step 2: 𝜎 = $230 is known. Is a sample of 60 students good enough? (we need the sampling distribution to be approximately normal); we will therefore use the standard normal distribution; the level of confidence is 1 − 𝛼 = 0.95 (95%)

Step 3: 𝑛 = 60, ҧ𝑥 = $928

12

Example: textbook costs

Step 4: 0.95/2 = 0.475, 𝑧 Τ𝛼 2 = 1.96 (table)

𝐸𝐵𝑀 = 𝑧 Τ𝛼 2 ∙𝜎

𝑛= 1.96 ∙

230

60= 58.2

ҧ𝑥 − 𝐸𝐵𝑀 = 869.8 , ҧ𝑥 + 𝐸𝐵𝑀 = 986.2

Step 5: The 95% confidence interval for the population mean 𝜇 is:

from $870 to $986

(same precision as the data)

How to decrease the error?

• To decrease the value of EBM (and thus, to decrease the size of the confidence interval for 𝜇) there are two possibilities:

(A) Decrease the confidence level. A smaller confidence level will result in a smaller 𝑧(𝛼/2) аnd thus, you’ll get a smaller EBM.

(B) Increase the size of a sample. A larger value of n means a larger value of 𝑛 and thus, you’ll get a smaller value of EBM.

• Tradeoffs: (A) less certain, (B) more costly

13

Example: practice

A survey by Future Shop involving 35 households in the area revealed the mean spending of $850 on home electronics during the last year. Construct a 98% confidence interval for the average annual spending on home electronics for all households in the area, if the population standard deviation is known to be $300.

Answer: from $732 to $968.

Estimating the sample size

• If we wish the error EBM to be smaller than a

certain value, 𝜀, but the confidence level is

fixed at 1 − 𝛼, we can choose the necessary

sample size:

𝜀 > 𝐸𝐵𝑀 = 𝑧( ൗ𝛼 2) ∙𝜎

𝑛

Thus, 𝑛 >𝑧 Τ𝛼 2 ∙𝜎

𝜀

2

14

Estimating the sample size

• The number 𝑧 Τ𝛼 2 ∙𝜎

𝜀

2rounded up to the

nearest integer is denoted by 𝑛𝑚𝑖𝑛: the minimum required sample size.

• Example: a supermarket manager needs to estimate the average weekly grocery spending by his customers at a 90% level of confidence and with an error not exceeding $10. What is the minimum sample size needed, if he knows that the population standard deviation is $60?

Example: grocery shopping

• Solution.

• Given: 1 − 𝛼 = 0.9, 𝜎 = $60, 𝜀 = $10

• Find: 𝑛𝑚𝑖𝑛

• First, we have 𝑧( Τ𝛼 2) = 1.645

• Now, we compute:

𝑧 Τ𝛼 2 ∙𝜎

𝜀

2=

1.645∙60

10

2= 97.4

Thus, the minimum required sample size is

𝑛𝑚𝑖𝑛 = 98 customers

15

Example: practice

An insurance company wants to estimate the average mileage driven by residents per week in Hamilton, so that the error does not exceed 20 km at the 99% level of confidence. From other studies they know that the population standard deviation can be taken as 100 km. Estimate the sample size needed for this study.

Answer: 166 drivers

confidence intervals for population meanweb2.slc.qc.ca/pfoth/qm2/presentations/09_estimate_mu... ·...

Documents