confidence intervals

62
1 Confidence Intervals, Sampling Error & Sample Size Determination Readings: Chapter 8, (Sec 8.1-8.4) “Statistics for Managers Using MS Excel” by Levine et al, PHI

Upload: ankur-chugh

Post on 02-Feb-2016

14 views

Category:

Documents


0 download

DESCRIPTION

confidence interval

TRANSCRIPT

Page 1: Confidence Intervals

1

Confidence Intervals, Sampling

Error & Sample Size

Determination

Readings: Chapter 8, (Sec 8.1-8.4)

“Statistics for Managers Using MS

Excel” by Levine et al, PHI

Page 2: Confidence Intervals

2

Confidence Intervals

• Confidence Intervals for the Population

Mean, μ – when Population Standard Deviation σ is Known

– when Population Standard Deviation σ is Unknown

• Confidence Intervals for the Population

Proportion, p

• Determining the Required Sample Size

Page 3: Confidence Intervals

3

Point and Interval Estimates

• A point estimate is a single number,

• a confidence interval provides additional information about variability

Point Estimate

Lower

Confidence

Limit

Upper

Confidence

Limit

Width of confidence interval

Page 4: Confidence Intervals

4

We can estimate a

Population Parameter …

Point Estimates

with a Sample

Statistic

(a Point Estimate)

Mean

Proportion ps p

X μ

Page 5: Confidence Intervals

5

Confidence Intervals

• How much uncertainty is associated with a point estimate of a population parameter?

• An interval estimate provides more information about a population characteristic than does a point estimate

• Such interval estimates are called confidence intervals

Page 6: Confidence Intervals

6

Confidence Interval Estimate

• An interval gives a range of values:

– Takes into consideration variation in sample statistics from sample to sample

– Based on observations from one sample

– Gives information about unknown population parameters closeness to

– Stated in terms of level of confidence

• Generally cannot be 100% confident

Page 7: Confidence Intervals

7

Estimation Process

(mean, μ, is

unknown)

Population

Random Sample

Mean

X = 50

Sample

I am 95%

confident that

μ is between

40 & 60.

Page 8: Confidence Intervals

8

General Formula

• The general formula for all

confidence intervals is:

Point Estimate (Critical Value) (Standard Error)

Page 9: Confidence Intervals

9

Confidence Level

• Confidence Level

– Degree of Confidence that the

interval will contain the unknown

population parameter

• A percentage (less than 100%)

Page 10: Confidence Intervals

10

Confidence Level, (1-)

• Suppose confidence level = 95%

• Also written (1 - ) = .95

• A relative frequency interpretation:

– In the long run, 95% of all the confidence

intervals that can be constructed will contain the

unknown true parameter

(continued)

Page 11: Confidence Intervals

11

Confidence Intervals

Population

Mean

σ Unknown

Confidence

Intervals

Population

Proportion

σ Known

Page 12: Confidence Intervals

12

Confidence Interval for μ (σ Known)

• Assumptions

– Population standard deviation σ is known

– Population is normally distributed

– If population is not normal, use large sample

• Confidence interval estimate:

(where Z is the normal distribution critical value for a probability of

α/2 in each tail)

n

σZX

Page 13: Confidence Intervals

13

Finding the Critical Value, Z

• Consider a 95% confidence interval:

-Z= -1.96 Z= 1.96

.951

.0252

α .025

2

α

0

1.96Z

Page 14: Confidence Intervals

14

Common Levels of Confidence

• Commonly used confidence levels are

90%, 95%, and 99%

Confidence

Level

Confidence

Coefficient,

Z value

1.28

1.645

1.96

2.33

2.57

3.08

3.27

.80

.90

.95

.98

.99

.998

.999

80%

90%

95%

98%

99%

99.8%

99.9%

1

Page 15: Confidence Intervals

15

μμx

Intervals and Level of Confidence

Confidence Intervals

Intervals extend from

to

(1-)x100%

of intervals

constructed

contain μ;

()x100% do

not.

Sampling Distribution of the Mean

n

σZX

n

σZX

x

x1

x2

/2 /21

Page 16: Confidence Intervals

Labor-Management Negotiations

• President of a company argues that company‟s blue-collar workers, who are paid an average of $30,000 per year, are well paid because mean annual income of all blue-collar workers in the country is at most $30,000 – this figure is disputed by the union.

• To test the beliefs, an arbitrator draws a random sample of 350 blue-collar workers from across the country and asks each to report his or her annual income. If the arbitrator assumes that the blue-collar incomes are normally distributed with a standard deviation of $8,000. Estimate mean annual income of all blue-collar workers in the country is at most $30,000

16

Page 17: Confidence Intervals

Labor-Management Negotiations: Use

of Confidence Interval

17

Page 18: Confidence Intervals

18

Example

• A sample of 11 weekly returns on a stock

over 3 years has a mean value of 2.20 per

cent. We know from past population

standard deviation is .35 per cent.

Determine a 95% confidence interval for the

true mean weekly returns on the stock.

Page 19: Confidence Intervals

19 2.4068) ,(1.9932

.2068 2.20

)11(.35/ 1.96 2.20

n

σZ X

Example

• A sample of 11 weekly returns on a stock

over 3 years has a mean value of 2.20 per

cent. We know from past population

standard deviation is .35 per cent.

• Solution:

(continued)

Page 20: Confidence Intervals

20

Interpretation

• We are 95% confident that the mean weekly

return on the stock is between 1.9932 and

2.4068 per cent

• Although the true mean may or may not be in

this interval, 95% of intervals formed in this

manner will contain the true mean

Page 21: Confidence Intervals

21

• If the population standard deviation σ is

unknown, we can substitute the sample

standard deviation, S

• This introduces extra uncertainty, since S is

variable from sample to sample

• So we use the t distribution instead of the

normal distribution

Confidence Interval for μ (σ Unknown)

Page 22: Confidence Intervals

22

• Assumptions

– Population standard deviation is unknown

– Population is normally distributed

– If population is not normal, use large sample

• Use Student‟s t Distribution

• Confidence Interval Estimate:

(where t is the critical value of the t distribution with n-1 d.f. and an

area of α/2 in each tail)

Confidence Interval for μ (σ Unknown)

n

StX 1-n

(continued)

Page 23: Confidence Intervals

23

Student‟s t Distribution

• The t is a family of distributions

• The t value depends on degrees of

freedom (d.f.)

– Number of observations that are free to vary after

sample mean has been calculated

d.f. = n - 1

Page 24: Confidence Intervals

24

If the mean of these three

values is 8.0,

then X3 must be 9

(i.e., X3 is not free to vary)

Degrees of Freedom (df)

Here, n = 3, so degrees of freedom = n – 1 = 3 – 1 = 2

(2 values can be any numbers, but the third is not free to vary

for a given mean)

Idea: Number of observations that are free to vary after sample mean has been calculated

Example: Suppose the mean of 3 numbers is 8.0

Let X1 = 7

Let X2 = 8

What is X3?

Page 25: Confidence Intervals

25

Student‟s t Distribution

t 0

t (df = 5)

t (df = 13) t-distributions are bell-shaped and symmetric, but have „fatter‟ tails than the normal

Standard

Normal (t with df = )

Note: t Z as n increases

Page 26: Confidence Intervals

26

Student‟s t Table

Upper Tail Area

df

.25 .10 .05

1 1.000 3.078 6.314

2 0.817 1.886 2.920

3 0.765 1.638 2.353

t 0 2.920

The body of the table

contains t values, not

probabilities

Let: n = 3

df = n - 1 = 2

= .10

/2 =.05

/2 = .05

Page 27: Confidence Intervals

27

t distribution values With comparison to the Z value

Confidence t t t Z

Level (10 d.f.) (20 d.f.) (30 d.f.) ____

.80 1.372 1.325 1.310 1.28

.90 1.812 1.725 1.697 1.64

.95 2.228 2.086 2.042 1.96

.99 3.169 2.845 2.750 2.57

Note: t Z as n increases

Page 28: Confidence Intervals

28

Example

A random sample of n = 25 has X = 50 and S = 8. Form a 95% confidence interval for μ

d.f. = n – 1 = 24, so

The confidence interval is

2.0639tt .025,241n,/2

25

8(2.0639)50

n

StX 1-n /2,

(46.698 , 53.302)

Page 29: Confidence Intervals

29

Confidence Intervals for the Population Proportion, p

• An interval estimate for the population

proportion ( p ) can be calculated by adding

an allowance for uncertainty to the sample

proportion ( ps )

Page 30: Confidence Intervals

30

Confidence Intervals for the Population Proportion, p

• Recall that the distribution of the sample

proportion is approximately normal if the

sample size is large, with standard deviation

• We will estimate this with sample data:

(continued)

n

)p(1p ss

n

p)p(1σp

Page 31: Confidence Intervals

31

Confidence Interval Endpoints

• Upper and lower confidence limits for the population proportion are calculated with the formula

• where

– Z is the standard normal value for the level of confidence desired

– ps is the sample proportion

– n is the sample size

n

)p1(pZp

sss

Page 32: Confidence Intervals

Is a Counseling Program Effective?

A study of top executives’ midlife crises indicates that

45% of all top executives suffer from some form of

mental crisis in the years following corporate success.

Recently a clinic has started a counseling program for top

executives to prevent such mental crisis from occurring. A

random sample of 125 executives who went through this

program indicated that only 39 showed signs of midlife

crisis. Do you believe the program is beneficial ?

32

Page 33: Confidence Intervals

Is a Counseling Program Effective?

(Use of Confidence Intervals)

33

Page 34: Confidence Intervals

34

Example

• A random sample of 100 executives

shows that 25 are left-handed.

• Form a 95% confidence interval for

the true proportion of left-handed

executives

Page 35: Confidence Intervals

35

Example

• A random sample of 100 executives shows

that 25 are left-handed. Form a 95%

confidence interval for the true proportion

of left-handed executives.

00.25(.75)/196.125/100

)/np(1pZp sss

0.3349) , (0.1651

(.0433) 1.96 .25

(continued)

Page 36: Confidence Intervals

36

Interpretation

• We are 95% confident that the true percentage of left-handed executives in the population is between

16.51% and 33.49%.

• Although this range may or may not contain the true proportion, 95% of intervals formed from samples of size 100 in this manner will contain the true proportion.

Page 37: Confidence Intervals

37

Determining Sample Size

For the

Mean

Determining

Sample Size

For the

Proportion

Page 38: Confidence Intervals

38

Sampling Error

• The required sample size can be found to reach

a desired margin of error (e) with a specified

level of confidence (1 - )

• The margin of error is also called sampling error

– the amount of imprecision in the estimate of the

population parameter

– the amount added and subtracted to the point

estimate to form the confidence interval

Page 39: Confidence Intervals

39

Determining Sample Size

For the

Mean

Determining

Sample Size

n

σZX

n

σZe

Sampling error

(margin of error)

Page 40: Confidence Intervals

40

Determining Sample Size

For the

Mean

Determining

Sample Size

n

σZe

(continued)

2

22

e

σZn

Now solve

for n to get

Page 41: Confidence Intervals

41

Determining Sample Size

• To determine the required sample size for the

mean, you must know:

– The desired level of confidence (1 - ), which

determines the critical Z value

– The acceptable sampling error (margin of error), e

– The standard deviation, σ

(continued)

Page 42: Confidence Intervals

42

Required Sample Size Example

If = 45, what sample size is needed to

estimate the mean within ± 5 with 90%

confidence?

(Always round up)

219.195

(45)(1.645)

e

σZn

2

22

2

22

So the required sample size is n = 220

Page 43: Confidence Intervals

Labor-Management Negotiations

• President of a company argues that company‟s blue-collar workers, who are paid an average of $30,000 per year, are well paid because mean annual income of all blue-collar workers in the country is at most $30,000 – this figure is disputed by the union.

• To test the beliefs, an arbitrator draws a random sample of 350 blue-collar workers from across the country and asks each to report his or her annual income. If the arbitrator assumes that the blue-collar incomes are normally distributed with a standard deviation of $8,000. Estimate mean annual income of all blue-collar workers in the country is at most $30,000

43

Page 44: Confidence Intervals

Labor-Management Negotiations:

Sample Size

44

Page 45: Confidence Intervals

45

Determining Sample Size for Mean

when σ is unknown

• If unknown, σ can be estimated when

using the required sample size formula

– Use a value for σ that is expected to be

at least as large as the true σ

– Select a pilot sample and estimate σ with

the sample standard deviation, S

Page 46: Confidence Intervals

46

Determining Sample Size for

Proportion

n

)p1(pZp

sss

n

)p1(pZe

Determining

Sample Size

For the

Proportion

Sampling error

(margin of error)

Page 47: Confidence Intervals

47

Determining Sample Size

Determining

Sample Size

For the

Proportion

2

2

e

)p1(pZn

Now solve

for n to get n

)p1(pZe

(continued)

Page 48: Confidence Intervals

48

Determining Sample Size

• To determine the required sample size for the

proportion, you must know:

– The desired level of confidence (1 - ), which

determines the critical Z value

– The acceptable sampling error (margin of error), e

– The true proportion of “successes”, p

• p can be estimated with a pilot sample, if

necessary (or conservatively use p = .50)

(continued)

Page 49: Confidence Intervals

49

Required Sample Size Example

How large a sample would be necessary

to estimate the true proportion defective in

a large population within ±3%, with 95%

confidence?

(Assume a pilot sample yields ps = .12)

Page 50: Confidence Intervals

50

Required Sample Size Example

Solution:

For 95% confidence, use Z = 1.96

e = .03

ps = .12, so use this to estimate p

So use n = 451

450.74(.03)

.12)(.12)(1(1.96)

e

)p1(pZn

2

2

2

2

(continued)

Page 51: Confidence Intervals

Is a Counseling Program Effective?

A study of top executives’ midlife crises indicates that

45% of all top executives suffer from some form of

mental crisis in the years following corporate success.

Recently a clinic has started a counseling program for top

executives to prevent such mental crisis from occurring. A

random sample of 125 executives who went through this

program indicated that only 39 showed signs of midlife

crisis. Do you believe the program is beneficial ?

51

Page 52: Confidence Intervals

Is a Counseling Program Effective?

(Sample Size Requirement )

52

Page 53: Confidence Intervals

53

Sample Size Requirements: Examples

confidence

level= 0.95 0.99 0.95

margin of

error= 0.05 0.05 0.02

Required

sample

size= 385 664 2401

Page 54: Confidence Intervals

54

PHStat Interval Options

options

Page 55: Confidence Intervals

55

PHStat Sample Size Options

Page 56: Confidence Intervals

56

Using PHStat (for μ, σ unknown)

A random sample of n = 25 has X = 50 and S = 8. Form a 95% confidence interval for μ

Page 57: Confidence Intervals

57

Using PHStat (sample size for proportion)

How large a sample would be necessary to estimate the true

proportion defective in a large population within 3%, with

95% confidence?

(Assume a pilot sample yields ps = .12)

Page 58: Confidence Intervals

58

Applications in Auditing

• Advantages of statistical sampling in

auditing:

– Sample result is objective and defensible

• Based on demonstrable statistical principles

– Provides sample size estimation in advance

on an objective basis

– Provides an estimate of the sampling error

Page 59: Confidence Intervals

59

Applications in Auditing

– Can provide more accurate conclusions on

the population

• Examination of the population can be time

consuming and subject to more nonsampling

error

– Samples can be combined and evaluated by

different auditors

• Samples are based on scientific approach

• Samples can be treated as if they have been

done by a single auditor

(continued)

Page 60: Confidence Intervals

Testing & Interval Estimation: Mean

when variance known

• H0: =0 vs (1) Ha:>0; (2) Ha:<0; (3) Ha:0

• (1) Accept H0 if and only if

• (2) Accept H0 if and only if

• (3) Accept H0 if and only if

60

Page 61: Confidence Intervals

Testing & Interval Estimation: Proportion

• H0: p=p0 vs (1) Ha:p>p0; (2) Ha:p<p0; (3) Ha:pp0

• (1) Accept H0 if and only if

• (2) Accept H0 if and only if

• (3) Accept H0 if and only if

61

Page 62: Confidence Intervals

Testing & Interval Estimation: Variance

• H0: 2=0

2 vs (1) Ha: 2>0

2; (2) Ha: 2<0

2; (3) Ha:

202

• (1) Accept H0 if and only if

• (2) Accept H0 if and only if

• (3) Accept H0 if and only if

Here 2 = 2

(n-1), and 2 df is (n-1), sample size = n.

62