chapter 7. frequency distributions probability distributions 2 © ray panko

Chapter 7

Frequency Distributions Probability Distributions

2© Ray Panko

3

Same mean, different standard deviations

Different means

© Ray Panko

Frequency Distributionfor a variable

Sampling Distribution to findthe mean of the variable

4

xxμ

µ

µ

µ

Sampling

Sampling

X

X

X

Event is the estimation of the mean (X bar) from

a sample of size n.

Event is the estimation of the mean (X bar) from

a sample of size n.

© Ray Panko

5

xxμ

µ

X

δ

n

σσ

Xμμ

X

Sampling Distribution for µ

Population Distribution

© Ray Panko

Forty percent of voters call themselves independents.◦ 40% is a proportion (∏)

◦ Take a sample to estimate ∏

◦ The sample mean, p, is an unbiased estimator of ∏

◦ The sampling standard deviation, δp, is given by:

© Ray Panko

6

“Based on a sample of 1,500 households, the percentage of voters in favor of Proposition X is 40%, with a sampling error of plus or minus 3%.”

The sample mean (X) or proportion (p) is not likely to be exactly the population mean (µ) or proportion (∏)

However, they should be close.

Confidence intervals allow us to estimate how close.

Example: “It is estimated that the proportion of independent voters is 49%, with a sampling error of plus or minus 3%.”

© Ray Panko

8

Probability that the true population mean µ will lie within a certain interval around the sampling distribution mean Xbar, with a certain degree of confidence.

9

95% Confidence Interval95% Confidence Interval

Xbar

© Ray Panko

If the confidence level is 95%, then the area outside the confidence interval, which we call α, is 0.05.

The upper and lower tails are α/2 or 0.025

10

0.0252

α

0.0252

α

Xbar

0.05 so ,1 95%

© Ray Panko

Find the Z values for α/2.

For P(1-0.025) = P(0.975), Z is 1.96

So the Z values are -1.96 and 1.96

11

Zα/2 = -1.96 Zα/2 = 1.96

0.0252

α

0.0252

α

Point Estimate

Lower Confidence Limit

UpperConfidence Limit

Z units:

X units: Point Estimate

0

© Ray Panko

12

Confidence Level

Confidence Coefficient,

Zα/2 value

1.281.6451.962.332.583.083.27

0.800.900.950.980.990.9980.999

80%90%95%98%99%99.8%99.9%

1

© Ray Panko

13

2.4068 1.9932

0.2068 2.20

)11(0.35/ 1.96 2.20

n

σ/2 ZX

μ

α

A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the

population standard deviation is 0.35 ohms.

95% confidencefor the true mean:

© Ray Panko

14

A sample of 11 circuits from a large normal population has a mean resistance of 2.20 ohms. We know from past testing that the

population standard deviation is 0.35 ohms.

90% confidenceinterval for thetrue mean:

2.3736 0264.2

0.173595 2.20

)11(0.35/ 1.645 2.20

nσ

Z X /2

© Ray Panko

Population Mean

σ Unknown

Confidence

Intervals

PopulationProportion

σ Known

UseNormal

DistributionWith δ

Uset Distribution

based on the sample standard deviation S

computed from sample instead of δ

15

© Ray Panko

16

Assumptions◦ Population standard deviation is unknown◦ Population is normally distributed◦ If population is not normal, use large sample

Use Student’s t Distribution instead of the normal distribution

Confidence Interval Estimate:n

StX 2/α

(where tα/2 is the critical value of the t distribution with n -1 degrees of freedom and an area of α/2 in each tail)

© Ray Panko

Idea: Number of observations that are free to vary

after sample mean has been calculated

17

Example: Suppose the mean of 3 numbers is

8.0 If the mean of these three values is 8.0, then X3 must be 9 (i.e., X3 is not free to vary)

If the mean of these three values is 8.0, then X3 must be 9 (i.e., X3 is not free to vary)

Here, the sample size (n) = 3

So degrees of freedom = n – 1 = 3 – 1 = 2

X1 = 7X2 = 8X3 = ?

X1 = 7X2 = 8X3 = ?

© Ray Panko

© Ray Panko

18

For confidence intervals based on sample standard deviations,

d.f. = n-1

Where n is the sample size

For confidence intervals based on sample standard deviations,

d.f. = n-1

Where n is the sample size

19

t0

t (df = 5)t (df = 5)

t (df = 13) t (df = 13)t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal

t-distributions are bell-shaped and symmetric, but have ‘fatter’ tails than the normal

Standard Normal

(t with df = ∞)

Standard Normal

(t with df = ∞)

Note: t Z as n increasesso (n-1 n)

© Ray Panko

20

Upper Tail Area

df

.25 .10 .05

1 1.000 3.078 6.314

2 0.817 1.886 2.9202.920

3 0.765 1.638 2.353

t0 2.920

2.920

The body of the table contains t values, not probabilities

The body of the table contains t values, not probabilities

/2 = 0.05

90% confidence level, = 0.10

/2 = 0.05

90% confidence level, = 0.10

/2 = 0.05

Sample Size = 3df = n-1 df = 2

Sample Size = 3df = n-1 df = 2

© Ray Panko

21

Confidence

Level

t (10 d.f.)

t (20 d.f.)

t (30 d.f.)

z

.90 1.812 1.725 1.697 1.645

.95 2.228 2.086 2.042 1.96

.99 3.169 2.845 2.750 2.58

As sample size n increases, df (n-1) increases.As df increases, t approaches zSo at large sample sizes, t and z are the same

© Ray Panko

22

A random sample of n = 25 has X = 50 and S = 8. Form a 95% confidence interval for μ

◦ d.f. = n – 1 = 49, and α/2 = .025

◦ From Table E.1, tα/2 = 2.0639

◦ So The confidence interval is

258

(2.0649)50n

S/2 tX

46.698 ≤ μ ≤ 53.302

© Ray Panko

TINV(Probability, df)

For a 95% confidence level, sample size of 25, and a standard deviation S of 8◦ df is 24 (n-1)

◦ Probability is α (.05), not α/2 = .05

◦ Equation is = TINV(.05,24)

◦ Its value is 2.063899

◦ This is the same value found with the table lookup

23

© Ray Panko

24

Population Mean

σ Unknown

Confidence

Intervals

PopulationProportion

σ Known

Based on a sample of 70, 95% of our faculty members have PhDs.

© Ray Panko

Recall that the distribution of the sample proportion is approximately normal if the sample size is large, with standard deviation

We will estimate this with sample data:

25

n

p)p(1

n

)(1σp

© Ray Panko

Upper and lower confidence limits for the population proportion are calculated with the formula

where ◦ Zα/2 is the standard normal value for the level of confidence desired

◦ p is the sample proportion

◦ n is the sample size

Note: must have np > 5 and n(1-p) > 5

26

n

p)p(1/2Zp

α

© Ray Panko

A random sample of 100 people shows that 25 are left-handed. Form a 95% confidence interval for the true proportion of left-handers.

27

/1000.25(0.75)1.9625/100

p)/np(1/2Zp

α

0.3349 0.1651

(0.0433)* 1.96 0.25

© Ray Panko

for a desired error size and confidence level

To determine the required sample size for the mean, you must know:

◦ The desired level of confidence (1 - ), which determines the critical value, Zα/2

◦ The acceptable sampling error, e (the plus or minus in the estimate).

◦ The population standard deviation, σ

(continued)

29

© Ray Panko

If = 45, what sample size is needed to estimate the mean within ± 5 with 90% confidence?

(Always round up)

219.195

(45)(1.645)

e

σZn

2

22

2

22

So the required sample size is n = 220

30

© Ray Panko

If unknown, σ can be estimated when using the required sample size formula

◦ Use a value for σ that is expected to be at least as large as the true σ

◦ Select a pilot sample and estimate σ with the sample standard deviation, S

31

© Ray Panko

A confidence interval estimate (reflecting sampling error) should always be included when reporting a point estimate

The level of confidence should always be reported

The sample size should be reported

An interpretation of the confidence interval estimate should also be provided

32

© Ray Panko

chapter 7. frequency distributions probability distributions 2 © ray panko

Documents