chapter 3 confidence interval revby rao

Post on 29-Oct-2014

31 Views

Category:

Technology

6 Downloads

Preview:

Click to see full reader

DESCRIPTION

 

TRANSCRIPT

Medical StatisticsMedical Statistics (full English(full English class)class)

Shaoqi Rao, PhD

School of Public Health

Sun Yat-Sen University

Slides adapted from Professor Fang Ji-Qian’s

Chapter 3Chapter 3

Sampling Error Sampling Error

and Confidence Intervaland Confidence Interval

For several samples For several samples from the same populationfrom the same population

Usually the sample means are not equal to the

population meanthe sample means are different one another

----sampling error

3.1 The Distribution of Sample Mean3.1 The Distribution of Sample Mean

3.1.1 Distribution of sample mean from a population 3.1.1 Distribution of sample mean from a population of normal distributionof normal distribution

Experiment 3.1 Sampling from a normal distribution. Assume the red cell counts of healthy males follow a

normal distribution 100 samples are drawn, The sample means are showed in the second column of

Table 3.1.

)5746.0,6602.4( 2N5n

Features of sample mean Features of sample mean as a random variableas a random variable

(1) Any of the sample means is not necessary equal to the population mean;

(2) The differences exist among the sample means;

(3) The distribution of sample means follows certain rule that more in center, less in two ends and symmetry around the center. (4) The range of variation for the sample mean is much narrower than that of the initial variable.

If the random samples with n individuals are drawn from a normal distribution , then the sample mean follows a normal distribution

(3.1)

),( 2N

, then the sample mean follows a normal

distribution

),(~ 2xNX

(3.1)

),(~ 2xNX

(5) The range of variation for the sample means tends to be narrow with the increase of sample sizes.

: Standard deviation of the initial variable

: Standard deviation of the sample mean

---- standard error of sample mean

or standard error.

For sample,

x

nx

n

SSx

3.1.2 Distribution of sample mean from a 3.1.2 Distribution of sample mean from a population with non-normal distribution population with non-normal distribution

Experiment 3.2 Sampling from positive skew distribution

(1) The distribution of sample means tends to be symmetric with the increase of sample size;

when n=30, it looks similar to normal distribution.

(2) The range of variation for the sample means also tends to be narrow with the increase of sample sizes.

Experiment 3.3 Sampling from an asymmetric hook-like distribution

For the population with a non-normal distribution,

although the distribution of sample means is not a

normal distribution, it will be similar to a normal

distribution when sample size is big (say,

approximately, we still have

)30n

),(~2

nNX

3.23.2 t t DistributionDistribution

3.2.1 Standard t deviate3.2.1 Standard t deviate

WhenWhen

),(~2

nNX

)1,0(~ NX

x

?)1,0(~ NS

X

x

W.S. Gosett (1908) explored its distribution

dist. ~ tS

X

x

1n

),(~ 2NX

3.2.2 The probability density 3.2.2 The probability density and critical values of t distribution and critical values of t distribution

The two-side probabilities and corresponding critical

values of t distribution are given in the Table 5 of the

Appendix 2.

For instance,

When degrees of freedom is 20, corresponding

to two-side probability 0.05, the critical value of t

distribution

Corresponding to one-side probability 0.05, the

critical value of t distribution

In general,

96.1086.220,2/05.0 t

64.1725.120,05.0 t

2/,2/ Zt Zt ,

3.3 The Confidence Interval 3.3 The Confidence Interval for Population Mean for Population Mean

of Normal Distributionof Normal Distribution),(~ 2NX

Therefore, 95% of the sample means meet the inequality (but not all)

For any sample, if we claim is located in such an interval, then in theory, we might be right about 95 times out of 100 times.

and are unknownA sample is drawn, and ,

X xS ?

dist. ~ tS

X

x

1n

,2/05.0,2/05.0 tS

Xt

x

xx StXStX 2/05.02/05.0

In general, given a random sample of the population,

if the sample size, sample mean and sample standard

deviation are denoted as ; , then

is called with confidence interval of the

population mean

: confidence level

: precision of the confidence interval

When sample size is big enough,

sxn and , nssx /),(: ,2/,2/ xx stxstx

)1(

)1(

xst

),(: 2/2/ xx sZxsZx

Example 3.1 Randomly select 20 cases from the patientswith certain kind of disease. The sample mean of blood sedimentation (mm/h) (血沉 ) is 9.15, sample standard deviation is 2.13. To estimate the 95% confidence interval and 99% confidence interval (Assume the blood sedimentation of this kind of disease follow a normal distribution).

Question: If both of higher confidence level and better precision are expected, What should we do?

20,13.2,15.9 nsx

15.8 and 15.1020

13.2093.215.919,2/05.019,2/05.0

n

stxstx x

87.7 and 51.1020

13.2861.215.919,2/01.019,2/01.0

n

stxstx x

3.4 Confidence Interval 3.4 Confidence Interval for the Difference for the Difference

between Two Population Meansbetween Two Population Means),(~ 2

11 NX ),(~ 222 NX

and , 21 unknown. Two samples with

The confidence interval for ?111 ,, sxn 222 ,, sxn

21

),(~1

2

11 nNX

),(~2

2

22 nNX

),(~2

2

1

2

2121 nnNXX

)1,0(~

)11

(

)()(

21

2

2121 N

nn

XX

Since is unknown, it could be replaced by , 2cS

2

)1()1(

21

222

2112

nn

SnSnSc 221 nn

?)1,0(~

)11

(

)()(

21

2

2121 N

nnS

XX

c

dist. ~

)11

(

)()(

21

2

2121 t

nnS

XX

c

221 nn

The )1( confidence interval of 21 is

])11

()(,)11

()[(21

2,2/21

21

2,2/21 nn

stxxnn

stxx cc

Example 3.2 Assume the red cell counts of healthy male

residents and healthy female residents of certain city

follow two normal distributions respectively

, ,95% CI for the difference between male and female?

15,20 21 nn 18.4,66.4 21 xx 45.0,47.0 21 ss

2209.047.0 221 s 2025.045.0 22

2 s

2131.021520

)45.0)(115()47.0)(120( 222

cs 3321520

042.230,2/05.0 t 021.240,2/05.0 t 034.2310

021.2042.2041.233,2/05.0

t

)15

1

20

1(2131.0034.2)18.466.4()

11()(

21

233,2/05.021

nnstxx c

16.0 and 80.0)1577.0(034.248.0

3.5 Confidence Intervals for 3.5 Confidence Intervals for Probability and the Difference Probability and the Difference

between Two Probabilitiesbetween Two Probabilities 3.5.1 Confidence interval for population probability3.5.1 Confidence interval for population probability

When sample size is smallWhen sample size is small, given , given XX and and nn the 95% and the 95% and

99% confidence interval of can be obtained from Table99% confidence interval of can be obtained from Table

3 of Appendix 2.3 of Appendix 2. When sample size is big enoughWhen sample size is big enough, can be estimated , can be estimated

by normal approximation by normal approximation

),(~ nBX n

XP

)

)1(,(~

nNP

))1(

,)1(

(: 2/2/ n

ppZp

n

ppZp

Comparing to the confidence Comparing to the confidence interval ofinterval of

),(: 2/2/ xx sZxsZx

))1(

,)1(

(: 2/2/ n

ppZp

n

ppZp

xp

xsn

pp

)1(

3.5.2 Confidence intervals 3.5.2 Confidence intervals for two population probabilitiesfor two population probabilities

),(~ 111 nBX ),(~ 222 nBX

1

11 n

XP

2

22 n

XP

))1(

,(~1

1111 n

NP

))1(

,(~2

2222 n

NP

2

22

1

112/2121

)1()1()(:

n

pp

n

ppZpp

Comparing to the confidence Comparing to the confidence interval ofinterval of 21

)11

()(:21

22/2121 nn

sZxx c

2

22

1

112/2121

)1()1()(:

n

pp

n

ppZpp

2121 2121 xxpp

)11

()1()1(

21

2

2

22

1

11

nns

n

pp

n

ppc

Example 3.4 Comparison between two drugs.

3.6 The Sample Size 3.6 The Sample Size for Estimation of Confidence Intervalfor Estimation of Confidence Interval

3.6.1 Sample size for confidence interval of the me3.6.1 Sample size for confidence interval of the mean of normal populationan of normal population

3.6.1 Sample size for confidence interval of the me3.6.1 Sample size for confidence interval of the mean of normal populationan of normal population

Given (1) the confidence level (1-)

(2) the half width of confidence interval δ

(3) the estimate of the standard deviation s

Let

Replace with , approximately

n

st ,2/

2/t 2/Z

n

sZ 2/

22/ )( sZ

n

Example 3.5 It is learnt from a pilot study that the

standard deviation of a biochemical index is about 10

units. In order to have a 95% confidence interval of

the population mean, of which the half of the width

equals to 2.5 units. What is the sample size needed?

Since s=10, δ=2.5, ≈2, 2/Z

64)5.2

102()( 222/

sZ

n

3.6.2 Sample size for confidence interval of the 3.6.2 Sample size for confidence interval of the probability of binomial population probability of binomial population

Given (1) the confidence level (1-),

(2) the half width of confidence interval δ

(3) the estimate of frequency p

Let

This formula shows, the large sample size will be needed if

the population probability is close to 0.5 (big variation).

n

ppZ

)1(

)1()( 2 ppZ

n

Example 3.6 It is learnt from a pilot study that the

probability of relapse in one year for a disease is about

10%. Now a survey is planed to further estimate the 95%

confidence interval for the probability of relapse in one

year, of which the half width is required with 3%. What

is the sample size needed?

Since p=10%, ≈2, 2/Z

400)1.01(1.0)03.0

2( 2 n

)1()( 2 ppZ

n

SummarySummary1. Sampling error The sample means are not

equal to the population mean; the sample means are different one another.

2. Distribution of sample mean If the random samples with n individuals are drawn from a normal distribution, then the sample mean follows a normal distribution.

If the random samples with n individuals are drawn from a non-normal distribution, although the distribution of sample means is not a normal distribution, it will be approximate to a normal distribution when sample size is big.

3. Confidence interval

When When

Given a random sample of

if the sample size, sample mean and sample

standard deviation are denoted as ,

then the confidence interval of the

population mean is

),(~ 2NX

dist. ~ tS

X

x

1n

sxn and ,

nssx /

),(: ,2/,2/ xx stxstx

),(~ 2NX

Given a random sample of ,

then the confidence interval of the population

probability is

Given two random samples of

and then the confidence interval of

the difference is

),(~ nBX n

XP

))1(

,)1(

(: 2/2/ n

ppZp

n

ppZp

),(~ 222 NX

),(~ 211 NX

)11

()(:21

22/2121 nn

sZxx c

Given two random samples of and then the confidence interval of

the difference is

4. Sample size Sample size for confidence interval of the mean o

f normal population

Sample size for confidence interval of the probability of binomial population

),(~ 222 nBX ),(~ 111 nBX

2

22

1

112/2121

)1()1()(:

n

pp

n

ppZpp

22/ )( sZ

n

)1()( 2 ppZ

n

Thank

top related