biostatistics unit 6 confidence intervals 1. statistical inference statistical inference is the...

135
Biostatistics Unit 6 Confidence Intervals 1

Post on 21-Dec-2015

225 views

Category:

Documents


4 download

TRANSCRIPT

Page 1: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Biostatistics

Unit 6

Confidence Intervals

1

Page 2: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Statistical inference

• Statistical inference is the procedure by which we reach a conclusion about a population on the basis of the information contained in a sample drawn from that population. 

• Estimation involves the use of the data in the sample to calculate the corresponding parameter in the population from which the sample was drawn.

2

Page 3: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Types of estimates

• A point estimate is a single numerical value used to estimate the corresponding population parameter.

• An interval estimate consists of two numerical values that, with a specified degree of confidence, we feel includes the parameter being estimated.

3

Page 4: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Estimator

• An estimator is a rule or formula that tells how to compute the estimate.

• Estimators are unbiased if they predict well the value in the population.

4

Page 5: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Table of unbiased estimators

5

Page 6: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Sampled and target populations

• The sampled population is the population from which we actually draw the sample. 

• The target population is the population about which we wish to make an inference.

(continued)

6

Page 7: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Sampled and target populations

• These two populations may or may not be the same. 

• When they are the same, it is possible to use statistical inference procedures to make conclusions about the target population. 

• If the sample and target populations are different, conclusions can be made about the target population only on the basis of nonstatistical considerations.

7

Page 8: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Random and nonrandom samples

The strict validity of statistical procedures depends on the assumption of random samples.

8

Page 9: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence intervals to be studied

A) Confidence Interval for a Population meanB) Confidence Interval for the Difference of Two Population Means

C) Confidence Interval for a Population ProportionD) Confidence Interval for the Difference of Two Population Proportions

E) Confidence Interval for the Variance of a Normally Distributed Population

F) Confidence Interval for the Ratio of Variances of Two Normally Distributed Populations

9

Page 10: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

A) Confidence interval for a population meanEstimating the mean

• Estimating the mean of a normally distributed population entails drawing a sample of size n and computing  which is used as a point estimate of .

• It is more meaningful to estimate  by an interval that communicates information regarding the probable magnitude of .

10

Page 11: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Sampling distributions and estimation

Interval estimates are based on sampling distributions.  When the sample mean is being used as an estimator of a population mean, and the population is normally distributed, the sample mean will be normally distributed with mean,

, equal to the population mean, , and variance of

11

Page 12: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

The 95% confidence interval

• 95% of the values of making up the distribution will lie within two standard deviations of the mean.

• The actual value is 1.96 

• The interval is noted by the two points,  – 1.96 and + 1.96 , so that 95% of the values are in the interval, ± 1.96 .

12

Page 13: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

The 95% confidence interval

• Since  and  are unknown, the location of the distribution is uncertain. 

• We can use  as a point estimate of . 

• In constructing intervals of ± 1.96 , about 95% of these intervals would contain

13

Page 14: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

Suppose a researcher, interested in obtaining an estimate of the average level of some enzyme in a certain human population, takes a sample of 10 individuals, determines the level of the enzyme in each, and computes a sample mean of x = 22. Suppose further it is known that the variable of interest is approximately normally distributed with a variance of 45.  We wish to estimate .

14

Page 15: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

15

± 1.96

Page 16: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Components of an interval estimate

• The interval estimate of  is centered on the point estimate of . 

• 95% of the values of the standard normal curve lie within 1.96 standard deviations of the mean.

• The z score of 1.96 used in this case is called the reliability coefficient. 

16

Page 17: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

General expression for an interval estimate

 

17

Page 18: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Table of reliability coefficients for confidence intervals      

18

Page 19: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Interpretation of confidence intervals

The interval estimate for  is expressed as:  

± z1-(/2)

If = .05, we can say that, in repeated sampling, 95% of the intervals constructed this way will include .  This is based on the probability of occurrence of different values of .

(continued)19

Page 20: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Interpretation of confidence intervals

The area of the curve of  that is outside the area of the interval is called .

The amount of area inside the interval is called 1-.

20

Page 21: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Probabilistic interpretation of the interval

In repeated sampling from a normally distributed population with a known standard deviation, 100(1- ) percent of all intervals in the form 

will, in the long run, include the population mean, .  

(continued)

21

Page 22: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Probabilistic interpretation of the interval

The quantity 1- is called the confidence coefficient or confidence level and the

interval,  , is called the

confidence interval for .

22

Page 23: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Practical interpretation of the interval

When sampling is from a normally distributed population with known standard deviation, we are 100(1- ) percent confident that the single computed interval,

contains the population mean, .

23

Page 24: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Precision

• Precision indicates how much the values deviate from their mean. 

• Precision is found by multiplying the reliability factor by the standard error of the mean. 

• This is also called the margin of error.

24

Page 25: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Exercise 6.2.2

We wish to estimate the mean serum indirect bilirubin level of 4-day-old infants.  The mean for a sample of 16 infants was found to be 5.98 mg/dl.  Assuming bilirubin levels in 4-day-old infants are approximately normally distributed with a standard deviation of 3.5 mg/dl find:    A) The 90% confidence interval for     B) The 95% confidence interval for     C) The 99% confidence interval for

25

Page 26: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(1) Given         = 5.98       = 3.5       n = 16

26

Page 27: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

(2) Sketch

27

Page 28: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution(3) Calculations A)  90% interval (z = 1.645)                      5.98 ± 1.645 (.875)

           5.98-1.439375, 5.98+1.439375

                      (4.5408, 7.4129)           

28

Page 29: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

B)  95% interval (z = 1.96)

                     5.98 ± 1.96 (.875)           

         (4.265, 7.695)           

29

Page 30: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

C)  99% interval (z = 2.575)

                     5.98 ± 2.575 (.875)

                    (3.7261, 8.2339)

30

Page 31: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(4) ResultsA higher percent confidence level gives a wider band.  There is less chance of making an error but there is more uncertainty. Calculator answers are more accurate because the calculator uses exact values and derives its answers from calculus.

31

Page 32: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

The t distributionIn most real life situations the variance of the population is unknown.  We know that the z score, 

is normally distributed if the population is normally distributed and is approximately normally distributed when the population is large.  But, it cannot be used because  is unknown.

32

Page 33: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Estimation of the standard deviation

The sample standard deviation,

can be used to replace .  If n 30, then s is a good approximation of .  An alternate procedure is used when the samples are small.  It is known as Student's t distribution.

33

Page 34: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Student's t distribution

Student's t distribution is used as an alternative for z with small samples.  It uses the following formula:

       

34

Page 35: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Student's t distributionStudent's t distribution was developed in 1908 by W. S. Gosset (1876-1937) who worked for the Guinness Brewery.

       

35

Page 36: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Properties of the t distribution

1.  Mean = 02.  It is symmetrical about the mean.3.  Variance is greater than 1 but approaches 1 as the sample gets large.  For df > 2, the variance = df/(df-2) or

(continued)

36

Page 37: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Properties of the t distribution

4.  The range is to .

5.  t is really a family of distributions because the divisors are different.6.  Compared with the normal distribution, t is less peaked and has higher tails.7.  t distribution approaches the normal distribution as n-1 approaches infinity.

37

Page 38: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

38

Page 39: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval for a mean using t

General relationship

The reliability coefficient is obtained from the t distribution.

39

Page 40: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval

When sampling is from a normal distribution whose standard deviation, , is unknown, the 100(1- ) percent confidence interval for the population mean, , is given by:

40

Page 41: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Deciding between z and t

• When constructing a confidence interval for a population mean, we must decide whether to use z or t. 

• Which one to use depends on the size of the sample, whether it is normally distributed or not, and whether or not the variance is known. 

• There are various flowcharts and decision keys that can be used to help decide.  Mine appears below.

41

Page 42: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Key for deciding between z and t in confidence interval construction

1.    Population normally distributed................2        Not as above—normally distributed.........5 2.    Sample size is large (30 or higher)............3        Sample size is small (less than 30)............4 3.    Population variance is known.............use z        Population variance not known.... use t (or z) 4.    Population variance is known.............use z        Population variance is not known.......use t 5.    Sample size is large..................................6        Sample size is small..................................7 6.    Population variance is known.............use z        Population variance not known        (central limit theorem applies)............use z 7.    Must use a non-parametric method

42

Page 43: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

In a study of preeclampsia, Kaminski and Rechberger found the mean systolic blood pressure of 10 healthy, nonpregnant women to be 119 with a standard deviation of 2.1.

(continued)

43

Page 44: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

(Preeclampsia:  Development of hypertension, albuminuria, or edema between the 20th week of pregnancy and the first week postpartum.

Eclampsia:  Coma and/or convulsive seizures in the same time period, without other etiology.)

44

Page 45: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

a.  What is the estimated standard error of the mean?b.  Construct the 99% confidence interval for the mean of the population from which the 10 subjects may be presumed to be a random sample.c.  What is the precision of the estimate?d.  What assumptions are necessary for the validity of the confidence interval you constructed?

45

Page 46: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(1) Given        n = 10         = 119        s = 2.1

46

Page 47: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

(2) Sketch of t distribution

47

Page 48: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Reading the t table

48

Page 49: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

49

Page 50: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

(3) Calculations

                       = .6640783086

119 ± 3.2498 (.66407...)

               116.84, 121.16

50

Page 51: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

Precision = 3.2498 (.66407...)                = 2.158121687

AssumptionsThe population is normally distributed The 10 subjects represent a random sample from this population.

51

Page 52: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

B) Confidence interval for the difference of two population means

Introduction

From each of two populations an independent random sample is drawn.  Sample means,  and  , are calculated.

(continued) 

52

Page 53: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

B) Confidence interval for the difference of two population means Introduction

The difference is which is an unbiased estimator of the difference between the two population means,  .  The variance of the

estimator is

53

Page 54: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Conditions for use

Assuming the populations are normally distributed, there are three situations where we would determine the 100(1- ) percent confidence interval for  .

(continued)

54

Page 55: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Conditions for use

a) where the population variances are known (use z)

b) where the population variances are unknown but equal (use t)

c) where the population variances are unknown but unequal (use t').

55

Page 56: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Population variances are known

When the population variances are known, the 100(1- ) percent confidence interval for  is given by        

56

Page 57: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example 6.4.1A research team is interested in the difference between serum uric acid levels in patients with and without Down's syndrome.  In a large hospital for the treatment of the mentally retarded, a sample of 12 individuals with Down's syndrome yielded a mean of   = 4.5 mg/100 ml.  In a general hospital a sample of 15 normal individuals of the same age and sex were found to have a mean value of = 3.4

mg/100 ml.  If it is reasonable to assume that the two populations of values are normally distributed with variances equal to 1 and 1.5, find the 95%

confidence interval for .

57

Page 58: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(1) Given

      n1 = 12,  = 4.5,  = 1

      n2 = 15,  = 3.4,  = 1.5

58

Page 59: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(2) Calculations

The point estimate for  is

             = 4.5 - 3.4 = 1.1

59

Page 60: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

The standard error is              

60

Page 61: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

The 95% confidence interval is                                   1.1 ± 1.96 (.4282)

                       (.26, 1.94)

61

Page 62: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Population variances unknown but equal

If it can be assumed that the population variances are equal then each sample variance is actually a point estimate of the same quantity.  Therefore, we can combine the sample variances to form a pooled estimate.

62

Page 63: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Weighted averages

The pooled estimate of the common variance is made using weighted averages.  This means that each sample variance is weighted by its degrees of freedom.

63

Page 64: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Pooled estimate of the variance

The pooled estimate of the variance comes from the formula:                  

64

Page 65: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Standard error of the estimate

The standard error of the estimate is

                   

65

Page 66: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval

The 100(1-) confidence interval for 

is:

       

66

Page 67: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

(1) Given

  n1 = 13,  = 21.0,  s1 = 4.9

      n2 = 17,   = 12.1,  s2 = 5.6

= .05

67

Page 68: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

(2) CalculationsThe point estimate for  - is

             = 21.0 - 12.1 = 8.9

68

Page 69: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

The pooled estimate of the variance is               

69

Page 70: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

The standard error is             

70

Page 71: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example

The 95% confidence interval is                                        8.9 ± 2.0484 (1.9569)

                        8.9 ± 4.0085

                        (4.9, 12.9)

71

Page 72: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Population variances unknown and not equal

With unequal variances, the quantity used to calculate the test statistic does not follow the t distribution. A substitute reliability factor called t' has been proposed.

72

Page 73: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

C) Confidence interval for a population proportion

To begin, a sample is drawn from the population of interest and the sample proportion, , is calculated.  This sample proportion is used as the point estimator of the population proportion, p.  The confidence interval is defined by the general formula:

73

Page 74: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Distribution

When n is large, the reliability coefficient will be z from the standard normal distribution.  Since p, the population proportion, is unknown, we use  as an

estimate.  The estimate of  , the

standard error, is  given by:

74

Page 75: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval

The 100(1- ) confidence interval for p is given by:

       

    

75

Page 76: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Probabilistic interpretation.

We say that we are 95% confident that the population proportion, p, lies  between the calculated limits since, in repeated sampling, about 95% of the intervals constructed this way would contain p.

76

Page 77: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Practical interpretation.

In a specific example, we would expect, with 95% confidence, to find the population proportion between the two boundaries.

77

Page 78: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example 6.5.2

A research study obtained data regarding sexual behavior from a sample of unmarried men and women between the ages of 20 and 44 residing in geographic areas characterized by high rates of sexually transmitted diseases and admission to drug programs.  Fifty percent of 1229 respondents reported that they never used a condom.  Construct a 95 percent confidence interval for the population proportion never using a condom.

78

Page 79: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(1) Given

        n = 1229         = .50

(for the TI-83, x = 615)

79

Page 80: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(2) Calculation

       

80

Page 81: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

D) Confidence interval for the difference of two population proportions

When studying the difference between two population proportions, the difference between the two sample proportions, , can be used as an unbiased point estimator for the difference between the two population proportions, p1 – p2.  This is used with the general formula:

81

Page 82: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Distribution

When the central limit theorem applies, the normal distribution is used to obtain confidence intervals.  The standard error is estimated by the formula:

        

82

Page 83: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval

The 100(1- ) percent confidence interval for p1 – p2 is given by:

           

83

Page 84: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Probabilistic interpretation.

We say that we are 95% confident that the difference between the two population proportions, p1 – p2, lies between the calculated limits since, in repeated sampling, about 95% of the intervals constructed this way would contain p1 – p2.

84

Page 85: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Practical interpretation.

In a specific example, we would expect, with 95% confidence, to find the difference between the two population proportions between the two limits.

85

Page 86: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example 6.6.1

A study of teenage suicide included a sample of 96 boys and 123 girls between ages of 12 and 16 years selected scientifically from admissions records to a private psychiatric hospital.  Suicide attempts were reported by 18 of the boys and 60 of the girls.  We assume that the girls constitute a simple random sample from a population of similar girls and likewise for the boys.  Construct a 99 percent confidence interval for the difference between the two proportions.

86

Page 87: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(1) Given

      n1  = 123        n2  = 96

            = .4878         = .1875

87

Page 88: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(2) Calculation

        

88

Page 89: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Determining the sample size for estimating means

It is important to have a sample that is the correct size.  It is also important to have a method that will allow prediction of the correct sample size for estimating a population mean or a population proportion.  This is important especially in business or commercial situations where money is involved.  Selecting a sample size that is too big wastes money.  One that is too small may give inaccurate results.

89

Page 90: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Objectives

The width of the confidence interval is determined by the magnitude of the margin of error which is given by:

d = (reliability coefficient) (standard error)

The total width of the interval is twice this amount.  

90

Page 91: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Reducing the margin of error

In the standard error, , the value of  is a constant.  If the reliability coefficient is fixed, the only way to reduce the margin of error is to have a large sample.  The size of the sample depends on the size of , the degree of reliability and the desired interval width.

91

Page 92: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Margin of error

 

92

Page 93: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Sample size for a large population

d = (reliability coefficient) X (standard error)

         Solving for n gives

       

93

Page 94: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Estimating 2

Generally the variance of the population under study is unknown.  As a result  has to be estimated.  The most common sources of estimates for  are:1.  A pilot sample which is drawn from the population and used as an estimate of .2.  Estimates of  from previous or similar studies.3.  In a normally distributed population, the range is usually about 6 standard deviations so is estimated by R/6.

94

Page 95: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Determination of the sample size for estimating proportions

The manner of finding sample sizes for estimating a population proportion is basically the same as for estimating a mean.

The general formula is:

95

Page 96: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Sample size

Assuming proper random sampling and an approximately normal distribution, the sample size is

                

96

Page 97: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Estimating the population proportion

It is necessary to estimate the population proportion, p, to use in the determination of the sample size.1.  If an upper limit is suspected or presumed, it could be used to represent p.2.  A pilot sample could be drawn and used to obtain an estimate for p.3.  With no better estimate, one may use p = .5 which gives the maximum value of n.

97

Page 98: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

E) Confidence interval for the variance of a normally distributed population

Measures of dispersion

S

(continued)

98

Page 99: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

E) Confidence interval for the variance of a normally distributed population

Measures of dispersion

S E( 2 ) =  when              E( s2 ) =  when sampling is with              sampling is without replacement                    replacement.

99

Page 100: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Large population size

When N is large, N and N-1 are approximately equal so 2 and s2 will be approximately equal.  These results justify why s2 can be used to compute the population variance.

100

Page 101: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Interval estimate of a population variance

• The value of  s2 is used as a point estimator of the population variance, 2. 

• Confidence intervals of  2 are based on the sampling distribution of (n-1) s2/ 2. 

• If samples of size n are drawn from a normally distributed population, this quantity has a distribution known as the chi-square distribution with n-1 degrees of freedom. 

• The assumption that the sample is drawn from a normally distributed population is crucial.

101

Page 102: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

The chi-square distribution

The chi-square distribution is not symmetrical. For low values of n, its shape is variable. The distribution does not have negative values.

102

Page 103: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Microsoft Excel Demonstration

Note how the shape of the curve changes depending on the degrees of freedom. With 1 degree of freedom, the curve is hyperbolic.

[Here follows the Excel Worksheet.]

103

Page 104: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Microsoft Excel Demonstration

104

Page 105: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Reading the 2 table

105

Page 106: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Finding 2 values

106

Page 107: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Finding 2 values

107

Page 108: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Finding 2 values

108

Page 109: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

109

Page 110: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval on the 2 distribution

The 100(1-) confidence interval for the distribution of (n-1) s2/2 is a two-tailed 2 distribution between

and .  This interval is given by

110

Page 111: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval for 2

From the sampling distribution of (n-1) s2/2 the sampling distribution of 2 is derived.  The  formula is:

        

    

111

Page 112: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval for

To get the 100(1-) confidence interval for , the population standard deviation, the square root of each term is taken.  The result is the formula below.

       

112

Page 113: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example 6.9.1

In a study on cholesterol levels a sample of 12 men and women was chosen.  The plasma cholesterol levels (mmol/L) of the subjects were as follows: 6.0, 6.4, 7.0, 5.8, 6.0, 5.8, 5.9, 6.7, 6.1, 6.5, 6.3, and 5.8.  We assume that these 12 subjects constitute a simple random sample of a population of similar subjects.  We wish to estimate the variance of the plasma cholesterol levels with a 95 percent confidence interval.

113

Page 114: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(1) Given

        6.0  6.4  7.0  5.8  6.0  5.8        5.9  6.7  6.1  6.5  6.3  5.8

Estimate the variance with a 95% confidence interval.

114

Page 115: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(2)  CalculationsValue of  s = .3918680978Values of from table              

= 21.920 = 3.816

115

Page 116: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Calculations

116

Page 117: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

F) Confidence interval for the ratio of variances of two normally distributed populations

A way to compare the variances of two normally distributed populations is to use the variance ratio,

/ .  The variance ratio is used, among other

things, as the test statistic for analysis of variance (ANOVA).  If the two variances are equal, then

V. R. = 1.

117

Page 118: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Sampling distribution

The sampling distribution of ( / )/( / ) is

used.  Since the population variances are usually not known, the sample variances are used.  The

assumptions are that  and are computed from

independent samples of size n1 and n2, respectively,

drawn from two normally distributed populations. (continued)

118

Page 119: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Sampling distribution

If the assumptions are met, ( / )/( / )

follows a distribution known as the F distribution with two values used for degrees of freedom.

119

Page 120: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Degrees of freedom

• The F distribution uses two values for degrees of

freedom. 

• The numerator degrees of freedom is the

value of n1 -1 which is used in calculating  . 

• The denominator degrees of freedom is the value of n2 -

1which is used in calculating   .

120

Page 121: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

The F distribution

• The F distribution is not symmetrical.

• The distribution does not have negative values.

• Because it uses two values of degrees of freedom, there are separate charts for different confidence intervals.

121

Page 122: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

F distribution tables

122

Page 123: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Reading F tables

F tables come in denominations based on 

which are  ,   ,   ,  and  with one tail. 

For two-tail intervals, the lower boundary, ,

must be calculated to give values of  ,   and 

.

123

Page 124: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Reading F tables

124

Page 125: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Two-tail F distribution boundaries

125

Page 126: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

The F.95 table

126

Page 127: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

The F.975 table

127

Page 128: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

The F.995 table

128

Page 129: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval for  /

The distribution ( / )/( / ) is used to

establish the 100(1- ) percent confidence interval

for  / .  The starting point is

(continued)

129

Page 130: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Confidence interval for  /

From this relation, it can be shown that the 100(1- )

percent confidence interval for  / is

130

Page 131: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Example 6.10.1

Among 11 patients in a certain study, the standard deviation of the property of interest was 5.8.  In another group of 4 patients, the standard deviation was 3.4.  We wish to construct a 95 percent confidence interval for the ratio of the variances of these two populations.

131

Page 132: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(1) Given     n1  = 11         = (5.8)2 = 33.64 = .05

     n2  = 4           = (3.4)2 = 11.56

         10, 3 = 14.42

        = 1/ 3, 10 = 1/4.83 = .20704

132

Page 133: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

133

Page 134: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

Solution

(2) Calculations

Calculation of the 95% confidence interval for  /

       

134

Page 135: Biostatistics Unit 6 Confidence Intervals 1. Statistical inference Statistical inference is the procedure by which we reach a conclusion about a population

fin

135