math 3680 lecture #15 confidence intervals

Download Math 3680 Lecture #15 Confidence Intervals

If you can't read please download the document

Post on 04-Jan-2016

34 views

Category:

Documents

2 download

Embed Size (px)

DESCRIPTION

Math 3680 Lecture #15 Confidence Intervals. Review: Suppose that E( X ) = m and SD( X ) = s . Recall the following two facts about the average of n observations drawn with replacement:. Estimation. - PowerPoint PPT Presentation

TRANSCRIPT

  • Math 3680

    Lecture #15

    Confidence Intervals

  • Review: Suppose that E(X) = m and SD(X) = s. Recall the following two facts about the average of n observations drawn with replacement:

  • Estimation

  • Example: A university has 25,000 registered students. In a survey of 318 students, the average age of the sample is found to be 22.4, with a sample SD of 4.5 years. Estimate the average age of all 25,000 students, and attach a standard error to this estimate.

  • Wrong Answer: The average age of thestudent body is exactly 22.4 years.What is wrong with this simplistic analysis?

  • Answer: Of course, we estimate the average of the population to be 22.4 years but this estimate will not be exact. To determine the magnitude of the error, we need to find the SE, and that means a box model. 25,000 tickets Average = ?? SD = ?? 318 draws

  • Bootstrap Estimation: Although the SD of the box is unknown, we estimate the SD of the box from the fractions in the sample:

    SD of box 4.5SE of the sample average

    (Why?)

  • Conclusion: The average age is about 22.4 years, give or take 0.251 years or so.

  • Confidence Intervals:Large samples or known s

  • 68%We say that the range

    22.40.251 years = 22.149-22.651 years

    is a 68% confidence interval for the average age of the population.

  • 95%We say that the range

    22.4(1.96)(0.251) years = 21.909-22.891 years

    is a 95% confidence interval for the average age of the population.

  • 99.7%We say that the range

    22.4(2.968)(0.251) years = 21.656-23.144 years

    is a 99.7% confidence interval for the average age of the population.

  • 1 - 2aIn general, we say that the range

    is a 1 - 2a confidence interval for the population average m.zaz1-a

  • Logic:

  • Observations:

    1) We are NOT saying that 95% of the students are between 21.9 and 22.9 years old this is patently ridiculous, of course.

    2) We are NOT saying that there is a 95% chance that the average age is between 21.9 and 22.9 years. The population average is constant it is either in this range or it is not.

  • Observations:

    3) The true interpretation is as follows: If several people run this experiment and they all find a 95%-confidence interval, then the true population parameter will lie in about 95% of these intervals.

  • 100 different 95% confidence intervals

  • 100 different 68% confidence intervals

  • 100 different 95% confidence intervals, n = 4 x 318 =1272

  • Observations:

    4) In the previous problem, we replaced the population s with the sample s. (When did we do this?) As it turns out, this makes little practical difference for large samples.

    More on this later when we consider small samples.

  • Observations:5) The normal approximation has been used. As discussed earlier, a large number of draws is required for this assumption to hold.

    6) Remember: There is no such thing as a 100% confidence interval. In practice, scientists often use 95% as a balance between a high confidence level and a narrow confidence interval.

  • Example: In a simple random sample of 680 households (in a city of millions), the average number of TV sets is 1.86, with an SD of 0.80. Find a 95% confidence interval for the average number of TV sets per household in the city.

  • True or false:(i) 1.860.06 is a 95%-confidence interval for this population average.(ii) 1.860.06 is a 95%-confidence interval for this sample average.

    (iii) There is a 95% chance for the population average to be in the range 1.860.06.

  • Example: The chart to the right shows platelet counts among 120 geriatric patients. Find a 95% confidence interval for the average platelet count among geriatric patients.

    Sheet1

    132117176126142120

    127125198208105146

    214194131208101139

    184163129138110247

    181181125123117176

    211108254244139179

    190212228139147170

    139129174108106141

    112126125142115147

    105256142175131119

    174106194181196232

    143142104184112141

    135110107137111112

    185114188106102104

    120143179178124242

    235129198150180187

    142125184238111129

    129203115101178133

    134168229169148185

    154162103105125151

    Sheet2

    Sheet3

  • Fill in the blanks with either box or draws.

    Probabilities are used when reasoning from the __________ to the _____________.

    Confidence levels are used when reasoning from the ____________ to the ______________.

  • Fill in the blank with either observed or expected.The chance error is in the _______________ value.

  • Fill in the blank with either sample or population.The confidence level is for the ______________ average.

  • Confidence Intervals:Projecting Sample Size

  • Example: In a preliminary simple random sample of 680 households (in a city of millions), the average number of TV sets in the sample households is 1.86, with an SD of 0.80.

    Suppose that its desired to construct a 90% confidence interval which has a margin of error of 0.03. How large a sample would be necessary?

  • Solution:So, the sample size should be at least 1925

  • Confidence Intervals:Small samples

  • Example: A biological research team measures the weights of 14 chipmunks, randomly chosen. Find a 90% confidence interval for the average weight of chipmunks.

    Sheet1

    7.68.669.418.458.088.867.488.7642857143

    8.29.249.349.5810.18.559.150.7610389082

    Sheet2

    Sheet3

  • Note: The previous calculations used the fact that

    approximately follows the normal curve for large values of n. In this problem, we cannot use this approximation.

  • However, for both small and large samples, we can use the fact that

    approximately follows the Students t-distribution with n - 1 degrees of freedom.

  • 1 - 2aIn general, we say that the range

    is a 1 - 2a confidence interval for the population average m.tn-1, atn-1, 1-a

  • 90%Therefore, the 90% confidence interval is

    or 8.40 9.12 ounces.Excel:TINV(0.1, 13)

  • Note: Be sure you look up the correct number on the table in the back of the book. The numbers at the bottom of Table 4 specify the two-sided confidence levels.

  • Excel 2003: Place all data values in a single row or column. Tools -> Data Analysis -> Descriptive Statistics Select the data range, and check columns or rows, Confidence Level for Mean and Summary Statistics.

    Sheet5

    Row1

    Mean8.4378571429

    Standard Error0.1177778621

    Median8.31

    Mode

    Standard Deviation0.4406844078

    Sample Variance0.1942027473

    Kurtosis0.291035133

    Skewness0.9319878034

    Range1.56

    Minimum7.84

    Maximum9.4

    Sum118.13

    Count14

    Confidence Level(95.0%)0.2544435527

    Sheet1

    7.987.848.338.178.158.299.098.168.348.548.748.239.48.87

    Sheet2

    Sheet3

  • Excel 2007

    First, to get started

    From the circle, select Excel Options. Click Add-Ins. Next to Manage Excel Add-Ins, click Go. Check Analysis ToolPak, and click OK.

  • Excel 2007 Place all data values in a single row or column. Click Descriptive Statistics. Select the data range, and check columns or rows, Confidence Level for Mean and Summary Statistics.

    Sheet5

    Row1

    Mean8.4378571429

    Standard Error0.1177778621

    Median8.31

    Mode

    Standard Deviation0.4406844078

    Sample Variance0.1942027473

    Kurtosis0.291035133

    Skewness0.9319878034

    Range1.56

    Minimum7.84

    Maximum9.4

    Sum118.13

    Count14

    Confidence Level(95.0%)0.2544435527

    Sheet1

    7.987.848.338.178.158.299.098.168.348.548.748.239.48.87

    Sheet2

    Sheet3

  • Notes: The confidence level is really the margin of error. The last two rows have to be entered by hand.

    Sheet5

    Row1

    Mean8.4378571429

    Standard Error0.1177778621

    Median8.31

    Mode

    Standard Deviation0.4406844078

    Sample Variance0.1942027473

    Kurtosis0.291035133

    Skewness0.9319878034

    Range1.56

    Minimum7.84

    Maximum9.4

    Sum118.13

    Count14

    Confidence Level(95.0%)0.2544435527

    Sheet1

    7.987.848.338.178.158.299.098.168.348.548.748.239.48.87

    Sheet2

    Sheet3

    Sheet6

    Row1

    Mean8.4378571429

    Standard Error0.1177778621

    Median8.31

    Mode

    Standard Deviation0.4406844078

    Sample Variance0.1942027473

    Kurtosis0.291035133

    Skewness0.9319878034

    Range1.56

    Minimum7.84

    Maximum9.4

    Sum118.13

    Count14

    Confidence Level(90.0%)0.20857655

    Mean - Margin of Error8.2292805929

    Mean + Margin of Error8.6464336928

    Sheet1

    7.987.848.338.178.158.299.098.168.348.548.748.239.48.87

    Sheet2

    Sheet3

  • Example: Duracell tests 12 batteries in flashlights. They determine that the average life of the batteries in this sample is 3.58 hours, with a sample SD of 1.58 hours. Find a 95% confidence interval for the average life of a Duracell battery in a flashlight.

    Repeat if 100 batteries were tested (with the same sample mean and SD as above)

  • Note: In previous lectures, we considered another technique of inferring information about the box from the draws namely, hypothesis testing.Confidence intervals provide a method of estimating the average of the box.Hypothesis testing checks if the difference between the

Recommended

View more >