6-sampling distributions review

7
Student Notes - Prep Session Topic: Sampling Distributions Gloria Barrett, Virginia Advanced Study Strategies edited by Daren Starnes Content The AP Statistics topic outline contains the following list of items related to sampling distributions. (Items (4), (5), and (8) will not be covered in this session.) 1. Sampling distribution of a sample proportion 2. Sampling distribution of a sample mean 3. Central Limit Theorem 4. Sampling distribution of a difference between two independent sample proportions 5. Sampling distribution of a difference between two independent sample means 6. Simulation of sampling distributions 7. t-distribution 8. Chi-square distribution Sampling distributions are an extension of probability, so many free response questions that include questions on sampling distributions will also include parts that relate to material discussed and reviewed in the earlier prep session on probability. Be sure you understand --- 1. The difference between a parameter and a statistic 2. What we mean by the sampling distribution of a statistic (that is, the distribution of the values of that statistic obtained from all possible samples of a given size from a given population) 3. What we mean by an unbiased statistic 4. The formulas for and should be used only when the population is at least 10 times as large as the sample 5. The sampling distribution of is approximately normal when the sample size is large (your textbook will have a definition of large, for example and ) 6. The sampling distribution of is normally distributed, regardless of sample size, if the underlying population is normally distributed 7. The sampling distribution of is approximately normally distributed, regardless of the shape of the underlying population, when the sample size is large (according to the Central Limit Theorem). In this case is usually sufficiently large. 8. The CLT is a statement about shape. It says that the sampling distribution of sample means becomes more normally distributed as the sample size increases. Formulas

Upload: madison

Post on 28-Sep-2015

216 views

Category:

Documents


2 download

DESCRIPTION

sampling distributions

TRANSCRIPT

Student Notes - Prep Session Topic: Sampling Distributions

Student Notes - Prep Session Topic: Sampling Distributions

Gloria Barrett, Virginia Advanced Study Strategies edited by Daren Starnes

Content

The AP Statistics topic outline contains the following list of items related to sampling distributions. (Items (4), (5), and (8) will not be covered in this session.)

1. Sampling distribution of a sample proportion2. Sampling distribution of a sample mean3. Central Limit Theorem4. Sampling distribution of a difference between two independent sample proportions5. Sampling distribution of a difference between two independent sample means

6. Simulation of sampling distributions7. t-distribution8. Chi-square distribution

Sampling distributions are an extension of probability, so many free response questions that include questions on sampling distributions will also include parts that relate to material discussed and reviewed in the earlier prep session on probability.

Be sure you understand ---

1. The difference between a parameter and a statistic2. What we mean by the sampling distribution of a statistic (that is, the distribution of the values of that statistic obtained from all possible samples of a given size from a given population)3. What we mean by an unbiased statistic

4. The formulas for and should be used only when the population is at least 10 times as large as the sample

5. The sampling distribution of is approximately normal when the sample size is large (your textbook will have a definition of large, for example and )6. The sampling distribution of is normally distributed, regardless of sample size, if the underlying population is normally distributed

7. The sampling distribution of is approximately normally distributed, regardless of the shape of the underlying population, when the sample size is large (according to the Central Limit Theorem). In this case is usually sufficiently large.8. The CLT is a statement about shape. It says that the sampling distribution of sample means becomes more normally distributed as the sample size increases.

Formulas

You will want to be familiar with the probability formulas that are provided on the exam. A partial list of formulas related to probability on the exam formula sheet is provided here. Note that several relate to the sampling distribution of sample means and sample proportions:

If X has a binomial distribution with parameters n and p, then:

If is the mean of a random sample of size n from an infinite population with mean and standard deviation , then:

MC B # 30, 38AP Exam Free Response Questions for Practice and Discussion

2004, Form B #3 Trains carry bauxite from a mine in Canada to an aluminum processing plant in northern New York

State in hopper cars. Filling equipment is used to load ore into the hopper car. When functioning properly, the actual weights of ore loaded into each car by the filling equipment at the mine are approximately normally distributed with a mean of 70 tons and a standard deviation of 0.9 ton. If the mean is greater than 70 tons, the loading mechanism is overfilling.

(a) If the filling equipment is functioning properly, what is the probability that the weight of the ore in a randomly selected car will be 70.7 tons or more? Show your work.

(b) Suppose that the weight of ore in a randomly selected car is 70.7 tons. Would that fact make you suspect that the loading mechanism is overfilling the cars? Justify your answer.

(c) If the filling equipment is functioning properly, what is the probability that a random sample of 10 cars will have a mean weight of 70.7 tons or more? Show your work.

(d) Based on your answer in part (c), if a random sample of 10 cars had a mean ore weight of 70.7 tons, would you suspect that the loading mechanism was overfilling the cars? Justify your answer.2008, Form B, #2

Four different statistics have been proposed as estimators of a population parameter. To investigate the behavior of these estimators, 500 random samples are selected from a known population and each statistic is calculated for each sample. The true value of the population parameter is 75. The graphs below show the distribution of the values for each statistic.

(a) Which of the statistics appear to be unbiased estimators of the population parameter? How can you tell?

(b) Which of the statistics A or B would be a better estimator of the population parameter? Explain your choice.

(c) Which of the statistics C or D would be a better estimator of the population parameter? Explain your choice.2007, Form B #2

The graph below shows the relative frequency distribution for X , the total number of dogs and cats owned per household, for the households in a large suburban area. For instance, 14 percent of the households own 2 of those pets.

(a) According to local law, each household in this area is prohibited from owning more than 3 of these pets. If a household in this area is selected at random, what is the probability that the selected household will be in violation of this law? Show your work.

(b) If 10 households in this area are selected at random, what is the probability that exactly 2 of them will be in violation of this law? Show your work.

(c) The mean and standard deviation of X are 1.65 and 1.851 respectively. Suppose that 150 households in this area are to be selected at random and, the mean number of dogs and cats per household, is to be computed. Describe the sampling distribution of , including its shape, center, and spread.

Solution, 2004 Form B Question 3

Let X = weight of ore in a randomly selected car.(a)

(b) No. Approximately 22% of the cars will have ore weights of 70.7 or greater when the filling equipment is working properly, so a car that was filled with 70.7 tons of ore would not be an unusual occurrence.

(c)

(d) Yes, we would suspect that the filling mechanism is overfilling. If it is working properly, the probability that the mean weight of the ore in 10 randomly selected cars is 70.7 or greater is 0.0069 which is very small.

Note 1: To receive complete credit for part (a) or part (c), students must show how the probability is computed. Since part (a) and part (c) involve different normal distributions, it is important to identify which normal distribution is used in each part. As shown above, this could be done by displaying a probability statement containing the mean and standard deviation for the appropriate normal distribution. It could be done in other ways, such as listing the mean and standard deviation and displaying an appropriate graph.

Note 2: The response in part (b) could be justified by indicating that 70.7 tons is less than one standard deviation away from the desired mean of 70 tons. The response in part (d) could be justified by indicating that 70.7 tons is more than two standard deviations above the desired mean of 70 tons.

Solution, 2008 Form B Question 2(a) Statistics A, C, and D appear to be unbiased. This is indicated by the fact that the mean of the estimated sampling distribution for each of these statistics is about 75, the value of the population parameter.

Note: No other characteristic should be mentioned in the response. Students must clearly demonstrate an understanding of the term unbiased.

(b) Statistic A would be a better choice because it appears to be unbiased (or centered at 75). Although the variability of the two estimated sampling distributions is similar, statistic A would produce estimates that tend to be closer to the true population parameter value of 75 than would statistic B.

(c) Statistic C would be a better choice because it has smaller variability. Although both statistic C and statistic D appear to be unbiased, statistic C would produce estimates that tend to be closer to the true population parameter value of 75 than would statistic D.Solution, 2007 Form B Question 2(a)

(b) Y = number of households in violation. Y has a binomial distribution with n = 10 and p = 0.17.

(c) The distribution of will:1. be approximately normal (note that the word approximately is required for an essentially correct response) OR is more symmetric than the population distribution which is highly skewed.2. have mean

3. have standard deviation

_1295595571.unknown

_1295611585.unknown

_1296410585.unknown

_1329748604.unknown

_1355991130.unknown

_1296410821.unknown

_1295611731.unknown

_1296407326.unknown

_1296407453.unknown

_1296407559.unknown

_1296407373.unknown

_1296407127.unknown

_1295611688.unknown

_1295611473.unknown

_1295611566.unknown

_1295595693.unknown

_1295608987.unknown

_1295611415.unknown

_1295608945.unknown

_1295595628.unknown

_1295595353.unknown

_1295595492.unknown

_1295595548.unknown

_1295595417.unknown

_1295594394.unknown

_1295594395.unknown

_1295594393.unknown