ap statistics chapters 0 & 1 review. variables fall into two main categories: a categorical, or...

23
AP Statistics Chapters 0 & 1 Review

Upload: matthew-york

Post on 11-Jan-2016

213 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

AP StatisticsChapters 0 & 1 Review

Page 2: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of several groups or categories. A quantitative variable takes numeric values for which arithmetic operations make sense.

Page 3: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

The distribution of a variable tells us what values the variable takes

on and how often it takes on those values.

Page 4: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Statistical inference involves drawing conclusions about a large

group, called the populationby gathering information from a

smaller subgroup, called the sample.

Page 5: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

The main statistical designs for producing data are surveys, experiments and

observational studies.In an observational study, we observe individuals and measure variables of

interest but do not attempt to influence the responses.

In an experiment, we deliberately do something to individuals in order to

observe their response.

Page 6: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

What two types of graphs are typically used for categorical variables?

What two types of graphs are typically used for quantitative variables?

GraphsBar and Charts Pie

histograms and plots Stem plots,Dot

Page 7: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Please know:

Cumulative frequency histogram

Relative frequency histogram

Page 8: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

When you describe the distribution pay special attention to the … shape: overall pattern, symmetric or skewed. The length of the “tails” will tell us whether a graph (i.e. distribution) is left-skewed (left tail is the longest) or right-skewed (the right tail is the longest).

modes: the values that occur most often (i.e. peaks)

unimodal - one major peak, bimodal - two major peaks

Page 9: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Center: the middle The two most common measures of center are the mean and the median. Spread: how varied (i.e. spread out is the data The IQR and standard deviation are probably the two most common measures of spread. Outliers: any value(s) that fall outside the overall

pattern.

Page 10: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

When you have to describe the shape of a distribution, don’t get mad,

C U S S E N P HN U R AT S E PE U A ER A D L

Page 11: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Measuring Center: The Mean & Median

To calculate the mean, add the values of the observations and divide by the number of

observations. The mean of a sample is denoted ,

pronounced x-bar.The mean of a population is denoted , the

Greek letter Mu.

x

Page 12: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Measuring Center: The MedianThe median (denoted by M) is the midpoint of a distribution: To calculate the median….

1. Order the observations from smallest to largest.

2. If the number of observations is odd, the median is simply the middle value in the list. You can find the location by counting (n+1)/2 observations from the bottom (or top).

3. If the number of observations is even, you should average the two middle numbers. The location of the median is again (n+1)/2 from the bottom or top of the list.

Page 13: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

EXAMPLE:Consider the following set of numbers… 13, 25, 28, 36, 47

M= _______ =________

Now, consider adding a 6th number, say 104.

M= _______ =________

We say that the median is an outlier resistant measure of center, while the mean is not.

x

x

28 8.29

32 61.42

Page 14: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Mean versus MedianThe mean and median of a roughly symmetrical distribution will be close together. If the distribution is exactly symmetric, the mean and median are equal. In a skewed distribution, the mean is farther out in the long tail than the median. In a skewed distribution, the median is the more accurate measure of center.

In descriptions of data, the “average” value of a variable is usually referred to as the mean whereas the “typical” value is usually referred to as the median.

Page 15: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Measuring Spread: The QuartilesOne way to measure spread, or variability, is to

calculate the range, which is the difference between the largest and smallest observations.

Another way to describe the spread of a distribution is by considering different percentiles. The pth percentile of a distribution is the value that has p percent of the observations at or below it. The median is the 50%

percentile. The 25th percentile is called the 1st quartile while the 75th percentile is called the 3rd quartile.

Page 16: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

The Five-Number Summary and Boxplots

The five-number summary of a set of observations consists of the smallest value, the 1st quartile, the median, the 3rd quartile and the

largest value.

The five-number summary can be presented visually by a boxplot.

Page 17: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

The 1.5IQR Rule for Outliers

The distance between the 1st and 3rd quartiles is called the interquartile range, which is abbreviated IQR for

obvious reasons.

The quartiles and IQR are resistant to changes in either tail of a distribution.

****Since the median and the IQR are resistant to outliers, they should be used when describing a skewed

distribution.

Page 18: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

We will call a data value a “suspected” outlier if it falls more than 1.5 x IQR above Q3 or below

Q1.

In a modified boxplot, the whiskers extend only to vlaues not “flagged” as outliers and asterisks

are used to denote any outliers.

Page 19: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Measuring Spread: The Standard DeviationThe standard deviation measures spread by determining how far each value is from the mean and then “averaging” these distances.

The standard deviation of a sample is denoted by s.

The standard deviation of a population is denoted , the Greek letter Sigma.

Page 20: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

The following formula is used to compute the standard deviation of a sample.

The variance of a set of observations, , is simply the square of the standard

deviation.

2

1

1xx

ns i

22 or s

Page 21: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Properties of the Standard Deviation1. s measures spread about the mean and should be used only when the mean is used as the measure of center2. s = 0 only when there is no spread/variability (i.e. all the values are the same . Otherwise, s > 0. As the observations become more spread out about their mean, s gets greater.3. s, like the mean , is not resistant to outliers. A few outliers can make s very large. Distributions with outliers and strongly skewed distributions have very large standard deviations. As such, the number s does not give much helpful information about such distributions.

x

Page 22: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Choosing Measures of Center and Spread

The five number summary, in particular the median and the IQR, is usually better than the mean and standard deviation for describing a

skewed distribution or a distribution with strong outliers.

Use and s only for reasonably symmetric distributions that are free of outliers.x

Page 23: AP Statistics Chapters 0 & 1 Review. Variables fall into two main categories: A categorical, or qualitative, variable places an individual into one of

Adding the same number, a, to each observation adds a to the measure of

center but does not affect the measure of spread.

Multiplying each observation by the same number, b, multiplies both the measures of

center and spread by b.