copyright © 2014 pearson education, inc. all rights reserved chapter 2 picturing variation with...

Post on 15-Jan-2016

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Copyright © 2014 Pearson Education, Inc. All rights reserved

Chapter 2

Picturing Variation with Graphs

2 - 2 Copyright © 2014 Pearson Education, Inc. All rights reserved

Learning Objectives

Understand that a distribution of a sample of data displays a variable’s values and the frequencies (or relative frequencies) of those values.

Know how to make graphs of distributions of numerical and categorical variables and how to interpret the graphs in context.

Be able to compare centers and spreads of distributions of samples informally.

Copyright © 2014 Pearson Education, Inc. All rights reserved

2.1

Visualizing Variation in Numerical Data

2 - 4 Copyright © 2014 Pearson Education, Inc. All rights reserved

Visualizing Statistics

Organize the data using the chart that most effectively visually summarizes the data.

The distribution of the data describes the values, frequencies (counts), and “shape” of the data. Is there a data value or data values that are far

from the rest of the data? Is there symmetry? Is there a most common value or most common

range of values?

2 - 5 Copyright © 2014 Pearson Education, Inc. All rights reserved

Dot Plots

A Dot Plot is a chart that contains a dot for each data value.

Benefits Shows the individual data values Easy to spot outliers Describes the distribution visually

Drawbacks Not as common as bar and pie charts Not great for data that has too many individual values

2 - 6 Copyright © 2014 Pearson Education, Inc. All rights reserved

Dot Plot Example

Clearly shows the outlier just below $300. The rest of the data is generally uniformly

spread out.

2 - 7 Copyright © 2014 Pearson Education, Inc. All rights reserved

Frequency Histograms

A histogram is a type of bar graph. The horizontal axis is numerical. The vertical axis represents the frequency of

the data. Groups the data into bins, also called

intervals or classes. Easy to visualize the distribution.

2 - 8 Copyright © 2014 Pearson Education, Inc. All rights reserved

Histogram Example

Different bin widths depict the same data differently.

The smaller width shows more detail. Too small a width shows too much detail and

will not clearly display the main features.

2 - 9 Copyright © 2014 Pearson Education, Inc. All rights reserved

Relative Frequency Histograms

A Relative Frequency Histogram is a histogram where the vertical axis represents the relative frequencies, or percents, rather than the frequencies.

Compute the relative frequency by dividing the frequency by the sample size.

The relative frequency histogram always has the same shape as the frequency histogram. The scale of the vertical axis is just changed.

2 - 10 Copyright © 2014 Pearson Education, Inc. All rights reserved

Relative Frequency Example

Clearly shows that half of all women score on average between 0.7 and 0.8 goals per game.

Shows there are a small number of exceptional players.

Women’s Soccer Players, NCAA Division III 2009

2 - 11 Copyright © 2014 Pearson Education, Inc. All rights reserved

Frequency vs. Relative Frequency Histograms

Use a frequency histogram when you want to emphasize how many are in each range.

Use a relative frequency histogram when you want to emphasize what proportion or percent of the total each range contains.

2 - 12 Copyright © 2014 Pearson Education, Inc. All rights reserved

Stem and Leaf Plots

The Leaf is the last digit The Stem contains all digits before the last

digit Shows individual data values Same as a histogram, but bin width a power

of 10 Example: The five 0’s show that there were

five classes with 40 students.

Class Size

Copyright © 2014 Pearson Education, Inc. All rights reserved

2.2

Summarizing Important Features of a Numerical Distribution

2 - 14 Copyright © 2014 Pearson Education, Inc. All rights reserved

Three Aspects of a Distribution

Shape Symmetry How a many bumps or modes? Other distinguishing features

Center What is a typical value?

Spread Is the data all close together or spread out?

2 - 15 Copyright © 2014 Pearson Education, Inc. All rights reserved

Skewness

A distribution is Skewed Right if most of the data values are small and there is a “tail” of larger values to the right.

A distribution is Skewed Left if most of the data values are large and there is a “tail” of smaller values to the left.

2 - 16 Copyright © 2014 Pearson Education, Inc. All rights reserved

Symmetric Distributions

A distribution is symmetric if the left hand side is roughly the mirror image of the right hand side.

Symmetric Distributions

2 - 17 Copyright © 2014 Pearson Education, Inc. All rights reserved

How Many Mounds

A Unimodal distribution has one mound.

A Multimodal distribution has more than two mounds.

A Bimodal distribution has two mounds.

2 - 18 Copyright © 2014 Pearson Education, Inc. All rights reserved

Normal Distributions

A Normal distribution has the following properties Symmetric Unimodal Mound or Bell Shaped

2 - 19 Copyright © 2014 Pearson Education, Inc. All rights reserved

Outliers

An Outlier is a data value that is either much smaller or much larger than the rest of the data.

Some reasons for outliers Error in data collection No error. For example, the owner’s salary could

be an outlier if the rest of the employees are all low wage workers

2 - 20 Copyright © 2014 Pearson Education, Inc. All rights reserved

Center

What is a typical value? Center not a typical value for bimodal or

skewed.

2 - 21 Copyright © 2014 Pearson Education, Inc. All rights reserved

Variability

Variability describes how spread out the data value are.

2 - 22 Copyright © 2014 Pearson Education, Inc. All rights reserved

Summary of Describing a Distribution

What is the shape? Is it Symmetric, Skewed, or Neither? Unimodal, Bimodal, or Multimodal? Normal? Are there outliers?

Where is the center? Is the center a typical value?

Is there low or high variability?

Copyright © 2014 Pearson Education, Inc. All rights reserved

2.3

Visualizing Variation in Categorical

Variables

2 - 24 Copyright © 2014 Pearson Education, Inc. All rights reserved

Two Types of Charts

A Bar Chart is like a histogram, but the horizontal axis can represent categorical data. A natural order may not occur.

A Pie Chart is a circle cut into slices where the size of each slice is proportional to the frequency of the outcome that it represents.

2 - 25 Copyright © 2014 Pearson Education, Inc. All rights reserved

The frequency table below shows the ranks of a group of army members who live in the barracks.

Example: Categorical Data

Rank Private Corporal Sergeant Major

Frequency 65 22 49 16

2 - 26 Copyright © 2014 Pearson Education, Inc. All rights reserved

Bar Chart

A graphical summary for categorical data Each category is represented by a bar. The height of each bar is proportional to the

frequency for that category. There can be more than one choice of

ordering the categories.

2 - 27 Copyright © 2014 Pearson Education, Inc. All rights reserved

Bar Chart for Army Ranks

2 - 28 Copyright © 2014 Pearson Education, Inc. All rights reserved

Pareto Chart

A Pareto Chart is a bar chart that orders the categories from largest to smallest frequency.

2 - 29 Copyright © 2014 Pearson Education, Inc. All rights reserved

Differences Between Bar Charts and Histograms

A histogram displays numerical data. A bar chart can display categorical data.

The bar widths of a histogram are meaningful and must all be the same size. The bar widths for a bar chart are meaningless.

The bars of a histogram must touch each other. For a bar chart, there are gaps between bars.

There is only one choice, ascending by x, for the order of the bars, while there are many choices of order for a bar chart.

2 - 30 Copyright © 2014 Pearson Education, Inc. All rights reserved

Pie Charts

Graphical summary for categorical data. A circle is cut into several slices. The size of each

slice is proportional to the frequency of the category that it represents.

Often used to display how much of a share each category has of the whole.

If f is the frequency and n is the sample size, the angle of each slice is

Angle 360f

n

2 - 31 Copyright © 2014 Pearson Education, Inc. All rights reserved

Pie Chart of Army Ranks

Copyright © 2014 Pearson Education, Inc. All rights reserved

2.4

Summarizing Categorical

Distributions

2 - 33 Copyright © 2014 Pearson Education, Inc. All rights reserved

Description of Numerical Distributions vs. Categorical Distributions

Numerical Distributions Shape Center Spread

Categorical Distributions Mode Variability or Diversity

2 - 34 Copyright © 2014 Pearson Education, Inc. All rights reserved

Example of a Bar Chart with Pop as the Mode

2 - 35 Copyright © 2014 Pearson Education, Inc. All rights reserved

Mode

The Mode is the category that occurs with the highest frequency.

The mode is thought of as the typical outcome. If there is a close tie between two categories for

most frequently occurring, the distribution is called bimodal.

If more than two categories have roughly the tallest bars, the distribution is called multimodal.

2 - 36 Copyright © 2014 Pearson Education, Inc. All rights reserved

Bimodal Distribution

2 - 37 Copyright © 2014 Pearson Education, Inc. All rights reserved

Multimodal Bar Chart

2 - 38 Copyright © 2014 Pearson Education, Inc. All rights reserved

Variability

If the distribution has a lot of diversity (many observations in many different categories), then variability is high.

If the distribution has only a little diversity (many of the observations fall into the same category), then variability is low.

Caution: Variability is about many different categories, not many frequencies.

2 - 39 Copyright © 2014 Pearson Education, Inc. All rights reserved

High Variability

2 - 40 Copyright © 2014 Pearson Education, Inc. All rights reserved

Low Variability

2 - 41 Copyright © 2014 Pearson Education, Inc. All rights reserved

Side-by-Side Bar Chart

Copyright © 2014 Pearson Education, Inc. All rights reserved

2.5

Interpreting Graphs

2 - 43 Copyright © 2014 Pearson Education, Inc. All rights reserved

Ways to Mislead with Graphs: Don’t Do Any of These!

Have the frequency scale not begin at 0 to create the illusion of greater differences.

Use symbols other than bars that hide or accentuate the real differences.

Use unequal width bars.

2 - 44 Copyright © 2014 Pearson Education, Inc. All rights reserved

Scale Not Starting at 0

The left bar chart misleads by making the differences seem greater than they are.

2 - 45 Copyright © 2014 Pearson Education, Inc. All rights reserved

Scale Unclear

This is misleading because we cannot see the frequencies.

2005 2006 2007 2008

Homes Sold by Year

2 - 46 Copyright © 2014 Pearson Education, Inc. All rights reserved

Scale Unclear

The scale is by area and not by just height.

2005 2006 2007 2008

Homes Sold by Year

2 - 47 Copyright © 2014 Pearson Education, Inc. All rights reserved

Other Creative Charting Techniques

Internet and computers allow for additional effects Analysis of State of the Union Speeches World Population Changes

Copyright © 2014 Pearson Education, Inc. All rights reserved

Chapter 2

Case Study

2 - 49 Copyright © 2014 Pearson Education, Inc. All rights reserved

Class Sizes: Private vs. Public

Using raw data is ineffective for this comparison

2 - 50 Copyright © 2014 Pearson Education, Inc. All rights reserved

Private Colleges’ Student-to-Teacher Ratio

Typical ratio between 10 and 11. Skewed right. Outlier of 54 student-to-teacher ratio. Large Variation.

2 - 51 Copyright © 2014 Pearson Education, Inc. All rights reserved

Public Colleges’ Student-to-Teacher Ratio

Typical ratio between 16 and 20. Generally symmetric. Outlier of fractional student-to-teacher ratio. Less Variation.

2 - 52 Copyright © 2014 Pearson Education, Inc. All rights reserved

Comparing the Histograms

It is much easier to describe the data when they are displayed using histograms compared to just the raw data table.

Copyright © 2014 Pearson Education, Inc. All rights reserved

Chapter 2

Guided Exercise

2 - 54 Copyright © 2014 Pearson Education, Inc. All rights reserved

Eating Out for Students With Full Time Jobs vs. Part Time Jobs

Full time jobs: 5, 3, 4, 4, 4, 2, 1, 5, 6, 5, 6, 3, 3, 2, 4, 5, 2, 3, 7, 5, 5, 1,4, 6, 7

Part time jobs: 1, 1, 5, 1, 4, 2, 2, 3, 3, 2, 3, 2, 4, 2, 1, 2, 3, 2, 1, 3, 3, 2,4, 2, 1

2 - 55 Copyright © 2014 Pearson Education, Inc. All rights reserved

Create a Dot Plot

Full time jobs: 5, 3, 4, 4, 4, 2, 1, 5, 6, 5, 6, 3, 3, 2, 4, 5, 2, 3, 7, 5, 5, 1,4, 6, 7

Part time jobs: 1, 1, 5, 1, 4, 2, 2, 3, 3, 2, 3, 2, 4, 2, 1, 2, 3, 2, 1, 3, 3, 2,4, 2, 1

2 - 56 Copyright © 2014 Pearson Education, Inc. All rights reserved

Examine Shapes

Full time jobs: Relatively mound shaped. Part time jobs: Slightly skewed right.

2 - 57 Copyright © 2014 Pearson Education, Inc. All rights reserved

Examine Center

Full time jobs: Typically eat out 5 times per week Part time jobs: Typically eat out 2 times per week

2 - 58 Copyright © 2014 Pearson Education, Inc. All rights reserved

Examine Variation

Full time jobs: Larger Variation - from once to 7 times per week. Part time jobs: Smaller variation – from once to 5 times per week.

2 - 59 Copyright © 2014 Pearson Education, Inc. All rights reserved

Check for Outliers

Full time jobs: No gaps, so no clear outliers Part time jobs: No gaps, so no clear outliers.

2 - 60 Copyright © 2014 Pearson Education, Inc. All rights reserved

Summarize

The typical part time worker eats out less often compared to the typical full time worker. There is wider variation for the eating out by full time workers than by part time workers. The shape of the distribution for full time workers is approximately mound shaped, while it is slightly skewed right for part time workers.

top related