chapter 7: statistics describing data · pdf filechapter 7: statistics describing data...

27
Chapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Upload: vodieu

Post on 06-Mar-2018

228 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Chapter 7: Statistics Describing Data

November 14, 2016

Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Page 2: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Categorical Data

Four ways to display categorical data:

1 Frequency and Relative Frequency Table

2 Bar graph (Pareto chart)

3 Pie chart

4 Pictogram

Chapter 7: Statistics Describing Data November 14, 2016 2 / 27

Page 3: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Frequency Table

Definition (Frequency Table)

A frequency table is a table with two columns. One column lists thecategories, and another for the frequencies with which the items in thecategories occur (how many items fit into each category).

Favorite colors in a class of Kindergarteners: Red Red Yellow Green BlueBlue Blue Blue Pink Pink Pink Purple

Color Frequency

Red 2

Yellow 1

Green 1

Blue 4

Pink 3

Purple 1

Chapter 7: Statistics Describing Data November 14, 2016 3 / 27

Page 4: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Relative Frequency Table

Definition (Relative Frequency Table)

A relative frequency table is a frequency table with a column offractions or percents describing the relative frequency of each category.

You roll a die 25 times. The rolls are:

4,4,2,2,1,6,6,6,5,1,4,2,1,4,6,5,5,5,3,2,1,2,2,4,1.

Roll Frequency Relative Frequency

1 5 525

2 6 625

3 1 125

4 5 525

5 4 425

6 4 425

Chapter 7: Statistics Describing Data November 14, 2016 4 / 27

Page 5: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Bar Graphs

Definition (Bar graph)

A bar graph is a graph that displays a bar for each category with thelength of each bar indicating the frequency of that category.

Definition (Pareto chart)

A Pareto chart is a bar graph ordered from highest to lowest frequency

Chapter 7: Statistics Describing Data November 14, 2016 5 / 27

Page 6: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Pareto Chart Example

Green Red Black White Blue Grey0

20

40

Vehicle color involved in total-loss collision

Fre

qu

ency

(%)

Chapter 7: Statistics Describing Data November 14, 2016 6 / 27

Page 7: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Pie Chart

Definition (Pie Chart)

A pie chart is a circle with wedges cut of varying sizes marked out likeslices of pie or pizza. The relative sizes of the wedges correspond to therelative frequencies of the categories.

46.6 %

Chrome

24.6 %

Internet Explorer20.4 %

Firefox

5.1 %

Safari

1.3 %

Opera

2.0 %Other

Chapter 7: Statistics Describing Data November 14, 2016 7 / 27

Page 8: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Pictogram

Definition (Pictogram)

A pictogram is a statistical graphic in which the size of the picture isintended to represent the frequencies or size of the values beingrepresented.

Chapter 7: Statistics Describing Data November 14, 2016 8 / 27

Page 9: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Quantitative Data

Graphical Summaries of Quantitative Data:

1 Histogram

2 Frequency Polygon

Chapter 7: Statistics Describing Data November 14, 2016 9 / 27

Page 10: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Histogram

Definition (Histogram)

A histogram is graph that displays a rectangle for each numerical classinterval with the height of each rectangle indicating the frequency ofvalues in the interval. A histogram is similar to a bar graph, but thehorizontal axis is a number line. All class intervals must be an equal width.

Interval Frequency

120-134 4

135-149 14

150-164 16

165-179 28

180-194 12

195-209 8

210-224 7

225-239 6

240-254 2

255-269 3

120 135 150 165 180 195 210 225 240 255 2700

5

10

15

20

25

30

Weights (pounds)

Fre

qu

ency

Chapter 7: Statistics Describing Data November 14, 2016 10 / 27

Page 11: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Frequency Polygon

Definition (Frequency polygon)

A frequency polygon is similar to a histogram, but instead of drawing abar, a point is placed in the midpoint of each interval at height equal tothe frequency. Typically the points are connected with straight lines toemphasize the distribution of the data.

50 55 60 65 70 750

10

20

30

40

Heights (in)

Fre

qu

ency

Chapter 7: Statistics Describing Data November 14, 2016 11 / 27

Page 12: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Measures of Central Tendency

Definition (Mean)

The mean of a set of data is the sum of the data values divided by thenumber of values.

Definition (Median)

The median of a set of data is the value in the middle when the data is inorder

Definition (Mode)

The mode is the element of the data set that occurs most frequently.

Chapter 7: Statistics Describing Data November 14, 2016 12 / 27

Page 13: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Small Example

3,5,5,6,7,9

Mean = 3+5+5+6+7+96 = 35

6 ' 5.83

Median = 5+62 = 5.5

Mode = 5

Chapter 7: Statistics Describing Data November 14, 2016 13 / 27

Page 14: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Large Example

Data value Frequency

17 11

18 94

19 76

20 19

Mean =17 · 11 + 18 · 94 + 19 · 76 + 20 · 19

11 + 94 + 76 + 19

=187 + 1692 + 1444 + 380

200

=3703

200= 18.515

Chapter 7: Statistics Describing Data November 14, 2016 14 / 27

Page 15: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Large Example

Data value Frequency

17 11

18 94

19 76

20 19

Median is the average of the 100thand 101st data values.Median = 18

Mode = 18

Chapter 7: Statistics Describing Data November 14, 2016 15 / 27

Page 16: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Measures of Spread

Definition (Range)

The range is the difference between the maximum value and the minimumvalue of the data set.

Definition (Standard deviation)

The standard deviation is a measure of variation based on measuringhow far each data value deviates, or is different, from the mean.

Definition (Quartiles)

Quartiles are values that divide the data in quarters. The first quartile(Q1) is the value so that 25% of the data values are below it; the thirdquartile (Q3) is the value so that 75% of the data values are below it.The second quartile is the same as the median, since the median is thevalue so that 50% of the data values are below it.

Chapter 7: Statistics Describing Data November 14, 2016 16 / 27

Page 17: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Example

1,2,6,6,7,9,18

Range = 18− 1 = 17

Standard Deviation:

Mean = 1+2+6+6+7+9+187 = 49

7 = 7

Data value Deviation Deviation squared

1 1− 7 = −6 36

2

6

6

7

9

18

Chapter 7: Statistics Describing Data November 14, 2016 17 / 27

Page 18: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Example

1,2,6,6,7,9,18

Range = 18− 1 = 17

Standard Deviation:

Mean = 1+2+6+6+7+9+187 = 49

7 = 7

Data value Deviation Deviation squared

1 1− 7 = −6 36

2 2− 7 = −5 25

6

6

7

9

18

Chapter 7: Statistics Describing Data November 14, 2016 18 / 27

Page 19: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Example

1,2,6,6,7,9,18

Range = 18− 1 = 17

Standard Deviation:

Mean = 1+2+6+6+7+9+187 = 49

7 = 7

Data value Deviation Deviation squared

1 1− 7 = −6 36

2 2− 7 = −5 25

6 6− 7 = −1 1

6

7

9

18

Chapter 7: Statistics Describing Data November 14, 2016 19 / 27

Page 20: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Example

1,2,6,6,7,9,18

Range = 18− 1 = 17

Standard Deviation:

Mean = 1+2+6+6+7+9+187 = 49

7 = 7

Data value Deviation Deviation squared

1 1− 7 = −6 36

2 2− 7 = −5 25

6 6− 7 = −1 1

6 6− 7 = −1 1

7

9

18

Chapter 7: Statistics Describing Data November 14, 2016 20 / 27

Page 21: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Example

1,2,6,6,7,9,18

Range = 18− 1 = 17

Standard Deviation:

Mean = 1+2+6+6+7+9+187 = 49

7 = 7

Data value Deviation Deviation squared

1 1− 7 = −6 36

2 2− 7 = −5 25

6 6− 7 = −1 1

6 6− 7 = −1 1

7 7− 7 = 0 0

9

18

Chapter 7: Statistics Describing Data November 14, 2016 21 / 27

Page 22: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Example

1,2,6,6,7,9,18

Range = 18− 1 = 17

Standard Deviation:

Mean = 1+2+6+6+7+9+187 = 49

7 = 7

Data value Deviation Deviation squared

1 1− 7 = −6 36

2 2− 7 = −5 25

6 6− 7 = −1 1

6 6− 7 = −1 1

7 7− 7 = 0 0

9 9− 7 = 2 4

18

Chapter 7: Statistics Describing Data November 14, 2016 22 / 27

Page 23: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Example

1,2,6,6,7,9,18

Range = 18− 1 = 17

Standard Deviation:

Mean = 1+2+6+6+7+9+187 = 49

7 = 7

Data value Deviation Deviation squared

1 1− 7 = −6 36

2 2− 7 = −5 25

6 6− 7 = −1 1

6 6− 7 = −1 1

7 7− 7 = 0 0

9 9− 7 = 2 4

18 18− 7 = 11 121

188

√188

6=√

31.333

= 5.6

Chapter 7: Statistics Describing Data November 14, 2016 23 / 27

Page 24: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Standard Deviation

1 Calculate the mean.

2 Calculate the deviation from the mean for each data value.

3 Square each deviation.

4 Add the squared deviations.

5 Divide by one less than the number of data values.

6 Take the square root.

Chapter 7: Statistics Describing Data November 14, 2016 24 / 27

Page 25: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Example

1,2,6,6,7,9,18

Range = 18− 1 = 17

Standard Deviation=5.6

First Quartile, Q1 = 2Third Quartile, Q3 = 9

Chapter 7: Statistics Describing Data November 14, 2016 25 / 27

Page 26: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

5-number summary

Definition (Five number summary)

The five number summary takes this form:

Minimum, Q1, Median, Q3, Maximum

1,2,6,6,7,9,18

5-number summary: 1,2,6,9,18

Chapter 7: Statistics Describing Data November 14, 2016 26 / 27

Page 27: Chapter 7: Statistics Describing Data · PDF fileChapter 7: Statistics Describing Data November 14, 2016 Chapter 7: Statistics Describing Data November 14, 2016 1 / 27

Box plot

Definition (Box plot)

A box plot is a graphical representation of a five-number summary.To create a box plot, a number line is first drawn. A box is drawn fromthe first quartile to the third quartile, and a line is drawn through the boxat the median. “Whiskers” are extended out to the minimum andmaximum values.

0 2 4 6 8 10 12 14 16 18

Chapter 7: Statistics Describing Data November 14, 2016 27 / 27