correctionkey=nl-a;ca-a correctionkey=nl-c;ca … 22 1096 lesson 2 do not edit--changes must be made...

16
Birth Month 1 2 3 4 5 6 7 8 9 101112 x xx x x x x x x x x x x x x x x x x x Name Class Date © Houghton Mifflin Harcourt Publishing Company • Image Credits ©Brooklyn Production/Corbis Resource Locker Explore 1 Seeing the Shape of a Distribution “Raw” data values are simply presented in an unorganized list. Organizing the data values by using the frequency with which they occur results in a distribution of the data. A distribution may be presented as a frequency table or as a data display. Data displays for numerical data, such as line plots, histograms, and box plots, involve a number line, while data displays for categorical data, such as bar graphs and circle graphs, do not. Data displays reveal the shape of a distribution. The table gives data about a random sample of 20 babies born at a hospital. Make a line plot for the distribution of birth months. Baby Birth Month Birth Weight (kg) Mother’s Age 11 9 3.6 33 12 10 3.5 29 13 11 3.4 31 14 1 3.7 29 15 6 3.5 34 16 5 3.8 30 17 8 3.5 32 18 9 3.6 30 19 12 3.3 29 20 2 3.5 28 Baby Birth Month Birth Weight (kg) Mother’s Age 1 5 3.3 28 2 7 3.6 31 3 11 3.5 33 4 2 3.4 35 5 10 3.7 39 6 3 3.4 30 7 1 3.5 29 8 4 3.2 30 9 7 3.6 31 10 6 3.4 32 Module 22 1095 Lesson 2 22.2 Shape, Center, and Spread Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? Common Core Math Standards The student is expected to: S-ID.4 Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets for which such a procedure is not appropriate. Use calculators, spreadsheets, and tables to estimate areas under the normal curve. Mathematical Practices MP.2 Reasoning Language Objective Have students work in pairs to fill in a table showing the shape of distributions of data. HARDCOVER PAGES 803814 Turn to these pages to find this lesson in the hardcover student edition. Shape, Center, and Spread ENGAGE Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? For a normal distribution, it’s appropriate to use either a combination of mean and standard deviation or a combination of median and IQR. For a skewed distribution, only a combination of median and IQR should be used because mean and standard deviation are sensitive to the data values in the tail of the distribution PREVIEW: LESSON PERFORMANCE TASK View the online Engage. Discuss what kinds of data might be collected and compared with the normally- distributed data when a pet visits the veterinarian. Then preview the Lesson Performance Task. S-ID.4 For the full text of this standard, see the table starting on page CA2 of Volume 1. Birth Month 1 2 3 4 5 6 7 8 9 101112 x xx x x x x x x x x x x x x x x x x x Name Class Date © Houghton Mifflin Harcourt Publishing Company • Image Credits ©Brooklyn Production/Corbis Resource Locker Explore 1 Seeing the Shape of a Distribution “Raw” data values are simply presented in an unorganized list. Organizing the data values by using the frequency with which they occur results in a distributionof the data. A distribution may be presented as a frequency table or as a data display. Data displays for numerical data, such as line plots, histograms, and box plots, involve a number line, while data displays for categorical data, such as bar graphs and circle graphs, do not. Data displays reveal the shape of a distribution. The table gives data about a random sample of 20 babies born at a hospital. Make a line plot for the distribution of birth months. Baby Birth Month Birth Weight (kg) Mother’s Age 11 9 3.6 33 12 10 3.5 29 13 11 3.4 31 14 1 3.7 29 15 6 3.5 34 16 5 3.8 30 17 8 3.5 32 18 9 3.6 30 19 12 3.3 29 20 2 3.5 28 Baby Birth Month Birth Weight (kg) Mother’s Age 1 5 3.3 28 2 7 3.6 31 3 11 3.5 33 4 2 3.4 35 5 10 3.7 39 6 3 3.4 30 7 1 3.5 29 8 4 3.2 30 9 7 3.6 31 10 6 3.4 32 Module 22 1095 Lesson 2 22.2 Shape, Center, and Spread Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution? DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A 1095 Lesson 22.2 LESSON 22.2

Upload: hanguyet

Post on 15-Jun-2018

298 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

Birth Month1 2 3 4 5 6 7 8 9 10 11 12

x x xxxxxxxxxxxxxxxxxx

Name Class Date

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y • I

mag

e C

red

its

©Br

ook

lyn

Pro

du

ctio

n/C

orb

is

Resource Locker

Explore 1 Seeing the Shape of a Distribution“Raw” data values are simply presented in an unorganized list. Organizing the data values by using the frequency with which they occur results in a distribution of the data. A distribution may be presented as a frequency table or as a data display. Data displays for numerical data, such as line plots, histograms, and box plots, involve a number line, while data displays for categorical data, such as bar graphs and circle graphs, do not. Data displays reveal the shape of a distribution.

The table gives data about a random sample of 20 babies born at a hospital.

Make a line plot for the distribution of birth months.

BabyBirth

Month

Birth Weight

(kg)Mother’s

Age

11 9 3.6 33

12 10 3.5 29

13 11 3.4 31

14 1 3.7 29

15 6 3.5 34

16 5 3.8 30

17 8 3.5 32

18 9 3.6 30

19 12 3.3 29

20 2 3.5 28

BabyBirth

Month

Birth Weight

(kg)Mother’s

Age

1 5 3.3 28

2 7 3.6 31

3 11 3.5 33

4 2 3.4 35

5 10 3.7 39

6 3 3.4 30

7 1 3.5 29

8 4 3.2 30

9 7 3.6 31

10 6 3.4 32

Module 22 1095 Lesson 2

22 . 2 Shape, Center, and SpreadEssential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution?

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1095 4/4/14 7:09 PM

Common Core Math StandardsThe student is expected to:

S-ID.4

Use the mean and standard deviation of a data set to fit it to a normal distribution and to estimate population percentages. Recognize that there are data sets for which such a procedure is not appropriate. Use calculators, spreadsheets, and tables to estimate areas under the normal curve.

Mathematical Practices

MP.2 Reasoning

Language ObjectiveHave students work in pairs to fill in a table showing the shape of distributions of data.

HARDCOVER PAGES 803814

Turn to these pages to find this lesson in the hardcover student edition.

Shape, Center, and Spread

ENGAGE Essential Question: Which measures of center and spread are appropriate for a normal distribution, and which are appropriate for a skewed distribution?For a normal distribution, it’s appropriate to use

either a combination of mean and standard

deviation or a combination of median and IQR. For a

skewed distribution, only a combination of median

and IQR should be used because mean and standard

deviation are sensitive to the data values in the tail

of the distribution

PREVIEW: LESSON PERFORMANCE TASKView the online Engage. Discuss what kinds of data might be collected and compared with the normally-distributed data when a pet visits the veterinarian. Then preview the Lesson Performance Task.

1095

HARDCOVER

Turn to these pages to find this lesson in the hardcover student edition.

© H

ough

ton

Miff

lin H

arco

urt P

ublis

hing

Com

pany

S-ID.4 For the full text of this standard, see the table starting on page CA2 of Volume 1.

Birth Month1 2 3 4 5 6 7 8 9 10 11 12x

x xxxxxxxxxx

xxxxxxxx

Name

Class Date

© H

ough

ton

Mif

flin

Har

cour

t Pub

lishi

ng C

omp

any

• Im

age

Cre

dit

s ©

Broo

klyn

Prod

ucti

on/C

orb

is

Resource

Locker

Explore 1 Seeing the Shape of a Distribution

“Raw” data values are simply presented in an unorganized list. Organizing the data values by using the frequency with

which they occur results in a distribution of the data. A distribution may be presented as a frequency table or as a

data display. Data displays for numerical data, such as line plots, histograms, and box plots, involve a number line,

while data displays for categorical data, such as bar graphs and circle graphs, do not. Data displays reveal the shape of

a distribution.

The table gives data about a random sample of 20 babies born at a hospital.

Make a line plot for the distribution of birth months.

BabyBirth

Month

Birth Weight

(kg)Mother’s

Age

11 93.6

33

12 103.5

29

13 113.4

31

14 13.7

29

15 63.5

34

16 53.8

30

17 83.5

32

18 93.6

30

19 123.3

29

20 23.5

28

BabyBirth

Month

Birth Weight

(kg)Mother’s

Age

15

3.328

27

3.631

3 113.5

33

42

3.435

5 103.7

39

63

3.430

71

3.529

84

3.230

97

3.631

10 63.4

32

Module 22

1095

Lesson 2

22 . 2 Shape, Center, and Spread

Essential Question: Which measures of center and spread are appropriate for a normal

distribution, and which are appropriate for a skewed distribution?

DO NOT EDIT--Changes must be made through “File info”

CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1095

4/3/14 11:00 AM

1095 Lesson 22 . 2

L E S S O N 22 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 2: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

Birth Weight (kg)3.0 3.2 3.4 3.6 3.8 4.0

xxxxxxxxxxx

xxx

xxxxx

xxxxx

xxxxx

xxxx

x

27 28 29 30 31 32 33 34 35 36 37 38 39 40

Mother’s Age

x xxx

xx

xx

xx

xx

xx

x x xx

xx

Skewed left Symmetric Skewed right

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

B Make a line plot for the distribution of birth weights.

C Make a line plot for the distribution of mothers’ ages.

Reflect

1. Describe the shape of the distribution of birth months.

2. Describe the shape of the distribution of birth weights.

3. Describe the shape of the distribution of mothers’ ages.

Explore 2 Relating Measures of Center and Spread to the Shape of a Distribution

As you saw in the previous Explore, data distributions can have various shapes. Some of these shapes are given names in statistics.

• A distribution whose shape is basically level (that is, it looks like a rectangle) is called a uniform distribution.

• A distribution that is mounded in the middle with symmetric “tails” at each end (that is, it looks bell-shaped) is called a normal distribution.

• A distribution that is mounded but not symmetric because one “tail” is much longer than the other is called a skewed distribution. When the longer “tail” is on the left, the distribution is said to be skewed left. When the longer “tail” is on the right, the distribution is said to be skewed right.

The figures show the general shapes of normal and skewed distributions.

The distribution is fairly level; that is, the data are more or less evenly distributed.

The distribution is mounded and symmetric.

The distribution is mounded and asymmetric; that is, it trails off more to the right

than to the left.

Module 22 1096 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1096 4/4/14 7:09 PM

Integrate Mathematical PracticesThe Explore activities in this lesson provide opportunities to address Mathematical Practice MP.2, which asks students to “reason abstractly and quantitatively.” Students review various ways to display data, and they learn to recognize various shapes of data distributions. They also calculate measures of center and spread and relate them to the shapes of the distributions. Finally, they learn that certain measures of center and spread are better statistics for non-normal distributions.

EXPLORE 1 Seeing the Shape of a Distribution

INTEGRATE TECHNOLOGYStudents have the option of completing the activity either in the book or online.

QUESTIONING STRATEGIESWhat values are on the number lines for line plots, histograms, and box plots? the data

values

What does the height of each vertical stack of X’s in the line plots represent? the number of

times each value occurs, or frequencies

EXPLORE 2 Relating Measures of Center and Spread to the Shape of a Distribution

AVOID COMMON ERRORSWhen using multiple lists in a calculator, students can confuse them or end up with mismatched data. Students should write down what set of data is in each list to help them avoid confusion.

QUESTIONING STRATEGIESWhat information does a data display give you? the shape of the data

distribution What information do the mean and median give you? the center of the

distribution What information do the standard deviation and interquartile range give you? the

spread of the distribution

Which types of distributions are relatively balanced about the measure of central

tendency and which are not? Uniform and normal

distributions are balanced, skewed distributions

are not.

PROFESSIONAL DEVELOPMENT

Shape, Center, and Spread 1096

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 3: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

Shape is one way of characterizing a data distribution. Another way is by identifying the distribution’s center and spread. You should already be familiar with the following measures of center and spread:

• The mean of n data values is the sum of the data values divided by n. If x 1 , x 2 , ..., x n are data values from a sample, then the mean _ x is given by:

_ x = x 1 + x 2 + ··· + x n ___________ n

• The median of n data values written in ascending order is the middle value if n is odd, and is the mean of the two middle values if n is even.

• The standard deviation of n data values is the square root of the mean of the squared deviations from the distribution’s mean. If x 1 , x 2 , ..., x n are data values from a sample, then the standard deviation s is given by:

s = √____

( x 1 - _ x ) 2 + ( x 2 - _ x ) 2 + ··· + ( x n - _ x ) 2

________________________ n

• The interquartile range, or IQR, of data values written in ascending order is the difference between the median of the upper half of the data, called the third quartile or Q 3 , and the median of the lower half of the data, called the first quartile or Q 1 . So, IQR = Q 3 - Q 1 .

To distinguish a population mean from a sample mean, statisticians use the Greek letter mu, written µ, instead of _ x . Similarly, they use the Greek letter sigma, written σ, instead of s to distinguish a population standard deviation from a sample standard deviation.

Use a graphing calculator to compute the measures of center and the measures of spread for the distribution of baby weights and the distribution of mothers’ ages from the previous Explore. Begin by entering the two sets of data into two lists on a graphing calculator as shown.

Calculate the “1-Variable Statistics” for the distribution of baby weights. Record the statistics listed. (Note: Your calculator may report the standard deviation with a denominator of n - 1 as “ s x ” and the standard deviation with a denominator of n as “ σ x .” In statistics, when you want to use a sample’s standard deviation as an estimate of the population’s standard deviation, you use s x , which is sometimes called the “corrected” sample standard deviation. Otherwise, you can just use σ x , which you should do in this lesson.)

_ x = Median =

s ≈ IQR = Q 3 - Q 1 =

Calculate the “1-Variable Statistics” for the distribution of mothers’ ages. Record the statistics listed.

_ x = Median =

s ≈ IQR = Q 3 - Q 1 =

3.5

0.14 0.2

3.5

31.15 30.5

2.6 3.5

Module 22 1097 Lesson 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1097 4/4/14 7:09 PM

COLLABORATIVE LEARNING

Whole Class ActivityHave students work in pairs. Assign pairs to find either the number of United States Representatives or the number of electoral votes for each state, or provide this information. Have each pair work together to make a poster with a histogram and an analysis of the data.

Gather together as a class to compare the two types of posters to see how the two posters are related. Have students describe the similarities and differences.

1097 Lesson 22 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 4: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

Reflect

4. What do you notice about the mean and median for the symmetric distribution (baby weights) as compared with the mean and median for the skewed distribution (mothers’ ages)? Explain why this happens.

5. The standard deviation and IQR for the skewed distribution are significantly greater than the corresponding statistics for the symmetric distribution. Explain why this makes sense.

6. Which measures of center and spread would you report for the symmetric distribution? For the skewed distribution? Explain your reasoning.

Explain 1 Making and Analyzing a HistogramYou can use a graphing calculator to create a histogram of numerical data using the viewing window settings Xmin (the least x-value), Xmax (the greatest x-value), and Xscl (the width of an interval on the x-axis, which becomes the width of the histogram).

Example 1 Use a graphing calculator to make a histogram of the given data and then analyze the graph.

a. Make a histogram of the baby weights from Explore 1. Based on the shape of the distribution, identify what type of distribution it is.

Begin by turning on a statistics plot, selecting the histogram option, and entering the list where the data are stored.

Set the viewing window. To obtain a histogram that looks very much like the line plot that you drew for this data set, use the values shown. Xscl determines the width of each bar, so when Xscl = 0.1 and Xmin = 3.15, the first bar covers the interval 3.15 ≤ x < 3.25, which captures the weight 3.2 kg.

The mean and median for a symmetric distribution are equal, but the mean and median

for a skewed distribution are not. This happens because the mean is pulled toward the

data values in the longer tail, but the median is not.

Report either the mean and standard deviation or the median and IQR for the symmetric

distribution, but use only the median and IQR for the skewed distribution because the

mean and standard deviation are too sensitive to the data values in the long tail.

The data are more spread out in the skewed distribution, so the measures of spread should

be greater.

Module 22 1098 Lesson 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-A;CA-A

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1098 4/4/14 7:08 PM

DIFFERENTIATE INSTRUCTION

Kinesthetic ExperienceHave students make flash cards for the vocabulary, and add them to their flash cards from the previous lesson. Group students in pairs. Have students shuffle their own cards. One student in each pair puts the cards down with the terms showing; the other student puts them down with definitions showing. Students take turns placing a card with a definition on top of a card with the matching term. Students get a point for each correct match. Have each turn begin with the student who is about to match cards either accepting or challenging the previous match. If the student correctly challenges a match, he or she earns a point and the opponent loses a point. If the challenging student is incorrect, he or she loses a point and a turn.

EXPLAIN 1 Making and Analyzing a Histogram

QUESTIONING STRATEGIESHow do you set Xmin and Xscl when using a graphing calculator to make a histogram?

One method is to set XMin at or below the least

value and choose Xscl so that each data value in L1

is a separate bar.

How are the shapes of the line plot and the histogram for birth weight related? The

shapes are the same; both represent a normal

distribution.

INTEGRATE MATHEMATICAL PRACTICESFocus on Math ConnectionsMP.1 Remind students that they might think of the standard deviation as the expected, or usual, variation of the data from the mean.

INTEGRATE MATHEMATICAL PRACTICESFocus on Technology

MP.5 Encourage students to examine the features of their graphing calculators. Some

have Zoom options such as ZoomStat that give a good initial window for a histogram.

CONNECT VOCABULARY Write the names of the types of distributions and the measures of center on the board, and leave them there throughout the lesson. This will keep students who might have trouble memorizing the terms from having to flip back to the first page of the lesson while they are working.

Shape, Center, and Spread 1098

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 5: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

GRAPH

MA04SE.GRAPH.KEYS.BluTRACE

MA04SE.TRACE.KEYS.Blu

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

Draw the histogram by pressing . You can obtain the heights of the bars by pressing and using the arrow keys.

The distribution has a central mound and symmetric tails, so it is a normal distribution.

b. By examining the histogram, determine the percent of the data that are within 1 standard deviation (s ≈ 0.14) of the mean ( _ x = 3.5). That is, determine the percent of the data in the interval 3.5 - 0.14 < x < 3.5 + 0.14, or 3.36 < x < 3.64. Explain your reasoning.

The bars for x-values that satisfy 3.36 < x < 3.64 have heights of 4, 6, and 4, so 14 data values out of 20, or 70% of the data, are in the interval.

c. Suppose one of the baby weights is chosen at random. By examining the histogram, determine the probability that the weight is more than 1 standard deviation above the mean. That is, determine the probability that the weight is in the interval x > 3.5 + 0.14, or x > 3.64. Explain your reasoning.

The bars for x-values that satisfy x > 3.64 have heights of 2 and 1, so the probability that the weight is in the interval is 3 __ 20 = 0.15 or 15%.

The table gives the lengths (in inches) of the random sample of 20 babies from Explore 1.

a. Make a histogram of the baby lengths. Based on the shape of the distribution, identify what type of distribution it is.

The distribution has a central mound and symmetric tails, so it is a distribution.

b. By examining the histogram, determine the percent of the data that are within 2 standard deviations (s ≈ 1.4) of the mean ( _ x = 20) . Explain your reasoning.

The interval for data that are within 2 standard deviations of the mean is

< x < . The bars for x-values that satisfy < x < have heights

of , so data values out of 20, or % of the data, are in the interval.

BabyBaby

Length (in.)

1 17

2 21

3 20

4 19

5 22

6 19

7 20

BabyBaby

Length (in.)

8 18

9 21

10 19

11 21

12 20

13 19

14 22

BabyBaby

Length (in.)

15 20

16 23

17 20

18 21

19 18

20 20

normal

17.2 17.222.8 22.8

90182, 4, 6, 4, and 2

Module 22 1099 Lesson 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1099 4/4/14 7:08 PM

LANGUAGE SUPPORT

Communicate MathHave students work with a partner to fill in a table like the one below using both words and labeled drawings or pictures. Students should be able to label and describe the meaning of each part of a normal distribution graph with respect to the data.

Type of Distribution Graph DescriptionNormal distribution

Skewed right

Skewed left

Uniform distribution1099 Lesson 22 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 6: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

c. Suppose one of the baby lengths is chosen at random. By examining the histogram,determine the probability that the length is less than 2 standard deviations below the mean. Explain your reasoning.

The interval for data that are less than 2 standard deviations below the mean is

x < . The only bar for x-values that satisfy x < has a height of , so the

probability that the length is in the interval is _ 20 = or %.

Your Turn

7. The table lists the test scores of a random sample of 22 students who are taking the same math class.

a. Use a graphing calculator to make a histogram of the math test scores. Based on the shape of the distribution, identify what type of distribution it is.

b. By examining the histogram, determine the percent of the data that are within 2 standard deviations (s ≈ 6.3) of the mean ( ― x ≈ 83) . Explain your reasoning.

c. Suppose one of the math test scores is chosen at random. By examining the histogram, determine the probability that the test score is less than 2 standard deviations below the mean. Explain your reasoning.

Explain 2 Making and Analyzing a Box PlotA box plot, also known as a box-and-whisker plot, is based on five key numbers: the minimum data value, the first quartile of the data values, the median (second quartile) of the data values, the third quartile of the data values, and the maximum data value. A graphing calculator will automatically compute these values when drawing a box plot. A graphing calculator also gives you two options for drawing box plots: one that shows outliers and one that does not. For this lesson, choose the second option.

Student Math test scores

1 86

2 78

3 95

4 83

5 83

6 81

7 87

8 81

Student Math test scores

9 90

10 85

11 83

12 99

13 81

14 75

15 85

Student Math test scores

16 83

17 83

18 70

19 73

20 79

21 85

22 83

The interval for data that are less than 2 standard deviations below the mean is x < 70.4. The only bar for x-values that satisfy x < 70.4 has a height of 1, so the probability that the test score is in the interval is 1 ___ 22 ≈ 0.05.

The interval for data that are within 2 standard deviations of the mean is 70.4 < x < 95.6. The bars for x-values that satisfy 70.4 < x < 95.6 have heights of 1, 1, 1, 1, 3, 6, 3, 1, 1, 1, and 1, so 20 data values out of 22, or about 91% of the data, are in the interval.

The distribution has a central mound and symmetric tails, so it is a normal distribution.

17.2 17.2

0.05 51

1

Module 22 1100 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1100 4/4/14 7:08 PM

EXPLAIN 2 Making and Analyzing a Box Plot

AVOID COMMON ERRORSStudents may use the mean as the measure of central tendency in a box plot. Emphasize that box plots involve dividing the data into four groups, with the median dividing the data into two groups of equal number and the quartiles further dividing each of those groups into two groups.

QUESTIONING STRATEGIESHow do you find the interquartile range (IQR)? Calculate Q3 – Q1, which is the

median of the upper half of the data minus the

median of the lower half of the data.

Describe what the IQR corresponds to on the box plot. the width of the box

INTEGRATE MATHEMATICAL PRACTICESFocus on Math ConnectionsMP.1 Emphasize that extreme values usually do not affect the interquartile range, but may greatly affect the range.

CONNECT VOCABULARY The concept of spread arises frequently in data analysis. Make sure students understand that when you use the word spread, you are referring to how the data are spread as well as how they are displayed.

Shape, Center, and Spread 1100

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 7: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

GRAPH

MA04SE.GRAPH.KEYS.BluTRACE

MA04SE.TRACE.KEYS.Blu

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

Example 2 Use a graphing calculator to make a box plot of the given data and then analyze the graph.

a. Make a box plot of the mothers’ ages from Explore 1. How does the box plot show that this skewed distribution is skewed right?

Begin by turning on a statistics plot, selecting the second box plot option, and entering the list where the data are stored.

Set the viewing window. Use the values shown.

Draw the box plot by pressing . You can obtain the box plot’s

five key values by pressing and using the arrow keys.

The part of the box to the right of the median is slightly wider than the part to the left, and the “whisker” on the right is much longer than the one on the left, so the distribution is skewed right.

b. Suppose one of the mothers’ ages is chosen at random. Based on the box plot and not the original set of data, what can you say is the approximate probability that the age falls between the median, 30.5, and the third quartile, 32.5? Explain your reasoning.

The probability is about 25%, or 0.25, because Q 1 , the median, and Q 3 divide the data into four almost-equal parts.

The list gives the ages of a random sample of 16 people who visited a doctor’s office one day.

80, 52, 78, 64, 70, 80, 78, 35, 78, 74, 82, 73, 80, 75, 62, 80

a. Make a box plot of the ages. How does the box plot show that this skewed distribution is skewed left?

The part of the box to the of the median is slightly wider than the part to

the and the “whisker” on the is much longer than the one

on the , so the distribution is skewed left.

left

leftright

right

Module 22 1101 Lesson 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1101 4/4/14 7:08 PM

1101 Lesson 22 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 8: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

b. Suppose one of the ages is chosen at random. Based on the box plot and not the original set of data, what can you say is the approximate probability that the age falls between the first quartile, 67, and the third quartile, 80? Explain your reasoning.

The probability is about %, or , because Q 1 , the median, and Q 3

divide the data into almost-equal parts and there are two parts that each

represent about % of the data between the first and third quartiles.

Your Turn

8. The list gives the starting salaries (in thousands of dollars) of a random sample of 18 positions at a large company. Use a graphing calculator to make a box plot and then analyze the graph.

40, 32, 27, 40, 34, 25, 37, 39, 40, 37, 28, 39, 35, 39, 40, 43, 30, 35

a. Make a box plot of the starting salaries. How does the box plot show that this skewed distribution is skewed left?

b. Suppose one of the starting salaries is chosen at random. Based on the box plot and not the original set of data, what can you say is the approximate probability that the salary is less than the third quartile, 40? Explain your reasoning.

Elaborate

9. Explain the difference between a normal distribution and a skewed distribution.

10. Discussion Describe how you can use a line plot, a histogram, and a box plot of a set of data to answer questions about the percent of the data that fall within a specified interval.

50

25

0.5

four

The part of the box to the left of the median is slightly wider than the part to the right, and the “whisker” on the left is longer than the one on the right, so the distribution is skewed left.

The probability is about 75%, or 0.75, because Q 1 , the median, and Q 3 divide the data into four almost-equal parts and there are three parts that each represent about 25% of the data from the minimum value to the third quartile.

A normal distribution is mound-shaped with two symmetric tails. The mean and median

of a normal distribution are equal or almost equal. A skewed distribution is also mound-

shaped but one tail is noticeably longer than the other. The mean is usually greater than

the median in a right-skewed distribution and less than the median in a left-skewed

distribution.

All three types of data displays organize numerical data using a number line. A line plot

preserves the data values, and you can simply count the number of data values that fall

within a specified interval. A histogram uses the heights of bars to indicate the number

of data values in intervals of equal width, and you can sum the heights of the bars whose

intervals are part of the specified interval. A box plot shows the points that divide the data

into four equal parts, so the specified interval must be expressed in terms of the dividing

points (quartiles), which means that the corresponding percent of the data will be a

multiple of 25%.

Module 22 1102 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-B;CA-B

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-B;CA-B

A2_MNLESE385900_U9M22L2.indd 1102 9/1/14 8:26 AM

ELABORATE INTEGRATE MATHEMATICAL PRACTICESFocus on Critical ThinkingMP.3 Challenge students to find a way to “flip” a skewed box plot so that it keeps the same minimum and maximum values. Subtracting each value from the sum of these two values will produce this result.

AVOID COMMON ERRORSStudents who depend on technology to calculate standard deviation need to know whether the result is σ, the population standard deviation or s, the sample standard deviation. The sample standard deviation will be slightly larger, as the quotient in the calculation is n – 1 rather than n. For large values of n there will be little difference between the values.

SUMMARIZE THE LESSONHow can you use shape, center, and spread to characterize a data distribution? The shape of

the distribution of a data set can be shown by

displaying the data as a line plot, as a histogram, or

as a box plot. The measures of center and spread

can be used to summarize the distribution. Normal

distributions can be summarized using either the

mean and standard deviation or the median and

IQR, while skewed distributions should be

summarized using only the median and IQR.

Shape, Center, and Spread 1102

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 9: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

3 4 5 6 7 8 9 10 11 12

xxx xx

x x xxx

xx

66 67 68 69 70 71 72 73 74 75 76 77 78 79 80

x x xxxx

xx

xx

xx

32 33 34 35 36 37 38 39 40

x xxx

xx

xx

xxx

x

© H

ough

ton

Miff

lin H

arco

urt P

ublis

hing

Com

pany

11. Essential Question Check-In Why are the mean and standard deviation not appropriate statistics to use with a skewed distribution?

1. Make a line plot of the data. Based on the shape of the distribution, identify what type of distribution it is.a. Ages of children: 4, 9, 12, 8, 7, 8, 7, 10, 8, 9, 6, 8

b. Scores on a test: 80, 78, 70, 77, 75, 77, 76, 66, 77, 76, 75, 77

c. Salaries (in thousands of dollars) of employees: 35, 35, 36, 40, 37, 36, 37, 35, 35, 38, 36, 34

• Online Homework• Hints and Help• Extra Practice

Evaluate: Homework and Practice

Because the mean and standard deviation are calculated using every data value, data

values that are much larger or smaller than the other data values can have a significant

impact on those statistics. So, the mean and standard deviation may be too sensitive to the

data values in the longer tail of a skewed distribution.

The distribution has a central mound and symmetric tails, so it is a normal distribution.

The distribution has a central mound and asymmetric tails with the longer tall on the left, so it is a left-skewed distribution.

The distribution has a central mound and asymmetric tails with the longer tall on the right, so it is a right-skewed distribution.

Module 22 1103 Lesson 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1103 4/4/14 7:08 PMExercise Depth of Knowledge (D.O.K.) Mathematical Practices

1 1 Recall of Information MP.3 Logic

2 1 Recall of Information MP.5 Using Tools

3 2 Skills/Concepts MP.4 Modeling

4–5 2 Skills/Concepts MP.3 Logic

6–7 1 Recall of Information MP.5 Using Tools

8–10 3 Strategic Thinking MP.3 Logic

EVALUATE

ASSIGNMENT GUIDE

Concepts and Skills Practice

Explore 1Seeing the Shape of a Distribution

Exercise 1

Explore 2Relating Measures of Center and Spread to the Shape of a Distribution

Exercises 4, 7

Example 1Making and Analyzing a Histogram

Exercises 2–3

Example 2Making and Analyzing a Box Plot

Exercises 5–6

CONNECT VOCABULARY Remind students that the word normal, as used in mathematics, does not mean the same thing as good or healthy, as it might, for example, in the sentence “the patient’s temperature is normal.” A normal curve simply represents one kind of distribution with certain characteristics.

INTEGRATE MATHEMATICAL PRACTICESFocus on Critical ThinkingMP.3 Have students analyze a set of data that has a small interquartile range but a large range, and have them describe how the data are distributed. The data

are clustered around the median and have some

extreme values.

1103 Lesson 22 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 10: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

In Exercises 2–3, use the data in the table. The table gives the heights and weights of a random sample of 14 college baseball players.

Height (in.) Weight (lb)

70 160

69 165

72 170

70 170

68 150

71 175

70 160

69 165

71 165

70 170

67 155

69 165

71 165

73 185

2. a. Find the mean, median, standard deviation, and IQR of the height data.

b. Use a graphing calculator to make a histogram of the height data. Based on the shape of the distribution, identify what type of distribution it is.

c. By examining the histogram of the height distribution, determine the percent of the data that fall within 1 standard deviation of the mean. Explain your reasoning.

― x = 70 Median = 70

s ≈ 1.5 IQR = Q 3 - Q 1 = 2

The distribution has a central mound and fairly symmetric tails, so it is a normal distribution.

The interval for data that are within 1 standard deviation of the mean is 68.5 < x < 71.5. The bars for x-values that satisfy 68.5 < x < 71.5 have heights of 3, 4, and 3, so 10 data values out of 14, or about 71% of the data, are in the interval.

Module 22 1104 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1104 4/4/14 7:08 PM

AUDITORY CUESThe word quartile sounds like quarter, which suggests fourths.

AVOID COMMON ERRORSStudents may assume that if one side of a box plot is longer than the other, the longer side represents more data. Remind students that a longer side indicates the same amount of data, but a wider spread, showing that the values are farther apart from each other.

Shape, Center, and Spread 1104

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 11: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

d. Suppose one of the heights is chosen at random. By examining the histogram, determine the probability that the height is more than 1 standard deviation above the mean. Explain your reasoning.

3. a. Find the mean, median, standard deviation, and IQR of the weight data.

b. Use a graphing calculator to make a histogram of the weight data. Based on the shape of the distribution, identify what type of distribution it is.

c. By examining the histogram, determine the percent of the weight data that are within 2 standard deviations of the mean. Explain your reasoning.

d. Suppose one of the weights is chosen at random. By examining the histogram, determine the probability that the weight is less than 1 standard deviation above the mean. Explain your reasoning.

The interval for data that are more than 1 standard deviation above the mean is x > 71.5.

The bars for x-values that satisfy x > 71.5 have heights of 1 and 1, so the probability that

a height is in the interval is 2 __ 14 ≈ 0.14.

― x ≈ 166 Median = 165

s ≈ 8.2 IQR = Q 3 - Q 1 = 10

The distribution has a central mound and fairly symmetric tails, so it is a normal distribution.

The interval for data that are within 2 standard deviations of the mean is 149.6 < x < 182.4. The bars for x-values that satisfy 149.6 < x < 182.4 have heights of 1, 1, 2, 5, 3, and 1, so 13 data values out of 14, or about 93% of the data, are in the interval.

The interval for data that are less than 1 standard deviation above the mean is

x < 174.2. The bars for x-values that satisfy x < 174.2 have heights of 1, 1, 2, 5, and 3,

so the probability that a weight is in the interval is 12 __ 14 ≈ 0.86 or 86%.

Module 22 1105 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1105 4/4/14 7:08 PM

INTEGRATE MATHEMATICAL PRACTICESFocus on Math ConnectionsMP.1 Emphasize that the advantage of using a box plot instead of a line plot is that it is easier to see the range, median, and other features of the data. The disadvantage of a box plot is that the values of the individual data cannot be seen.

1105 Lesson 22 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 12: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

x xxx

xx x

xxxx

xx

xxxx

xx

xx

xx

x

55 60 65 70

Resting Heart Rate

75 80 85 90

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany • Im

age C

redits:

©N

athan

Marx/iSto

ckPhoto.com

4. The line plot shows a random sample of resting heart rates (in beats per minute) for 24 adults.

a. Find the mean, median, standard deviation, and IQR of the heart rates.

b. By examining the line plot, determine the percent of the data that are within 1 standard deviation of the mean. Explain your reasoning.

c. Suppose one of the heart rates is chosen at random. By examining the line plot, determine the probability that the heart rate is more than 1 standard deviation below the mean. Explain your reasoning.

5. The list gives the prices (in thousands of dollars) of a random sample of houses for sale in a large town.

175, 400, 325, 350, 500, 375, 350, 375, 400, 375, 250, 400, 200, 375, 400, 400, 375, 325, 400, 350

a. Find the mean, median, standard deviation, and IQR of the house prices. How do these statistics tell you that the distribution is not symmetric?

b. Use a graphing calculator to make a box plot of the house prices. How does the box plot show that this skewed distribution is skewed left?

c. Suppose one of the house prices is chosen at random. Based on the box plot and not the original set of data, what can you say is the approximate probability that the price falls between the first and the third quartiles? Explain your reasoning.

― x ≈ 72 Median = 70

s ≈ 8.4 IQR = Q 3 - Q 1 = 12.5

The interval for data that are within 1 standard deviation of the mean is 63.6 < x < 80.4. This interval includes heart rates of 65, 70, 75, and 80, which have frequencies of 4, 6, 5, and 3, respectively. So, 18 data values out of 24, or 75% of the data, are in the interval.

The interval for data that are more than 1 standard deviation below the mean is x > 63.6. This interval includes heart rates of 65, 70, 75, 80, 85, and 90, which have frequencies of 4, 6, 5, 3, 2, and 1, respectively. So, the probability that a heart rate is in the interval is 21 __ 24 = 0.875 or 87.5%.

The part of the box to the left of the median is slightly wider than the part to the right, and the “whisker” on the left is longer than the one on the right, so the distribution is skewed left.

The probability is about 50%, or 0.50, because Q 1 , the median, and Q 3 divide the data into four almost-equal parts and there are two parts that each represent about 25% of the data between the first and third quartiles.

― x = 355 Median = 375

s ≈ 72.28 IQR = Q 3 - Q 1 = 62.5

The mean and median are significantly different, which indicates that the distribution is not symmetric.

Module 22 1106 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1106 4/4/14 7:08 PM

INTEGRATE MATHEMATICAL PRACTICESFocus on Technology

MP.5 Remind students that the TRACE feature of a graphing calculator can be used to

find the quartiles for a data set graphed as a box plot.

Shape, Center, and Spread 1106

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 13: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

x xx

xx x

xxxxx

xx

xx

xxxx

xx

xx

x

0 1 2 3

Time Spent With Customer

4 5 6 7

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y • I

mag

e C

red

its

©w

aveb

reak

med

ia/S

hutt

erst

ock

6. The line plot shows a random sample of the amounts of time (in minutes) that an employee at a call center spent on the phone with customers.

a. Do you expect the mean to be equal to, less than, or greater than the median? Explain.

b. Find the mean, median, standard deviation, and IQR of the time data. Do these statistics agree with your answer for part a?

c. Use a graphing calculator to make a box plot of the time data. How does the box plot show that the distribution is skewed right?

d. Suppose one of the times spent with a customer is chosen at random. Based on the box plot and not the original set of data, what can you say is the approximate probability that the time is greater than the third quartile? Explain your reasoning.

7. Classify each description as applying to a normal distribution or a skewed distribution.A. Histogram is mound-shaped with two symmetric tails. Normal SkewedB. Mean and median are equal or almost equal. Normal SkewedC. Box plot has one “whisker” longer than the other. Normal SkewedD. Histogram is mounded with one tail longer than the other. Normal SkewedE. Box plot is symmetric with respect to the median. Normal SkewedF. Mean and median are significantly different. Normal Skewed

The mean should be greater than the median because the distribution is right-skewed and the mean will be pulled toward the right tail.

― x = 2.875 Median = 2.5

s ≈ 1.6 IQR = Q 3 - Q 1 = 2

Yes, the mean is greater than the median.

The part of the box to the right of the median is wider than the part to the left, and the “whisker” on the left is longer than the one on the right.

The probability is about 25%, or 0.25, because Q 1 , the median, and Q 3 divide the data into four almost-equal parts.

Module 22 1107 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1107 4/4/14 7:08 PM

1107 Lesson 22 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 14: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany • Im

age C

redits

©M

oo

db

oard

/Corb

is

H.O.T. Focus on Higher Order Thinking

8. Explain the Error A student was given the following data and asked to determine the percent of the data that fall within 1 standard deviation of the mean.

20, 21, 21, 22, 22, 22, 22, 23, 23, 23, 23, 24, 24, 24, 24, 24, 25, 25, 25, 26, 26, 26, 27, 27, 28

The student gave this answer: “The interval for data that are within 1 standard deviation of the mean is 24 - 3.5 < x < 24 + 3.5, or 21.5 < x < 27.5. The bars for x-values that satisfy 21.5 < x < 27.5 have heights of 4, 4, 5, 3, 3, and 2, so 21 data values out of 25, or about 84% of the data, are in the interval.” Find and correct the student’s error.

9. Analyze Relationships The list gives the number of siblings that a child has from a random sample of 10 children at a daycare center.

5, 2, 3, 1, 0, 2, 3, 1, 2, 1a. Use a graphing calculator to create a box plot of the data. Does the

box plot indicate that the distribution is normal or skewed? Explain.

b. Find the mean, median, standard deviation, and IQR of the sibling data. What is the relationship between the mean and median?

c. Suppose that an 11th child at the daycare is included in the random sample, and that child has 1 sibling. How does the box plot change? How does the relationship between the mean and median change?

d. Suppose that a 12th child at the daycare is included in the random sample, and that child also has 1 sibling. How does the box plot change? How does the relationship between the mean and median change?

The median of the data is 24, and the IQR is 3.5, so the student used the median and IQR, rather than the mean and standard deviation, to define the interval for data that are within 1 standard deviation of the mean. Since the mean of the data is 23.88 and the standard deviation is about 2.03, the student should have used the interval 23.88 - 2.03 < x < 23.88 + 2.03, or 21.85 < x < 25.91. The bars for x-values that satisfy 21.85 < x < 25.91 have heights of 4, 4, 5, and 3, so 16 data values out of 25, or about 64% of the data, are in the interval.

Although the two parts of the box in the box plot have the same width, the box plot shows that the distribution is skewed right because the right “whisker” is longer than the left one.

x = 2 Median = 2

s ≈ 1.3 IQR = Q 3 - Q 1 = 2

The mean and median are equal.

The box plot does not change. x = 1.9 Median = 2

s ≈ 1.3 IQR = Q 3 - Q 1 = 2 The mean is less than the median.

The part of the box to the right of the median is now wider than the part to the left, and the right “whisker” is now even longer than the left one. ― x ≈ 1.8 Median = 1.5

s ≈ 1.3 IQR = Q 3 - Q 1 = 1.5 The mean is greater than the median.

Module 22 1108 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1108 4/4/14 7:07 PM

AVOID COMMON ERRORSStudents may decide to divide the number of data by 4 to get the lower quartile and upper quartile. Reinforce that these quartiles are defined as the medians of the upper and lower halves of the data as determined by the median.

Shape, Center, and Spread 1108

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 15: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton

Mif

flin

Har

cour

t Pub

lishi

ng

Com

pan

y

e. What is the general rule about the relationship between the mean and median when a distribution is skewed right? What has your investigation of the sibling data demonstrated about this rule?

10. Draw Conclusions Recall that a graphing calculator may give two versions of the standard deviation. The population standard deviation, which you can also use for the

“uncorrected” sample standard deviation, is σ x = √_____

( x 1 - _ x ) 2 + ( x 2 - _ x ) 2 +⋯+ ( x n - _ x ) 2

___________________________ n .

The “corrected” sample standard deviation is s x = √_____

( x 1 - _ x ) 2 + ( x 2 - _ x ) 2 +⋯+ ( x n - _ x ) 2

___________________________ n - 1 .

Write and simplify the ratio σ x __ s x . Then determine what this ratio approaches as n increases without bound. What does this result mean in terms of finding standard deviations of samples?

Lesson Performance TaskThe table gives data about a random sample of 16 cats brought to a veterinarian’s office during one week.

a. Find the mean, median, standard deviation, and IQR of the weight data. Do the same for the age data.

b. Use a graphing calculator to make a histogram of the weight data and a separate histogram of the age data. Based on the shape of each distribution, identify what type of distribution it is. Explain your reasoning.

Sex Weight (pounds)

Age (years)

Male 12 11

Female 9 2

Female 8 12

Male 10 15

Female 10 10

Male 11 10

Male 10 11

Male 11 7

Sex Weight (pounds)

Age (years)

Female 9 5

Male 12 8

Female 7 13

Male 11 11

Female 10 13

Male 13 9

Female 8 12

Female 9 16

σ x __ s x =

√_____

( x 1 - x ) 2 + ( x 2 - x ) 2 +⋯+ ( x n - x ) 2

___________________________ n _____________________

√_____

( x 1 - x ) 2 + ( x 2 - x ) 2 +⋯+ ( x n - x ) 2

___________________________ n - 1

= √____

( x 1 - x ) 2 + ( x 2 - x ) 2 +⋯+ ( x n - x ) 2

___________________________ n ____________________

( x 1 - x ) 2 + ( x 2 - x ) 2 +⋯+ ( x n - x ) 2

___________________________ n - 1 = √

_

n - 1 _____ n . As n increases

without bound, n - 1 _____ n approaches 1, so σ x __ s x approaches √ ― 1 , or 1. This means that as the

sample size increases, the two forms of the standard deviation get closer to being equal.

The general rule is that when a distribution is skewed right, the mean is greater than the median because the mean is increased by the data in the tail on the right. The investigation of the sibling data shows that this rule doesn’t always apply.

Module 22 1109 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1109 4/4/14 7:07 PM

PEERTOPEER DISCUSSIONGroup students in pairs. Ask one student in each pair to give values for the quartiles, range, and interquartile range. Then have the other student determine a set of data that fits these values and graph the data as a box plot. As an extension, have the students write a problem that fits the data.

JOURNALHave students write a journal entry in which they draw a normal distribution and a skewed distribution. Next to the drawings, have students list the measures of center and spread that are used for each type of distribution and explain why.

1109 Lesson 22 . 2

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C

Page 16: CorrectionKey=NL-A;CA-A CorrectionKey=NL-C;CA … 22 1096 Lesson 2 DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A DO NOT EDIT--Changes must be made

© H

oug

hton Mifflin H

arcourt Publishin

g Com

pany

c. By examining the histogram of the weight distribution, determine the percent of the data that fall within 1 standard deviation of the mean. Explain your reasoning.

d. For the age data, Q 1 = 8.5 and Q 3 = 12.5. By examining the histogram of the age distribution, find the probability that the age of a randomly chosen cat falls between Q 1 and Q 3 . Why does this make sense?

e. Investigate whether being male or female has an impact on a cat’s weight and age. Do so by calculating the mean weight and age of female cats and the mean weight and age of male cats. For which variable, weight or age, does being male or female have a greater impact? How much of an impact is there?

a. Weight data:

x = 10 Median = 10

s ≈ 1.6 IQR = Q 3 - Q 1 = 2

Age data:

x ≈ 10.3 Median = 11

s ≈ 3.5 IQR = Q 3 - Q 1 = 4

b. The weight distribution has a central mound and symmetric tails, so it is a normal distribution. The age distribution has a central mound and a left tail that is longer than the right tail, so it is a skewed-left distribution.

c. The interval for weight data that are within 1 standard deviation of the mean is 8.4 < x < 11.6. The bars for x-values that satisfy 8.4 < x < 11.6 have heights of 3, 4, and 3, so 10 data values out of 16, or 62.5% of the weight data, are in the interval.

d. The bars for x-values that satisfy 8.5 < x < 12.5 have heights of 1, 2, 3,

and 2, so the probability that a randomly chosen cat is between 8.5 and

12.5 years old is 8 __ 16 = 0.5 or 50%. This makes sense because Q 1 , the

median, and Q 3 divide the data into four almost-equal parts and there

are two parts that each represent about 25% of the data between the

first and third quartiles.

e. For female cats, the mean weight is 8.75 pounds, and the mean age is about 10.4 years. For male cats, the mean weight is 11.25 pounds, and the mean age is about 10.3 years. Being male or female appears to have an impact on weight but not on age. On average, male cats weigh 2.5 pounds more than female cats.

Module 22 1110 Lesson 2

DO NOT EDIT--Changes must be made through “File info” CorrectionKey=NL-A;CA-A

A2_MNLESE385900_U9M22L2.indd 1110 4/4/14 7:07 PM

EXTENSION ACTIVITY

Have students compare the formulas for sample standard deviation, S, and population standard deviation, σ. They should calculate for both very small samples and very large samples. Then have students research to find reasons for dividing the summation by n and the other by n – 1.

Note that S is considered an unbiased estimate of the standard deviation of the population and is based on the concept of degrees of freedom. One degree of freedom is lost due to the fact that the sample and the complement of the sample must total the entire population. Dividing by n – 1 compensates for the information in the population that is not within the sample.

QUESTIONING STRATEGIESWhy might you be interested only in the sample standard deviation and not in the

population standard deviation? Possible answer: You

might be interested only in the subset of the

population because the rest of the population will

not be affected by any decisions made from the

data.

Why do we square the differences between data values and the mean? Why not just add

the differences from the mean? The sum of the

differences is 0, so the mean difference will always

be 0. This provides no insight with respect to the

data.

INTEGRATE MATHEMATICAL PRACTICESFocus on Technology

MP.5 In calculating the distribution of data, what does each of the six symbols on the

calculator screen represent? _ x represents the mean.

∑x represents the sum of the data. ∑ x 2 represents

the sum of squares of the data. S x represents the

sample standard deviation. σ x represents the

population standard deviation. n represents the

total number of items entered in the list of data.

Scoring Rubric2 points: Student correctly solves the problem and explains his/her reasoning.1 point: Student shows good understanding of the problem but does not fully solve or explain his/her reasoning.0 points: Student does not demonstrate understanding of the problem.

Shape, Center, and Spread 1110

DO NOT EDIT--Changes must be made through “File info”CorrectionKey=NL-C;CA-C