nctm hssi 2016: mp’s through a statistical …...2 nctm hssi 2016: mp’s through a statistical...

234
1 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS Session 2: Experiments and Simulation as a way to practice statistical thinking Goals for This session: To better understand the role of random assignment in an experiment To better understand the role of simulation in making inferences about the effectiveness of imposing treatments on subjects. To better understand the value of non-traditional activities in helping students practice statistical thinking. Part 1: Planning for our experiment Question: Can we reduce somebody’s ABILITY to make shots by distracting them? In a few minutes we’ll conduct an experiment, and our subject will take 40 shots. We’ll flip a coin to determine which treatment we give during each shot (distract the subject or do not distract). About half will be taken with distractions, half without. 1. Imagine that our subject made 28 of 40 shots during the experiment. a) Suppose that our results show no evidence that distractions hurt her ability to make shots. How many shots would you expect her to make when she is distracted? When she is not distracted? b) Fill out the rest of the table. c) For this situation, what would the difference in proportion of successes be? to find this compute (prop. made when not distracted – prop. made when distracted) Not Distracted Distracted Total Made 28 Missed 12

Upload: others

Post on 09-Jul-2020

4 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

1

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

Session 2: Experiments and Simulation as a way to practice statistical thinking Goals for This session:

• To better understand the role of random assignment in an experiment • To better understand the role of simulation in making inferences about the

effectiveness of imposing treatments on subjects. • To better understand the value of non-traditional activities in helping students

practice statistical thinking. Part 1: Planning for our experiment Question: Can we reduce somebody’s ABILITY to make shots by distracting them? In a few minutes we’ll conduct an experiment, and our subject will take 40 shots. We’ll flip a coin to determine which treatment we give during each shot (distract the subject or do not distract). About half will be taken with distractions, half without.

1. Imagine that our subject made 28 of 40 shots during the experiment.

a) Suppose that our results show no evidence that distractions hurt her ability to make shots. How many shots would you expect her to make when she is distracted? When she is not distracted?

b) Fill out the rest of the table.

c) For this situation, what would the difference in proportion of successes be? to find this compute (prop. made when not distracted – prop. made when distracted)

Not Distracted Distracted Total

Made 28

Missed

12

Page 2: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

2

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

2. a) Now, suppose that our results show strong evidence that distractions hurt her

ability to make shots. Complete the rest of the table below to show a possible example of this case.

b) For this situation, what would the difference in proportion of successes be? to find this compute (prop. made when not distracted – prop. made when distracted)

3. a) Now, suppose that our results show strong evidence that distractions hurt her

ability to make shots. Complete the rest of the table below to show a possible example of this case.

b) For this situation, what would the difference in proportion of successes be? to find this compute (prop. made when not distracted – prop. made when distracted)

4. Think about it/discuss: In a set of 40 shots, how large a difference in success rates would provide you with convincing evidence that distraction hurts our subject’s ability to make shots?

Not Distracted Distracted Total

Made 28

Missed

12

Total

20 20 40

Not Distracted Distracted Total

Made 28

Missed

12

Total

20 20 40

Page 3: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

3

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

Distraction Experiment: Waste Paper Basketball

1. Select one person to be our test subject. This person will be shooting crumpled balls of paper at a small wastepaper basket. 2. Select a location for the basket and a location for the shooter. Place the locations so there’s a reasonable chance of both making and missing baskets (Don’t make it too far away or too close). Give the shooter a few minutes to figure out a good distance. 3. Before each shot, flip a fair coin: heads = player shoots WITH distractions. tails = player shoots WITHOUT distractions. 4. Ground Rules for Distracting the shooter: 4.1. Distractors should stand in front of shooter, behind the basket. 4.2. Distractors should not have the potential to touch the shooter or block the path of any shots. 4.3. Distractors may shout, use noisemakers (no air horns), wave hands, or do anything you might see at an actual basketball game (within reason). 4.4. Please refrain from anything disparaging, disrespectful, or unprofessional when distracting the shooter. 5. Ground Rules for NOT distracting the shooter: 5.1. Spectators do not make noise. 5.2. Spectators should stand in front of shooter, behind the basket, but make no noise and create no visual distraction. 5.3. Spectators take no action that may be perceived as an attempted distraction. 5.4. The shooter should not be potentially distracted by any actions from the spectators. 6. Select One person to record data. For each shot you will record two things: Did you distract the shooter? (y or n) Did the shooter make the shot? (y or n)

Page 4: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

4

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

Data Recording Sheet

Trial Distracted? (Y or N)

Made the Attempt? (Y or N)

Trial Distracted? (Y or N)

Made the Attempt? (Y or N)

1 21

2 22

3 23

4 24

5 25

6 26

7 27

8 28

9 29

10 30

11 31

12 32

13 33

14 34

15 35

16 36

17 37

18 38

19 39

20 40

Page 5: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

5

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

Summarizing the Results:

Compute: % made without distractions: Compute: % made with distractions: Compute the difference in success rates: (without – with):

Discuss/Speculate: Do you think the results of this experiment provide convincing evidence (beyond a reasonable doubt) that our distractions impact our shooter’s ABILITY to make free throws? Why (not)? Do you think that chance variation alone could plausibly be responsible for the difference in success rates for our test subject? Why (not)? In order to get a better understanding of how much impact “pure chance” has on the results of our experiment, we are going to run a simulation of our experiment, where we simulate the assumption that the long-term chance of making a free throw does not change when we distract a shooter.

With Distractions

Without distractions

Total

Make

Miss

Total 40

Page 6: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

6

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

Directions for simulating 40 free throws:

1. Each group should have a deck of cards (maybe index cards), a pencil, and a

calculator/computer. 2. Your deck of 40 cards represents every attempt taken by our test subject. We

want to simulate what might happen in a sample of 40 shots if distractions DON’T affect a person’s ability to make their shots.

3. You will now assign a meaning to each index card. Each card will represent a

MADE shot (marked on one side with a check mark) or MISSED shot left blank).

Total number of Makes: Total number of Misses:

Mark your 40 cards now. 4. In order to simulate the idea that “Distraction doesn’t change ABILITY to make

shots,” we will Shuffle the cards. We will not look at the Ys or Ns. Shuffle well.

5. How many shots were taken without distraction? Deal out that many cards. The remaining cards represent the shots taken without distraction.

6. In each pile, record the proportion of attempts that were successful (“makes.”).

7. After each simulation, record the difference in proportions for each group.

(prop. made WITHOUT distraction) - (proportion made WITH distraction)

8. Record this difference and place it on the group’s dot plot of simulated differences.

9. Repeat steps 4-8 again, until the group has a sufficient number of simulations.

Page 7: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

7

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

Analyzing the results of our simulation:

1. Our dot plot of simulated “difference in proportions” is centered around which value?

2. What about the design of our simulation would allow us to predict where the simulation results were centered?

3. What values for “difference in proportion” would you call…

a. “typical?”

b. “reasonable to expect once in a while?”

c. “very unlikely to occur?”

4. Consider the difference in proportions that actually occurred in our actual experiment. Use the results of the dot plot to determine if we have convincing evidence to suggest that distractions impacted our subject’s ability to make free throws.

Page 8: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

8

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

Reflection on our work:

1. What role does the simulation play in helping us determine the strength of our evidence that distractions impact a shooter’s ability?

2. Why is it important to assign the treatment for each shot with a coin flip? In other words, why can’t we simply take 20 shots with a distraction and 20 shots without a distraction?

3. How might we re-design this experiment for future subjects to improve our ability to detect any possible impact of distraction on a shooter?

TEACHER HATS BACK ON!

4. Take a look at the “Mathematical Practices through a Statistical Lens” on the following pages. Identify moments in this activity where your team were demonstrating one of the Mathematical / Statistical practices.

Page 9: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

9

NCTMHSSI2016:MP’STHROUGHASTATISTICALLENS,PART2:SIMULATIONS

Resources for this session: located at http://roughlynormal.com

My slides A pdf of this handout Lessons, some with teacher notes:

• A year’s work of inquiry from day one • Graphs from Questions, Questions from Graphs • … a few probability lessons involving two way tables • … plus more examples of lessons I’ve done using simulations.

Frameworks for Statistical Thinking: ASA: GAISE Report: http://www.amstat.org/education/gaise/ ASA: The Statistical Education of Teachers: http://www.amstat.org/education/SET/SET.pdf Georgia Performance Standards in Mathematics, Statistical Reasoning Contains Mathematical Practices Through a Statistical Lens: My Information: William Thill Department of Mathematics Harvard-Westlake 3700 Coldwater Canyon Avenue Studio City, CA 91604 email: [email protected] email: [email protected] http://roughlynormal.com Twitter: @roughlynormal

Page 10: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Unit 1: Chokes and Clutches 1

Unit 1: Getting Started, “Chokes and Clutches” Statistics Goals for This unit:

Understand the difference between categorical and quantitative variables. Understand and explain the difference between ABILITY and PERFORMANCE in context. Create and interpret a simulation to assess whether a PERFORMANCE gives convincing evidence that

athletic ABILITY has changed. Use new skills to answer questions about whether “choking” or “clutch” performances give evidence of

a change in athletic ability in different contexts. Research Project Goals for this unit:

Learn about at least two or three useful databases for potential sports research topics. Begin the process of collecting and organizing databases, articles, research papers, and Internet

resources for the project. Learn to summarize the essential information in a research article.

Day Topic In Class Due Next Time 1 Intro, pp 1-3 HW: 1-6 2 Tasks 2-3, pp 4-8 HW: thru # 18 3 Working Day: Discovering Sites Complete “Show/Tell” 4 Explore resrouces, Work Day: HW: Thru #27 5 Project Investigation HW: Thru #36 6 Project Work Day HW: Thru #40 7 Project Work Day Investigation 1 Complete Task How Graded Point

Value Homework

*Your two HW were submitted to the Google Doc by the due date, with acceptable effort + revisions (with commentary about mistakes). * You were able to show the instructor acceptable work on any HW checks (done at teacher’s discretion, notice not required). See the class CH 1 Google Doc for HW information.

4 pts

In-class work / synopses

Clear evidence in packet that you completed activities thoughtfully, wrote down answers, engaged in the discussion of the task / problem, and responded to discussion questions in the HW HUB when asked to.

4 pts

Show and Tell

You gave citation of two potentially valuable resources for people to look through, consider for topics, or to access data.

4 pts

Investigation Investigation: Choke and Clutch. You performed an investigation according to the guidelines. Your work was assessed based on the rubric given to you.

10 pts

Page 11: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

2 UNIT 1: CHOKES AND CLUTCHES

HW Problem Sign-up Problem

# Topic Work posted on Google Doc By Due Date

2 Numerical or Categorical? MUS

3 Bar Graphs, A-Rod, NYY MUS

4 Bar Graphs Patrick Roy Hockey JB

5 Bar Graphs, Candace Parker LA Sparks JT

6 Bar Graphs, Ben Rothlisberger, Steelers AR

7 Describe when Performance > Ability. JL

8 Describe when Performance < Ability. RT

11 and 13* Peyton Manning, simulation CG

12 and 14* Annika Sorenstam, simulation OCS

15 Free Throws and sample size AB

17 and 18* Tim Wakefield's knuckleballs JB

21* Manny Ramirez , testing for "clutch" MUS

22* Eun-Hee Ji, testing for "clutch" performance OCS

23* Miguel Cabrera post DUI: choker? JT

25* Volleyball players choking? AR

27* Tennis Serves clutch? CG

29 Improve this graph: A-Rod JT

33 Categorical/Numerical? JL

35 Did Swede Risberg throw the 1919 World Series? OCS

36* Does Romo choke in December? JL

39* Kissing Study: Doing it the Right Way. AB

40* Outliers: Hockey Birthdays RT

Each student must pick two problems, one of which is starred. All work must be posted in the Google Doc. I don’t care what format you use (typing, picture upload, etc.), but keep problems in order.

Page 12: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Unit 1: Chokes and Clutches 3

Task 1: Looking at LeBron James’ Performance, 2008 NBA Playoffs Background: Before the 2013 NBA Finals, Some folks believed that LeBron James did not perform as well as he did in the regular season. In other words, his ABILITY I to play well went down when he got into a playoff situation. Let’s focus on a single measurement to compare his PERFORMANCE. 1. Go to http://www.basketball-reference.com to find the following information for LeBron James: 3-PTS made 3-PTS missed

2007-2008 2008 Playoffs How did the performances compare? 2. These bar charts were created to help us compare the PERFORMANCE of LeBron James in the 2008 playoffs to his PERFORMANCE in the 2007-2008 regular season. List some specific problems with using each graph to make comparisons.

3. Create a more appropriate bar chart to make a good comparison. You may want to refer to Appendix B: Making Graphs in Excel at the book’s website: http://bcs.whfreeman.com/sris/#t_730892____

Page 13: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

4 UNIT 1: CHOKES AND CLUTCHES

Task 2: Modeling Athletic Performance ABILITY: How a player or team would do in the long term if given an infinite number of opportunities in the same context. PERFORMANCE: An observed value from the data that describes how a player or team actually did in a specific context.

CENTRAL IDEA TO ANY ANALYSIS of SPORTS DATA:

PERFORMANCE = ABILITY + RANDOM CHANCE Activity: HEAD FLIPPING Tryouts! Every member of the class is trying out for our new Varsity Coin Flipping team. You get ten coin flips. Your goal is to flip “heads” as often as possible. Every person gets 10 flips. As you flip record: b) Discuss. Be sure that you and your classmates make substantive conclusions to each question below: 1. Record your results with the rest of the class. Based on these results, who should be on the team? Justify. 2. What would happen if we extended tryouts to 100 flips? 3. What does the “coin flipping” activity have to do with the LeBron James’ Example?

Page 14: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Unit 1: Chokes and Clutches 5

Task 3: Some questions to check your understanding. Here are the results of the first ten 3-pointers shot by a simulated NBA player during an unspecified season. a) Explain the meaning of the dot at the end of this line scatter plot. In other words, what do we know about this player? b) The statement in part a): Is this a statement about PERFORMANCE or ABILITY? c) The simulation was created so that the player’s ABILITY was the same throughout. Here’s a line plot of the proportion of 3-point shots made by the same player as the season progressed: after 40 shots and after 500 shots. What conclusions can we make about the player’s ABILITY based on the PERFORMANCES tracked here?

Page 15: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

6 UNIT 1: CHOKES AND CLUTCHES

d) Two different schools (School A and School B had coin-flipping tryouts last week. At each school, 200 students tried out. Which school (A or B) gave each student 30 coin flips to prove their skills at getting “heads?” Which school gave each student 150 coin flips? How can you tell?

Page 16: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Unit 1: Chokes and Clutches 7

Exercises: Task 4: Simulating Random Outcomes. Let’s think back to the LeBron James example: His ABILITY success rate on 3-pointers was somewhere near 31.5% in the regular season, but his PERFORMANCE in the playoffs was only 25.7% (18 made out of 70 shots). There are two possible explanations for LeBron’s poor PERFORMANCE: Explanation 1: “LeBron didn’t choke.” In other words, LeBron’s ABILITY to shoot three pointers is the same as it was the 2008 regular season, and his PERFORMANCE was simply due to bad luck, or RANDOM CHANCE. Explanation 2: “LeBron Choked!” In other words, LeBron’s ABILITY to shoot three pointers went down in the 2008 playoffs, and his PERFORMANCE reflects this. Let’s recap his performance in the 2008 regular season and the 2008 playoffs: 3-pt success rate

2007-2008 31.5% = 0.315 (over many shots) 2008 Playoffs 18 of 70 = 0.257 Discuss: Can we rule out explanation #1 beyond a reasonable doubt? Let’s go with our gut. Let’s assume LeBron’s ABILITY to shoot 3-pointers is .315. a) Circle the best prediction of how many 3-pointers LeBron would make out of 70, with an assumed ABILITY to make 31.5% of all 3-pointers in the long run. b) Circle PERFORMANCES you think you might see by luck.

Page 17: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

8 UNIT 1: CHOKES AND CLUTCHES

Creating the Simulation: Let’s use simulation to determine what’s plausible in a set of 70 3-point attempts and a 0.315 ABILITY to make 3-pointers. Directions for our simulation: Results of the simulation: Create a dot plot to record results of each simulation.

Page 18: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Unit 1: Chokes and Clutches 9

Interpreting the results of a simulation: Below is a dot plot of 100 runs of the simulation. Recall that LeBron made only 18/70 =0.257 of his 3-pointers in the regular season. Do we have convincing evidence that his ABILITY went down (he “choked”)? What evidence in the dot plot justifies your conclusion? Suppose LeBron made only 15/70 = 21.4% of his 3-pointers during the playoffs. Would this convince you that he choked? Explain.

Page 19: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

10

UNIT 1: CHOKES AND CLUTCHES

Graded Tasks for this chapter: Task How Graded Point

Value Homework

*Your two HW were submitted to the Google Doc by the due date, with acceptable effort + revisions (with commentary about mistakes). * You were able to show the instructor acceptable work on any HW checks (done at teacher’s discretion, notice not required). See the class CH 1 Google Doc for HW information.

4 pts

In-class work / synopses

Clear evidence in packet that you completed activities thoughtfully, wrote down answers, engaged in the discussion of the task / problem, and responded to discussion questions in the HW HUB when asked to.

4 pts

Show and Tell

You gave citation of two potentially valuable resources for people to look through, consider for topics, or to access data.

4 pts

Investigation Investigation: Choke and Clutch. You performed an investigation according to the guidelines. Your work was assessed based on the rubric given to you.

6 pts

Page 20: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

William Thill from Statistical Reasoning in Sports, 1 ed. Unit 2

Unit 2: Comparing Two proportions: Gaining an Advantage Statistical Goals:

State a null hypothesis and an alternate hypothesis when deciding whether a difference in athletic PERFORMANCE gives convincing evidence of a difference in athletic ABILITY.

Understand the structure and logic of a hypothesis test. Use simulation to test a claim about a difference in ABILITY. Understand the meaning of a test statistic and a p-value.

Understand the difference between an observational study and an experiment.

Understand the components of a randomized experiment. Explain why using random assignment is needed to make valid cause-effect

claims. Explain how controlling the influence of outside variables in an experiment. Explain the effect of changing the number of observations on the strength of

statistical evidence.

Graded Assignment Worth Type of work Due Date

Unit 2 Exercises 8 pts Individual

Unit 2 Investigation 20 pts Paired Submit

Page 21: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

2 Unit 2: Gaining an Advantage

Is there a home field advantage in the NFL? Testing Hypotheses The Super Bowl Champions Baltimore Ravens are known for being a great team… when playing at home. (One such article:http://espn.go.com/blog/afcnorth/post/_/id/58762/ravens-show-why-they-need-home-field-advantage Let’s test a theory: Is the Baltimore Raven’s ABILITY to win games stronger at their home stadium than away? To answer this, go to this URL, and determine the number of wins and losses when the Ravens were at home and when they were on the road. Put the results from both seasons together, and fill out the table below. Use only regular season games: no playoffs or pre-season. http://en.wikipedia.org/wiki/2012_Baltimore_Ravens_season Home Road Total Wins Losses Total Construct a segmented Bar Chart:

Compute difference in the Raven’s home PERFORMANCE and their road PERFORMANCE.

Page 22: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

William Thill from Statistical Reasoning in Sports, 1 ed. Unit 2

Recapping PERFORMANCE vs. ABILITY: Describe what we mean by the Raven’s ABILITY to win at home and one the road. Describe what we mean by the Raven’s PERFORMANCE in 2012.

There are two explanations for why the Ravens performed better at home in 2012 than on the road. Claim 1 (Ho, the null hypothesis): The Cardinals have the same ABILITY to win at home and on the road. Claim 2: (Ha, the alternate hypothesis): The Cardinals have greater ABILITY to win at home than on the road. To have convincing evidence that claim 2 is correct (what we wish to prove) we must show that it is not plausible for RANDOM CHANCE alone to produce such a large difference in PERFORMANCE.

The null hypothesis, denotedH0

, describes an initial belief (assumption) that

there has been no change in ABILITY, or no difference in ABILITY in two different contexts. The alternative hypothesis, denotedHA

, describes what we suspect to be true:

that there has been a change in ABILITY, or there is a difference in ABILITY in two different contexts.

Page 23: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

4 Unit 2: Gaining an Advantage

Testing Hypotheses with simulation Compute the difference in PERFORMANCE at home and away games for the Baltimore Ravens in 2012: This value will be our test statistic: A test statistic is a measure calculated from PERFORMANCE data to be used as evidence in a hypothesis test. Speculate: If there really is no home field advantage for the Ravens (we assume H0

true: the same ABILITY to win games at home and away), could we

get this large of a test statistic by chance? Let’s not speculate, let’s simulate. Activity: 1. Get 16 index cards. 2. On 10 cards, write a “W” on one side, and on the remaining 6, write an “L.” Question: What do these letters represent?

3. Shuffle the cards thoroughly, and deal out 8 cards in one pile to represent simulated PERFORMANCE at home games. The remaining 8 cards represent the simulated PERFORMANCE at road games. Question: Why does this process assume there’s no difference in ABILITY at home or on the road? Question: How does this process ensure that every simulated 2012 PERFORMANCE results in a 10-6 season?

4. Record the results of one simulated season. Be sure to compute the value of our test statistic (home winning percentage – road winning percentage).

Page 24: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

William Thill from Statistical Reasoning in Sports, 1 ed. Unit 2

5. We’re going to simulate a 2012 season of many 10-6 seasons, assuming that there was no home field advantage for the Baltimore Ravens. Now repeat steps 1-4 a number of times. As a class we will record results on a dot plot. Results of simulations (by hand):

Generating many results via computer simulation: Go to one of the following websites to let a computer automate this simulation. The Textbook website: use the applet “Difference in Proportions,” http://bcs.whfreeman.com/sris/#730892__752214__ StatKey: http://lock5stat.com/statkey/randomization_2_cat/randomization_2_cat.html

(both are acceptable for this task, but work a little differently). When you have run 100 trials of the correct simulation record the following: Question: What was the lowest simulated difference in winning percentages? What was the highest simulated difference? What values for the difference were most common? Present the results in an appropriate table or visual display.

Page 25: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

6 Unit 2: Gaining an Advantage

Here are the results of 100 trials of the simulation. This graph shows the approximate distribution of the test statistic.

What does one dot at “50” represent in this situation? What assumptions about Baltimore’s ABILITY to win games are we making when this simulation was designed? In the actual PERFORMANCE data, our test statistic was 75%- 50%= 25%. Circle all simulated values that are at least as high as 25%. Using our simulated distribution of the test statistic, estimate how likely it is to get a test statistic at least as high as 25% by RANDOM CHANCE if there is no home field advantage. This likelihood is called the p-value of the test. Based on our p-value, do you think we have convincing evidence that the Ravens a greater ABILITY to win at home than on the road? Explain. What kinds of p-values (higher/ lower) would have given more convincing evidence that the Ravens a greater ABILITY to win at home than on the road? Why? Have we proven that there is no home field advantage for the Ravens? Explain.

Page 26: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

William Thill from Statistical Reasoning in Sports, 1 ed. Unit 2

Suppose we had more data about the Raven’s PERFORMANCES from more seasons … Activity: Let’s find more relevant data for the Ravens, and run the hypothesis test again. Here are the steps: See pages 47-48 as a guide. a. Create a two – way table to organize PERFORMANCE data. Show the number of wins and losses at home and away. b. Construct a bar graph to compare the Ravens’ PERFORMANCES at home and on the road. Briefly describe what the graph suggests. c. State the hypotheses we are testing. d. What is the value of our test statistic? e. Describe a procedure to simulate the distribution of the test statistic with index cards (we’ll do this by computer).

Page 27: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

8 Unit 2: Gaining an Advantage

f. Use the textbook website or StatKey to run 100 trials of the simulation. Sketch your results below. Mark the value of the test statistic found in the actual PERFORMANCE data. Pages 49-51 from our textbook give you some help g. Using the distribution of the test statistic, estimate and interpret the p-value. h. Using the p-value, state an appropriate conclusion. Do we have convincing evidence that the Ravens have better ABILITY to win at home than on the road? i. Why was the conclusion different here than the first time we ran the analysis? j. Why do you think a greater ABILITY to win at home exists for the Ravens? Name some possible causes.1 k. Do the data provide evidence about which variables are causing this to happen? 1 Note: For more about what might be causing this difference, go here: http://www.freakonomics.com/2011/12/18/football-freakonomics-how-advantageous-is-home-field-advantage-and-why/

Page 28: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

William Thill from Statistical Reasoning in Sports, 1 ed. Unit 2

Experiment: Is it harder to shoot free throws with distractions? We’re going to pick one student to be our “test subject.” Our test subject will undergo two different treatments. Shooting free throws with distraction and Shooting free throws without distraction. The remaining class members will serve as the “distraction.” We have to write up a set of specific procedures so that anybody can implement the experiment how we intended. This set of directions is called the design for our experiment. We want to design this experiment so that: Conditions that may affect free throw success are as similar as possible

when our subject is shooting with distractions or without distractions.

Neither treatment gets some advantage during the shooting process.

We have enough PERFORMANCE data to get convincing evidence of a difference in ABILITY (if there is one).

In your groups: Write down a set of directions on how we will execute this experiment. We will then discuss our ideas and agree on a design for our experiment. Assume you have 45 minutes to get to the gym, execute the experiment, and head to your next class.

Page 29: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

10 Unit 2: Gaining an Advantage

Design of our experiment: What are the explanatory and response variables in this experiment?

What are the treatments in our experiment?

What two hypotheses are we testing in this experiment? Explain why is it important to randomly determine the order in which our subject receives distractions or not.

What other variables are important to keep the same (control) in this experiment? How will you do it?

Hot many shots should our subject take? Explain how you arrived at your answer.

Page 30: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

William Thill from Statistical Reasoning in Sports, 1 ed. Unit 2

Data collection: As our subject shoots, tally results in this table.

With Distraction Without Total Made Shots

Missed Shots

Total

Compute the difference in the percentage of shots made under each treatment. Also create a bar graph to show this difference. Compute our test statistic: Speculate: Do you think our test subject’s PERFORMANCE during the treatments gives convincing evidence that our subject’s ABILITY to make free throws goes down when being distracted?

Page 31: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

12 Unit 2: Gaining an Advantage

Describe a procedure to simulate the distribution of the test statistic with index cards (we’ll do this by computer). Use the textbook website or StatKey to run 100 trials of the simulation. Sketch your results below. Mark the value of the test statistic found in the actual PERFORMANCE data. Using the distribution of the test statistic, estimate and interpret the p-value. Using the p-value, state an appropriate conclusion. Do we have convincing evidence that our subject has weaker ABILITY to make free throws when there are distractions? What are the possible causes of any difference in ability? Why? What about with the Baltimore Ravens data? Do we know what is causing the difference in ABILITY? What could be causing the difference? 2 2 Note: For more about what might be causing this difference, go here: http://www.freakonomics.com/2011/12/18/football-freakonomics-how-advantageous-is-home-field-advantage-and-why/

Page 32: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

William Thill from Statistical Reasoning in Sports, 1 ed. Unit 2

Investigation: Look at the Investigation Guide for the directions for your Unit 2 Investigation. Also notice the rubric for how your work will be graded. Resources for Data: We will be collecting and compiling these as a class. Check the Course HUB site for the information. Good Examples from our text: Due Date: How to turn in my work: Ground Rules:

Page 33: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Lab Activity: Hiring discrimination

Scenario: An airline has just finished training 25 pilots—15 male and 10 female—to become captains. Unfortunately, only eight captain positions are available right now. Airline managers announce that they will use a lottery to determine which pilots will fill the available positions. The names of all 25 pilots will be written on identical slips of paper, which will be placed in a hat, mixed thoroughly, and drawn out one at a time until all eight captains have been identified.

A day later, managers announce the results of the lottery. Of the 8 captains chosen, 5 are female and 3 are male. Some of the male pilots who weren’t selected suspect that the lottery was not carried out fairly. One of these pilots asks your statistics class for advice about whether to file a grievance with the pilots’ union.

a) Could these results have happened just by chance? Discuss.

b) To find out, you and your classmates will simulate the lottery process that airline managers said they used.

i. Separate the cards into piles by suit. Use ten cards from one suit to represent the female pilots. To represent the 15 male pilots, you’ll need all 13 cards of another suit plus two extra cards from a third suit. A second student or group can use the leftover cards from the deck to set up their simulation in a similar way.

ii. Shuffle your stack of 25 cards thoroughly and deal 8 cards. Count the number of female pilots selected. Record this value in a table like the one shown below.

Return the 8 cards to your stack. Shuffle and deal four more times so that you have a total of five simulated lottery results.

c) Your teacher will draw and label axes for a class dotplot. Each student should plot the number of females obtained in each of the five simulation trials on the graph.

d) Discuss: Does it seem believable that airline managers carried out a fair lottery? What advice would you give the male pilot who contacted you?

e) Discuss: Would your advice change if the lottery had chosen 6 female (and 2 male) pilots? Explain.

Page 34: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

What we’ll do this year:

Our ability to make inferences from sample data a population is determined by how the data are produced.

Chapters 1-3 discuss how to find patterns in data and summarize data with appropriate numbers and mathematical models. In this activity we did this with our simple dot plot.

Chapter 4 discusses the two primary methods of data production — sampling and experiments — and the types of conclusions that can be drawn from each.

As the Activity illustrates, the logic of inference rests on asking, “What are the chances?” Probability, the study of chance behavior, is the topic of Chapters 5 through 7.

We’ll introduce the most common inference techniques in Chapters 8 through 12.

Page 35: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Guidelines for

Assessment and Instructionin Statistics Education (GAISE) Report

A Pre-K–12 Curriculum Framework

Christine FranklinUniversity of Georgia

Gary KaderAppalachian State University

Denise MewbornUniversity of Georgia

Jerry MorenoJohn Carroll University

Roxy PeckCalifornia Polytechnic StateUniversity, San Luis Obispo

Mike PerryAppalachian State University

Richard ScheafferUniversity of Florida

Endorsed by the American Statistical AssociationAugust 2005

Page 36: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Library of Congress Cataloging-in-Publication DataGuidelines for assessment and instruction in statistics education (GAISE) report: a pre-k–12 curriculum framework / Authors, ChristineFranklin … [et al.]. p. cm. Includes bibliographical references. ISBN-13: 978-0-9791747-1-1 (pbk.) ISBN-10: 0-9791747-1-6 (pbk.)1. Statistics–Study and teaching (Early childhood)–Standards.2. Statistics–Study and teaching (Elementary)–Standards.3. Statistics–Study and teaching (Secondary)–StandardsI. Franklin, Christine A

QA276.18.G85 2007519.5071–dc22 2006103096

AdvisorsSusan FrielThe University of North Carolina

Landy GodboldWestminster Schools

Brad HartlaubKenyon College

Peter HolmesNottingham Trent University

Cliff KonoldUniversity of Massachusetts/Amherst

Production TeamChristine FranklinUniversity of Georgia

Nicholas HortonSmith College

Gary KaderAppalachian State University

Jerry MorenoJohn Carroll University

Megan MurphyAmerican Statistical Association

Valerie SniderAmerican Statistical Association

Daren StarnesFountain Valley School of Colorado

Special ThanksThe authors extend a special thank you to the American Statistical Association Board of Directors for funding the writing process of GAISE as a strategic initiative and to the ASA/NCTM Joint Committee for funding the production of the GAISE Framework.

© 2007 by American Statistical AssociationAlexandria, VA 22314

Also available online at www.amstat.org/education/gaise. All rights reserved. No part of this book may be reproduced, in any form or by any means, without permission in writing from the publisher.

Printed in the United States of America10 9 8 7 6 5 4 3 2 1

978-0-9791747-1-10-9791747-1-6

Cover photo by Andres RodriguezBook design by Valerie Snider

Page 37: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

ContentsIntroduction 1

Framework 11The Role of Variability in the Problem-Solving Process 11

Maturing over Levels 12

The Framework Model 13

Illustrations 16

Detailed Descriptions of Each Level 21

Level A 23Example 1: Choosing the Bandfor the End of the Year Party—Conducting a Survey 24

Comparing Groups 27

The Simple Experiment 28

Example 2: Growing Beans—A Simple Comparative Experiment 28

Making Use of Available Data 29

Describing Center and Spread 29

Looking for an Association 31

Example 3: Purchasing Sweat Suits—The Role of Height and Arm Span 31

Understanding Variability 33

The Role of Probability 33

Misuses of Statistics 35

Summary of Level A 35

Level B 37

Example 1, Level A Revisited: Choosinga Band for the School Dance 38

Connecting Two Categorical Variables 40

Questionnaires and Their Diffi culties 41

Measure of Location—The Meanas a Balance Point 41

A Measure of Spread—The MeanAbsolute Deviation 44

Representing Data Distributions—The Frequency Table and Histogram 44

Comparing Distributions—The Boxplot 46

Measuring the Strength of Association between Two Quantitative Variables 48

Modeling Linear Association 51

The Importance of Random Selection 52

Comparative Experiments 54

Time Series 55

Misuses of Statistics 56

Summary of Level B 58

Level C 61An Introductory Example—Obesityin America 62

The Investigatory Process at Level C 64

Example 1: The Sampling Distributionof a Sample Proportion 67

Example 2: The Sampling Distributionof a Sample Mean 69

Example 3: A Survey of MusicPreferences 71

Example 4: An Experiment on theEffects of Light on the Growth of Radish Seedlings 75

Example 5: Estimating the Densityof the Earth—A Classical Study 79

Example 6: Linear Regression Analysis—Height vs. Forearm Length 80

Example 7: Comparing Mathematics Scores—An Observational Study 82

Example 8: Observational Study—Toward Establishing Causation 83

The Role of Probability in Statistics 84

Summary of Level C 87

Appendix for Level A 89

Appendix for Level B 95

Appendix for Level C 99

References 108

Page 38: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Table 1: The Framework 14–15

Table 2: Frequency Count Table 24

Table 3: Frequencies and RelativeFrequencies 39

Table 4: Two-Way Frequency Table 40, 95

Table 5: Grouped Frequency and Grouped Relative Frequency Distributions 46

Table 6: Hat Size Data 47

Table 7: Five-Number Summariesfor Sodium Content 47

Table 8: Height and Arm Span Data 48

Table 9: Five-Number Summaries 55

Table 10: Live Birth Data 56

Table 11: Two-Way Frequency Table 72

Table 12: Lengths of Radish Seedlings 76

Table 13: Treatment Summary Statistics 77

Table 14: Heights vs. Forearm Lengths 81, 99

Table 15: NAEP 2000 Scores in Mathematics 82

Table 16: Cigarette Smoking and LungCancer 83

Table 17: Level of Cigarette Smokingand Lung Cancer 84

Table 18: Family Size Distribution 86

Table 19: 2x2 Two-Way Frequency Table 96

Table 20: Two-Way Frequency Table 97

Table 21: Two-Way Frequency Table 97

Table 22: Two-Way Frequency Table 98

Table 23: Result of Lifestyle Question 100

Table 24: Pulse Data 102

Table 25: Pulse Data in Matched Pairs 102

Table 26: U.S. Population (in 1,000s) 104

Table 27: U.S. Death Rates (Deaths per100,000 of Population) 105

Table 28: Enrollment Data 106

Figure 1: Picture Graph of Music Preferences 25

Figure 2: Bar Graph of Music Preferences 26

Figure 3: Stem and Leaf Plot ofJumping Distances 27

Figure 4: Dotplot of Environment vs. Height 28

Figure 5: Parallel Dotplot of Sodium Content 29

Figure 6: Scatterplot of Arm Span vs. Height 32

Figure 7: Timeplot of Temperature vs. Time 32

Figure 8: Comparative Bar Graph for Music Preferences 39

Figure 9: Dotplot for Pet Count 42

Figure 10: Dotplot Showing Pets Evenly Distributed 42

Figure 11: Dotplot with One Data PointMoved 42

Figure 12: Dotplot with Two Data PointsMoved 42

Figure 13: Dotplot with Different Data Points Moved 43

Figure 14: Dotplot Showing Distance from 5 43

Figure 15: Dotplot Showing Original Data and Distance from 5 43

Figure 16: Stemplot of Head Circumference 45

Figure 17: Relative Frequency Histogram 45

Figure 18: Boxplot for Sodium Content 47

Figure 19: Scatterplot of Arm Spanvs. Height 49

Figure 20: Scatterplot Showing Means 49

Figure 21: Eyeball Line 51

Figure 22: Eighty Circles 53

Figure 23: Boxplot for Memory Data 55

Figure 24: Time Series Plot of Live Births 56

Figure 25: Histogram of Sample Proportions 68

Figure 26: Histogram of Sample Means 69

Figure 27: Dotplot of Sample Proportionsfrom a Hypothetical Population in Which50% Like Rap Music 72

Figure 28: Dotplot of Sample Proportionsfrom a Hypothetical Population in Which40% Like Rap Music 73

Figure 29: Dotplot Showing SimulatedSampling Distribution 74

Figure 30: Seed Experiment 75

Figure 31: Boxplot Showing Growth under Different Conditions 77

Figure 32: Dotplot Showing Differencesof Means 78

Figure 33: Dotplot Showing Differencesof Means 78

Figure 34: Histogram of Earth Density Measurements 80

Figure 35: Scatterplot and Residual Plot 81, 99

Figure 36: Random Placement of Names 89

Figure 37: Names Clustered by Length 90

Figure 38: Preliminary Dotplot 90

Figure 39: Computer-Generated Dotplot 91

Figure 40: Student-Drawn Graphs 92

Figure 41: Initial Sorting of Candies 93

Figure 42: Bar Graph of Candy Color 93

Figure 43: Scatterplot of Arm Span/Height Data 95

Figure 44: Dotplot Showing Association 100

Figure 45: Dotplot Showing Differencesin Sample Proportions 101

Figure 46: Dotplot of RandomizedDifferences in Means 103

Figure 47: Dotplot of Randomized Pair Difference Means 104

Figure 48: Scatterplot of Death Rates 105

Figure 49: Scatterplot of Actual Deaths 105

Figure 50: Distorted Graph 106

Figure 51: Plot of African-American vs. Total Enrollments 107

Figure 52: Plot of African-AmericanEnrollments Only 107

Figure 53: Ratio of African-Americanto Total Enrollments 107

Index of Tables and Figures

Page 39: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

1

The ultimate goal: statistical literacy. Every morning, the newspaper and other me-dia confront us with statistical information on

topics ranging from the economy to education, from movies to sports, from food to medicine, and from public opinion to social behavior. Such information guides decisions in our personal lives and enables us to meet our responsibilities as citizens. At work, we may be presented with quantitative information on budgets, supplies, manufacturing specifi cations, mar-ket demands, sales forecasts, or workloads. Teachers may be confronted with educational statistics concern-ing student performance or their own accountability. Medical scientists must understand the statistical re-sults of experiments used for testing the effectiveness and safety of drugs. Law enforcement professionals depend on crime statistics. If we consider changing jobs and moving to another community, then our de-cision can be affected by statistics about cost of living, crime rate, and educational quality.

Our lives are governed by numbers. Every high-school graduate should be able to use sound statistical reasoning to intelligently cope with the requirements of citizenship, employment, and family and to be pre-pared for a healthy, happy, and productive life.

Citizenship

Public opinion polls are the most visible examples of a statistical application that has an impact on our lives.

In addition to directly informing individual citizens, polls are used by others in ways that affect us. The po-litical process employs opinion polls in several ways. Candidates for offi ce use polling to guide campaign strategy. A poll can determine a candidate’s strengths with voters, which can, in turn, be emphasized in the campaign. Citizens also might be suspicious that poll results might infl uence candidates to take positions just because they are popular.

A citizen informed by polls needs to understand that the results were determined from a sample of the pop-ulation under study, that the reliability of the results depends on how the sample was selected, and that the results are subject to sampling error. The statisti-cally literate citizen should understand the behavior of “random” samples and be able to interpret a “margin of sampling error.”

The federal government has been in the statistics business from its very inception. The U.S. Census was established in 1790 to provide an offi cial count of the population for the purpose of allocating rep-resentatives to Congress. Not only has the role of the U.S. Census Bureau greatly expanded to include the collection of a broad spectrum of socioeconomic data, but other federal departments also produce ex-tensive “offi cial” statistics concerned with agriculture, health, education, environment, and commerce. The information gathered by these agencies infl uences policy making and helps to determine priorities for

Introduction

Everyhigh-schoolgraduate should be able to use soundstatistical reasoning to intelligently cope with the requirements of citizenship, employment, and family and to be prepared for a healthy, happy, andproductive life.

Page 40: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

2

government spending. It is also available for general use by individuals or private groups. Thus, statistics compiled by government agencies have a tremendous impact on the life of the ordinary citizen.

Personal Choices

Statistical literacy is required for daily personal choices. Statistics provides information about the nu-tritional quality of foods and thus informs our choices at the grocery store. Statistics helps to establish the safety and effectiveness of drugs, which aids physi-cians in prescribing a treatment. Statistics also helps to establish the safety of toys to assure our children are not at risk. Our investment choices are guided by a plethora of statistical information about stocks and bonds. The Nielsen ratings help determine which shows will survive on television, thus affecting what is available. Many products have a statistical history, and our choices of products can be affected by awareness of this history. The design of an automobile is aided by anthropometrics—the statistics of the human body—to enhance passenger comfort. Statistical ratings of fuel effi ciency, safety, and reliability are available to help us select a vehicle.

The Workplace and Professions

Individuals who are prepared to use statistical think-ing in their careers will have the opportunity to ad-vance to more rewarding and challenging positions.

A statistically competent work force will allow the United States to compete more effectively in the glob-al marketplace and to improve its position in the inter-national economy. An investment in statistical literacy is an investment in our nation’s economic future, as well as in the well-being of individuals.

The competitive marketplace demands quality. Ef-forts to improve quality and accountability are promi-nent among the many ways that statistical thinking and tools can be used to enhance productivity. Qual-ity-control practices, such as the statistical monitor-ing of design and manufacturing processes, identify where improvement can be made and lead to better product quality. Systems of accountability can help produce more effective employees and organizations, but many accountability systems now in place are not based on sound statistical principles and may, in fact, have the opposite effect. Good accountability systems require proper use of statistical tools to determine and apply appropriate criteria.

Science

Life expectancy in the US almost doubled during the 20th century; this rapid increase in life span is the consequence of science. Science has enabled us to im-prove medical care and procedures, food production, and the detection and prevention of epidemics. Statis-tics plays a prominent role in this scientifi c progress.

Page 41: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

3

The U.S. Food and Drug Administration requires extensive testing of drugs to determine effectiveness and side effects before they can be sold. A recentadvertisement for a drug designed to reduce blood clots stated, “PLAVIX, added to aspirin and your cur-rent medications, helps raise your protection against heart attack or stroke.” But the advertisement also warned, “The risk of bleeding may increase with PLAVIX...”

Statistical literacy involves a healthy dose of skepticism about “scientifi c” fi ndings. Is the information about side effects of PLAVIX treatment reliable? A statisti-cally literate person should ask such questions and be able to intelligently answer them. A statistically literate high-school graduate will be able to understand the conclusions from scientifi c investigations and offer an informed opinion about the legitimacy of the reported results. According to Mathematics and Democracy: The Case for Quantitative Literacy (Steen, 2001), such knowl-edge “empowers people by giving them tools to think for themselves, to ask intelligent questions of experts, and to confront authority confi dently. These are skills required to survive in the modern world.”

Statistical literacy is essential in our personal lives as consumers, citizens, and professionals. Statistics plays a role in our health and happiness. Sound statistical reasoning skills take a long time to develop. They cannot be honed to the level needed in the modern world through one high-school course. The surest way

to help students attain the necessary skill level is to begin the statistics education process in the elemen-tary grades and keep strengthening and expanding students’ statistical thinking skills throughout the middle- and high-school years. A statistically liter-ate high-school graduate will know how to interpret the data in the morning newspaper and will ask the right questions about statistical claims. He or she will be comfortable handling quantitative decisions that come up on the job, and will be able to make informed decisions about quality-of-life issues.

The remainder of this document lays out a curriculum framework for pre-K–12 educational programs that is designed to help students achieve statistical literacy.

The Case for Statistics Education

Over the past quarter century, statistics (often labeled data analysis and probability) has become a key com-ponent of the pre-K–12 mathematics curriculum. Advances in technology and modern methods of data analysis in the 1980s, coupled with the data richness of society in the information age, led to the develop-ment of curriculum materials geared toward introduc-ing statistical concepts into the school curriculum as early as the elementary grades. This grassroots effort was given sanction by the National Council of Teach-ers of Mathematics (NCTM) when their infl uential document, Curriculum and Evaluation Standards for School Mathematics (NCTM, 1989), included “Data Analysis

Page 42: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

4

and Probability” as one of the fi ve content strands. As this document and its 2000 replacement, Principles and Standards for School Mathematics (NCTM, 2000), became the basis for reform of mathematics curricula in many states, the acceptance of and interest in statistics as part of mathematics education gained strength. In recent years, many mathematics educators and stat-isticians have devoted large segments of their careers to improving statistics education materials and peda-gogical techniques.

NCTM is not the only group calling for improved statistics education beginning at the school level. The National Assessment of Educational Progress (NAEP, 2005) was developed around the same content strands as the NCTM Standards, with data analysis and prob-ability questions playing an increasingly prominent role on the NAEP exam. In 2006, the College Board released its College Board Standards for College Success™: Mathematics and Statistics, which includes “Data and Variation” and “Chance, Fairness, and Risk” among its list of eight topic areas that are “central to the knowledge and skills developed in the middle-school and high-school years.” An examination of the stan-dards recommended by this document reveals a consistent emphasis on data analysis, probability, and statistics at each course level.

The emerging quantitative literacy movement calls for greater emphasis on practical quantitative skills that will help assure success for high-school graduates in

life and work; many of these skills are statistical in nature. To quote from Mathematics and Democracy: The Case for Quantitative Literacy (Steen, 2001):

A recent study from the American DiplomaProject, titled Ready or Not: Creating a High SchoolDiploma That Counts (www.amstat.org/education/gaise/1), recommends “must-have” competencies needed for high-school graduates “to succeed in postsecondary education or in high-performance, high-growth jobs.” These include, in addition to algebra and geometry, as-pects of data analysis, statistics, and other applications that are vitally important for other subjects, as well as for employment in today’s data-rich economy.

Statistics education as proposed in this Framework can promote the “must-have” competencies for graduates to “thrive in the modern world.”

Quantitative literacy, also called numeracy, is the natural tool for comprehending information in the computer age. The expectation that ordinary citizens be quantitatively literate is primarily a phe-nomenon of the late twentieth century. …Unfortu-nately, despite years of study and life experience in an environment immersed in data, many educated adults remain functionally illiterate. …Quantita-tive literacy empowers people by giving them tools to think for themselves [sic], to ask intelligent questions of experts, and to confront authority confi dently. These are the skills required to thrive in the modern world.

Statisticseducation as

proposedin this

Framework can promote the‘must-have’

competencies for graduates

to ‘thrive in the modern world.’

Page 43: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

5

NCTM Standards and the Framework

The main objective of this document is to provide a conceptual Framework for K–12 statistics education. The foundation for this Framework rests on the NCTM Principles and Standards for School Mathematics (2000).

The Framework is intended to complement the recom-mendations of the NCTM Principles and Standards, not to supplant them.

The NCTM Principles and Standards describes the statis-tics content strand as follows:

Data Analysis and Probability

Instructional programs from pre-kindergarten through grade 12 should enable all students to:

→ formulate questions that can be addressed with data and collect, organize, and display relevant data to answer them;

→ select and use appropriate statistical methods to analyze data;

→ develop and evaluate inferences and predictions that are based on data; and

→ understand and apply basic concepts of probability.

The “Data Analysis and Probability” standard recom-mends that students formulate questions that can be answered using data and address what is involved in wisely gathering and using that data. Students should

learn how to collect data, organize their own or oth-ers’ data, and display the data in graphs and charts that will be useful in answering their questions. This stan-dard also includes learning methods for analyzing data and ways of making inferences and drawing conclu-sions from data. The basic concepts and applications of probability also are addressed, with an emphasis on the way probability and statistics are related.

The NCTM Principles and Standards elaborates on these themes somewhat and provides examples of the types of lessons and activities that might be used in a class-room. More complete examples can be found in the NCTM Navigation Series on Data Analysis and Probability (2002–2004). Statistics, however, is a relatively new subject for many teachers, who have not had an op-portunity to develop sound knowledge of the prin-ciples and concepts underlying the practices of data analysis that they now are called upon to teach. These teachers do not clearly understand the difference be-tween statistics and mathematics. They do not see the statistics curriculum for grades pre-K–12 as a cohesive and coherent curriculum strand. These teachers may not see how the overall statistics curriculum provides a developmental sequence of learning experiences.

This Framework provides a conceptual structure for statistics education that gives a coherent picture of the overall curriculum.

Page 44: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

6

The Difference between Statistics and Mathematics

“Statistics is a methodological discipline. It exists not for itself, but rather to offer to other fi elds of study a coherent set of ideas and tools for dealing with data. The need for such a discipline arises from the omnipres-ence of variability.” (Moore and Cobb, 1997)

A major objective of statistics education is to help students develop statistical thinking. Statistical think-ing, in large part, must deal with this omnipresence of variability; statistical problem solving and decision making depend on understanding, explaining, and quantifying the variability in the data.

It is this focus on variability in data that sets apart sta-tistics from mathematics.

The Nature of Variability

There are many sources of variability in data. Some of the important sources are described below.

Measurement Variability—Repeated measurements on the same individual vary. Sometimes two measure-ments vary because the measuring device produces unreliable results, such as when we try to measure a large distance with a small ruler. At other times, variability results from changes in the system being measured. For example, even with a precise measur-ing device, your recorded blood pressure could differ from one moment to the next.

Natural Variability—Variability is inherent in nature. Individuals are different. When we measure the same quantity across several individuals, we are bound to get differences in the measurements. Although some of this may be due to our measuring instrument, most of it is simply due to the fact that individuals differ. People naturally have different heights, different ap-titudes and abilities, and different opinions and emo-tional responses. When we measure any one of these traits, we are bound to get variability in the measure-ments. Different seeds for the same variety of bean will grow to different sizes when subjected to the same environment because no two seeds are exactly alike; there is bound to be variability from seed to seed in the measurements of growth.

Induced Variability—If we plant one pack of bean seeds in one fi eld, and another pack of seeds in another loca-tion with a different climate, then an observed differ-ence in growth among the seeds in one location with those in the other might be due to inherent differ-ences in the seeds (natural variability), or the observed difference might be due to the fact that the locations are not the same. If one type of fertilizer is used on one fi eld and another type on the other, then observed differences might be due to the difference in fertiliz-ers. For that matter, the observed difference might be due to a factor we haven’t even thought about. A more carefully designed experiment can help us determine the effects of different factors.

Page 45: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

7

This one basic idea, comparing natural variability to the variability induced by other factors, forms the heart of modern statistics. It has allowed medical sci-ence to conclude that some drugs are effective and safe, whereas others are ineffective or have harmful side effects. It has been employed by agricultural sci-entists to demonstrate that a variety of corn grows better in one climate than another, that one fertilizer is more effective than another, or that one type of feed is better for beef cattle than another.

Sampling Variability—In a political poll, it seems rea-sonable to use the proportion of voters surveyed (a sample statistic) as an estimate of the unknown pro-portion of all voters who support a particular candi-date. But if a second sample of the same size is used, it is almost certain that there would not be exactly the same proportion of voters in the sample who support the candidate. The value of the sample proportion will vary from sample to sample. This is called sam-pling variability. So what is to keep one sample from estimating that the true proportion is .60 and another from saying it is .40? This is possible, but unlikely, if proper sampling techniques are used. Poll results are useful because these techniques and an adequate sample size can ensure that unacceptable differences among samples are quite unlikely.

An excellent discussion on the nature of variability is given in Seeing Through Statistics (Utts, 1999).

The Role of Context

“The focus on variability naturally gives statistics a particular content that sets it apart from mathematics, itself, and from other mathematical sciences, but there is more than just content that distinguishes statisti-cal thinking from mathematics. Statistics requires a different kind of thinking, because data are not just numbers, they are numbers with a context. In mathematics, context obscures structure. In data analysis, context provides meaning.” (Moore and Cobb, 1997)

Many mathematics problems arise from applied con-texts, but the context is removed to reveal mathemati-cal patterns. Statisticians, like mathematicians, look for patterns, but the meaning of the patterns depends on the context.

A graph that occasionally appears in the business sec-tion of newspapers shows a plot of the Dow Jones Industrial Average (DJIA) over a 10-year period. The variability of stock prices draws the attention of an investor. This stock index may go up or down over intervals of time, and may fall or rise sharply over a short period. In context, the graph raises questions. A serious investor is not only interested in when or how rapidly the index goes up or down, but also why. What was going on in the world when the market went up; what was going on when it went down? Now strip away the context. Remove time (years) from the hori-zontal axis and call it “X,” remove stock value (DJIA)

“ In mathematics, context obscures structure. In data analysis, context provides meaning.”

Page 46: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

8

from the vertical axis and call it “Y,” and there remains a graph of very little interest or mathematical content!

Probability

Probability is a tool for statistics.

Probability is an important part of any mathematical education. It is a part of mathematics that enriches the subject as a whole by its interactions with other uses of mathematics. Probability is an essential tool in applied mathematics and mathematical modeling. It is also an essential tool in statistics.

The use of probability as a mathematical model and the use of probability as a tool in statistics employ not only different approaches, but also different kinds of reasoning. Two problems and the nature of the solu-tions will illustrate the difference.

Problem 1:

Assume a coin is “fair.”

Question: If we toss the coin fi ve times, how many heads will we get?

Problem 2:

You pick up a coin.

Question: Is this a fair coin?

Problem 1 is a mathematical probability problem. Problem 2 is a statistics problem that can use the

mathematical probability model determined in Prob-lem 1 as a tool to seek a solution.

The answer to neither question is deterministic. Coin tossing produces random outcomes, which suggests that the answer is probabilistic. The solution to Prob-lem 1 starts with the assumption that the coin is fair and proceeds to logically deduce the numerical prob-abilities for each possible number of heads: 0, 1,…, 5.

The solution to Problem 2 starts with an unfamiliar coin; we don’t know if it is fair or biased. The search for an answer is experimental—toss the coin and see what happens. Examine the resulting data to see if it looks as if it came from a fair coin or a biased coin. There are several possible approaches, including toss the coin fi ve times and record the number of heads. Then, do it again: Toss the coin fi ve times and record the number of heads. Repeat 100 times. Compile the frequencies of outcomes for each possible number of heads. Compare these results to the frequencies predicted by the mathematical model for a fair coin in Problem 1. If the empirical frequencies from the experiment are quite dissimilar from those predicted by the mathematical model for a fair coin and are not likely to be caused by random variation in coin tosses, then we conclude that the coin is not fair. In this case, we induce an answer by making a general conclusion from observations of experimental results.

Page 47: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

9

Probability and Chance Variability

Two important uses of “randomization” in statisti-cal work occur in sampling and experimental design. When sampling, we “select at random,” and in experi-ments, we randomly assign individuals to different treatments. Randomization does much more than remove bias in selections and assignments. Random-ization leads to chance variability in outcomes that can be described with probability models.

The probability of something says about what percent of the time it is expected to happen when the basic process is repeated over and over again. Probability theory does not say very much about one toss of a coin; it makes predictions about the long-run behavior of many coin tosses.

Probability tells us little about the consequences of random selection for one sample, but describes the variation we expect to see in samples when the sam-pling process is repeated a large number of times. Probability tells us little about the consequences of random assignment for one experiment, but describes the variation we expect to see in the results when the experiment is replicated a large number of times.

When randomness is present, the statistician wants to know if the observed result is due to chance or some-thing else. This is the idea of statistical signifi cance.

The Role of Mathematics in StatisticsEducation

The evidence that statistics is different from math-ematics is not presented to argue that mathematics is not important to statistics education or that statistics education should not be a part of mathematics educa-tion. To the contrary, statistics education becomes in-creasingly mathematical as the level of understanding goes up. But data collection design, exploration of data, and the interpretation of results should be emphasized in statistics education for statistical literacy. These are heavily dependent on context, and, at the introductory level, involve limited formal mathematics.

Probability plays an important role in statistical analy-sis, but formal mathematical probability should have its own place in the curriculum. Pre-college statistics education should emphasize the ways probability is used in statistical thinking; an intuitive grasp of prob-ability will suffi ce at these levels.

Page 48: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

10

In This Section→ The Role of Variability in the Problem- Solving Process

→ Maturing over Levels

→ The Framework Model

→ Illustrations

I. Formulate Questions

Word Length Example

Popular Music Example

Height and Arm Span Example

Plant Growth Example

II. Collect Data

Word Length Example

Plant Growth Example

III. Analyze Data

Popular Music Example

Height and Arm Span Example

IV. Interpret Results

Word Length Example

Plant Growth Example

Nature of Variability

Variability within a Group

Variability within a Group and Variability between Groups

Covariability

Variability in Model Fitting

Induced Variability

Sampling Variability

Chance Variability from Sampling

Chance Variability Resulting from Assignment to Groups in Experiments

→ Detailed Descriptions of Each Level

Page 49: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

11

Statistical problem solving is an investigative pro-cess that involves four components:

I. Formulate Questions

→ clarify the problem at hand → formulate one (or more) questions that can be answered with data

II. Collect Data

→ design a plan to collect appropriate data→ employ the plan to collect the data

III. Analyze Data

→ select appropriate graphical and numerical methods→ use these methods to analyze the data

IV. Interpret Results

→ interpret the analysis → relate the interpretation to the original question

The Role of Variability in the Problem-Solving Process

I. Formulate Questions

Anticipating Variability—Making the Statistics Question DistinctionThe formulation of a statistics question requires an understanding of the difference between a question

that anticipates a deterministic answer and a question that anticipates an answer based on data that vary.

The question, “How tall am I?” will be answered with a single height. It is not a statistics question. The question “How tall are adult men in the USA?” would not be a statistics question if all these men were ex-actly the same height! The fact that there are differing heights, however, implies that we anticipate an answer based on measurements of height that vary. This is a statistics question.

The poser of the question, “How does sunlight affect the growth of a plant?” should anticipate that the growth of two plants of the same type exposed to the same sun-light will likely differ. This is a statistics question.

The anticipation of variability is the basis for under-standing the statistics question distinction.

II. Collect Data

Acknowledging Variability—Designing for DifferencesData collection designs must acknowledge variability in data, and frequently are intended to reduce variabil-ity. Random sampling is intended to reduce the dif-ferences between sample and population. The sample size infl uences the effect of sampling variability (er-ror). Experimental designs are chosen to acknowledge the differences between groups subjected to different treatments. Random assignment to the groups is in-tended to reduce differences between the groups due to factors that are not manipulated in the experiment.

The Framework

Theformulation of a statistics questionrequires an understanding of thedifference between a question that anticipates a deterministic answer and a question that anticipatesan answer based on data that vary.

Page 50: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

12

Some experimental designs pair subjects so they are similar. Twins frequently are paired in medical ex-periments so that observed differences might be more likely attributed to the difference in treatments, rather than differences in the subjects.

The understanding of data collection designs that acknowledge differences is required for effective collection of data.

III. Analyze Data

Accounting of Variability—Using DistributionsThe main purpose of statistical analysis is to give an accounting of the variability in the data. When results of an election poll state “42% of those polled support a particular candidate with margin of error +/- 3% at the 95% confi dence level,” the focus is on sampling variability. The poll gives an estimate of the support among all voters. The margin of error indicates how far the sample result (42% +/- 3%) might differ from the actual percent of all voters who support the can-didate. The confi dence level tells us how often esti-mates produced by the method employed will producecorrect results. This analysis is based on the distribu-tion of estimates from repeated random sampling.

When test scores are described as “normally distrib-uted with mean 450 and standard deviation 100,” the focus is on how the scores differ from the mean. The normal distribution describes a bell-shaped pattern of

scores, and the standard deviation indicates the level of variation of the scores from the mean.

Accounting for variability with the use of distribu-tions is the key idea in the analysis of data.

IV. Interpret Results

Allowing for Variability—Looking beyond the DataStatistical interpretations are made in the presence of variability and must allow for it.

The result of an election poll must be interpreted as an estimate that can vary from sample to sample. The gen-eralization of the poll results to the entire population of voters looks beyond the sample of voters surveyed and must allow for the possibility of variability of results among different samples. The results of a randomized comparative medical experiment must be interpreted in the presence of variability due to the fact that different individuals respond differently to the same treatment and the variability due to randomization. The gener-alization of the results looks beyond the data collected from the subjects who participated in the experiment and must allow for these sources of variability.

Looking beyond the data to make generalizations must allow for variability in the data.

Maturing over Levels

The mature statistician understands the role of vari-ability in the statistical problem-solving process. At the

Page 51: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

13

point of question formulation, the statistician antici-pates the data collection, the nature of the analysis, and the possible interpretations—all of which involve pos-sible sources of variability. In the end, the mature prac-titioner refl ects upon all aspects of data collection and analysis as well as the question, itself, when interpreting results. Likewise, he or she links data collection and analysis to each other and the other two components.

Beginning students cannot be expected to make all of these linkages. They require years of experience and training. Statistical education should be viewed as a developmental process. To meet the proposed goals, this report provides a framework for statistical educa-tion over three levels. If the goal were to produce a mature practicing statistician, there certainly would be several levels beyond these. There is no attempt to tie these levels to specifi c grade levels.

The Framework uses three developmental Levels: A, B, and C. Although these three levels may parallel grade levels, they are based on development in statistical literacy, not age. Thus, a middle-school student who has had no prior experience with statistics will need to begin with Level A concepts and activities before moving to Level B. This holds true for a secondary student as well. If a student hasn’t had Level A and B experiences prior to high school, then it is not appro-priate for that student to jump into Level C expecta-tions. The learning is more teacher-driven at Level A, but becomes student-driven at Levels B and C.

The Framework Model

The conceptual structure for statistics education is provided in the two-dimensional model shown in Table 1. One dimension is defi ned by the problem-solving process components plus the nature of the variability considered and how we focus on variabil-ity. The second dimension is comprised of the three developmental levels.Each of the fi rst four rows describes a process com-ponent as it develops across levels. The fi fth row indicates the nature of the variability considered at a given level. It is understood that work at Level B as-sumes and develops further the concepts from Level A; likewise, Level C assumes and uses concepts from the lower levels.

Reading down a column will describe a complete problem investigation for a particular level along with the nature of the variability considered.

Page 52: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

14

Process Component Level A Level B Level CI. Formulate Question Beginning awareness

of the statistics question distinction

Teachers pose questions of interest

Questions restricted to the classroom

Increased awareness of the statistics question distinction

Students begin to pose their own questions of interest

Questions not restricted to the classroom

Students can make the statistics questiondistinction

Students pose their own questions of interest

Questions seekgeneralization

II. Collect Data Do not yet design fordifferences

Census of classroom

Simple experiment

Beginning awareness of design for differences

Sample surveys; begin to use random selection

Comparative experiment; begin to use randomallocation

Students make design for differences

Sampling designs with random selection

Experimental designs with randomization

III. Analyze Data Use particular properties of distributions in the context of a specifi c example

Display variability within a group

Compare individual to individual

Compare individual to group

Beginning awareness of group to group

Observe associationbetween two variables

Learn to use particular properties of distributions as tools of analysis

Quantify variability within a group

Compare group to group in displays

Acknowledge sampling error

Some quantifi cation ofassociation; simple models for association

Understand and usedistributions in analysisas a global concept

Measure variability within a group; measure variability between groups

Compare group to group using displays andmeasures of variability

Describe and quantifysampling error

Quantifi cation ofassociation; fi tting ofmodels for association

Table 1: The Framework

Page 53: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

15

Process Component Level A Level B Level CIV. Interpret Results Students do not look

beyond the data

No generalization beyond the classroom

Note difference between two individuals withdifferent conditions

Observe association in displays

Students acknowledgethat looking beyond the data is feasible

Acknowledge that asample may or may notbe representative of thelarger population

Note the differencebetween two groupswith different conditions

Aware of distinctionbetween observational study and experiment

Note differences instrength of association

Basic interepretation of models for association

Aware of the distinction between association and cause and effect

Students are able to look beyond the data in some contexts

Generalize from sample to population

Aware of the effect ofrandomization on the results of experiments

Understand the difference between observational studies and experiments

Interpret measures of strength of association

Interpret modelsof association

Distinguish betweenconclusions fromassociation studies and experiments

Nature of Variability Measurement variability

Natural variability

Induced variability

Sampling variability Chance variability

Focus on Variability Variability within a group Variability within a group and variability between groups

Covariability

Variability in model fi tting

Page 54: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

16

Illustrations

All four steps of the problem-solving process are used at all three levels, but the depth of understanding and sophistication of methods used increases across Lev-els A, B, and C. This maturation in understanding the problem-solving process and its underlying concepts is paralleled by an increasing complexity in the role of variability. The illustrations of learning activities giv-en here are intended to clarify the differences across the developmental levels for each component of the problem-solving process. Later sections will give il-lustrations of the complete problem-solving process for learning activities at each level.

I. Formulate Questions

Word Length Example

Level A: How long are the words on this page?

Level B: Are the words in a chapter of a fi fth-grade book longer than the words in a chapter of a third-grade book?

Level C: Do fi fth-grade books use longer words than third-grade books?

Popular Music Example

Level A: What type of music is most popular among students in our class?

Level B: How do the favorite types of music compare among different classes?

Level C: What type of music is most popular among students in our school?

Height and Arm Span Example

Level A: In our class, are the heights and arm spans of students approximately the same?

Level B: Is the relationship between arm span and height for the students in our class the same as the relationship between arm span and height for the stu-dents in another class?

Level C: Is height a useful predictor of arm span for the students in our school?

Plant Growth Example

Level A: Will a plant placed by the window grow taller than a plant placed away from the window?

Level B: Will fi ve plants placed by the window grow taller than fi ve plants placed away from the window?

Level C: How does the level of sunlight affect the growth of plants?

II. Collect Data

Word Length Example

Level A: How long are the words on this page?

Theillustrationsof learning

activities given here are

intended to clarify the

differences across the

developmental levels for each

componentof the

problem-solving process.

Page 55: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

17

The length of every word on the page is determined and recorded.

Level B: Are the words in a chapter of a fi fth-grade book longer than the words in a chapter of a third-grade book?

A simple random sample of words from each chapter is used.

Level C: Do fi fth-grade books use longer words than third-grade books?

Different sampling designs are considered and com-pared, and some are used. For example, rather than selecting a simple random sample of words, a simple random sample of pages from the book is selected and all the words on the chosen pages are used for the sample.

Note: At each level, issues of measurement should be addressed. The length of word depends on the defi nition of “word.” For instance, is a number a word? Consistency of defi nition helps reduce mea-surement variability.

Plant Growth Example

Level A: Will a plant placed by the window grow taller than a plant placed away from the window?

A seedling is planted in a pot that is placed on the window sill. A second seedling of the same type and size is planted in a pot that is placed away from the

window sill. After six weeks, the change in height for each is measured and recorded.

Level B: Will fi ve plants of a particular type placed by the window grow taller than fi ve plants of the same type placed away from the window?

Five seedlings of the same type and size are planted in a pan that is placed on the window sill. Five seedlings of the same type and size are planted in a pan that is placed away from the window sill. Random numbers are used to decide which plants go in the window. Af-ter six weeks, the change in height for each seedling is measured and recorded.

Level C: How does the level of sunlight affect the growth of plants?

Fifteen seedlings of the same type and size are select-ed. Three pans are used, with fi ve of these seedlings planted in each. Fifteen seedlings of another variety are selected to determine if the effect of sunlight is the same on different types of plants. Five of these are planted in each of the three pans. The three pans are placed in locations with three different levels of light. Random numbers are used to decide which plants go in which pan. After six weeks, the change in height for each seedling is measured and recorded.

Note: At each level, issues of measurement should be addressed. The method of measuring change in height must be clearly understood and applied in order to re-duce measurement variability.

Page 56: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

18

III. Analyze Data

Popular Music Example

Level A: What type of music is most popular among students in our class?

A bar graph is used to display the number of students who choose each music category.

Level B: How do the favorite types of music compare among different classes?

For each class, a bar graph is used to display the per-cent of students who choose each music category. The same scales are used for both graphs so that they can easily be compared.

Level C: What type of music is most popular among students in our school?

A bar graph is used to display the percent of students who choose each music category. Because a random sample is used, an estimate of the margin of error is given.

Note: At each level, issues of measurement should be addressed. A questionnaire will be used to gather students’ music preferences. The design and wording of the questionnaire must be carefully considered to avoid possible bias in the responses. The choice of music categories also could affect results.

Height and Arm Span Example

Level A: In our class, are the heights and arm spans of students approximately the same?

The difference between height and arm span is deter-mined for each individual. An X-Y plot (scatterplot) is constructed with X = height, Y = arm span. The line Y = X is drawn on this graph.

Level B: Is the relationship between arm span and height for the students in our class the same as the relationship between arm span and height for the stu-dents in another class?

For each class, an X-Y plot is constructed with X = height, Y = arm span. An “eye ball” line is drawn on each graph to describe the relationship between height and arm span. The equation of this line is determined. An elementary measure of association is computed.

Level C: Is height a useful predictor of arm span for the students in our school?

The least squares regression line is determined and as-sessed for use as a prediction model.

Note: At each level, issues of measurement should be addressed. The methods used to measure height and arm span must be clearly understood and applied in order to reduce measurement variability. For instance, do we measure height with shoes on or off?

IV. Interpret Results

Word Length Example

Level A: How long are the words on this page?

Page 57: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

19

The dotplot of all word lengths is examined and sum-marized. In particular, students will note the longest and shortest word lengths, the most common and least common lengths, and the length in the middle.

Level B: Are the words in a chapter of a fi fth-grade book longer than the words in a chapter of a third-grade book?

Students interpret a comparison of the distribution of a sample of word lengths from the fi fth-grade book with the distribution of word lengths from the third-grade book using a boxplot to represent each of these. The students also acknowledge that samples are be-ing used that may or may not be representative of the complete chapters.

The boxplot for a sample of word lengths from the fi fth-grade book is placed beside the boxplot of the sample from the third-grade book.

Level C: Do fi fth-grade books use longer words than third-grade books?

The interpretation at Level C includes the interpreta-tion at Level B, but also must consider generalizing from the books included in the study to a larger popu-lation of books.

Plant Growth Example

Level A: Will a plant placed by the window grow taller than a plant placed away from the window?

In this simple experiment, the interpretation is just a matter of comparing one measurement of change in size to another.

Level B: Will fi ve plants placed by the window grow taller than fi ve plants placed away from the window?

In this experiment, the student must interpret a com-parison of one group of fi ve measurements with an-other group. If a difference is noted, then the student acknowledges it is likely caused by the difference in light conditions.

Level C: How does the level of sunlight affect the growth of plants?

There are several comparisons of groups possible with this design. If a difference is noted, then the student acknowledges it is likely caused by the difference in light conditions or the difference in types of plants. It also is acknowledged that the randomization used in the experiment can result in some of the observed differences.

Nature of Variability

The focus on variability grows increasingly more sophisticated as students progress through the developmental levels.

Variability within a Group

This is the only type considered at Level A. In the word length example, differences among word lengths

The focuson variability growsincreasingly moresophisticated as students progress through the developmental levels.

Page 58: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

20

on a single page are considered; this is variability with-in a group of word lengths. In the popular music ex-ample, differences in how many students choose each category of music are considered; this is variability within a group of frequencies.

Variability within a Group and Variabilitybetween Groups

At Level B, students begin to make comparisons of groups of measurements. In the word length example, a group of words from a fi fth-grade book is compared to a group from a third-grade book. Such a comparison not only notes how much word lengths differ within each group, but must also take into consideration the differences between the two groups, such as the dif-ference between median or mean word lengths.

Covariability

At Level B, students also begin to investigate the “sta-tistical” relationship between two variables. The na-ture of this statistical relationship is described in terms of how the two variables “co-vary.” In the height and arm span example, for instance, if the heights of two students differ by two centimeters, then we would like our model of the relationship to tell us by how much we might expect their arm spans to differ.

Variability in Model Fitting

At Level C, students assess how well a regression line will predict values of one variable from values

of another variable using residual plots. In the height and arm span example, for instance, this assessment is based on examining whether differences between actual arm spans and the arm spans predicted by the model randomly vary about the horizontal line of “no difference” in the residual plot. Inference about a pre-dicted value of y for a given value of x is valid only if the values of y vary at random according to a normal distribution centered on the regression line. Students at Level C learn to estimate this variability about the regression line using the estimated standard deviation of the residuals.

Induced Variability

In the plant growth example at Level B, the experi-ment is designed to determine if there will be a differ-ence between the growth of plants in sunlight and of plants away from sunlight. We want to determine if an imposed difference on the environments will induce a difference in growth.

Sampling Variability

In the word length example at Level B, samples of words from a chapter are used. Students observe that two samples will produce different groups of word lengths. This is sampling variability.

Chance Variability from Sampling

When random selection is used, differences between samples will be due to chance. Understanding this

Page 59: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

21

chance variation is what leads to the predictability of results. In the popular music example, at Level C, this chance variation is not only considered, but is also the basis for understanding the concept of margin of error.

Chance Variability Resulting from Assignmentto Groups in Experiments

In the plant growth example at Level C, plants are randomly assigned to groups. Students consider how this chance variation in random assignments might produce differences in results, although a formal analysis is not done.

Detailed Descriptions of Each Level

As this document transitions into detailed descrip-tions of each level, it is important to note that the examples selected for illustrating key concepts and the problem-solving process of statistical reason-ing are based on real data and real-world contexts. Those of you reading this document are stakeholders, and will need to be flexible in adapting these examples to fit your instructional circumstances.

Page 60: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

22

In This Section→ Example 1: Choosing the Band for the End of the Year Party— Conducting a Survey

→ Comparing Groups

→ The Simple Experiment

→ Example 2: Growing Beans— A Simple Comparative Experiment

→ Making Use of Available Data

→ Describing Center and Spread

→ Looking for an Association

→ Example 3: Purchasing Sweatsuits— The Role of Height and Arm Span

→ Understanding Variability

→ The Role of Probability

→ Misuses of Statistics

→ Summary of Level A

Page 61: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

23

Children are surrounded by data. They may think of data as a tally of students’ preferences, such as favorite type of music, or as measurements,

such as students’ arm spans and number of books in school bags.It is in Level A that children need to develop data sense—an understanding that data are more than just numbers. Statistics changes numbers into information.

Students should learn that data are generated with re-spect to particular contexts or situations and can be used to answer questions about the context or situation.Opportunities should be provided for students to generate questions about a particular context (such as their classroom) and determine what data might be collected to answer these questions. Students also should learn how to use basic statistical tools to analyze the data and make informal inferences in answering the posed questions.Finally, students should develop basic ideas of prob-ability in order to support their later use of probability in drawing inferences at Levels B and C.

It is preferable that students actually collect data, but not necessary in every case. Teachers should take advantage of naturally occurring situations in which students notice a pattern about some data and begin to raise questions. For example, when taking daily at-tendance one morning, students might note that many students are absent. The teacher could capitalize on

this opportunity to have the students formulate ques-tions that could be answered with attendance data.

Specifi cally, Level A recommendations in the Investi-gative Process include:

I. Formulate the Question

→ Teachers help pose questions (questions in contexts of interest to the student).→ Students distinguish between statistical solution and fi xed answer.

II. Collect Data to Answer the Question

→ Students conduct a census of the classroom.→ Students understand individual-to-individual natural variability.→ Students conduct simple experiments with nonrandom assignment of treatments.→ Students understand induced variability attributable to an experimental condition.

III. Analyze the Data

→ Students compare individual to individual.→ Students compare individual to a group.→ Students become aware of group to group comparison.→ Students understand the idea of a distribution.→ Students describe a distribution.→ Students observe association between two variables.

Level A

Page 62: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

24

→ Students use tools for exploring distributions and association, including: ▪ Bar Graph ▪ Dotplot ▪ Stem and Leaf Plot ▪ Scatterplot ▪ Tables (using counts) ▪ Mean, Median, Mode, Range

▪ Modal Category

IV. Interpret Results

→ Students infer to the classroom.→ Students acknowledge that results may be different in another class or group.→ Students recognize the limitation of scope of inference to the classroom.

Children at Level A may be interested in the favorite type of music among students at a certain grade level. An end of the year party is being planned and there is only enough money to hire one musical group. The class might investigate the question: What type of music is most popular among students?

This question attempts to measure a characteristic in the population of children at the grade level that

will have the party. The characteristic, favorite mu-sic type, is a categorical variable—each child in that grade would be placed in a particular non-numerical category based on his or her favorite music type. The resulting data often are called categorical data.

The Level A class would most likely conduct a cen-sus of the students in a particular classroom to gauge what the favorite music type might be for the whole grade. At Level A, we want students to recognize that there will be individual-to-individual variability.

For example, a survey of 24 students in one of the classrooms at a particular grade level is taken. The data are summarized in the frequency table below. This frequency table is a tabular representation that takes Level A students to a summative level for categorical data. Students might fi rst use tally marks to record the measurements of categorical data before fi nding fre-quencies (counts) for each category.

Favorite Frequency or Count

Country 8

Rap 12

Rock 4

A Level A student might fi rst use a picture graph to represent the tallies for each category. A picture graph uses a picture of some sort (such as a type of musical band) to represent each individual. Thus, each child

Table 2: Frequency Count TableExample 1: Choosing the Band for the End of the Year Party—Conducting a Survey

Page 63: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

25

who favors a particular music type would put a cut-out of that type of band directly onto the graph the teacher has created on the board. Instead of a picture of a band, another representation—such as a picture of a guitar, an X, or a colored square—can be used to represent each individual preference. A child who prefers “country” would go to the board and place a guitar, dot, X, or color in a square above the column labeled “country.” In both cases, there is a deliberate recording of each data value, one at a time.

Note that a picture graph refers to a graph where an object, such as a construction paper cut-out, is used to represent one individual on the graph. (A cut-out of a tooth might be used to record how many teeth were lost by children in a kindergarten class each month.) The term pictograph often is used to refer to a graph in which a picture or symbol is used to represent several items that belong in the same category. For example, on a graph showing the distribution of car riders, walkers, and bus riders in a class, a cut-out of a school bus might be used to represent fi ve bus riders. Thus, if the class had 13 bus riders, there would be approxi-mately 2.5 busses on the graph.

This type of graph requires a basic understanding of proportional or multiplicative reasoning, and for this reason we do not advocate its use at Level A. Similarly, circle graphs require an understanding of proportional reasoning, so we do not advocate their use at Level A.

A bar graph takes the student to the summative level with the data summarized from some other representation,

Figure 1: Picture graph of music preferences

12

11

10

9

8

7

6

5

4

3

2

1

Country Rap Rock

Type of Music

Nu

mb

er o

f Peo

ple

Wh

o L

ike

This

Kin

d o

f Mu

sic

Page 64: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

26

such as a picture graph or a frequency count table. The bar on a bar graph is drawn as a rectangle, reaching up to the desired number on the y-axis.

A bar graph of students’ music preferences is dis-played below for the census taken of the classroom represented in the above frequency count table and picture graph.

12

10

8

6

4

2

0

Type of Music

Freq

uen

cy

Country Rap Rock

Students at Level A should recognize the mode as a way to describe a “representative” or “typical” value for the distribution.

The mode is most useful for categorical data. Students should understand that the mode is the category that contains the most data points, often referred to as the modal category. In our favorite music example, rap music

was preferred by more children, thus the mode or modal category of the data set is rap music. Students could use this information to help the teachers in seeking a musical group for the end of the year party that specializes in rap music.

The vertical axis on the bar graph in Figure 2 could be scaled in terms of the proportion or percent of the sample for each category. As this involves proportion-al reasoning, converting frequencies to proportions (or percentages) will be developed in Level B.

Because most of the data collected at Level A will involve a census of the students’ classroom, the fi rst stage is for students to learn to read and interpret at a simple level what the data show about their own class. Reading and interpreting comes before inference. It is important to consider the question:

What might have caused the data to look like this?

It is also important for children to think about if and how their fi ndings would “scale up” to a larger group, such as the entire grade level, the whole school, all children in the school system, all children in the state, or all people in the nation. They should note variables (such as age or geographic location) that might affect the data in the larger set. In the music example above, students might speculate that if they collected data on music preference from their teachers, the teach-ers might prefer a different type of music. Or, what would happen if they collected music preference from

Figure 2: Bar graph of music preferences

Students shouldunderstand that the mode is the category that contains the most data points, often referred toas the modal category.”

Page 65: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

27

using a back-to-back ordered stem and leaf plot, such as the one below.

From the stem and leaf plot, students can get a sense of shape—more symmetric for the boys than for the girls—and of the fact that boys tend to have longer jumps. Looking ahead to Level C, the previous ex-amples of data collection design will be more formally discussed as examples of observational studies. The researcher has no control over which students go into the boy and girl groups (the pre-existing condition of gender defi nes the groups). The researcher then merely observes and collects measurements on characteristics within each group.

middle-school students in their school system? Level A students should begin recognizing the limitations of the scope of inference to a specifi c classroom.

Comparing Groups

Students at Level A may be interested in comparing two distinct groups with respect to some characteris-tic of those groups. For example, is there a difference between two groups—boys and girls—with respect to student participation in sports? The characteristic “participation in sports” is categorical (yes or no). The resulting categorical data for each gender may be ana-lyzed using a frequency count table or bar graph. An-other question Level A students might ask is whether there is a difference between boys and girls with respect to the distance they can jump, an example of taking measurements on a numerical variable. Data on numerical variables are obtained from situations that involve taking measurements, such as heights or tem-peratures, or situations in which objects are counted (e.g., determining the number of letters in your fi rst name, the number of pockets on clothing worn by children in the class, or the number of siblings each child has). Such data often are called numerical data.

Returning to the question of comparing boys and girls with respect to jumping distance, students may mea-sure the jumping distance for all of their classmates. Once the numerical data are gathered, the children might compare the lengths of girls’ and boys’ jumps

Figure 3: Stem and leaf plot of jumping distances

Girls Boys

8

7

6 1

5 2 6 9

9 7 2 4 1 3 5 5 5

5 5 3 3 3 2 1 3 1 1 2 5 6 7

9 8 7 7 6 4 4 3 2 2 2 3 4 6

1

Inches Jumped in the Standing Broad Jump

Page 66: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

28

The Simple Experiment

Another type of design for collecting data appropri-ate at Level A is a simple experiment, which consists of taking measurements on a particular condition or group. Level A students may be interested in tim-ing the swing of a pendulum or seeing how far a toy car runs off the end of a slope from a fi xed starting position (future Pinewood Derby participants?) Also, measuring the same thing several times and fi nding a mean helps to lay the foundation for the fact that the mean has less variability as an estimate of the true mean value than does a single reading. This idea will be developed more fully at Level C.

A simple comparative experiment is like a science experi-ment in which children compare the results of two or more conditions. For example, children might plant dried beans in soil and let them sprout, and then com-pare which one grows fastest—the one in the light or the one in the dark. The children decide which beans will be exposed to a particular type of lighting. The conditions to be compared here are the two types of lighting environments—light and dark. The type of lighting environment is an example of a categorical variable. Measurements of the plants’ heights can be taken at the end of a specifi ed time period to answer the question of whether one lighting environment is better for growing beans. The collected heights are an

example of numerical data. In Level C, the concept of an experiment (where conditions are imposed by the researcher) will be more fully developed.

Another appropriate graphical representation for numerical data on one variable (in addition to the stem and leaf plot) at Level A is a dotplot. Both the dotplot and stem and leaf plot can be used to easily compare two or more similar sets of numerical data. In creating a dotplot, the x-axis should be labeled with a range of values that the numerical variable can assume. The x-axis for any one-variable graph conventionally is the axis representing the values of the variable under study. For example, in the bean growth experiment, children might record in a dotplot the height of beans (in centimeters) that were grown in the dark (labeled D) and in the light (labeled L) using a dotplot.

108642

DL

Envi

ron

men

t

Height (cm)

It is obvious from the dotplot that the plants in the light environment tend to have greater heights than the plants in the dark environment.

Figure 4: Dotplot of environment vs. height

Example 2: Growing Beans—A Simple Comparative Experiment

Page 67: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

29

Looking for clusters and gaps in the distribution helps students identify the shape of the distribution. Students should develop a sense of why a distribution takes on a particular shape for the context of the variable being considered.→ Does the distribution have one main cluster (or mound) with smaller groups of similar size on each side of the cluster? If so, the distribution might be described as symmetric.→ Does the distribution have one main cluster with smaller groups on each side that are not the same size? Students may classify this as “lopsided,” or may use the term asymmetrical. → Why does the distribution take this shape? Using the dotplot from above, students will recognize both groups have distributions that are “lopsided,” with the main cluster on the lower end of the distributions and a few values to the right of the main mound.

Making Use of Available Data

Most children love to eat hot dogs, but are aware that too much sodium is not necessarily healthy. Is there a difference in the sodium content of beef hot dogs (labeled B in Figure 5) and poultry hot dogs (labeled P in Figure 5)? To investigate this question, students can make use of available data. Using data from the June 1993 issue of Consumer Reports magazine, parallel dotplots can be constructed.

Sodium (mg)250 300 350 400 450 500 550 600 650

B&P Hot Dogs

Typ

eB

P

Students will notice that the distribution of the poultry hot dogs has two distinct clusters. What might explain the gap and two clusters? It could be another variable, such as the price of the poultry hot dogs, with more expensive hot dogs having less sodium. It can also be observed that the beef sodium amounts are more spread out (or vary more) than the poultry hot dogs. In addition, it appears the center of the distribution for the poultry hot dogs is higher than the center for the beef hot dogs.

As students advance to Level B, considering the shape of a distribution will lead to an understanding of what mea-sures are appropriate for describing center and spread.

Describing Center and Spread

Students should understand that the median describes the center of a numerical data set in terms of how many data points are above and below it. The same number of data points (approximately half ) lie to the left of the median and to the right of the median. Children can create a human graph to show how many letters are in their fi rst names. All the children

Figure 5: Parallel dotplot of sodium content

As studentsadvance to Level B,considering the shape of a distribution will lead to anunderstanding of whatmeasures are appropriate for describingcenter and spread.

Page 68: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

30

with two-letter names can stand in a line, with all of the children having three-letter names standing in a parallel line. Once all children are assembled, the teacher can ask one child from each end of the graph to sit down, repeating this procedure until one child is left standing, representing the median. With Level A students, we advocate using an odd number of data points so the median is clear until students have mas-tered the idea of a midpoint.

Students should understand the mean as a fair share measure of center at Level A. In the name length example, the mean would be interpreted as “How long would our names be if they were all the same length?” This can be illustrated in small groups by having children take one snap cube for each letter in their name. In small groups, have students put all the cubes in the center of the table and redistribute them one at a time so each child has the same number. De-pending on the children’s experiences with fractions, they may say the mean name length is 4 R 2 or 4 1/2 or 4.5. Another example would be for the teacher to collect eight pencils of varying lengths from children and lay them end-to-end on the chalk rail. Finding the mean will answer the question “How long would each pencil be if they were all the same length?” That is, if we could glue all the pencils together and cut them into eight equal sections, how long would each sec-tion be? This can be modeled using adding machine tape (or string), by tearing off a piece of tape that is the same length as all eight pencils laid end-to-end.

Then, fold the tape in half three times to get eighths, showing the length of one pencil out of eight pencils of equal length. Both of these demonstrations can be mapped directly onto the algorithm for fi nding the mean: combine all data values (put all cubes in the middle, lay all pencils end-to-end and measure, add all values) and share fairly (distribute the cubes, fold the tape, and divide by the number of data values). Level A students should master the computation (by hand or using appropriate technology) of the mean so more sophisticated interpretations of the mean can be developed at Levels B and C.

The mean and median are measures of location for describ-ing the center of a numerical data set. Determining the maximum and minimum values of a numerical data set assists children in describing the position of the smallest and largest value in a data set. In addition to describing the center of a data set, it is useful to know how the data vary or how spread out the data are.

One measure of spread for a distribution is the range, which is the difference between the maximum and minimum values. Measures of spread only make sense with numerical data.

In looking at the stem and leaf plot formed for the jumping distances (Figure 3), the range differs for boys (range = 39 inches) and girls (range = 27 inches). Girls are more consistent in their jumping distances than boys.

Page 69: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

31

Looking for an Association

Students should be able to look at the possible associa-tion of a numerical variable and a categorical variable by com-paring dotplots of a numerical variable disaggregated by a categorical variable. For example, using the paral-lel dotplots showing the growth habits of beans in the light and dark, students should look for similarities within each category and differences between the cat-egories. As mentioned earlier, students should readily recognize from the dotplot that the beans grown in the light environment have grown taller overall, and therefore reason that it is best for beans to have a light environment. Measures of center and spread also can be compared. For example, students could calculate or make a visual estimate of the mean height of the beans grown in the light and the beans grown in the dark to substantiate their claim that light conditions are better for beans. They also might note that the range for plants grown in the dark is 4 cm, and 5 cm for plants grown in the light. Putting that information together with the mean should enable students to fur-ther solidify their conclusions about the advantages of growing beans in the light.

Considering the hot dog data, one general impres-sion from the dotplot is that there is more variation in the sodium content for beef hot dogs. For beef hot dogs, the sodium content is between 250 mg and 650 mg, while for poultry hot dogs, the sodium content is between 350 mg and 600 mg. Neither the centers

nor the shapes for the distributions are obvious from the dotplots. It is interesting to note the two apparent clusters of data for poultry hot dogs. Nine of the 17 poultry hot dogs have sodium content between 350 mg and 450 mg, while eight of the 17 poultry hot dogs have sodium content between 500 mg and 600 mg. A possible explanation for this division is that some poultry hot dogs are made from chicken, while others are made from turkey.

What about the association between two numerical variables? Parent-teacher organizations at elementary schools have for a popular fund raiser “spirit wear,” such as sweatshirts and sweatpants with the school name and mascot. The organizers need to have some guidelines about how many of each size garment to order. Should they offer the shirt and pants separately, or offer the sweatshirt and sweatpants as one outfi t? Are the heights and arm spans of elementary students closely related, or do they differ considerably due to individual growing patterns of children? Thus, some useful questions to answer are:Is there an association between height and arm span?How strong is the association between height and arm span?A scatterplot can be used to graphically represent data when values of two numerical variables are obtained from the same individual or object. Can we use height

Example 3: Purchasing Sweat Suits—The Role of Height andArm Span

Page 70: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

32

to predict a person’s arm span? Students can measure each other’s heights and arm spans, and then con-struct a scatterplot to look for a relationship between these two numerical variables. Data on height and arm span are measured (in centimeters) for 26 students. The data presented below are for college students and are included for illustrative purposes.

155 160 165 170 175 180 185 190

190

185

180

175

170

165

160

155

150

Height (cm)

Arm

Span

(cm

)

With the use of a scatterplot, Level A students can visually look for trends and patterns.

For example, in the arm span versus height scatterplot above, students should be able to identify the consistent

relationship between the two variables: generally as one gets larger, so does the other. Based on these data, the organizers might feel comfortable ordering some complete outfi ts of sweatshirt and sweatpants based on sizes. However, some students may need to order the sweatshirt and sweatpants separately based on sizes. Another important question the organizers will need to ask is whether this sample is representative of all the students in the school. How was the sample chosen?

Students at Level A also can use a scatterplot to graphi-cally look at the values of a numerical variable change over time, referred to as a time plot. For example, children might chart the outside temperature at various times during the day by recording the values themselves or by using data from a newspaper or the internet.

0

10

20

30

40

50

60

70

12:00 a.m.

6:00 a.m.

9:00 a.m.

12:00 p.m.

3:00 p.m.

6:00 p.m.

9:00 p.m.

3:00 a.m.

TimeTe

mp

erat

ure

(F)

Figure 6: Scatterplot of arm span vs. height

Figure 7: Timeplot of temperature vs. time

With theuse of ascatterplot,Level Astudents can visually lookfor trends and patterns.”

Page 71: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

33

When students advance to Level B, they will quantify these trends and patterns with measures of association.

Understanding Variability

Students should explore possible reasons data look the way they do and differentiate between variation and error. For example, in graphing the colors of candies in a small packet, children might expect the colors to be evenly distributed (or they may know from prior experience that they are not). Children could speculate about why certain colors appear more or less frequently due to variation (e.g., cost of dyes, market research on peo-ple’s preferences, etc.). Children also could identify possible places where errors could have occurred in their handling of the data/candies (e.g., dropped can-dies, candies stuck in bag, eaten candies, candies given away to others, colors not recorded because they don’t match personal preference, miscounting). Teachers should capitalize on naturally occurring “errors” that hap-pen when collecting data in the classroom and help students speculate about the impact of these errors on the fi nal results. For example, when asking students to vote for their favorite food, it is common for students to vote twice, to forget to vote, to record their vote in the wrong spot, to misunderstand what is being asked, to change their mind, or to want to vote for an option that is not listed. Counting errors are also common among young children, which can lead to incorrect tallies of data points in categories. Teachers can help students think about how these events might

affect the fi nal outcome if only one person did this, if several people did it, or if many people did it. Students can generate additional examples of ways errors might occur in a particular data-gathering situation.

The notions of error and variability should be used to explain the outliers, clusters, and gaps students ob-serve in the graphical representations of the data. An understanding of error versus natural variability will help students interpret whether an outlier is a legiti-mate data value that is unusual or whether the outlier is due to a recording error.

At Level A, it is imperative that students begin to un-derstand the concept of variability. As students move from Level A to Level B to Level C, it is important to always keep at the forefront that understanding variability is the essence of developing data sense.

The Role of Probability

Level A students need to develop basic ideas of prob-ability in order to support their later use of probability in drawing inferences at Levels B and C.

At Level A, students should understand that prob-ability is a measure of the chance that something will happen. It is a measure of certainty or uncertainty. Events should be seen as lying on a continuum from impossible to certain, with less likely, equally likely, and more likely lying in between. Students learn to informally assign numbers to the likelihood that something will occur.

Page 72: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

34

An example of assigning numbers on a number line is given below:

0 ¼ ½ ¾ 1

Impos-sible

Unlikely or less likely

Equally likely to

occur and not

occur

Likely or more

likely

Certain

Students should have experiences estimating probabilities using empirical data. Through experimentation (or simu-lation), students should develop an explicit under-standing of the notion that the more times you repeat a random phenomenon, the closer the results will be to the expected mathematical model. At Level A, we are considering only simple models based on equally likely outcomes or, at the most, something based on this, such as the sum of the faces on two number cubes. For example, very young children can state that a penny should land on heads half the time and on tails half the time when fl ipped. The student has given the expected model and probability for tossing a head or tail, assuming that the coin is “fair.”

If a child fl ips a penny 10 times to obtain empiri-cal data, it is quite possible he or she will not get fi ve heads and fi ve tails. However, if the child fl ips the coin hundreds of times, we would expect to see that results will begin stabilizing to the expected probabilities of .5 for heads and .5 for tails. This is known as the Law of Large Numbers. Thus, at

Level A, probability experiments should focus on obtaining empirical data to develop relative frequency interpretations that children can easily translate to models with known and understandable “mathemati-cal” probabilities. The classic fl ipping coins, spinning simple spinners, and tossing number cubes are reliable tools to use in helping Level A students develop an understanding of probability. The concept of relative frequency interpretations will be important at Level B when the student works with proportional reason-ing—going from counts or frequencies to propor-tions or percentages.

As students work with results from repeating random phenomena, they can develop an understanding for the concept of randomness. They will see that when fl ip-ping a coin 10 times, although we would expect fi ve heads and fi ve tails, the actual results will vary from one student to the next. They also will see that if a head results on one toss, that doesn’t mean the next fl ip will result in a tail. Because coin tossing is a ran-dom experiment, there is always uncertainty as to how the coin will land from one toss to the next. However, at Level A, students can begin to develop the notion that although we have uncertainty and variability in our results, by examining what happens to the ran-dom process in the long run, we can quantify the un-certainty and variability with probabilities—giving a predictive number for the likelihood of an outcome in the long run. At Level B, students will see the role probability plays in the development of the concept

Page 73: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

35

of the simple random sample and the role probability plays with randomness.

Misuses of Statistics

The Level A student should learn that proper use of statistical terminology is as important as the proper use of statistical tools. In particular, the proper use of the mean and median should be emphasized. These numerical summaries are appropriate for describing numerical variables, not categorical variables. For example, when collecting categorical data on favorite type of music, the number of children in the sample who prefer each type of music is summarized as a frequency. It is easy to confuse categorical and nu-merical data in this case and try to fi nd the mean or median of the frequencies for favorite type of music. However, one cannot use the frequency counts to compute a mean or median for a categorical variable. The frequency counts are the numerical summary for the categorical variable.

Another common mistake for the Level A student is the inappropriate use of a bar graph with numerical data. A bar graph is used to summarize categori-cal data. If a variable is numerical, the appropriate graphical display with bars is called a histogram, which is introduced in Level B. At Level A, appropriate graphical displays for numerical data are the dotplot and the stem and leaf plot.

Summary of Level A

If students become comfortable with the ideas and concepts described above, they will be prepared to further develop and enhance their understanding of the key concepts for data sense at Level B.

It is also important to recognize that helping students develop data sense at Level A allows mathematics instruction to be driven by data. The traditional mathematics strands of algebra, func-tions, geometry, and measurement all can be de-veloped with the use of data. Making sense of data should be an integrated part of the mathematics curriculum, starting in pre-kindergarten.

Page 74: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

36

In This Section→ Example 1, Level A Revisited: Choosing a Band for the School Dance

→ Connecting Two Categorical Variables

→ Questionnaires and Their Diffi culties

→ Measure of Location—The Mean as a Balance Point

→ A Measure of Spread—The Mean Absolute Deviation

→ Representing Data Distributions— The Frequency Table and Histogram

→ Comparing Distributions— The Boxplot

→ Measuring the Strength of Association between Two Quantitative Variables

→ Modeling Linear Association

→ The Importance of Random Selection

→ Comparative Experiments

→ Time Series

→ Misuses of Statistics

→ Summary of Level B

Page 75: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

37

Instruction at Level B should build on the statisti-cal base developed at Level A and set the stage for statistics at Level C. Instructional activities at

Level B should continue to emphasize the four main components in the investigative process and have the spirit of genuine statistical practice. Students who complete Level B should see statistical reason-ing as a process for solving problems through data and quantitative reasoning.

At Level B, students become more aware of the statisti-cal question distinction (a question with an answer based on data that vary versus a question with a deterministic answer). They also should make decisions about what variables to measure and how to measure them in or-der to address the question posed.Students should use and expand the graphical, tabu-lar, and numerical summaries introduced at Level A to investigate more sophisticated problems. Also, when selecting a sample, students should develop a basic understanding of the role probability plays in random selection—and in random assignment when conduct-ing an experiment.At Level B, students investigate problems with more emphasis placed on possible associations among two or more variables and understand how a more sophis-ticated collection of graphical, tabular, and numerical summaries is used to address these questions. Finally, students recognize ways in which statistics is used or misused in their world.

Specifi cally, Level B recommendations in the Investiga-tive Process include:

I. Formulate Questions

→ Students begin to pose their own questions.→ Students address questions involving a group larger than their classroom and begin to recognize the distinction among a population, a census, and a sample.

II. Collect Data

→ Students conduct censuses of two or more classrooms.→ Students design and conduct nonrandom sample surveys and begin to use random selection.→ Students design and conduct comparative experiments and begin to use random assignment.

III. Analyze Data

→ Students expand their understanding of a data distribution.→ Students quantify variability within a group.→ Students compare two or more distributions using graphical displays and numerical summaries.

Level B

Page 76: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

38

→ Students use more sophisticated tools for summarizing and comparing distributions, including: ▪ Histograms ▪ The IQR (Interquartile Range) and MAD (Mean Absolute Deviation) ▪ Five-Number Summaries and boxplots→ Students acknowledge sampling error.→ Students quantify the strength of association between two variables, develop simple models for association between two numerical variables, and use expanded tools for exploring association, including: ▪ Contingency tables for two categorical variables ▪ Time series plots ▪ The QCR (Quadrant Count Ratio) as a measure of strength of association ▪ Simple lines for modeling association be- tween two numerical variables

IV. Interpret Results

→ Students describe differences between two or more groups with respect to center, spread, and shape.→ Students acknowledge that a sample may not be representative of a larger population.→ Students understand basic interpretations of measures of association.

→ Students begin to distinguish between an observational study and a designed experiment.→ Students begin to distinguish between “association” and “cause and effect.”→ Students recognize sampling variability in summary statistics, such as the sample mean and the sample proportion.

Many of the graphical, tabular, and numerical sum-maries introduced at Level A can be enhanced and used to investigate more sophisticated problems at Level B. Let’s revisit the problem of planning for the school dance introduced in Level A, in which, by conducting a census of the class, a Level A class investigated the question:What type of music is most popular among students?

Recall that the class was considered to be the entire population, and data were collected on every member of the population. A similar investigation at Level B would include recognition that one class may not be representative of the opinions of all students at the school. Level B students might want to compare the opinions of their class with the opinions of other classes from their school. A Level B class might inves-tigate the questions:

What type of music is most popular among students at our school?

Example 1, Level A Revisited: Choosing a Bandfor the School Dance

Page 77: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

39

How do the favorite types of music differ between classes?

As class sizes may be different, results should be sum-marized with relative frequencies or percents in order to make comparisons. Percentages are useful in that they allow us to think of having comparable results for groups of size 100. Level B students will see more emphasis on proportional reasoning throughout the mathematics curriculum, and they should be comfort-able summarizing and interpreting data in terms of percents or fractions.

The results from two classes are summarized in Table 3 using both frequency and relative frequency (percents).

The bar graph below compares the percent of each favorite music category for the two classes.

Students at Level B should begin to recognize that there is not only variability from one individual to another within a group, but also in results from one group to another. This second type of variability is il-lustrated by the fact that the most popular music is rap music in Class 1, while it is rock music in Class 2. That is, the mode for Class 1 is rap music, while the mode for Class 2 is rock music.

The results from the two samples might be com-bined in order to have a larger sample of the entire school. The combined results indicate rap music is the favorite type of music for 43% of the students,

Class 1

Favorite Frequency RelativeFrequencyPercentage

Country 8 33%

Rap 12 50%

Rock 4 17%

Total 24 100%

Class 2

Favorite Frequency RelativeFrequencyPercentage

Country 5 17%

Rap 11 37%

Rock 14 47%

Total 30 101%

Figure 8: Comparative bar graph for music preferences

Table 3: Frequencies and Relative Frequencies

0

10

20

30

40

50

60

1 2Class

CountryRapRockP

erce

nt

Page 78: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

40

rock music is preferred by 33%, while only 24% of the students selected country music as their favorite. Level B students should recognize that although this is a larger sample, it still may not be representative of the entire population (all students at their school). In statistics, randomness and probability are incorpo-rated into the sample selection procedure in order to provide a method that is “fair” and to improve the chances of selecting a representative sample. For example, if the class decides to select what is called a simple random sample of 54 students, then each pos-sible sample of 54 students has the same probability of being selected. This application illustrates one of the roles of probability in statistics. Although Level B students may not actually employ a random selec-tion procedure when collecting data, issues related to obtaining representative samples should be discussed at this level.

Connecting Two Categorical Variables

As rap was the most popular music for the two com-bined classes, the students might argue for a rap group for the dance. However, more than half of those sur-veyed preferred either rock or country music. Will these students be unhappy if a rap band is chosen? Not necessarily, as many students who like rock music also may like rap music. To investigate this problem, students might explore two additional questions:Do students who like rock music tend to like or dislike rap music?

Do students who like country music tend to like or dislike rap music?

To address these questions, the survey should ask stu-dents not only their favorite type of music, but also whether they like rap, rock, and country music.

The two-way frequency table (or contingency table) below provides a way to investigate possible connections between two categorical variables.

According to these results, of the 33 students who liked rock music, 27 also liked rap music. That is, 82% (27/33) of the students who like rock music also like rap music. This indicates that students who like rock music tend to like rap music as well. Once again, notice the use of proportional reasoning in interpreting these results. A similar analysis could be performed to determine if students who like country tend to like or dislike rap music. A more detailed discussion of this example and a measure of associa-tion between two categorical variables is given in the Appendix for Level B.

Like Rap Music?

Yes No Row Totals

Like Rock Music?

Yes 27 6 33

No 4 17 21

Column Totals 31 23 54

Table 4: Two-Way Frequency Table

With theuse of ascatterplot,Level Astudents can visually lookfor trends and patterns.”

Page 79: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

41

Questionnaires and Their Diffi culties

At Level B, students should begin to learn about sur-veys and the many pitfalls to avoid when designing and conducting them. One issue involves the wording of questions. Questions must be unambiguous and easy to understand. For example, the question:Are you against the school implementing a no-door policy on bathroom stalls?

is worded in a confusing way. An alternative way to pose this question is:The school is considering implementing a no-door policy on bath-room stalls. What is your opinion regarding this policy?Strongly Oppose Oppose No Opinion Support Strongly Support

Questions should avoid leading the respondent to an answer. For example, the question: Since our football team hasn’t had a winning season in 20 years and is costing the school money, rather than generating funds, do you feel we should concentrate more on another sport, such as soccer or basketball?

is worded in a way that is biased against the football team.

The responses to questions with coded responses should include all possible answers, and the answers should not overlap. For example, for the question:How much time do you spend studying at home on a typical night?

the responses:none 1 hour or less 1 hour or more

would confuse a student who spends one hour a night studying.

There are many other considerations about question formulation and conducting sample surveys that can be introduced at Level B. Two such issues are how the interviewer asks the questions and how accurately the responses are recorded. It is important for students to realize that the conclusions from their study depend on the accuracy of their data.

Measure of Location—The Mean as aBalance Point

Another idea developed at Level A that can be ex-panded at Level B is the mean as a numerical sum-mary of center for a collection of numerical data. At Level A, the mean is interpreted as the “fair share” value for data. That is, the mean is the value you would get if all the data from subjects are combined and then evenly redistributed so each subject’s value is the same. Another interpretation of the mean is that it is the balance point of the corresponding data distri-bution. Here is an outline of an activity that illustrates the notion of the mean as a balance point. Nine stu-dents were asked:How many pets do you have?

The resulting data were 1, 3, 4, 4, 4, 5, 7, 8, 9. These data are summarized in the dotplot shown in Figure 9. Note that in the actual activity, stick-on notes were used as “dots” instead of Xs.

At Level A, the mean is interpreted as the ‘fair share’ value for data.

Page 80: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

42

X X XXXX X

XX

1 5 6 7 8 9432

If the pets are combined into one group, there are a total of 45 pets. If the pets are evenly redistributed among the nine students, then each student would get fi ve pets. That is, the mean number of pets is fi ve. The dotplot representing the result that all nine students have exactly fi ve pets is shown below:

X

XX

X

XX

X

XX

1 5 6 7 8432 9

It is hopefully obvious that if a pivot is placed at the value 5, then the horizontal axis will “balance” at this pivot point. That is, the “balance point” for the hori-zontal axis for this dotplot is 5. What is the balance point for the dotplot displaying the original data?

We begin by noting what happens if one of the dots over 5 is removed and placed over the value 7, as shown below:

X

XX

X

XX

X

XX

1 432 5 6 7 8 9

Clearly, if the pivot remains at 5, the horizontal axis will tilt to the right. What can be done to the remain-ing dots over 5 to “rebalance” the horizontal axis at the pivot point? Since 7 is two units above 5, one solution is to move a dot two units below 5 to 3, as shown below:

X

XX

X

XX

X

X X

1 432 5 6 7 8 9

Figure 9: Dotplot for pet count

Figure 10: Dotplot showing pets evenly distributed

Figure 11: Dotplot with one data point moved

Figure 12: Dotplot with two data points moved

Page 81: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

43

The horizontal axis is now rebalanced at the pivot point. Is this the only way to rebalance the axis at 5? No. Another way to rebalance the axis at the pivot point would be to move two dots from 5 to 4, as shown below:

X

XX

X

XX

X

X X

1 2 3 4 5 6 7 8 9

The horizontal axis is now rebalanced at the pivot point. That is, the “balance point” for the horizontal axis for this dotplot is 5. Replacing each “X” (dot) in this plot with the distance between the value and 5, we have:

0

00

0

01

0

1 2

1 2 3 4 5 6 7 8 9

Notice that the total distance for the two values below the 5 (the two 4s) is the same as the total distance for

the one value above the 5 (the 7). For this reason, the balance point of the horizontal axis is 5. Replacing each value in the dotplot of the original data by its distance from 5 yields the following plot:

4

1

20 34112

1 2 3 4 5 6 7 8 9

The total distance for the values below 5 is 9, the same as the total distance for the values above 5. For this reason, the mean (5) is the balance point of the horizontal axis.

Both the mean and median often are referred to as measures of central location. At Level A, the median also was introduced as the quantity that has the same num-ber of data values on each side of it in the ordered data. This “sameness of each side” is the reason the median is a measure of central location. The previous activity demonstrates that the total distance for the values be-low the mean is the same as the total distance for the values above the mean, and illustrates why the mean also is considered to be a measure of central location.

Figure 13: Dotplot with different data points moved

Figure 14: Dotplot showing distance from 5

Figure 15: Dotplot showing original data and distance from 5

Page 82: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

44

A Measure of Spread—The Mean Absolute Deviation

Statistics is concerned with variability in data. One important idea is to quantify how much variability ex-ists in a collection of numerical data. Quantities that measure the degree of variability in data are called measures of spread. At Level A, students are introduced to the range as a measure of spread in numerical data. At Level B, students should be introduced to the idea of comparing data values to a central value, such as the mean or median, and quantifying how different the data are from this central value.

In the number of pets example, how different are the original data values from the mean? One way to mea-sure the degree of variability from the mean is to de-termine the total distance of all values from the mean. Using the fi nal dotplot from the previous example, the total distance the nine data values are from the mean of 5 pets is 18 pets. The magnitude of this quantity depends on several factors, including the number of measurements. To adjust for the number of measure-ments, the total distance from the mean is divided by the number of measurements. The resulting quantity is called the Mean Absolute Deviation, or MAD. The MAD is the average distance of each data value from the mean. That is:

MAD = Total Distance from the Mean for all Values

Number of Data Values

The MAD for the data on number of pets from the previous activity is:

MAD = 18/9 = 2

The MAD indicates that the actual number of pets for the nine students differs from the mean of fi ve pets by two pets, on average. Kader (1999) gives a thorough discussion of this activity and the MAD.

The MAD is an indicator of spread based on all the data and provides a measure of average variation in the data from the mean. The MAD also serves as a precursor to the standard deviation, which will be developed at Level C.

Representing Data Distributions—The Frequency Table and Histogram

At Level B, students should develop additional tabular and graphical devices for representing data distribu-tions of numerical variables. Several of these build upon representations developed at Level A. For ex-ample, students at Level B might explore the problem of placing an order for hats. To prepare an order, one needs to know which hat sizes are most common and which occur least often. To obtain information about hat sizes, it is necessary to measure head circumfer-ences. European hat sizes are based on the metric system. For example, a European hat size of 55 is designed to fi t a person with a head circumference of between 550 mm and 559 mm. In planning an order

Page 83: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

45

for adults, students might collect preliminary data on the head circumferences of their parents, guardians, or other adults. Such data would be the result of a nonrandom sample. The data summarized in the fol-lowing stemplot (also known as stem and leaf plot) are head circumferences measured in millimeters for a sample of 55 adults.

51 | 3

52 | 5

53 | 133455

54 | 2334699

55 | 12222345

56 | 0133355588

57 | 113477

58 | 02334458

59 | 1558

60 | 13

61 | 28

51 | 3 means 513 mm

Based on the stemplot, some head sizes do appear to be more common than others. Head circumferences in the 560s are most common. Head circumferences fall off in a somewhat symmetric manner on both

sides of the 560s, with very few smaller than 530 mm or larger than 600 mm.

In practice, a decision of how many hats to order would be based on a much larger sample, possibly hundreds or even thousands of adults. If a larger sample was available, a stemplot would not be a practical device for summarizing the data distribution. An alternative to the stemplot is to form a distribution based on di-viding the data into groups or intervals. This method can be illustrated through a smaller data set, such as the 55 head circumferences, but is applicable for larger data sets as well. The grouped frequency and grouped relative frequency distributions and the relative frequency histogram that correspond to the above stemplot are:

510 530 550 570 590 610

5

10

15

20

Head Circumference (mm)

Figure 17: Relative frequency histogram

Figure 16: Stemplot of head circumference

Page 84: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

46

If the hat manufacturer requires that orders be in multiples of 250 hats, then based on the above re-sults, how many hats of each size should be ordered? Using the relative frequency distribution, the num-ber of hats of each size for an order of 250 hats is shown in Table 6.

Once again, notice how students at Level B would uti-lize proportional reasoning to determine the number of each size to order. Kader and Perry (1994) give a detailed description of “The Hat Shop” problem.

Comparing Distributions—The Boxplot

Problems that require comparing distributions for two or more groups are common in statistics. For example, at Level A students compared the amount of sodium in beef and poultry hot dogs by examining parallel dotplots. At Level B, more sophisticated rep-resentations should be developed for comparing dis-tributions. One of the most useful graphical devices for comparing distributions of numerical data is the boxplot. The boxplot (also called a box-and-whiskers

Stem Limits on Recorded Measurements

on HeadCircumference

Interval ofActual Head

Circumferences

Frequency Relative Frequency (%)

51 510–519 510–<520 1 1.8

52 520–529 520–<530 1 1.8

53 530–539 530–<540 6 10.9

54 540–549 540–<550 7 12.7

55 550–559 550–<560 8 14.5

56 560–569 560–<570 10 18.2

57 570–579 570–<580 6 10.9

58 580–589 580–<590 8 14.5

59 590–599 590–<600 4 7.3

60 600–609 600–<610 2 3.6

61 610–619 610–<620 2 3.6

Total 55 99.8

Table 5: Grouped Frequency and Grouped Relative Frequency Distributions

Page 85: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

47

plot) is a graph based on a division of the ordered data into four groups, with the same number of data values in each group (approximately one-fourth). The four groups are determined from the Five-Number Summary (the minimum data value, the fi rst quartile, the me-dian, the third quartile, and the maximum data value). The Five-Number Summaries and comparative box-plots for the data on sodium content for beef (labeled B) and poultry (labeled P) hot dogs introduced in Level A are given in Table 7 and Figure 18.

Interpreting results based on such an analysis requires comparisons based on global characteristics of each distribution (center, spread, and shape). For example, the median sodium content for poultry hot dogs is

430 mg, almost 50 mg more than the median sodium content for beef hot dogs. The medians indicate that a typical value for the sodium content of poultry hot dogs is greater than a typical value for beef hot dogs. The range for the beef hot dogs is 392 mg, versus 231 mg for the poultry hot dogs. The ranges indi-cate that, overall, there is more spread (variation) in the sodium content of beef hot dogs than poultry hot dogs. Another measure of spread that should be in-troduced at Level B is the interquartile range, or IQR. The IQR is the difference between the third and fi rst quartiles, and indicates the range of the middle 50% of the data. The IQRs for sodium content are 157.5 mg for

Hat Size Number to Order

51 5

52 5

53 27

54 32

55 36

56 46

57 27

58 36

59 18

60 9

61 9

Figure 18: Boxplot for sodium content

Beef Hot Dogs (n = 20)

Poultry Hot Dogs (n = 17)

Minimum 253 357

First Quartile 320.5 379

Median 380.5 430

Third Quartile 478 535

Maximum 645 588

Table 6: Hat Size Data Table 7: Five-Number Summaries for Sodium Content

250 300 350 400 450 500 550 600 650

B&P Hot Dogs

Typ

eB

P

Sodium (mg)

Page 86: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

48

beef hot dogs and 156 mg for poultry hot dogs. The IQRs suggest that the spread within the middle half of data for beef hot dogs is similar to the spread with-in the middle half of data for poultry hot dogs. The boxplots also suggest that each distribution is some-what skewed right. That is, each distribution appears to have somewhat more variation in the upper half. Considering the degree of variation in the data and the amount of overlap in the boxplots, a difference of 50 mg between the medians is not really that large. Finally, it is interesting to note that more than 25% of beef hot dogs have less sodium than all poultry hot dogs. On the other hand, the highest sodium levels are for beef hot dogs.

Note that there are several variations of boxplots. At Level C, performing an analysis using boxplots might include a test for outliers (values that are extremely large or small when compared to the variation in the major-ity of the data). If outliers are identifi ed, they often are detached from the “whiskers” of the plot. Outlier anal-ysis is not recommended at Level B, so whiskers extend to the minimum and maximum data values. However, Level B students may encounter outliers when using statistical software or graphing calculators.

Measuring the Strength of Associationbetween Two Quantitative Variables

At Level B, more sophisticated data representations should be developed for the investigation of problems

that involve the examination of the relationship be-tween two numeric variables. At Level A, the problem of packaging sweat suits (shirt and pants together or separate) was examined through a study of the re-lationship between height and arm span. There are several statistical questions related to this problem that can be addressed at Level B with a more in-depth analysis of the height/arm span data. For example: How strong is the association between height and arm span?Is height a useful predictor of arm span?

Height Arm Span Height Arm Span

155 151 173 170

162 162 175 166

162 161 176 171

163 172 176 173

164 167 178 173

164 155 178 166

165 163 181 183

165 165 183 181

166 167 183 178

166 164 183 174

168 165 183 180

171 164 185 177

171 168 188 185

Table 8: Height and Arm Span Data

Page 87: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

49

Table 8 provides data on height and arm span (mea-sured in centimeters) for 26 students. For convenience, the data on height have been ordered.

The height and arm span data are displayed in Figure 19. The scatterplot suggests a fairly strong increasing rela-tionship between height and arm span. In addition, the relationship appears to be quite linear.

Measuring the strength of association between two variables is an important statistical concept that should be introduced at Level B. The scatterplot in Figure 20 for the height/arm span data includes a vertical line drawn through the mean height (x = 172.5) and

a horizontal line drawn through the mean arm span( y = 169.3).

The two lines divide the scatterplot into four regions (or quadrants). The upper right region (Quadrant 1) contains points that correspond to individuals with above average height and above average arm span. The upper left region (Quadrant 2) contains points that correspond to individuals with below average height and above average arm span. The lower left region (Quadrant 3) contains points that correspond to individuals with below average height and below average arm span. The lower right region (Quadrant

Figure 19: Scatterplot of arm span vs. height Figure 20: Scatterplot showing means

155 160 165 170 175 180 185 190

Height

190

185

180

175

170

165

160

155

150

Arm

Span

Page 88: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

50

4) contains points that correspond to individuals with above average height and below average arm span.

Notice that most points in the scatterplot are in either Quadrant 1 or Quadrant 3. That is, most people with above average height also have above average arm span (Quadrant 1) and most people with below aver-age height also have below average arm span (Quad-rant 3). One person has below average height with above average arm span (Quadrant 2) and two people have above average height with below average arm span (Quadrant 4). These results indicate that there is a positive association between the variables height and arm span. Generally stated, two numeric variables are posi-tively associated when above average values of one vari-able tend to occur with above average values of the other and when below average values of one variable tend to occur with below average values of the other. Negative association between two numeric variables oc-curs when below average values of one variable tend to occur with above average values of the other and when above average values of one variable tend to oc-cur with below average values of the other.

A correlation coeffi cient is a quantity that measures the direction and strength of an association between two variables. Note that in the previous example, points in Quadrants 1 and 3 contribute to the positive association between height and arm span, and there is a total of 23 points in these two quadrants. Points in Quadrants 2 and 4 do not contribute to the positive

association between height and arm span, and there is a total of three points in these two quadrants. One correlation coeffi cient between height and arm span is given by the QCR (Quadrant Count Ratio):

QCR = 23 – 3

= .77 26

A QCR of .77 indicates that there is a fairly strong positive association between the two variables height and arm span. This indicates that a person’s height is a useful predictor of his/her arm span.

In general, the QCR is defi ned as:

The QCR has the following properties:

→ The QCR is unitless.→ The QCR is always between –1 and +1 inclusive.

Holmes (2001) gives a detailed discussion of the QCR. A similar correlation coeffi cient for 2x2 con-tingency tables is described in Conover (1999) and discussed in the Appendix for Level B. The QCR is a measure of the strength of association based on only the number of points in each quadrant and, like most summary measures, has its shortcomings. At Level C, the shortcomings of the QCR can be

A correlation coefficient is a quantity that measures the direction and strength ofan association between two variables.”

(Number of Points in Quadrants 1 and 3)– (Number of Points in Quadrants 2 and 4)

Number of Points in all Four Quadrants

Page 89: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

51

addressed and used as foundation for developing Pearson’s correlation coeffi cient.

Modeling Linear Association

The height/arm span data were collected at Level A in order to study the problem of packaging sweat suits. Should a shirt and pants be packaged separately or together? A QCR of .77 suggests a fairly strong posi-tive association between height and arm span, which indicates that height is a useful predictor of arm span and that a shirt and pants could be packaged together. If packaged together, how can a person decide which size sweat suit to buy? Certainly, the pant-size of a sweat suit depends on a person’s height and the shirt-size depends on a person’s arm span. As many people know their height, but may not know their arm span, can height be used to help people decide which size sweat suit they wear? Specifi cally:Can the relationship between height and arm span be described using a linear function?

Students at Level B will study linear relationships in other areas of their mathematics curriculum. The degree to which these ideas have been developed will determine how we might proceed at this point. For example, if students have not yet been intro-duced to the equation of a line, then they simply might draw a line through the “center of the data” as shown in Figure 21.

This line can be used to predict a person’s arm span if his or her height is known. For example, to predict the arm span for a person who is 170 cm tall, a verti-cal segment is drawn up from the X-axis at Height = 170. At the point this vertical segment intersects the segment, a horizontal line is drawn to the Y-axis. The value where this horizontal segment intersects the Y-axis is the predicted arm span. Based on the graph above, it appears that we would predict an arm span of approximately 167 cm for a person who is 170 cm tall.

If students are familiar with the equation for a line and know how to fi nd the equation from two points,

Figure 21: Eyeball line

Page 90: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

52

then they might use the Mean – Mean line, which is determined as follows. Order the data according to the X-coordinates and divide the data into two “halves” based on this ordering. If there is an odd number of measurements, remove the middle point from the analysis. Determine the means for the X-coordinates and Y-coordinates in each half and fi nd the equation of the line that passes through these two points. Using the previous data:

The equation of the line that goes through the points (164.8, 163.4) and (180.2, 175.2) is Predicted Arm Span ≈ 37.1 + .766(Height). This equation can be used to predict a person’s height more accurately than an eye-ball line. For example, if a person is 170 cm tall, then we would predict his/her height to be approximately 37.1 + .766(170) = 167.3 cm. A more sophisticated ap-proach (least squares) to determine a “best-fi tting” line through the data will be introduced in Level C.

The Importance of Random Selection

In statistics, we often want to extend results beyond a particular group studied to a larger group, the population. We are trying to gain information about the population by examining a portion of the population,

called a sample. Such generalizations are valid only if the data are representative of that larger group. A rep-resentative sample is one in which the relevant char-acteristics of the sample members are generally the same as those of the population. Improper or biased sample selection tends to systematically favor certain outcomes, and can produce misleading results and erroneous conclusions.

Random sampling is a way to remove bias in sam-ple selection, and tends to produce representative samples. At Level B, students should experience the consequences of nonrandom selection and develop a basic understanding of the principles involved in ran-dom selection procedures. Following is a description of an activity that allows students to compare sample results based on personal (nonrandom) selection ver-sus sample results based on random selection.

Consider the 80 circles on the next page. What is the average diameter for these 80 circles? Each student should take about 15 seconds and select fi ve circles that he/she thinks best represent the sizes of the 80 circles. After selecting the sample, each student should fi nd the average diameter for the circles in her/his personal sample. Note that the diameter is 1 cm for the small circles, 2 cm for the medium-sized circles, and 3 cm for the large circles.

Next, each student should number the circles from one to 80 and use a random digit generator to select a random sample of size fi ve. Each student should fi nd

Lower Half (13 Points) Upper Half (13 Points)

Mean Height = 164.8 Mean Height = 180.2

Mean Arm Span = 163.4 Mean Arm Span = 175.2

Page 91: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

53

Figure 22: Eighty circles

Page 92: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

54

the average diameter for the circles in his/her random sample. The sample mean diameters for the entire class can be summarized for the two selection proce-dures with back-to-back stemplots.

How do the means for the two sample selection pro-cedures compare with the true mean diameter of 1.25 cm? Personal selection usually will tend to yield sam-ple means that are larger than 1.25. That is, personal selection tends to be biased with a systematic favoring toward the larger circles and an overestimation of the population mean. Random selection tends to produce some sample means that underestimate the popula-tion mean and some that overestimate the population mean, such that the sample means cluster somewhat evenly around the population mean value (i.e., ran-dom selection tends to be unbiased ).

In the previous example, the fact that the sample means vary from one sample to another illustrates an idea that was introduced earlier in the favorite music type survey. This is the notion of sampling variability. Imposing randomness into the sampling procedure allows us to use probability to describe the long-run be-havior in the variability of the sample means resulting from random sampling. The variation in results from repeated sampling is described through what is called the sampling distribution. Sampling distributions will be explored in more depth at Level C.

Comparative Experiments

Another important statistical method that should be introduced at Level B is comparative experimental studies. Comparative experimental studies involve comparisons of the effects of two or more treatments (experimental conditions) on some response variable. At Level B, studies comparing two treatments are adequate. For example, students might want to study the effects of listening to rock music on one’s ability to memorize. Before undertaking a study such as this, it is important for students to have the opportunity to identify and, as much as possible, control for any potential extraneous sources that may interfere with our ability to interpret the results. To address these issues, the class needs to develop a design strategy for collecting appropriate experimental data.

One simple experiment would be to randomly divide the class into two equal-sized (or near equal-sized) groups. Random assignment provides a fair way to assign stu-dents to the two groups because it tends to average out differences in student ability and other characteristics that might affect the response. For example, suppose a class has 28 students. The 28 students are randomly assigned into two groups of 14. One way to accom-plish this is to place 28 pieces of paper in a box—14 labeled “M” and 14 labeled “S.” Mix the contents in the box well and have each student randomly choose a piece of paper. The 14 Ms will listen to music and the 14 Ss will have silence.

Page 93: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

55

Each student will be shown a list of words. Rules for how long students have to study the words and how long they have to reproduce the words must be deter-mined. For example, students may have two minutes to study the words, a one-minute pause, and then two minutes to reproduce (write down) as many words as possible. The number of words remembered under each condition (listening to music or silence) is the response variable of interest.

The Five-Number Summaries and comparative box-plots for a hypothetical set of data are shown in Table 9 and Figure 23. These results suggest that students gen-erally memorize fewer words when listening to music than when there is silence. With the exception of the maximum value in the music group (which is classi-fi ed as an outlier), all summary measures for the music group (labeled M in Figure 23) are lower than the cor-responding summary measures for the silence group (labeled S in Figure 23). Without the outlier, the de-gree of variation in the scores appears to be similar for both groups. Distribution S appears to be reasonably

2 4 6 8 10 12 14 16

Memory Experiment

Gro

up

Score

MS

symmetric, while distribution M is slightly right-skewed. Considering the degree of variation in the scores and the separation in the boxplots, a difference of three between the medians is quite large.

Time Series

Another important statistical tool that should be in-troduced at Level B is a time series plot. Problems that explore trends in data over time are quite common. For example, the populations of the United States and the world continue to grow, and there are several fac-tors that affect the size of a population, such as the number of births and the number of deaths per year. One question we ask is:How has the number of live births changed over the past 30 years?

The U.S. Census Bureau publishes vital statistics in its annual Statistical Abstract of the United States. The data below are from The Statistical Abstract of the United States (2004–2005) and represent the number of live births

Music Silence

Minimum 3 6

First Quartile 6 8

Median 7 10

Third Quartile 9 12

Maximum 15 14

Figure 23: Boxplot for memory data

Table 9: Five-Number Summaries

Page 94: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

56

per year (in thousands) for residents of the United States since 1970. Note that, in 1970, the value 3,731 represents 3,731,000 live births.

The time series plot in Figure 24 shows the number of live births over time. This graph indicates that:→ from 1970 to 1975, the number of live births generally declined → from 1976 to 1990, the number of live births generally increased

→ from 1991 to 1997, the number of live births generally declined

And it appears that the number of live births may have started to increase since 1997.

1970 1975 1980 1985 1990 1995 2000Year

4200

4000

3800

3600

3400

3200

3000

Bir

ths

(in

tho

usa

nd

s)

Misuses of Statistics

The introduction of this document points out that data govern our lives. Because of this, every high-school graduate deserves to have a solid foundation in statistical reasoning. Along with identifying proper uses of statistics in questionnaires and graphs, the Level B student should become aware of common misuses of statistics.

Proportional reasoning allows the Level B student to interpret data summarized in a variety of ways. One type of graph that often is misused for representing

Year Births(x 1,000)

Year Births(x 1,000)

1970 3,731 1985 3,761

1971 3,556 1986 3,757

1972 3,258 1987 3,809

1973 3,137 1988 3,910

1974 3,160 1989 4,041

1975 3,144 1990 4,158

1976 3,168 1991 4,111

1977 3,327 1992 4,065

1978 3,333 1993 4,000

1979 3,494 1994 3,979

1980 3,612 1995 3,900

1981 3,629 1996 3,891

1982 3,681 1997 3,881

1983 3,639 1998 3,942

1984 3,669 1999 3,959

Figure 24: Time series plot of live births

Table 10: Live Birth Data

Page 95: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

57

data is the pictograph. For example, suppose the buy-ing power of a dollar today is 50% of what it was 20 years ago. How would one represent that in a picto-graph? Let the buying power of a dollar 20 years ago be represented by the following dollar bill:

If the buying power today is half what it was 20 years ago, one might think of reducing both the width and height of this dollar by one-half, as illustrated in the pictograph below:

Today’s dollar should look half the size of the dollar of 20 years ago. Does it? Since both the length and the width were cut in half, the area of today’s dollar shown above is one-fourth the original area, not one-half.

The two pictographs below show the correct reduc-tion in area. The one on top changes only one di-

mension, while the other changes both dimensions, but in correct proportion so that the area is one-half the area of the original representation. This example provides the Level B student with an excellent exer-cise in proportional reasoning.

Poorly designed statistical graphs are commonly found in newspapers and other popular media. Sev-eral examples of bad graphs, including the use of an unwarranted third dimension in bar graphs and circle graphs can be found at www.amstat.org/education/gaise/2, a web site managed by Carl Schwarz at Simon Fraser University. Students at Level B should be given oppor-tunities to identify graphs that incorrectly represent data and then draw, with the aid of statistical computer

Today’s dollarat “half” size,

representing that it buys only halfof what it did 20

years ago.

Today’s dollarat half size,with 50% taken from the length.

Today’s dollarat half size, with sidesin correct proportionto the original.

Poorlydesignedstatistical graphs are commonly found innewspapers and otherpopular media.

Page 96: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

58

software, the correct versions. This gives them excel-lent practice in calculating areas and volumes.

There are many famous misuses of data analysis in the literature, and three are mentioned here. The maga-zine Literary Digest erred in 1936 when it projected that Alf Landon would defeat Franklin Delano Roosevelt by a 57 to 43 percent margin based on responses to its survey. Each survey included a subscription form to the magazine, and more than 2.3 million were re-turned. Unfortunately, even large voluntary response surveys are generally not representative of the entire population, and Roosevelt won with 62% of the vote. George Gallup correctly projected the winner, and thereby began a very successful career in using ran-dom sampling techniques for conducting surveys. Learning what Gallup did right and the Literary Digest did wrong gives the Level B student valuable insight into survey design and analysis. A more detailed dis-cussion of this problem can be found in Hollander and Proschan (1984).

The 1970 Draft Lottery provides an example of in-correctly applying randomness. In the procedure that was used, capsules containing birth dates were placed in a large box. Although there was an effort to mix the capsules, it was insuffi cient to overcome the fact that the capsules were placed in the box in order from January to December. This resulted in young men with birth dates in the latter months being more likely to have their dates selected sooner than birth dates

elsewhere in the year. Hollander and Proschan (1984) give an excellent discussion of this problem.

The 25th fl ight of NASA’s space shuttle program took off on January 20, 1986. Just after liftoff, a puff of gray smoke could be seen coming from the right solid rocket booster. Seventy-three seconds into the fl ight, the Challenger exploded, killing all seven astronauts aboard. The cause of the explosion was determined to be an O-ring failure, due to cold weather. The disaster possibly could have been avoided had available data been displayed in a simple scatterplot and correctly interpreted. The Challenger disaster has become a case study in the possible catastrophic consequences of poor data analysis.

Summary of Level B

Understanding the statistical concepts of Level B en-ables a student to begin to appreciate that data analysis is an investigative process consisting of formulating their own questions, collecting appropriate data through various sources (censuses, nonrandom and random sample surveys, and comparative experiments with random assignment), analyzing data through graphs and simple summary measures, and interpret-ing results with an eye toward inference to a popula-tion based on a sample. As they begin to formulate their own questions, students become aware that the world around them is fi lled with data that affect their own lives, and they begin to appreciate that statistics

Page 97: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

59

can help them make decisions based on data. This will help them begin to appreciate that statistics can help them make decisions based on data, investigation, and sound reasoning.

Page 98: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

60

In This Section→ An Introductory Example— Obesity in America

→ The Investigatory Process at Level C

Formulating Questions

Collecting Data—Types of Statistical Studies

Sample Surveys

Experiments

Observational Studies

Analyzing Data

→ Example 1: The Sampling Distribution of a Sample Proportion

→ Example 2: The Sampling Distribution of a Sample Mean

Interpreting Results

Generalizing from Samples

Generalizing from Experiments

→ Example 3: A Survey of Music Preferences

→ Example 4: An Experiment on the Effects of Light on the Growth of Radish Seedlings

→ Example 5: Estimating the Density of the Earth—A Classical Study

→ Example 6: Linear Regression Analysis—Height vs. Forearm Length

→ Example 7: Comparing Mathematics Scores— An Observational Study

→ Example 8: Observational Study— Toward Establishing Causation

→ The Role of Probability in Statistics

→ Summary of Level C

Page 99: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

61

Level C is designed to build on the foundation developed in Levels A and B. In particular, Levels A and B introduced students to statistics

as an investigatory process, the importance of using data to answer appropriately framed questions, types of variables (categorical versus numerical), graphical displays (including bar graph, dotplot, stemplot, his-togram, boxplot, and scatterplot), tabular displays (including two-way frequency tables for categorical data and both ungrouped and grouped frequency/relative frequency tables for numerical data), and nu-merical summaries (including counts, proportions, mean, median, range, quartiles, interquartile range, MAD, and QCR).

Additionally, Levels A and B covered common study designs (including census, simple random sample, and randomized designs for experiments), the process of drawing conclusions from data, and the role of prob-ability in statistical investigations.

At Level C, all of these ideas are revisited, but the types of studies emphasized are of a deeper statistical nature. Statistical studies at this level require students to draw on basic concepts from earlier work, extend the concepts to cover a wider scope of investiga-tory issues, and develop a deeper understanding of inferential reasoning and its connection to probability. Students also should have increased ability to explain statistical reasoning to others.

At Level C, students develop additional strategies for producing, interpreting, and analyzing data to help answer questions of interest. In general, students should be able to formulate questions that can be answered with data; devise a reasonable plan for col-lecting appropriate data through observation, sampling, or experimentation; draw conclusions and use data to support these conclusions; and understand the role random variation plays in the inference process.

Specifi cally, Level C recommendations include:

I. Formulate Questions

→ Students should be able to formulate questions and determine how data can be collected and analyzed to provide an answer.

II. Collect Data

→ Students should understand what constitutes good practice in conducting a sample survey. → Students should understand what constitutes good practice in conducting an experiment.→ Students should understand what constitutes good practice in conducting an observational study.→ Students should be able to design and implement a data collection plan for statistical studies, including observational studies, sample surveys, and simple comparative experiments.

Level C

Page 100: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

62

III. Analyze Data

→ Students should be able to identify appropriate ways to summarize numerical or categorical data using tables, graphical displays, and numerical summary statistics. → Students should understand how sampling distributions (developed through simulation) are used to describe the sample-to-sample variability of sample statistics. → Students should be able to recognize association between two categorical variables.→ Students should be able to recognize when the relationship between two numerical variables is reasonably linear, know that Pearson’s correlation coeffi cient is a measure of the strength of the linear relationship between two numerical variables, and understand the least squares criterion in line fi tting.

IV. Interpret Results

→ Students should understand the meaning of statistical signifi cance and the difference between statistical signifi cance and practical signifi cance.→ Students should understand the role of p-values in determining statistical signifi cance.→ Students should be able to interpret the margin of error associated with an estimate of a population characteristic.

An Introductory Example–Obesityin America

Data and the stories that surround the data must be of interest to students! It is important to remember this when teaching data analysis. It is also important to choose data and stories that have enough depth to demonstrate the need for statistical thinking. The fol-lowing example illustrates this.

Students are interested in issues that affect their lives, and issues of health often fall into that category. News items are an excellent place to look for stories of cur-rent interest, including items on health. One health-related topic making lots of news lately is obesity. The following paragraph relates to a news story that is rich enough to provide a context for many of the statistical topics to be covered at Level C.

A newspaper article that appeared in 2004 begins with the following lines: “Ask anyone: Americans are get-ting fatter and fatter. Advertising campaigns say they are. So do federal offi cials and the scientists they rely on. … In 1991, 23% of Americans fell into the obese category; now 31% do, a more than 30% increase. But Dr. Jeffrey Friedman, an obesity researcher at Rock-efeller University, argues that contrary to popular opinion, national data do not show Americans grow-ing uniformly fatter. Instead, he says, the statistics demonstrate clearly that while the very fat are getting fatter, thinner people have remained pretty much the same. …The average weight of the population has in-

Page 101: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

63

creased by just seven to 10 pounds.” The discussion in the article refers to adults.

The following are suggested questions to explore with students who have a Level B background in statistics, but are moving on to Level C.

→ Sketch a histogram showing what you think a distribution of weights of American adults might have looked like in 1991. Adjust the sketch to show what the distribution of weights might have looked like in 2002, the year of the reported study. Before making your sketches, think about the shape, center, and spread of your distributions. Will the distribution be skewed or symmetric? Will the median be smaller than, larger than, or about the same size as the mean? Will the spread increase as you move from the 1991 distribution to the 2002 distribution?→ Which sounds more newsworthy: “Obesity has increased by more than 30%” or “On the aver- age, the weight of Americans has increased by fewer than 10 pounds”? Explain your reasoning.→ The title of the article is The Fat Epidemic: He Says It’s an Illusion. [See New York Times, June 8, 2004, or CHANCE, Vol. 17., No. 4, Fall 2004, p. 3 for the complete article.] Do you think this is a fair title? Explain your reasoning. → The data on which the percentages are based come from the National Center for Health

Statistics, National Health and Nutrition Examination Survey 2002. This is a survey of approximately 5,800 residents of the United States. Although the survey design is more complicated than a simple random sample, the margin of error calculated as if it were a simple random sample is a reasonable approximation. What is an approximate margin of error associated with the 31% estimate of obesity for 2004? Interpret this margin of error for a newspaper reader who never studied statistics.

For the curious, information about how obesity is de-fi ned can be found at www.amstat.org/education/gaise/3.

In answering these questions, students at Level C should realize that a distribution of weights is going to be skewed toward the larger values. This generally produces a situation in which the mean is larger than the median. Because 8% shifted over the obesity line between 1991 and 2002, but the average weight (or center) did not shift very much, the upper tail of the distribution must have gotten “fatter,” indicating a larger spread for the 2002 data. Students will have a variety of interesting answers for the second and third questions. The role of the teacher is to help students understand whether their answers are sup-ported by the facts. The last question gets students thinking about an important estimation concept studied at Level C.

Page 102: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

64

The Investigatory Process at Level C

Because Level C revisits many of the same topics ad-dressed at Levels A and B, but at a deeper and more sophisticated level, we begin by describing how the investigatory process looks at Level C. This general discussion is followed by several examples.

Formulating Questions

As stated at the beginning of Level A, data are more than just numbers. Students need to understand the types of questions that can be answered with data. For example, the question “Is the overall health of high-school students declining in this country?” is too big a question to answer with a statistical in-vestigation (or even many statistical investigations). Certain aspects of the health of students, however, can be investigated by formulating more specifi c questions, such as “What is the rate of obesity among high-school students?”; “What is the average daily caloric intake for high-school seniors?”; “Is a three-day-a-week exercise regimen enough to main-tain heart rate and weight within acceptable limits?” Question formulation, then, becomes the starting point for a statistical investigation.

Collecting Data—Types of Statistical Studies

Most questions that can be answered through data collection and interpretation require data from a designed study, either a sample survey or an experiment.

These two types of statistical investigations have some common elements—each requires randomiza-tion for both purposes of reducing bias and building a foundation for statistical inference and each makes use of the common inference mechanisms of margin of error in estimation and p-value in hypothesis test-ing (both to be explained later). But these two types of investigations have very different objectives and requirements. Sample surveys are used to estimate or make decisions about characteristics (parameters) of populations. A well-defi ned, fi xed population is the main ingredient of such a study. Experiments are used to estimate or compare the effects of different experi-mental conditions (treatments), and require well-de-fi ned treatments and experimental units on which to study those treatments.

Estimating the proportion of residents of a city that would support an increase in taxes for education re-quires a sample survey. If the selection of residents is random, then the results from the sample can be extended to represent the population from which the sample was selected. A measure of sampling error (margin of error) can be calculated to ascertain how far the estimate is likely to be from the true value.

Testing to see if a new medication to improve breath-ing for asthma patients produces greater lung capacity than a standard medication requires an experiment in which a group of patients who have consented to par-ticipate in the study are randomly assigned to either

Page 103: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

65

the new or the standard medication. With this type of randomized comparative design, an investigator can determine, with a measured degree of uncertainty, whether the new medication caused an improvement in lung capacity. Randomized experiments are, in fact, the only type of statistical study capable of establish-ing cause and effect relationships. Any generalization extends only to the types of units used in the experi-ment, however, as the experimental units are not usu-ally randomly sampled from a larger population. To generalize to a larger class of experimental units, more experiments would have to be conducted. That is one reason why replication is a hallmark of good science.

Studies that have no random selection of sampling units or random assignment of treatments to ex-perimental units are called observational studies in this document. A study of how many students in your high school have asthma and how this breaks down among gender and age groups would be of this type. Observational studies are not amenable to statistical inference in the usual sense of the term, but they can provide valuable insight into the distribution of measured values and the types of associations among variables that might be expected.

At Level C, students should understand the key features of both sample surveys and experimental designs, including how to set up simple versions of both types of investigations, how to analyze the data appropriately (as the correct analysis is related to the

design), and how to clearly and precisely state conclu-sions for these designed studies. Key elements of the design and implementation of data collection plans for these types of studies follow.

Sample Surveys

Students should understand that obtaining good re-sults from a sample survey depends on four basic fea-tures: the population, the sample, the randomization process that connects the two, and the accuracy of the measurements made on the sampled elements. For ex-ample, to investigate a question on health of students, a survey might be planned for a high school. What is the population to be investigated? Is it all the students in the school (which changes on a daily basis)? Per-haps the questions of interest involve only juniors and seniors. Once the population is defi ned as precisely as possible, one must determine an appropriate sample size and a method for randomly selecting a sample of that size. Is there, for example, a list of students who can then be numbered for random selection? Once the sampled students are found, what questions will be asked? Are the questions fair and unbiased (as far as possible)? Can or will the students actually answer them accurately?

When a sample of the population is utilized, errors may occur for several reasons, including:→ the sampling procedure is biased→ the sample was selected from the wrong population

Page 104: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

66

→ some of the units selected to be in the sample were unable (or unwilling) to participate→ the questions were poorly written → the responses were ambiguous

These types of errors should be considered carefully before the study begins so plans can be made to reduce their chance of occurring as much as possible. One way to resolve the bias in the sampling procedure is to incorporate randomness into the selection process.

Two samples of size 50 from the same population of students will most likely not give the same result on, say, the proportion of students who eat a healthy breakfast. This variation from sample to sample is called sampling variability. When randomness is in-corporated into the sampling procedure, probability provides a way to describe the “long-run” behavior of this sampling variability.

Experiments

At Level C, students should understand that obtaining good results from an experiment depends upon four basic features: well-defi ned treatments, appropriate experimental units to which these treatments can be assigned, a sound randomization process for assign-ing treatments to experimental units, and accurate measurements of the results of the experiment. Ex-perimental units generally are not randomly selected from a population of possible units. Rather, they are the ones that happen to be available for the study. In

experiments with human subjects, the people involved are often volunteers who have to sign an agreement stating they are willing to participate in the experi-mental study. In experiments with agricultural crops, the experimental units are the fi eld plots that happen to be available. In an industrial experiment on process improvement, the units may be the production lines in operation during a given week.

As in a sample survey, replicating an experiment will produce different results. Once again, random assign-ment of experimental units to treatments (or vice versa) allows the use of probability to predict the behavior in the resulting values of summary statistics from a large number of replications of the experiment. Randomiza-tion in experiments is important for another reason. Suppose a researcher decides to assign treatment A only to patients over the age of 60 and treatment B only to patients under the age of 50. If the treatment responses differ, it is impossible to tell whether the difference is due to the treatments or the ages of the patients. (This kind of bias in experiments and other statistical studies is called confounding.) The randomiza-tion process, if properly done, will usually balance treatment groups so this type of bias is minimized.

Observational Studies

At Level C, students should understand that observa-tional studies are useful for suggesting patterns in data and relationships between variables, but do not provide a strong foundation for estimating population parameters

Whenrandomness isincorporated into thesamplingprocedure, probabilityprovides a way to describethe ‘long-run’behaviorof samplingvariability.”

Page 105: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

67

or establishing differences among treatments. Asking the students in one classroom whether they eat a healthy breakfast is not going to help you establish the proportion of healthy breakfast-eaters in the school, as the students in one particular classroom may not be representative of the students in the school. Random sampling is the only way to be confi dent of a represen-tative sample for statistical purposes. Similarly, feed-ing your cats Diet A and your neighbor’s cats Diet B is not going to allow you to claim that one diet is better than the other in terms of weight control, because there was no random assignment of experimental units (cats) to treatments (diets). As a consequence, confounding may result. Studies of the type suggested above are merely observational; they may suggest pat-terns and relationships, but they are not a reliable basis for statistical inference.

Analyzing Data

When analyzing data from well-designed sample sur-veys, students at Level C should understand that an appropriate analysis is one that can lead to justifi able inferential statements about population parameters based on estimates from sample data. The ability to draw conclusions about the population using informa-tion from a sample depends on information provided by the sampling distribution of the sample statistic being used to summarize the sample data. At Level C, the two most common parameters of interest are the population proportion for categorical data and the

population mean for numerical data. The appropriate sample statistics used to estimate these parameters are the sample proportion and the sample mean, respec-tively. At Level C, the sample-to-sample variability, as described by the sampling distribution for each of these two statistics, is addressed in more depth.

Exploring how the information provided by a sam-pling distribution is used for generalizing from a sample to the larger population enables students at Level C to draw more sophisticated conclusions from statistical studies. At Level C, it is recommended that the sampling distributions of a sample proportion and of a sample mean be developed through simulation. More formal treatment of sampling distributions can be left to AP Statistics and college-level introductory statistics courses.

Because the sampling distribution of a sample statis-tic is a topic with which many teachers may not be familiar, several examples are included here to show how simulation can be used to obtain an approximate sampling distribution for a sample proportion and for a sample mean.

Properties of the sampling distribution for a sample proportion can be illustrated by simulating the process of selecting a random sample from a population using random digits as a device to model various populations.

Example 1: The Sampling Distributionof a Sample Proportion

Page 106: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

68

For example, suppose a population is assumed to have 60% “successes” (p = .6) and we are to take a random sample of n = 40 cases from this population. How far can we expect the sample proportion of successes to deviate from the true population value of .60? This can be answered by determining an empirical sam-pling distribution for the sample proportion.

One way to model a population with 60% successes (and 40% failures) is to utilize the 10 digits 0, 1,…, 9. Label six of the 10 digits as “success” and the other four as “failures.” To simulate selecting a sample of size 40 from this population, randomly select 40 ran-dom digits (with replacement). Record the number of successes out of the 40 digits selected and convert this count to the proportion of successes in the sample. Note that:

Proportion of Successes in the Sample

Repeating this process a large number of times, and determining the proportion of successes for each sample, illustrates the idea of the sample-to-sample variability in the sample proportion.

Simulating the selection of 200 random samples of size 40 from a population with 60% successes and de-termining the proportion of success for each sample

resulted in the empirical distribution shown in Figure 25. This empirical distribution is an approximation to the true sampling distribution of the sample propor-tion for samples of size 40 from a population in which the actual proportion is .60.

5

10

15

20

25

30

Proportion

0.4 0.5 0.6 0.7 0.8

Sample proportions

Co

un

t

Summarizing the above distribution based on its shape, center, and spread, one can state that this em-pirical sampling distribution has a mound shape (ap-proximately normal). Because the mean and standard deviation of the 200 sample proportions are .59 and .08, respectively, the empirical distribution shown in Figure 25 has a mean of .59 and a standard deviation of .08.

By studying this empirical sampling distribution, and others that can be generated in the same way, students will see patterns emerge. For example, students will observe that, when the sample size is reasonably large

Figure 25: Histogram of sample proportions

Number of Successes in the Sample=

Sample Size

Page 107: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

69

(and the population proportion of successes is not too near the extremes of 0 or 1), the shapes of the result-ing empirical sampling distributions are approximately normal. Each of the empirical sampling distributions should be centered near the value of p, the population proportion of successes, and the standard deviation for each distribution should be close to:

p (1− p)

n

Note that in Example 1, the mean of the empirical dis-tribution is .59, which is close to .6, and the standard deviation is .08, which is close to:

.6(.4)

40≈ .0775

A follow-up analysis of these empirical sampling dis-tributions can show students that about 95% of the sample proportions lie within a distance of:

.6(.4)

40≈0.1552

from the true value of p. This distance is called the margin of error.

Properties of the sampling distribution for a sample mean can be illustrated in a way similar to that used for proportions in Example 1. Figure 26 shows the distribution of the sample mean when 200 samples of

30 random digits are selected (with replacement) and the sample mean is computed. This simulates sam-pling from a population that has a uniform distribu-tion with equal numbers of 0s, 1s, 2s,…, 9s. Note that this population of numerical values has a mean, μ, of 4.5 and a standard deviation, σ, of 2.9.

3.0 3.5 4.0 4.5 5.0 5.5 6.0Mean

Sample means

35

30

25

20

15

10

5

Freq

uen

cyo

fMea

n

The empirical sampling distribution shown in Figure 26 can be described as approximately normal with a mean of 4.46 (the mean of the 200 sample means from the simulation) and a standard deviation of 0.5 (the standard deviation of the 200 sample means).

By studying this empirical sampling distribution, and others that can be generated in similar ways, students will see patterns emerge. For example, students will

Figure 26: Histogram of sample means

Example 2: The Sampling Distribution of a Sample Mean

Page 108: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

70

observe that, when the sample size is reasonably large, the shapes of the empirical sampling distributions are approximately normal. Each of the empirical sam-pling distributions should be centered near the value of μ, the population mean, and the standard deviation for each distribution should be close to:

n

Note that in Example 2, the mean of the empirical sampling distribution is 4.46, which is close to μ = 4.5, and the standard deviation (0.5) is close to:

2.9 30 0.53n = =

The margin of error in estimating a population mean using the sample mean from a single random sample is approximately:

2n

The sample mean should be within this distance of the true population mean about 95% of the time in repeated random sampling.

Interpreting Results

Generalizing from Samples

The key to statistical inference is the sampling distribu-tion of the sample statistic, which provides information

about the population parameter being estimated. As described in the previous section, knowledge of the sampling distribution for a statistic, like a sample pro-portion or sample mean, leads to a margin of error that provides information about the maximum likely distance between a sample estimate and the popula-tion parameter being estimated. Another way to state this key concept of inference is that an estimator plus or minus the margin of error produces an interval of plausible values for the population parameter. Any one of these plausible values could have produced the ob-served sample result as a reasonably likely outcome.

Generalizing from Experiments

Do the effects of the treatments differ? In analyzing experimental data, this is one of the fi rst questions asked. This question of difference is generally posed in terms of differences between the centers of the data distributions (although it could be posed as a differ-ence between the 90th percentiles or any other mea-sure of location in a distribution). Because the mean is the most commonly used statistic for measuring the center of a distribution, this question of differences is generally posed as a question about a difference in means. The analysis of experimental data, then, usu-ally involves a comparison of means.

Unlike sample surveys, experiments do not depend on random samples from a fi xed population. Instead, they require random assignment of treatments to pre-selected experimental units. The key question, then,

σ

σ

σ

Page 109: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

71

is: “Could the observed difference in treatment means be due to the random assignment (chance) alone, or can it be attributed to the treatments administered?”

The following examples are designed to illustrate and further illuminate the important concepts at Level C by carefully considering the four phases of a statistical analysis—question, design, analysis, interpretation—in a variety of contexts.

A survey of student music preferences was introduced at Level A, where the analysis consisted of making counts of student responses and displaying the data in a bar graph. At Level B, the analysis was expanded to consider relative frequencies of preferences and cross-classifi ed responses for two types of music displayed in a two-way table. Suppose the survey included the following questions:

1. What kinds of music do you like?

Do you like country music?

Yes or No

Do you like rap music?

Yes or No

Do you like rock music?

Yes or No

2. Which of the following types of music do you like most? Select only one.

Country Rap/Hip Hop Rock

In order to be able to generalize to all students at the school, a representative sample of students from the school is needed. This could be accomplished by se-lecting a simple random sample of 50 students from the school. The results can then be generalized to the school (but not beyond), and the Level C discussion will center on basic principles of generalization—or statistical inference.

A Level C analysis begins with a two-way table of counts that summarizes the data on two of the ques-tions: “Do you like rock music?” and “Do you like rap music?” The table provides a way to separately ex-amine the responses to each question and to explore possible connections (association) between the two categorical variables. Suppose the survey of 50 stu-dents resulted in the data summarized in Table 11.

As demonstrated at Level B, there are a variety of ways to interpret data summarized in a two-way table, such as Table 11. Some examples based on all 50 students in the survey include:→ 25 of the 50 students (50%) liked both rap and rock music. → 29 of the 50 students (58%) liked rap music. → 19 of the 50 students (38%) did not like rock music.

Example 3: A Survey of Music Preferences

Page 110: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

72

One type of statistical inference relates to conjectures (hypotheses) made before the data were collected. Suppose a student says “I think more than 50% of the students in the school like rap music.” Because 58% of the students in the sample liked rap music (which is more than 50%), there is evidence to sup-port the student’s claim. However, because we have only a sample of 50 students, it is possible that 50% of all students like rap (in which case, the student’s claim is not correct), but the variation due to random sam-pling might produce 58% (or even more) who like rap. The statistical question, then, is whether the sample result of 58% is reasonable from the variation we ex-pect to occur when selecting a random sample from a population with 50% successes.

One way to arrive at an answer is to set up a hypo-thetical population that has 50% successes (such as even and odd digits produced by a random number generator) and repeatedly take samples of size 50 from it, each time recording the proportion of even digits.

The sampling distribution of proportions so gener-ated will be similar to the one below.

0.30 0.40 0.50 0.60 0.70Proportion

Movable line is at 0.58

Sample proportions

Based on this simulation, a sample proportion greater than or equal to the observed .58 occurred 12 times out of 100 just by chance variation alone when the actual population proportion is .50. This suggests the result of .58 is not a very unusual occurrence when sampling from a population with .50 as the “true” proportion of students who like rap music. So a popu-lation value of .50 is plausible based on what was ob-served in the sample, and the evidence in support of the student’s claim is not very strong. The fraction of times the observed result is matched or exceeded (.12 in this invest igat ion) is cal led the approximate

Like Rock Music?

Yes No Row Totals

Like Rap Music?

Yes 25 4 29

No 6 15 21

Column Totals 31 19 50

Figure 27: Dotplot of sample proportions from a hypo-thetical population in which 50% like rap music

Table 11: Two-Way Frequency Table

Page 111: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

73

p-value. The p-value represents the chance of observ-ing the result observed in the sample, or a result more extreme, when the hypothesized value is in fact cor-rect. A small p-value would have supported the stu-dent’s claim, because this would have indicated that if the population proportion was .50, it would have been very unlikely that a sample proportion of .58 would have been observed.

Suppose another student hypothesized that more than 40% of the students in the school like rap music. To test this student’s claim, samples of size 50 must now be repeatedly selected from a population that has 40% successes. Figure 28 shows the results of one such simulation. The observed result of .58 was reached only one time out of 100, and no samples produced a proportion greater than .58. Thus, the approximate

p-value is .01, and it is not likely that a population in which 40% of the students like rap music would have produced a sample proportion of 58% in a random sample of size 50. This p-value provides very strong evidence in support of the student’s claim that more than 40% of the students in the entire school like rap music.

Another way of stating the above is that .5 is a plausible value for the true population proportion, based on the sample evidence, but .4 is not. A set of plausible values can be found by using the margin of error introduced in Example 1. As explained previously, the margin of error for a sample proportion is approximately:

2p (1− p)

n

However, in this problem, the true value of p is un-known. Our sample proportion 58.ˆ =p( ) is our “best estimate” for what p might be, so the margin of error can be estimated to be:

14.50

)42(.58.2

)ˆ1(ˆ2 ≈=

−n

pp

Thus, any proportion between .58 − .14 = .44 and .58 + .14 = .72 can be considered a plausible value for the true proportion of students at the school who like rap music. Notice that .5 is well within this in-terval, but .4 is not. Figure 28: Dotplot of sample proportions from a hypo-

thetical population in which 40% like rap music

0.20 0.30 0.40 0.50 0.60Proportion

Movable line is at 0.58

Sample proportions

Page 112: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

74

Another type of question that could be asked about the students’ music preferences is of the form “Do those who like rock music also tend to like rap mu-sic?” In other words, is there an association between liking rock music and liking rap music? The same data from the random sample of 50 students can be used to answer this question.

According to Table 11, a total of 31 students in the survey like rock music. Among those students, the proportion who also like rap music is 25/31 = .81. Among the 19 students who do not like rock music, 4/19 = .21 is the proportion who like rap music. The large difference between these two proportions (.60) suggests there may be a strong association between liking rock music and liking rap music. But could this association simply be due to chance (a consequence only of the random sampling)?

If there were no association between the two groups, then the 31 students who like rock would behave as a random selection from the 50 in the sample. We would expect the proportion who like rap among these 31 students to be close to the proportion who like rap among the 19 students who don’t like rock. Essential-ly, this means that if there is no association, we expect the difference between these two proportions to be approximately 0. Because the difference in our survey is .6, this suggests that there is an association. Can the difference, .6, be explained by the random variation we expect when selecting a random sample?

To simulate this situation, we create a population of 29 1s (those who like rap) and 21 0s (those who do not like rap) and mix them together. Then, we select 31 (representing those who like rock) at random and see how many 1s (those who like rap) we get. It is this entry that goes into the (yes, yes) cell of the table, and from that data the difference in proportions can be calculated. Repeating the process 100 times produces a simulated sampling distribution for the difference between the two proportions, as shown in Figure 29.

Figure 29: Dotplot showing simulated samplingdistribution

-0.4 -0.2 0.0 0.2 0.4 0.6Difference

Movable line is at 0.60

Differences between proportions

Page 113: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

75

The observed difference in proportions from the sample data, .6, was never reached in 100 trials, in-dicating that the observed difference cannot be at-tributed to chance alone. Thus, there is convincing evidence of a real association between liking rock music and liking rap music.

What is the effect of different durations of light and dark on the growth of radish seedlings? This ques-tion was posed to a class of biology students who then set about designing and carrying out an experi-ment to investigate the question. All possible relative durations of light to dark cannot possibly be investi-gated in one experiment, so the students decided to focus the question on three treatments: 24 hours of light, 12 hours of light and 12 hours of darkness, and 24 hours of darkness. This covers the extreme cases and one in the middle.

With the help of a teacher, the class decided to use plastic bags as growth chambers. The plastic bags would permit the students to observe and measure the germination of the seeds without disturbing them. Two layers of moist paper towel were put into a dis-posable plastic bag, with a line stapled about 1/3 of the way from the bottom of the bag (see Figure 30) to hold the paper towel in place and to provide a seam to hold the radish seeds.

Although three growth chambers would be suffi cient to examine the three treatments, this class made four growth chambers, with one designated for the 24 hours of light treatment, one for the 12 hours of light and 12 hours of darkness treatment, and two for the 24 hours of darkness treatment. One hundred twenty seeds were available for the study. Thirty of the seeds were chosen at random and placed along the stapled seam of the 24 hours of light bag. Thirty seeds were then chosen at random from the remaining 90 seeds and placed in the 12 hours of light and 12 hours of darkness bag. Finally, 30 of the remaining 60 seeds were chosen at random and placed in one of the 24 hours of darkness bags. The fi nal 30 seeds were placed in the other 24 hours of darkness bag. After three days, the lengths of radish seedlings for the germinat-ing seeds were measured and recorded. These data are provided in Table 12; the measurements are in milli-

Example 4: An Experiment on the Effects of Light on the Growth of Radish Seedlings

Figure 30: Seed experiment

SeedsStaples

Page 114: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

76

meters. Notice that not all of the seeds in each group germinated.

A good fi rst step in the analyses of numerical data such as these is to make graphs to look for patterns and any unusual departures from the patterns. Box-plots are ideal for comparing data from more than one treatment, as you can see in Figure 31. Both the centers and the spreads increase as the amount of darkness increases. There are three outliers (one at 20

mm and two at 21 mm) in the Treatment 1 (24 hours of light) data. Otherwise, the distributions are fairly symmetric, which is good for statistical inference.

In Figure 31, Treatment 1 is 24 hours of light; treat-ment 2 is 12 hours of light and 12 of darkness; treat-ment 3 is 24 hours of darkness.

The summary statistics for these data are shown in Table 13.

Table 12: Lengths of Radish Seedlings

Treatment 1 24 light

Treatment 2 12 light, 12

dark

Treatment 324 dark

Treatment 1 24 light

Treatment 2 12 light, 12

dark

Treatment 324 dark

2 3 5 20 10 17 15 30

3 4 5 20 10 20 15 30

5 5 8 22 10 20 15 30

5 9 8 24 10 20 15 31

5 10 8 25 10 20 15 33

5 10 8 25 10 20 15 35

5 10 10 25 10 21 16 35

7 10 10 25 10 21 20 35

7 10 10 25 14 22 20 35

7 11 10 26 15 22 20 35

8 13 10 29 15 23 20 35

8 15 11 30 20 25 20 36

8 15 14 30 21 25 20 37

9 15 14 30 21 27 20 38

20 40

Page 115: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

77

Experiments are designed to compare treatment effects, usually by comparing means. The original question on the effect of different periods of light and dark on the growth of radish seedlings might be turned into two questions about treatment means. Is there evidence that the 12 hours of light and 12 hours of dark (Treatment 2) group has a signifi cantly higher mean than the 24 hours of light (Treatment 1) group? Is there evidence that the 24 hours of dark (Treatment 3) group has a signifi cantly higher mean than the 12 hours of light and 12 hours of dark (Treatment 2) group? Based on the boxplots and the summary sta-tistics, it is clear that the sample means differ. Are these

differences large enough to rule out chance variation as a possible explanation for the observed difference?

The Treatment 2 mean is 6.2 mm larger than the Treatment 1 mean. If there is no real difference be-tween the two treatments in terms of their effect on seedling growth, then the observed difference must be due to the random assignment of seeds to the bags; that is, one bag was simply lucky enough to get a preponderance of good and lively seeds. But, if a dif-ference this large (6.2 mm) is likely to be the result of randomization alone, then we should see differences of this magnitude quite often if we repeatedly re-randomize the measurements and calculate a new dif-ference in observed means. This, however, is not the case, as one can see from Figure 32. This dotplot was produced by mixing the growth measurements from Treatments 1 and 2 together, randomly splitting them into two groups of 28 measurements, recording the difference in means for the two groups, and repeating the process 200 times.

The observed difference of 6.2 mm was exceeded only one time in 200 trials, for an approximate p-value of

Treat-ment

n Mean Median Std. Dev.

1 28 9.64 9.5 5.03

2 28 15.82 16.0 6.76

3 58 21.86 20.0 9.75

Table 13: Treatment Summary Statistics

Figure 31: Boxplot showing growth under different conditions

Length

0 5 10 15 20 25 30 35 40 45

Radish seedling lengths

32

1Tr

eatm

ent

(mm)

Experiments are designedto compare treatmenteffects, usually by comparing means.

Page 116: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

78

1/200. This is very small, and gives extremely strong evidence to support the hypothesis that there is a sta-tistically signifi cant difference between the means for Treatments 1 and 2. The observed difference of 6.2 mm is very unlikely to be due simply to chance variation.

In a comparison of the means for Treatments 2 and 3, the same procedure is used, except that the combined measurements are split into groups of 28 and 58 each time. The observed difference of 6 mm was exceeded only one time out of 200 trials (see Figure 33), giving extremely strong evidence of a statistically signifi cant difference between the means for Treatments 2 and 3. In summary, the three treatment groups show statistically signifi cant differences in mean growth that cannot reasonably be explained by the random as-

signment of seeds to the bags. This gives us convinc-ing evidence of a treatment effect—the more hours of darkness, the greater the growth of the seedling, at least for these three periods of light versus darkness.

Students should be encouraged to delve more deeply into the interpretation, relating it to what is known about the phenomenon or issue under study. Why do the seedlings grow faster in the dark? Here is an ex-planation from a biology teacher. It seems to be an adaptation of plants to get the seedlings from the dark (under ground) where they germinate into the light (above ground) as quickly as possible. Obviously, the seedling cannot photosynthesize in the dark and is using up the energy stored in the seed to power the

Figure 33: Dotplot showing differences of means

-6 -4 -2 0 2 4 6 8Difference

Movable line is at 6.0

Differences of means

Figure 32: Dotplot showing differences of means

-6 -4 -2 0 2 4 6 8Difference

Movable line is at 6.2

Differences of means

Page 117: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

79

growth. Once the seedling is exposed to light, it shifts its energy away from growing in length to producing chlorophyll and increasing the size of its leaves. These changes allow the plant to become self-suffi cient and begin producing its own food. Even though the growth in length of the stem slows, the growth in di-ameter of the stem increases and the size of the leaves increases. Seedlings that continue to grow in the dark are spindly and yellow, with small yellow leaves. Seed-lings grown in the light are a rich, green color with large, thick leaves and short stems.

What is the density of the Earth? This is a question that intrigued the great scientist Henry Cavendish, who attempted to answer the question in 1798. Cav-endish estimated the density of the Earth by using the crude tools available to him at the time. He did not literally take a random sample; he measured on dif-ferent days and at different times, as he was able. But the density of the Earth does not change over time, so his measurements can be thought of as a random sample of all the measurements he could have taken on this constant. The variation in the measurements is due to his measurement error, not to changes in the Earth’s density. The Earth’s density is the constant that is being estimated.

This is a typical example of an estimation problem that occurs in science. There is no real “popula-tion” of measurements that can be sampled; rather, the sample data is assumed to be a random selection from the conceptual population of all measurements that could have been made. At this point, there may be some confusion between an “experiment” and a “sample survey” because Cavendish actually conduct-ed a scientifi c investigation to get his measurements. The key, however, is that he conducted essentially the same investigation many times with a goal of estimat-ing a constant, much like interviewing many people to estimate the proportion who favor a certain candi-date for offi ce. He did not randomly assign treatments to experimental units for the purpose of comparing treatment effects.

The famous Cavendish data set contains his 29 mea-surements of the density of the Earth, in grams per cubic centimeter. The data are shown below [Source: http://lib.stat.cmu.edu/DASL]: 5.50 5.57 5.42 5.61 5.53 5.47 4.88 5.62 5.63 4.07 5.29 5.34 5.26 5.44 5.46 5.55 5.34 5.30 5.36 5.79 5.75 5.29 5.10 5.86 5.58 5.27 5.85 5.65

5.39

One should look at the data before proceeding with an analysis. The histogram in Figure 34 shows the data to be roughly symmetric, with one unusually small value. If Cavendish were alive, you could ask him if he had

Example 5: Estimating the Density of the Earth—A Classical Study

Page 118: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

80

made a mistake (and that is certainly what you should do for a current data set). The mean of the 29 measurements is 5.42 and the standard deviation is 0.339. Recall that the margin of error for the sample mean is:

2n

where σ is the population standard deviation. In this problem, the population standard deviation is not known; however, the sample standard deviation provides an estimate for the population standard deviation. Consequently, the margin of error can be estimated to be:

2sn

= 20.339

29= 0.126

The analysis shows that any value between 5.420 – 0.126 and 5.420 + 0.126, or in the interval (5.294, 5.546), is a plausible value of the density of the Earth. That is, any value in the interval is consistent with the data obtained by Cavendish. Now, the questionable low observation should be taken into account, as it will lower the mean and increase the standard devia-tion. If that measurement is regarded as a mistake and removed from the data set, the mean of the 28 re-maining observations is 5.468 and the standard devia-tion is 0.222, producing a margin of error of 0.084 and an interval of plausible values of (5.384, 5.552).

Students now can check on how well Cavendish did; modern methods pretty much agree that the average density of the Earth is about 5.515 grams per cubic centimeter. The great 18th century scientist did well!

Regression analysis refers to the study of relationships between variables. If the “cloud” of points in a scat-terplot of paired numerical data has a linear shape, a straight line may be a realistic model of the rela-tionship between the variables under study. The least squares line runs through the center (in some sense) of the cloud of points. Residuals are defi ned to be the deviations in the y direction between the points in the scatterplot and the least squares line; spread is now the variation around the least squares line, as

σExample 6: Linear Regression Analysis—Height vs.Forearm Length

Figure 34: Histogram of Earth density measurements

2

4

6

8

4.0 4.4 4.8 5.2 5.6 6.0Density

Density

Co

un

t

Regression analysis refers to the studyof relationships betweenvariables.”

Page 119: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

81

measured by the standard deviation of the residuals. When using a fi tted model to predict a value of y from x, the associated margin of error depends on the stan-dard deviation of the residuals.

Relationships among various physical features, such as height versus arm span and neck size versus shoe size, can be the basis of many interesting questions for student investigation. If I were painting a picture of a person, how could I get the relative sizes of the body parts correct? This question prompted students to carry out an investigation of one of the possible re-lationships, that between forearm length and height.

The students responsible for the study sampled other students on which to make forearm and height mea-surements. Although the details of how the sample actually was selected are not clear, we will suppose that it is representative of students at the school and has the characteristics of a random sample. An impor-tant consideration here is to agree on the defi nition of “forearm” before beginning to take measurements. The data obtained by the students (in centimeters) are provided in Table 14.

A good fi rst step in any analysis is to plot the data, as we have done in Figure 35. The linear trend in the plot is fairly strong. The scatterplot, together with Pearson’s correlation coeffi cient of .8, indicate that a

Forearm (cm)

Height (cm) Forearm (cm)

Height (cm)

45.0 180.0 41.0 163.0

44.5 173.2 39.5 155.0

39.5 155.0 43.5 166.0

43.9 168.0 41.0 158.0

47.0 170.0 42.0 165.0

49.1 185.2 45.5 167.0

48.0 181.1 46.0 162.0

47.9 181.9 42.0 161.0

40.6 156.8 46.0 181.0

45.5 171.0 45.6 156.0

46.5 175.5 43.9 172.0

43.0 158.5 44.1 167.0

Figure 35: Scatterplot and residual plot

Height = 2.76Forearm + 45.8 r2 = 0.64

155160165170175180185190

Forearm39 40 41 42 43 44 45 46 47 48 49 50

-150

15

39 40 41 42 43 44 45 46 47 48 49 50Forearm

Height vs. forearm length

Hei

gh

tR

esid

ual

Table 14: Heights vs. Forearm Lengths

Page 120: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

82

line would be a reasonable model for summarizing the relationship between height and forearm length.

The scatterplot includes a graph of the least squares line:

Predicted Height = 45.8 + 2.76(Forearm Length).

The plot below the scatterplot shows the residuals. There are a few large residuals but no unusual pat-tern in the residual plot. The slope (about 2.8) can be interpreted as an estimate of the average difference in heights for two persons whose forearms are 1 cm different in length. The intercept of 45.8 centime-ters cannot be interpreted as the expected height of a person with a forearm zero centimeters long! However, the regression line can reasonably be used to predict the height of a person for whom the fore-arm length is known, as long as the known forearm length is in the range of the data used to develop the prediction equation (39 to 50 cm for these data). The margin of error for this type of prediction is approximately 2(standard deviation of the residuals). For these data, the standard deviation of the residu-als is 5.8 (not shown here, but provided as part of the computer output), so the margin of error is 2(5.8) = 11.6 cm. The predicted height of someone with a forearm length of 42 cm would be:

Predicted Height = 45.8 + 2.76(42) = 161.7 cm

With 95% confi dence, we would predict the height of people with forearm length 42 cm to be between 150.1 cm and 173.3 cm (161.7 ± 11.6).

Is the slope of 2.8 “real,” or simply a result of chance variation from the random selection process? This question can be investigated using simulation. A description of this simulation is included in the Ap-pendix to Level C.

Data often are presented to us in a form that does not call for much analysis, but does require some insight into statistical principles for correct interpretation. Standardized test scores often fall into this category. Table 15 gives information about the state mean scores on the National Assessment of Educational Progress (NAEP) 2000 Grade 4 mathematics scores for Louisi-ana and Kentucky. Even though these scores are based on a sample of students, these are the scores assigned to the states, and consequently, they can be considered observational data from that point of view.

To see if students understand the table, it is informa-tive to ask them to fi ll in a few omitted entries.

Example 7: Comparing Mathematics Scores—An Observational Study

Overall Mean

Mean for Whites

Mean for Non-

whites

% White

Louisiana 217.96 229.51 204.94

Kentucky 220.99 224.17 87

Table 15: NAEP 2000 Scores in Mathematics

Page 121: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

83

→ Fill in the two missing entries in the table (53% and 199.71).

More substantive questions involve the seeming con-tradictions that may occur in data of this type. They might be phrased as follows.→ For the two states, compare the overall means. Compare the means for whites. Compare the means for nonwhites. What do you observe?→ Explain why the reversals in direction take place once the means are separated into racial groups.

It is genuinely surprising to students that data summa-ries (means in this case) can go in one direction in the aggregate but can go in the opposite direction for each subcategory when disaggregated. This phenomenon is called Simpson’s Paradox.

Observational studies are the only option for situ-ations in which it is impossible or unethical to ran-domly assign treatments to subjects. Such situations are a common occurrence in the study of causes of diseases. A classical example from this fi eld is the re-lationship between smoking and lung cancer, which prompted heated debates during the 1950s and 1960s. Society will not condone the notion of assigning some people to be smokers and others to be nonsmokers in an experiment to see if smoking causes lung cancer. So the evidence has to be gathered from observing the

world as it is. The data collection process still can be designed in clever ways to obtain as much information as possible.

Here is an example from the smoking versus lung cancer debates. A group of 649 men with lung cancer was identifi ed from a certain population in England. A control group of the same size was established by matching these patients with other men from the same population who did not have lung cancer. The matching was on background variables such as eth-nicity, age, and socioeconomic status. (This is called a case-control study.) The objective, then, is to compare the rate of smoking among those with lung cancer to the rate for those without cancer.

First, make sure students understand the nature of the data in Table 16. Does this show, for example, that there was a very high percentage of smokers in England around 1950? The rate of smoking in these groups was (647/649) = .997 for the cancer patients and (622/649) = .958 for the controls. If these data had resulted from a random assignment or selection, the difference of about 4 percentage points would be

Lung Cancer Cases

Controls Totals

Smokers 647 622 1,269

Non-smokers

2 27 29

Table 16: Cigarette Smoking and Lung Cancer

Example 8: Observational Study—Toward EstablishingCausation

Page 122: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

84

statistically signifi cant (by methods discussed earlier), which gives the researcher reason to suspect there is an association here that cannot be attributed to chance alone. Another way to look at these data is to think about randomly selecting one person from among the smokers and one person from among the nonmokers. The smoker has a chance of 647/1269 = .51 of being in the lung cancer column, while the nonsmoker has only a 2/29 = .07 chance of being there. This is evidence of strong association between smoking and lung cancer, but it is not conclusive evidence that smoking is, in fact, the cause of the lung cancer. (This is a good place to have students speculate about other possible causes that could have resulted in data like these.)

Another step in establishing association in observa-tional studies is to see if the increase in exposure to the risk factor produces an increase in incidence of the disease. This was done with the same case-control study by looking at the level of smoking for each per-son, producing Table 17.

The term “probability” is used in the same sense as above. If a person is randomly selected from the 1–14 level, the chance that the person falls into the can-cer column is .45, and so on for the other rows. The important result is that these “probabilities” increase with the level of smoking. This is evidence that an in-crease in the disease rate is associated with an increase in cigarette smoking.

Even with this additional evidence, students should understand that a cause and effect relationship cannot be established from an observational study. The main reason for this is that these observational studies are subject to bias in the selection of patients and controls. Another study of this type could have produced a dif-ferent result. (As it turned out, many studies of this type produced remarkably similar results. That, cou-pled with laboratory experiments on animals that es-tablished a biological link between smoking and lung cancer, eventually settled the issue for most people.)

The Appendix to Level C contains more examples of the types discussed in this section.

The Role of Probability in Statistics

Teachers and students must understand that sta-tistics and probability are not the same. Statistics uses probability, much as physics uses calculus, but only certain aspects of probability make their way into statistics. The concepts of probability needed for introductory statistics (with emphasis on data

Cigarettes/Day

Lung Cancer Cases

Controls Probability

0 2 27 0.07

1–14 283 346 0.45

15–24 196 190 0.51

25+ 168 84 0.67

Table 17: Level of Cigarette Smoking and Lung Cancer

Page 123: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

85

analysis) include relative frequency interpretations of data, probability distributions as models of popula-tions of measurements, an introduction to the normal distribution as a model for sampling distributions, and the basic ideas of expected value and random varia-tion. Counting rules, most specialized distributions and the development of theorems on the mathematics of probability should be left to areas of discrete math-ematics and/or calculus.

Understanding the reasoning of statistical inference requires a basic understanding of some important ideas in probability. Students should be able to:→ Understand probability as a long-run relative frequency;→ Understand the concept of independence; and→ Understand how probability can be used in making decisions and drawing conclusions.

In addition, because so many of the standard inferential procedures are based on the normal distribution, students should be able to evaluate probabilities using the normal distribution (preferably with the aid of technology).

Probability is an attempt to quantify uncertainty. The fact that the long-run behavior of a random process is predictable leads to the long-run relative frequency in-terpretation of probability. Students should be able to interpret the probability of an outcome as the long-run proportion of the time the outcome should occur if the random experiment is repeated a large number of

times. This long-run relative frequency interpretation of probability also provides the justifi cation for using simulation to estimate probabilities. After observing a large number of chance outcomes, the observed pro-portion of occurrence for the outcome of interest can be used as an estimate of the relevant probability.

Students also need to understand the concept of in-dependence. Two outcomes are independent if our assessment of the chance that one outcome occurs is not affected by knowledge that the other outcome has occurred. Particularly important to statistical inference is the notion of independence in sampling settings. Random selection (with replacement) from a population ensures the observations in a sample are independent. For example, knowing the value of the third observation does not provide any information about the value of the fi fth (or any other) observation. Many of the methods used to draw conclusions about a population based on data from a sample require the observations in a sample to be independent.

Most importantly, the concepts of probability play a critical role in developing statistical methods that make it possible to make inferences based on sample data and to assess our confi dence in such conclusions.

To clarify the connection between data analysis and probability, we will return to the key ideas presented in the inference section. Suppose an opinion poll shows 60% of sampled voters in favor of a proposed new law. A basic statistical question is, “How far

Probabilityis an attemptto quantifyuncertainty.

Page 124: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

86

might this sample proportion be from the true population proportion?” That the difference between the estimate and the truth is less than the margin of error approximately 95% of the time is based on a probabilistic understanding of the sampling distribu-tion of sample proportions. For large random samples, this relative frequency distribution of sample propor-tions is approximately normal. Thus, students should be familiar with how to use appropriate technology to fi nd areas under the normal curve.

Suppose an experimenter divides subjects into two groups, with one group receiving a new treatment for a disease and the other receiving a placebo. If the treatment group does better than the placebo group, a basic statistical question is, “Could the difference have been a result of chance variation alone?” The randomization allows us to determine the probabil-ity of a difference being greater than that observed under the assumption of no treatment effect. In turn, this probability allows us to draw a meaningful con-clusion from the data. (A proposed model is rejected as implausible, not primarily because the probability of an observed outcome is small, but rather because it is in the tail of a distribution.) An adequate answer to the above question also requires knowledge of the context in which the question was asked and a sound experimental design. This reliance on context and design is one of the basic differences between statis-tics and mathematics.

As demonstrated earlier, the sampling distribution of a sample mean will be approximately normal under ran-dom sampling, as long as the sample size is reasonably large. The mean and standard deviation of this distri-bution usually are unknown (introducing the need for inference), but sometimes these parameter values can be determined from basic information about the pop-ulation being sampled. To compute these parameter values, students will need some knowledge of expected values, as demonstrated next.

According to the March 2000 Current Population Survey of the U.S. Census Bureau, the distribution of family size is as given by Table 18. (A family is defi ned as two or more related people living together. The number “7” really is the category “7 or more,” but very few families are larger than 7.)

Notice fi rst the connection between data and prob-ability: These proportions (really estimates from a very large sample survey) can be taken as approximate

Family Size, x Proportion, p(x)

2 0.437

3 0.223

4 0.201

5 0.091

6 0.031

7 0.017

Table 18: Family Size Distribution

Page 125: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

87

probabilities for the next survey. In other words, if someone randomly selects a U.S. family for a new survey, the probability that it will have three mem-bers is about .223.

Second, note that we now can fi nd the mean and stan-dard deviation of a random variable (call it X), defi ned as the number of people in a randomly selected family. The mean, sometimes called the expected value of X and denoted by E(X), is found using the formula:

( ) ( )all possible

xvalues

E X x p x= ⋅∑

which turns out to be 3.11 for this distribution. If the next survey contains 100 randomly selected families, then the survey is expected to produce 3.11 members per family, on the average, for an estimated total of 311 people in the 100 families altogether.

The standard deviation of X, SD(X), is the square root of the variance of X, V(X), given by:

2( ) [ ( )] ( )all possible

xvalues

V X x E X p x= − ⋅∑

For the family size data, V(X) = 1.54 and SD(X) = 1.24.

Third, these facts can be assembled to describe the ex-pected sampling distribution of the mean family size in a random sample of 100 families yet to be taken. That sampling distribution will be approximately

normal in shape, centering at 3.11 with a standard de-viation of 1.24/ 100 = 0.124. This would be useful information for the person designing the next survey.

In short, the relative frequency defi nition of prob-ability, the normal distribution, and the concept of ex-pected value are the keys to understanding sampling distributions and statistical inference.

Summary of Level C

Students at Level C should become adept at using statistical tools as a natural part of the investigative process. Once an appropriate plan for collecting data has been implemented and the resulting data are in hand, the next step usually is to summarize the data using graphical displays and numerical summaries. At Level C, students should be able to select summary techniques appropriate for the type of data available, produce these summaries, and describe in context the important characteristics of the data. Students will use the graphical and numerical summaries learned at Levels A and B, but should be able to provide a more sophisticated interpretation that integrates the context and objectives of the study.

At Level C, students also should be able to draw con-clusions from data and support these conclusions us-ing statistical evidence. Students should see statistics as providing powerful tools that enable them to answer questions and to make informed decisions. Students also should understand the limitations of conclusions

Page 126: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

88

based on data from sample surveys and experiments, and should be able to quantify uncertainty associated with these conclusions using margin of error and re-lated properties of sampling distributions.

Page 127: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

89

What Are Common Name Lengths?

Formulate Questions

During the fi rst week of school, a third-grade teacher is trying to help her students learn one another’s names by playing various games. During one of the games, a student named MacKenzie noticed she and her classmate Zacharius each have nine letters in their names. MacKenzie conjectured that their names were longer than everyone else’s names. The teacher de-cided that this observation by the student provided an excellent opening for a statistics lesson.

The next school day, the teacher reminds students of MacKenzie’s comment from the day before and asks the class what they would like to know about their classmates’ names. The class generates a list of questions, which the teacher records on the board as follows:→ Who has the longest name? The shortest?→ Are there more nine-letter names or six-letter

names? How many more?→ What’s the most common name length?→ How many letters are in all of our names?→ If you put all of the eight- and nine-letter

names together, will there be as many as the fi ve-letter names?

Collect Data

The statistics lesson begins with students writing their names on sticky notes and posting them on the white board at the front of the room. This is a census of the classroom because they are gathering data from all students in the class.

Given no direction about how to organize the notes, the students arbitrarily place them on the board.

In order to help students think about how to use graphical tools to analyze data, the teacher asks the students if they are easily able to answer any of the

Appendix for Level ASam

3

Patti5

Haven5 Connor

6

Faith5Ella

4

Alicia6

Josh4

Bryce5

Landis6

Qynika6

Aaliyah7

Christian9

Nicholas8 Katelin

7

Austin6

Christina9

Amber5

Amanda6

Zak3

Marcas6 Octavious

9 Ilonna6

Ali3

Mrs. Chrisp9

Figure 36: Random placement of names

Page 128: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

90

posed questions now by looking at the sticky notes, and the students say they cannot. The teacher then suggests that they think of ways to better organize the notes. A student suggests grouping the names accord-ing to how many letters are in each name.

The teacher again asks if they can easily answer the questions that are posed. The students say they can answer some of the questions, but not easily. The teach-er asks what they can do to make it easier to answer the questions. Because the students have been con-structing graphs since kindergarten, they readily an-

swer, “Make a graph!” The teacher then facilitates a discussion of what kind of graph they will make, and the class decides on a dotplot, given the fact that their names are already on sticky notes and given the avail-able space on the board. Note that this display is not a bar graph because bar graphs are made when the data represent a categorical variable (such as favorite color). A dotplot is appropriate for a numerical variable, such as the number of letters in a name.

The teacher then uses computer software to translate this information into a more abstract dotplot, as shown

Figure 37: Names clustered by lengthFigure 38: Preliminary dotplot

Ali3

Sam3

Christian9

Octavious9

Mrs. Chrisp9

Faith5Amber

5Bryce

5

Landis6

Amanda6

Ilonna6

Austin6

Connor6

Marcas6

Patti5

Haven5

Ella4

Alicia6

Josh4

Qynika6

Aaliyah7

Nicholas8

Katelin7

Christina9

Zak3

Octavious9

Christian9

Mrs. Chrisp9

Landis6Amanda

6Ilonna6

Austin6Connor

6Alicia6

Faith5

Bryce5Haven

5Patti

5

Ali3

Sam3

Amber5

Marcas6

Ella4

Josh4

Qynika6

Aaliyah7

Nicholas8

Katelin7

Christina9

Zak3

Page 129: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

91

in Figure 39. This helps the students focus on the gen-eral shape of the data, rather than on the particular names of the students.

Interpret Results

The teacher then facilitates a discussion of each ques-tion posed by the students, using the data displayed in the graph to answer the questions. Students also add appropriate labels and titles to the graph. The teacher helps students use the word “mode” to answer the question about the most common name length. She introduces the term “range” to help students an-swer the questions about shortest and longest names. Students visualize from the dotplot that there is vari-ability in name length from individual to individual. The range gives a sense of the amount of variability in name length within the class. Using the range, we know that if the name for any two students are com-pared, the name lengths cannot differ by more than the value for the range.

The teacher then tells the students that there is an-other useful question they can answer from this data. Sometimes it is helpful to know “about how long most names are.” For instance, if you were making place cards for a class lunch party, you might want to know how long the typical name is in order to decide which size of place cards to buy. The typical or average name length is called the mean. Another way to think of this is, “If all of our names were the same length, how long would they be?” To illustrate this new idea, the teach-

er has students work in groups of four, and each child takes a number of snap cubes equal to the number of letters in his/her name. Then all four children at one table put all of their snap cubes in a pile in the middle of the table. They count how many cubes they have in total. Then they share the cubes fairly, with each child taking one at a time until they are all gone or there are not enough left to share. They record how many cubes each child received. (Students at some tables are able to use fractions to show that, for example, when there are two cubes left, each person could get half a cube. At other tables, the students simply leave the remain-ing two cubes undistributed.) The teacher then helps the students symbolize what they have done by using addition to refl ect putting all the cubes in the middle of the table and using division to refl ect sharing the cubes fairly among everyone at the table. They attach the words “mean” and “average” to this idea.

Finally, the students are asked to transfer the data from the sticky notes on the board to their own graphs. The class helps the teacher generate additional questions about the data that can be answered for homework. Because the students’ graphs look different, the next

Figure 39: Computer-generated dotplot

9876543Number of Letters in Name

Page 130: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

92

day the teacher will lead a discussion about the features of the various graphs the students have constructed and the pros and cons of each.

Valentine’s Day and Candy Hearts

Formulate Questions

As Valentine’s Day approaches, a teacher decides to plan a lesson in which children will analyze the charac-teristics of a bag of candy hearts. To begin the lesson, the teacher holds up a large bag of candy hearts and asks the children what they know about them from prior experience. The children know that the hearts are different colors and that they have words on them. The teacher asks the children what they wonder about the bag of hearts she is holding. The children want to know how many hearts are in the bag, what they say, and whether there are a lot of pink hearts, because most people like pink ones the best. The teacher tells

the children that they will be able to answer some of those questions about their own bags of candy.

Collect Data

Each child receives a small packet of candy hearts. Students are asked how they can sort their hearts, and the students suggest sorting them by color—a categorical variable. The teacher asks students what question this will help them answer, and the students readily recognize that this will tell them which color candy appears most often in the bag.

Analyze Data

After sorting the candies into piles and counting and recording the number of candies in each pile, the teacher guides the students to make a bar graph with their candies on a blank sheet of paper. The children construct individual bar graphs by lining up all of their pink candies, all of their white candies, etc. The

Figure 40: Student-drawn graphs

Page 131: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

93

teacher then provides a grid with color labels on the x-axis and numerical labels on the y-axis so the students can transfer their data from the actual candies to a more permanent bar graph.

Interpret Results

After students construct their individual graphs, the teacher distributes a recording sheet on which each student records what color occurred the most frequently (the modal category) and how many of each color they had. This is followed by a class discussion in which the teacher highlights issues of variability. First,

the students recognize that the number of each color varies within a package. Students also recognize that their packets of candy are not identical, noting that some students had no green hearts while others had no purple hearts. Some students had more pink hearts than any other color, while other students had more white hearts. At Level A, students are acknowledging variability between packages—the concept of between group variability that will be explored in more detail at Level B. The students hypothesize that these varia-tions in packages were due to how the candies were packed by machines. The students also noted differ-

Figure 41: Initial sorting of candies

C A N D Y H E A R T C O L O R

S O R Tpurple

yellow

white

pink

orange

green

C A N D Y H E A R T C O L O R

G R A P H

Purple Pink Orange Green White Yellow

Figure 42: Bar graph of candy color

Page 132: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

94

ences in the total number of candies per packet, but found this difference to be small. The student with the fewest candies had 12, while the student with the greatest number of candies had 15. The teacher asked students if they had ever read the phrase “packed by weight, not by volume” on the side of a package. The class then discussed what this meant and how it might relate to the number of candies in a bag.

(Note: Images in this example were adapted from www.littlegiraffes.com/valentines.html.)

Page 133: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

95

Many questionnaires ask for a “Yes” or “No” response. For example, in the Level B document, we explored connections between whether students like rap mu-sic and whether they like rock music. To investigate possible connections between these two categorical variables, the data were summarized in the following two-way frequency table, or contingency table.

Since 82% (27/33) of the students who like rock music also like rap music, students who like rock music tend to like rap music as well. Because students who like rock music tend to like rap music, there is an association between liking rock music and liking rap music.

At Level B, we explored the association between height and arm span by examining the data in a scat-terplot, and we measured the strength of the associa-tion with the Quadrant Count Ratio, or QCR. For the height/arm span problem, both variables are numer-ical. It also is possible to measure the strength and direction of association between certain types of cat-egorical variables. Recall that two numerical variables are positively associated when above-average values of

one variable tend to occur with above-average values of the other and when below-average values of one variable tend to occur with below-average values of the other. Two numerical variables are negatively asso-ciated when below-average values of one variable tend to occur with above-average values of the other and when above-average values of one variable tend to oc-cur with below-average values of the other.

The scatterplot below for the height/arm span data includes a vertical line (x = 172.8) drawn through the mean height and a horizontal line ( y = 169.3) drawn through the mean arm span.

Appendix for Level B

Like Rap Music?

Yes No Row Totals

Like Rock Music?

Yes 27 6 33

No 4 17 21

Column Totals 31 23 54

Figure 43: Scatterplot of arm span/height data

Table 4: Two-Way Frequency Table

Page 134: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

96

An alternative way to summarize the data would have been to ask each student the following two questions:Is your height above average?Is your arm span above average?

Note that for these data, the response to each question is either “Yes” or “No.”

The 12 individuals in the scatterplot with below-average height and below-average arm span (Quad-rant 3) responded “No” to both questions. Because their responses to both questions are the same, these 12 responses are in agreement. The 11 individuals in the scatterplot with above-average height and above-average arm span (Quadrant 1) responded “Yes” to both questions. Since their responses to both questions are the same, these 11 responses are in agreement. When the responses to two “Yes/No” questions are the same (No/No) or (Yes/Yes), the responses are in agreement.

The one individual with below-average height and above-average arm span (Quadrant 2) responded “No” to the first question and “Yes” to the second question, (No/Yes). Since her/his responses to the two questions are different, these two responses are in disagreement. The two individuals with above-average height and below-average arm span (Quadrant 4) responded “Yes” to the fi rst question and “No” to the second question (Yes/No). Since their responses to the two questions are different, their responses are

in disagreement. When the responses to two “Yes/No” questions are different (No/Yes) or (Yes/No), the responses are in disagreement.

For the data in the scatterplot in Figure 43, the results to the above two questions can be summarized in the following 2x2 two-way frequency table:

Notice that there are a total of 23 responses in agree-ment (12 No/No and 11 Yes/Yes to the height/arm span questions), and that these correspond to the points in Quadrants 3 and 1, respectively, in the scat-terplot. Also, there are a total of three responses in dis-agreement (two Yes/No and one No/Yes), and these correspond to the points in Quadrants 4 and 2, respec-tively. Recall that the QCR is determined as follows:

(Number of Points in Quadrants 1 and 3)– (Number of Points in Quadrants 2 and 4)

Number of Points in all Four Quadrants

Height above Average?Row

TotalsNo Yes

Arm Span

above Average?

No 12 2 14

Yes 1 11 12

Column Totals 13 13 26

Table 19: 2x2 Two-Way Frequency Table

Page 135: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

97

Restated in terms of Table 19:

Based on this, we can say that two “Yes/No” cat-egorical variables are positively associated when the responses tend to be in agreement—the more obser-vations in agreement, the stronger the positive asso-ciation. Negative association between two “Yes/No” categorical variables occurs when the responses tend to be in disagreement—the more observations in dis-agreement, the stronger the negative association.

The responses to two “Yes/No” questions can be summarized as follows in a two-way frequency table:

Note: a = the number who respond No/No; b = the number who respond Yes/No; c = the number who re-spond No/Yes; d = the number who respond Yes/Yes.

Conover (1999) suggests the following measure of as-sociation based on a 2x2 table summarized as above.

Let’s call this measure the Agreement-Disagreement Ratio (ADR). Note that this measure of association is analogous to the QCR correlation coeffi cient for two numerical variables.

The ADR for the height/arm span data is:

An ADR of .77 indicates a strong positive association between height and arm span measurements.

Recall the music example data, which were summa-rized as follows:

The ADR for the rap/rock data is:

QCR =

(Number of Points in Agreement)– (Number of Points in Disagreement)

Number of Points in all Four Quadrants

Question 1 Row TotalsNo Yes

Question 2

No a b r1=a+b

Yes c d r2=c+d

Column Totals c1=a+c c2=b+d T=a+b+c+d

(a+d) – (b+c)

T

ADR = (12+11) – (2+1)

= .77 26

Like Rap Music?

No Yes Row Totals

Like Rock Music?

No 17 4 21

Yes 6 27 33

Column Totals 23 31 54

Table 20: Two-Way Frequency TableTable 21: Two-Way Frequency Table

ADR = (17 +27) – (4+6)

= .63 54

Page 136: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

98

An ADR of .63 indicates a fairly strong association between liking rock and liking rap music.

Another question presented in Level B was:

Do students who like country music tend to like or dislike rap music?

Data collected on 54 students are summarized in the following two-way frequency table:

For these data,

An ADR of –.30 indicates a negative association be-tween liking country music and liking rap music.

The QCR and the ADR are additive in nature, in that they are based on “how many” data values are in each quadrant or cell. Conover (1999) suggests the phi coef-fi cient as another possible measure of association for data summarized in a 2x2 table.

Phi =ad − bc

r1r2c1c2

Conover points out that Phi is analogous to Pearson’s correlation coeffi cient for numerical data. Both Phi and Pearson’s correlation coeffi cient are multiplica-tive, and Pearson’s correlation coeffi cient is based on “how far” the points in each quadrant are from the center point.

Recall that in Example 6 of Level C, students inves-tigated the relationship between height and forearm length. The observed data are shown again here as Table 14, and the resulting plots and regression analy-sis are given in Figure 35.

Like Rap Music?

No Yes Row Totals

Like Country Music?

No 10 22 32

Yes 13 9 22

Column Totals 23 31 54

ADR = (10+9) – (22+13)

= –.3054

Table 22: Two-Way Frequency Table

Page 137: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

99

Regression Analysis: Height versusForearm

The regression equation is:

Predicted Height = 45.8 + 2.76 (Forearm)

Is the slope of 2.8 “real,” or simply a result of the chance variation from the random selection

Height = 2.76Forearm + 45.8 r2 = 0.64

155160165170175180185190

Forearm39 40 41 42 43 44 45 46 47 48 49 50

-150

15

39 40 41 42 43 44 45 46 47 48 49 50Forearm

Height vs. forearm length

Hei

gh

tR

esid

ual

process? This question can be investigated using simulation.

If there were no real relationship between height and forearm length, then any of the height values could be paired with any of the forearm values with no loss of information. In the spirit of the comparison of means in the radish experiment, you could then ran-domly mix up the heights (while leaving the forearm lengths as-is), calculate a new slope, and repeat this process many times to see if the observed slope could be generated simply by randomization. The results of 200 such randomizations are shown in Figure 44. A slope as large as 2.8 is never reached by random-ization, which provides strong evidence that the

Appendix for Level C

Forearm (cm)

Height (cm) Forearm (cm)

Height (cm)

45.0 180.0 41.0 163.0

44.5 173.2 39.5 155.0

39.5 155.0 43.5 166.0

43.9 168.0 41.0 158.0

47.0 170.0 42.0 165.0

49.1 185.2 45.5 167.0

48.0 181.1 46.0 162.0

47.9 181.9 42.0 161.0

40.6 156.8 46.0 181.0

45.5 171.0 45.6 156.0

46.5 175.5 43.9 172.0

43.0 158.5 44.1 167.0

Figure 35: Scatterplot and residual plot

Table 14: Heights vs. Forearm Lengths

Page 138: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

100

observed slope is not due simply to chance variation. An appropriate conclusion is that there is signifi cant evidence of a linear relationship between forearm length and height.

A high-school class interested in healthy lifestyles car-ried out a survey to investigate various questions they thought were related to that issue. A random sample of 50 students selected from those attending a high school on a particular day were asked a variety of health-related questions, including these two:

Do you think you have a healthy lifestyle?Do you eat breakfast at least three times a week?

The data are given in Table 23.

From these data, collected in a well-designed sample survey, it is possible to estimate the proportion of stu-dents in the school who think they have a healthy life-style and the proportion who eat breakfast at least three times a week. It also is possible to assess the degree of association between these two categorical variables.

For example, in the lifestyle survey previously de-scribed, 24 students in a random sample of 50 stu-dents attending a particular high school reported they eat breakfast at least three times per week. Based on this sample survey, it is estimated that the proportion of students at this school who eat breakfast at least three times per week is 24/50 = .48 with a margin of error of:

2(.48)(.52)

50= .14

Using the margin of error result from above (.14), the in-terval of plausible values for the population proportion of students who eat breakfast at least three times a

Figure 44: Dotplot showing association

-2 -1 0 1 2 3Slope

Movable line is at 2.8

Slopes

Eat Breakfast

Healthy Lifestyle

Yes No Total

Yes 19 15 34

No 5 11 16

Total 24 26 50

Table 23: Result of Lifestyle Question

Example 1: A Survey of Healthy Lifestyles

Page 139: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

101

week is (0.34, 0.62). Any population proportion in this interval is consistent with the sample data in the sense that the sample result could reasonably have come from a population having this proportion of students eating breakfast.

To see if the answers to the breakfast and lifestyle questions are associated with each other, you can compare the proportions of yes answers to the healthy lifestyle question for those who regularly eat break-fast with those who do not, much like the compari-son of means for a randomized experiment. In fact, if a 1 is recorded for each yes answer and a 0 for each no answer, the sample proportion of yes answers is precisely the sample mean. For the observed data, there is a total of 34 1s and 16 0s. Re-randomizing these 50 observations to the groups of size 24 and 26 (corresponding to the yes and no groups on the breakfast question) and calculating the differ-ence in the resulting proportions gave the results in Figure 45. The observed difference in sample proportions (19/24) – (15/26) = 0.21 was matched or exceeded 13 times out of 200 times, for an esti-mated p-value of 0.065. This is moderately small, so there is some evidence that the difference be-tween the two proprtions might not be a result of chance variation. In other words, the responses to the health lifestyle question and the eating break-fast question appear to be related in the sense that those who think they have a healthy lifestyle also have a tendency to eat breakfast regularly.

-0.4 -0.3 -0.2 -0.1 0.0 0.1 0.2 0.3 0.4Mean Difference

Movable line is at 0.21

Healthy lifestyle differences

On another health-related issue, a student decided to answer the question of whether simply standing for a few minutes increases people’s pulses (heart rates) by an appreciable amount. Subjects available for the study were the 15 students in a particular class. The “sit” treatment was randomly assigned to eight of the students; the remaining seven were assigned the “stand” treatment. The measurement recorded was a pulse count for 30 seconds, which was then doubled to approximate a one-minute count. The data, ar-ranged by treatment, are in Table 24. From these data, it is possible to either test the hypothesis that stand-ing does not increase pulse rate, on the average, or to

Figure 45: Dotplot showing differences in sample proportions

Example 2: An Experimental Investigation of Pulse Rates

Page 140: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

102

estimate the difference in mean pulse between those who stand and those who sit. The random assignment to treatments is intended to balance out the unmea-sured and uncontrolled variables that could affect the results, such as gender and health conditions. This is called a completely randomized design.

However, randomly assigning 15 students to two groups may not be the best way to balance background

information that could affect results. It may be bet-ter to block on a variable related to pulse. Since people have different resting pulse rates, the students in the experiment were blocked by resting pulse rate by pair-ing the two students with the lowest resting pulse rates, then the two next lowest, and so on. One person in each pair was randomly assigned to sit and the other to stand. The matched pairs data are in Table 25. As in the completely randomized design, the mean difference be-tween sitting and standing pulse rate can be estimated. The main advantage of the blocking is that the varia-tion in the differences (which now form the basis of the analysis) is much less than the variation among the pulse measurements that form the basis of analysis for the completely randomized design.

Pulse Group Category

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

62 1 sit

60 1

72 1 sit

56 1 sit

80 1 sit

58 1 sit

60 1 sit

54 1 sit

58 2 stand

61 2 stand

60 2 stand

73 2 stand

62 2 stand

72 2 stand

82 2 stand

sit

Pulse data: matched pairs

=

MPSit MPStand Difference

1

2

3

4

5

6

7

68 74 6

56 55 -1

60 72 12

62 64 2

56 64 8

60 59 -1

58 68 10

Table 24: Pulse Data Table 25: Pulse Data in Matched Pairs

Page 141: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

103

In the fi rst pulse rate experiment (Table 24), the treatments of “sit” or “stand” were randomly as-signed to students. If there is no real difference in pulse rates for these two treatments, then the ob-served difference in means (4.1 beats per minute) is due to the randomization process itself. To check this out, the data resulting from the experiment can be re-randomized (reassigned to sit or stand after the fact) and a new difference in means recorded. Do-ing the re-randomization many times will generate a distribution of differences in sample means due to chance alone. Using this distribution, one can assess the likelihood of the original observed difference. Figure 46 shows the results of 200 such re-random-izations. The observed difference of 4.1 was matched or exceeded 48 times, which gives an estimated p-val-ue of 0.24 of seeing a result of 4.1 or greater by chance alone. Because this is a fairly large p-value, it can be concluded that there is little evidence of any real dif-ference in means pulse rates between the sitting and the standing positions based on the observed data.

In the matched pairs design, the randomization oc-curs within each pair—one person randomly as-signed to sit while the other stands. To assess whether the observed difference could be due to chance alone and not due to treatment differences, the re-random-ization must occur within the pairs. This implies that the re-randomization is merely a matter of randomly assigning a plus or minus sign to the numerical values of the observed differences. Figure 47 on the follow-

ing page shows the distribution of the mean differenc-es for 200 such re-randomizations; the observed mean difference of 5.14 was matched or exceeded eight times. Thus, the estimated probability of getting a mean dif-ference of 5.1 or larger by chance alone is 0.04. This very small probability provides evidence that the mean difference can be attributed to something other than chance (induced by the initial randomization process) alone. A better explanation is that standing increases pulse rate, on average, over the sitting rate. The mean difference shows up as signifi cant here, while it did not for the completely randomized design, because the matching reduced the variability. The differences in the matched pairs design have less variability than the individual measurements in the completely randomized design, making it easier to detect a difference in mean pulse for the two treatments.

Figure 46: Dotplot of randomized differences in means

-12 -8 -4 0 4 8 12Mean Difference

Movable line is at 4.1

Randomized differences in means; pulse data

Page 142: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

104

Vital statistics are a good example of observational data that are used every day by people in various walks of life. Most of these statistics are reported as rates, so an understanding of rates is a critical skill for high-school graduates. Table 26 shows the U.S. population (in 1,000s) from 1990–2001. Table 27 shows the death rates for sections of the U.S. population over a period of 12 years. Such data recorded over time often are referred to as time series data.

Students’ understanding of the rates in Table 27 can be established by posing problems such as:→ Carefully explain the meaning of the number 1,029.1 in the lower left-hand data cell.

→ Give at least two reasons why the White Male and Black Male entries do not add up to the All Races male entry. → Can you tell how many people died in 2001 based on Table 27 alone?

Hopefully, students will quickly realize that they can-not change from rates of death to frequencies of death without knowledge of the population sizes. Table 26 provides the population sizes overall, as well as for the male and female categories.

Noting that the population fi gures are in thousands but the rates are per 100,000, it takes a little thinking

Figure 47: Dotplot of randomized pair difference means

Year Total Persons Male Female

1990 249,623 121,714 127,909

1991 252,981 123,416 129,565

1992 256,514 125,247 131,267

1993 259,919 126,971 132,948

1994 263,126 128,597 134,528

1995 266,278 130,215 136,063

1996 269,394 131,807 137,587

1997 272,647 133,474 139,173

1998 275,854 135,130 140,724

1999 279,040 136,803 142,237

2000 282,224 138,470 143,755

2001 285,318 140,076 145,242

-6 -4 -2 0 2 4 6Mean Difference

Movable line is at 5.1

Randomized paired difference means; pulse data Table 26: U.S. Population (in 1,000s)

Example 3: Observational Study—Rates over Time

Page 143: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

105

Year All Races White Black

Male Female Male Female Male Female

1990 1202.8 750.9 1165.9 728.8 1644.5 975.1

1991 1180.5 738.2 1143.1 716.1 1626.1 963.3

1992 1158.3 725.5 1122.4 704.1 1587.8 942.5

1993 1177.3 745.9 1138.9 724.1 1632.2 969.5

1994 1155.5 738.6 1118.7 717.5 1592.8 954.6

1995 1143.9 739.4 1107.5 718.7 1585.7 955.9

1996 1115.7 733.0 1082.9 713.6 1524.2 940.3

1997 1088.1 725.6 1059.1 707.8 1458.8 922.1

1998 1069.4 724.7 1042.0 707.3 1430.5 921.6

1999 1067.0 734.0 1040.0 716.6 1432.6 933.6

2000 1053.8 731.4 1029.4 715.3 1403.5 927.6

2001 1029.1 721.8 1006.1 706.7 1375.0 912.5

Figure 48: Scatterplot of death rates

720725730735740745750755

1990 1994 1998 2002Year

Female Rate = -1.6545Ye a r + 4036 r^2 = 0.44

Deaths in U.S.

Fem

ale

Rat

e

Figure 49: Scatterplot of actual deaths

950000

970000

990000

1010000

1030000

1050000

1990 1994 1998 2002Year

Female Deaths = 9284Year - 17523000 r^2 = 0.93

Deaths in U.S.

Fem

ale

Dea

ths

Table 27: U.S. Death Rates (Deaths per 100,000 of Population)

Page 144: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

106

on a student’s part to go from rates to counts by mak-ing the computation shown in the formula:

Female Death Rate ⋅Female Population

100⎛ ⎝

⎞ ⎠

Female Deaths

=

Some time series questions can now be explored. For example, how does the pattern of female death rates over time compare to the pattern of actual female deaths? The plots of Figures 48 and 49 provide a visu-al impression. The death rates are trending downward over time, with considerable variation, but the actual deaths are going up.

Students will discover that the picture for males is quite different, which can lead to interesting discussions.

Study the graph pictured in Figure 50. Do you see any weaknesses in this graphic presentation? If so, de-scribe them and explain how they could be corrected.

Here are some plausible plots to correct errors of in-terpretation, and to raise other questions. Better pre-sentations begin with a data table, such as Table 28, and then proceed to more standard graphical displays of such data.

The plot in Figure 51 shows total and African-Ameri-can enrollments on the same scale. When viewed this

way, one can see that the latter is a small part of the for-mer, with little change, by comparison, over the years.

By viewing African-American enrollments by them-selves, one can see that the marked decrease between 1996 and 2002 may be turning around—or leveling off.

However, the ratio of African American to total en-rollment is still on the decrease!

Figure 50: Distorted graph [source: Athens Banner-Herald]

Year Total Students African Americans

1996 29404 2003

1997 29693 1906

1998 30009 1871

1999 30912 1815

2000 31288 1856

2001 32317 1832

2002 32941 1825

2003 33878 1897

2004 33405 1845

Table 28: Enrollment Data

Example 4: Graphs: Distortions of Reality?

Page 145: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

107

Figure 51: Plot of African-American vs. total enrollments

Figure 52: Plot of African-American enrollments only

Figure 53: Ratio of African-American to total enrollments

00. 010. 020. 030. 040. 050. 060. 070. 08

1995 2000 2005

Ratio of AAto Total

Year

1800

1850

1900

1950

2000

2050

1994 1996 1998 2000 2002 2004 2006

African Am.

Year

05000

10000150002000025000300003500040000

1995 2000 2005Year

TotalAfrican Am.

Page 146: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

108

Cobb, G. and Moore, D. (2000). “Statistics and Mathematics: Tension and Cooperation,” American Mathematical Monthly, pp. 615-630.

College Board (2006). College Board Standards for College Success™: Mathematics and Statistics.

College Entrance Examination Board (2004). Course Description: Statistics. New York: College Board.

Conference Board of the Mathematical Sciences (2001). The Mathematical Education of Teachers. Providence, RI, and Washington, DC: American Mathematical Society and Mathematical Association of America.

Conover, W. J. (1999). Practical Nonparametric Statistics. John Wiley and Sons, Page 235 (Equation 17).

Consumer Reports ( June 1993) Hot dogs. 51(6), 364-367.

Data-Driven Mathematics Series (1998), New York: Pearson Learning (Dale Seymour Publications).

Gnanadesikan, Mrudulla, Richard L. Scheaffer, James M. Landwehr, Ann E. Watkins, Peter Barbella, James Kepner, Claire M. Newman, Thomas E. Obremski, and Jim Swift (1995). Quantitative Literacy Series, New York: Pearson Learning (Dale Seymour Publications).

Hollander, Miles and Proschan, Frank (1984). The Statistical Exorcist: Dispelling Statistics Anxiety. Marcel Dekker, Pages 83–88 and 121–130.

Holmes, Peter (2001). “Correlation: From Picture to Formula,” Teaching Statistics, 23(3):67–70.

Kader, Gary (1999). “Means and MADS,” Mathematics Teaching in the Middle School, 4(6):398–403.

Kader, Gary and Perry, Mike (1984). “Learning Statistics with Technology,” Mathematics Teaching in the Middle School, 1(2):130–136.

Moore, D. and Cobb, G. (1997). “Mathematics, Statistics, and Teaching,” American Mathematical Monthly, 104, 801–823.

National Assessment Governing Board (2004). Mathematics Framework for 2005 National Assessment of Educational Progress. Available: www.amstat.org/education/gaise/4.

National Council of Teachers of Mathematics (1989). Curriculum and Evaluation Standards for School Mathematics.Reston, VA: The Council.

National Council of Teachers of Mathematics (2002–2004). Navigating through Data Analysis and Probability Series. Reston, VA: The Council.

National Council of Teachers of Mathematics (2000). Principles and Standards for School Mathematics. Reston, VA: The Council.

Steen, Lynn, ed. (2001). Mathematics and Democracy: The Case for Quantitative Literacy. National Council on Education and the Disciplines. Princeton: Woodrow Wilson Foundation.

U.S. Census Bureau. (2005). Statistical Abstract of the United States 2004–2005, Table No. 70. Live Births, Deaths, Marriages, and Divorces: 1950 to 2002.

Utts, Jessica A. (1999). Seeing Through Statistics. Pacifi c Grove, CA: Duxbury, 2nd ed.

References

Page 147: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

1 | Statistical Education of Teachers

StatiStical

Christine A. Franklin (Chair)University of Georgia

Anna E. BargagliottiLoyola Marymount University

Catherine A. CaseUniversity of Florida

Gary D. KaderAppalachian State University

Richard L. ScheafferUniversity of Florida

Denise A. SpanglerUniversity of Georgia

STATISTICAL

EDUCATION

OF TEACHERS SET

Page 148: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence
Page 149: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Preface ...............................................................................................i

Chapter 1: Background and Motivation for SET Report ........................................................1

Chapter 2: Recommendations ................................................5

Chapter 3: Mathematical Practices Through a Statistical Lens ....................................................9

Chapter 4: Preparing Elementary School Teachers to Teach Statistics ............................................... 13

Chapter 5: Preparing Middle-School Teachers to Teach Statistics ............................................... 21

Chapter 6: Preparing High-School Teachers to Teach Statistics .............................................. 29

Chapter 7: Assessment ........................................................... 39

Chapter 8: Overview of Research on the Teaching and Learning of Statistics in Schools ..........45

Chapter 9: Statistics in the School Curriculum: A Brief History ..................................................... 55

Appendix 1 ................................................................................... 61

Appendix 2 .................................................................................. 77

Contents

Page 150: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

i | Statistical Education of Teachers

Preface

The Mathematical Education of Teachers (MET) (Conference Board of the Mathematical Sciences [CBMS], 2001) made recommendations regarding the mathematics PreK–12 teachers should know and how they should come to know it. In 2012, CBMS released MET II to update these recommendations in light of changes to the educational climate in the intervening decade, particularly the release of the Common Core State Standards for Mathematics (CCSSM) (NCACBP and CCSSO, 2010). Because of the emphasis on statistics in the Common Core and many states’ guidelines, MET II includes numer-ous recommendations regarding the preparation of teachers to teach statistics.

This report, The Statistical Education of Teachers ( ), was commissioned by the American Statisti-cal Association (ASA) to clarify MET II’s recommen-dations, emphasizing features of teachers’ statistical preparation that are distinct from their mathemati-cal preparation. SET calls for collaboration among mathematicians, statisticians, mathematics educa-tors, and statistics educators to prepare teachers to teach the intellectually demanding statistics in the PreK–12 curriculum, and it serves as a resource to aid those efforts.

This report (SET) aims to do the following:

• Clarify MET II’s recommendations for the statistical preparation of teachers at all grade levels: elementary, middle, and high school

• Address the professional development of teachers of statistics

• Highlight differences between statistics and mathematics that have important im-plications for teaching and learning

• Illustrate the statistical problem-solving process across levels of development

• Make pedagogical recommendations of particular relevance to statistics, includ-ing the use of technology and the role of assessment

Chapter 1 describes the motivation for SET in de-tail, highlighting ways preparing teachers of statistics is different from preparing teachers of mathematics.

Chapter 2 presents six recommendations regard-ing what statistics teachers need to know and the shared responsibility for the statistical education of teachers. This chapter is directed to those in leader-ship positions in school districts, colleges and uni-versities, and government agencies whose policies affect the statistical education of teachers.

Chapter 3 describes CCSSM as viewed through a statistical lens.

Chapters 4, 5, and 6 give recommendations for the statistical preparation and professional de-velopment of elementary-, middle-, and high-school teachers, respectively. These chapters are intended as a resource for those engaged in teacher prepara-tion or professional development.

Chapter 7 describes various strategies for assess-ing teachers’ statistical content knowledge.

Chapter 8 provides a brief review of the research lit-erature supporting the recommendations in this report.

Chapter 9 presents an overview of the history of statistics education at the PreK–12 level.

Appendix 1 includes a series of short examples and accompanying discussion that address particu-lar difficulties that may occur while teaching statis-tics to teachers.

Appendix 2 includes a sample activity handout for the illustrative examples presented in Chapters 4–6 that could be used in professional development courses or a classroom.

Web Resources The ASA provides a variety of outstanding and timely resources for teachers, including record-ed web-based seminars, the Statistics Teacher Network newsletter, and peer-reviewed lesson plans

PREFACE

Page 151: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | iv

Preface

(STEW). These and other resources are available at www.amstat.org/education.

The National Council of Teachers of Mathematics (NCTM) offers exceptional classroom resources, includ-ing lesson plans and interactive web activities. NCTM has created a searchable classroom resources site that can be accessed at www.nctm.org/Classroom-Resources/Browse-All/#.

AudienceThis report is intended as a resource for all involved in the statistical education of teachers, both the ini-tial preparation of prospective teachers and the pro-fessional development of practicing teachers. Thus, the three main audiences are:

• Mathematicians and statisticians. Faculty members of mathematics and sta-tistics departments at two- and four-year collegiate institutions who teach cours-es taken by prospective and practicing teachers. They and their departmental colleagues set policies regarding the sta-tistical preparation of teachers.

• Mathematics educators and sta-tistics educators. Mathematics ed-ucation and statistics education faculty members—whether within colleges of education, mathematics departments, statistics departments, or other academ-ic units—are also an important audience

for this report. Typically, they are re-sponsible for the pedagogical education of mathematics and statistics teachers (e.g., methods courses, field experiences for prospective teachers). Outside of aca-deme, a variety of people are engaged in professional development for teachers of statistics, including state, regional, and school-district mathematics specialists. The term “mathematics educators” or “sta-tistics educators” includes this audience.

• Policy makers. This report is intended to inform educational administrators and policy makers at the national, state, school district, and collegiate levels as they work to provide PreK–12 students with a strong statistics education for an increasingly da-ta-driven world. Teachers’ preparation to teach statistics is central to this effort and is supported—or hindered—by institutional policies. These include national accreditation requirements, state certifications require-ments, and the ways in which these require-ments are reflected in teacher preparation programs. State and district supervisors make choices in the provision and funding of professional development. At the school level, scheduling and policy affect the type of learning experiences available to teachers. Thus, policy makers play important roles in the statistical education of teachers.

Page 152: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

v | Statistical Education of Teachers

Preface

TerminologyTo avoid confusion, the report uses the following terminology:

• Student refers to a child or adolescent in a PreK–12 classroom.

• Teacher refers to an instructor in a PreK–12 classroom, but also may refer to prospective PreK–12 teachers in a college mathematics course (“prospective teacher” or “pre-service teacher” also is used in the latter case).

• Instructor refers to an instructor of pro-spective or practicing teachers. This term may refer to a mathematician, statistician, mathematics educator, statistics educator, or professional developer. The term statis-tics teacher educators is used to refer to this diverse group of instructors collectively.

AcknowledgmentsWe thank our colleagues Steven J. Foti, Tim Jacobbe, and Douglas L. Whitaker from the University of Florida for their contributions to SET. They provided insight and expertise that guided the evolution of the docu-ment from initial to final draft.

We also thank our colleagues Hollylynne S. Lee, Stephen J. Miller, Roxy Peck, Jamis Perritt, Susan A. Peters, Maxine Pfannkuch, Angela L.E. Walmsley, and Ann E. Watkins for their willingness to serve as reviewers. Their thoughtful comments improved the final document.

Finally, we acknowledge the support of the Ameri-can Statistical Association Board of Directors and the ASA-NCTM joint committee, as well as the assistance of ASA staff members Ron Wasserstein, Rebecca Nich-ols, Valerie Nirala, and Megan Ruyle.

ReferencesConference Board of the Mathematical Sciences.

(2001). The Mathematical Education of Teachers. Providence, RI, and Washington, DC: American Mathematical Society and Mathematical Associa-tion of America.

Conference Board of the Mathematical Sciences. (2012). The Mathematical Education of Teachers II. Providence, RI, and Washington, DC: American Mathematical Society and Mathematical Associa-tion of America.

National Governors Association Center for Best Prac-tices and Council of Chief State School Officers. (2010). Common Core State Standards for Mathe-matics. Washington, DC: Authors

Lesson Plans Available on Statistics Education Web for K–12 TeachersStatistics Education Web (STEW) is an online resource for peer-reviewed lesson plans for K–12 teachers. The lesson plans identify both the statistical concepts being devel-oped and the age range appropriate for their use. The statistical concepts follow the recommendations of the Guidelines for Assessment and Instruction in Statistics Educa-tion (GAISE) Report: A Pre-K-12 Curriculum Framework, Common Core State Stan-dards for Mathematics, and NCTM Principles and Standards for School Mathematics. The website resource is organized around the four elements in the GAISE framework: formulate a statistical question, design and implement a plan to collect data, analyze the data by measures and graphs, and interpret the data in the context of the original question. Teachers can navigate the site by grade level and statistical topic. Lessons follow Common Core standards, GAISE recommendations, and NCTM Principles and Standards for School Mathematics. Lesson Plans Wanted for Statistics Education WebThe editor of STEW is accepting submissions of lesson plans for an online bank of peer-reviewed lesson plans for K–12 teachers of mathematics and science. Lessons showcase the use of statistical methods and ideas in science and mathematics based on the framework and levels in the Guidelines for Assessment and Instruction in Statistics Education (GAISE) and Common Core State Standards. Consider submitting several of your favorite lesson plans according to the STEW template to [email protected].

For more information, visit www.amstat.org/education/stew.

Page 153: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 1

chaPter 1

CHAPTER 1Background and Motivation for SET Report

In an increasingly data-driven world, statistical litera-cy is becoming an essential competency, not only for researchers conducting formal statistical analyses, but for informed citizens making everyday decisions based on data. Whether following media coverage of current events, making financial decisions, or assessing health risks, the ability to process statistical information is critical for navigating modern society.

Statistical reasoning skills are also advantageous in the job market, as employment of statisticians is pro-jected to grow 27 percent from 2012 to 2022 (Bureau of Labor Statistics, 2014) and business experts predict a shortage of people with deep analytical skills (Manyika et al., 2011).

In keeping with the objectives of preparing students for college, career, and life, the Common Core State Standards for Mathematics (CCSSM) (NCACBP and CCSSO, 2010) and other state standards place heavy em-phasis on statistics and probability, particularly in grades 6–12. However, effective implementation of more rigor-ous standards depends to a large extent on the teachers who will bring them to life in the classroom. This report offers recommendations for the statistical preparation and professional development of those teachers.

The Guidelines for Assessment and Instruction in Statistics Education (GAISE) Report (Franklin et al., 2007) outlines a framework for statistics education at the PreK–12 level. The GAISE report identifies three developmental levels: Levels A, B, and C, which ideally match with the three grade-level bands—elementary, middle, and high school. However, the report empha-sizes that the levels are based on development in statis-tical thinking, rather than age.

The GAISE report also breaks down the statisti-cal problem-solving process into four components: formulate questions (clarify the problem at hand and formulate questions that can be answered with data), collect data (design and employ a plan to collect appro-priate data), analyze data (select and use appropriate graphical and numerical methods to analyze data), and interpret results (interpret the analysis, relating the in-terpretation to the original questions).

Likewise, the CCSSM and other standards recognize statistics as a coherent body of concepts connected across

grade levels and as an investigative process. To effective-ly teach statistics as envisioned by the GAISE framework and current state standards, it is important that teachers understand how statistical concepts are interconnected and their connections to other areas of mathematics.

Teachers also should recognize the features of statistics that set it apart as a discipline distinct from mathematics, particularly the focus on variability and the role of context. Across all levels and stages of the investigative process, statistics anticipates and accounts for variability in data. Whereas mathematics answers deterministic questions, statistics provides a coherent set of tools for dealing with “the omnipres-ence of variability” (Cobb and Moore, 1997)—natu-ral variability in populations, induced variability in experiments, and sampling variability in a statistic, to name a few. The focus on variability distinguish-es statistical content from mathematical content. For example, designing studies that control for variabili-ty, making use of distributions to describe variability, and drawing inferences about a population based on a sample in light of sampling variability all require con-tent knowledge distinct from mathematics.

In addition to these differences in content, statistical reasoning is distinct from mathematical reasoning, as the former is inextricably linked to context. Reasoning in mathematics leads to discovery of mathematical patterns underlying the context, whereas statistical reasoning is necessarily dependent on data and context and requires integration of concrete and abstract ideas (delMas, 2005).

This dependence on context has important impli-cations for teaching. For example, rote calculation of a correlation coefficient for two lists of numbers does little to develop statistical thinking. In contrast, using the concept of association to explore the link between, for example, unemployment rates and obesity rates integrates data analysis and contextual reasoning to identify a meaningful pattern amid variability.

Because statistics is often taught in mathematics classes at the pre-college level, it is particularly import-ant that teachers be aware of the differences between the two disciplines.

One noteworthy intersection between statistics and mathematics is probability, which plays a critical role in

Page 154: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

2 | Statistical Education of Teachers

chaPter 1

statistical reasoning, but is also worthy of study in its own right as a subfield of mathematics. While teacher prepa-ration should include characterizations of probability as both a tool for statistics and as a component of mathe-matical modeling, this report focuses on probability pri-marily in the service of statistics. For example, a single instance of random sampling or random assignment is unpredictable, but probability provides ways to describe patterns in outcomes that emerge in the long run.

For teachers to understand statistical procedures like confidence intervals and significance tests, they must understand foundational probabilistic concepts that provide ways to quantify uncertainty. Thus, the SET report describes development of probabilistic concepts through simulation or the use of theoretical distribu-tions, such as the Normal distribution. On the other hand, topics further removed from statistical practice—such as specialized distributions and axiomatic ap-proaches to probability—are not detailed in this report.

It should be noted that current research is examin-ing the effects of integrating more probability model-ing into the school mathematics curriculum beginning at the middle grades. Through the use of dynamic sta-tistical software, the research is investigating the devel-opment of students’ understanding of connections be-tween data and chance (Konold and Kazak, 2008). This report strongly recommends that teacher preparation programs include probability modeling as a compo-nent of their mathematics education.

Because of the emphasis on statistical content in the CCSSM and other state standards, teachers of mathe-matics face high expectations for teaching statistics. Thus, the statistical education of teachers is critical and should be considered a priority for mathematicians and statisticians, mathematics and statistics educators, and those in leadership positions whose policies affect the preparation of teachers. The dramatic increase in statistical content at the pre-college level demands a coordinated effort to improve the preparation of pre-service teachers and to provide professional devel-opment for teachers trained before the implementation of the new standards.

The SET report reiterates MET II’s recommendation that statistics courses for teachers should be different from the theoretically oriented courses aimed toward science, technology, engineering, and mathematics majors and from the noncalculus-based introductory statistics courses taught at many universities. Where-as those courses often focus on mathematical proofs or a large number of specific statistical techniques, the courses SET recommends emphasize statistical thinking

and the statistical content knowledge and pedagogical content knowledge necessary to teach statistics as out-lined in the GAISE report and various state standards.

Effective teacher preparation must provide teachers not only with the statistical and mathematical knowl-edge sufficient for the content they are expected to teach, but also an understanding of foundational topics that come before and advanced topics that will follow. For example, grade 8 teachers are better equipped to guide students investigating patterns of association in bivari-ate data1 if they also understand the random selection process intended to produce a representative sample (taught in grade 7)2 and the types of inferences that can be drawn from an observational study (taught in high school)3. Note that although the linear equations often used to model an association in bivariate data would be familiar to anyone with a mathematics background, the process of statistical investigation requires content knowledge separate from mathematics content.

In addition to statistical content knowledge, teach-ers need opportunities to develop pedagogical content knowledge (Shulman, 1986). For example, effective teaching of statistics requires knowledge about com-mon student conceptions and thinking patterns, con-tent-specific teaching strategies, and appropriate use of curricula. Teachers should have the pedagogical knowledge necessary to assess students’ levels of un-derstanding and plan next steps in the development of their statistical thinking.

The SET report also highlights pedagogical recom-mendations of particular relevance to statistics, such as those related to technology and assessment. These recommendations apply to courses for pre-service teachers and professional development for practicing teachers, as well as to the elementary-, middle-, and high-school courses they teach. Ideally, the statistical education of teachers should model effective pedago-gy by emphasizing statistical thinking and conceptual understanding, relying on active learning and explo-ration of real data, and making effective use of tech-nology and assessment.

SET echoes the recommendation in the GAISE College Report (ASA, 2005) that technology should be used for developing concepts and analyzing data. An abstract concept such as the Central Limit Theorem can be developed (and visualized) through computer simulations instead of through mathematical proof. Calculations of p-values can be automated to allow more time to interpret the p-value and carefully con-sider the inferences that can be drawn based on its val-ue. The two goals of using technology for developing

1 Refer to CCSS 8.SP.1 – 8.SP.4

2 Refer to CCSS 7.SP.13 Refer to CCSS S-IC.1

and S-IC.3

Page 155: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 3

chaPter 1

concepts and analyzing data may be achieved with a single software package or with a number of comple-mentary tools (e.g., applets, graphing calculators, sta-tistical packages, etc.). SET does not endorse any par-ticular technological tools, but instead prescribes what teachers should be able to do with those tools.

Many aspects of the statistical education of teach-ers directly or indirectly hinge on assessment. As-sessment not only measures teachers’ understand-ing of key concepts, but also directs their focus and efforts. For example, SET recommends emphasis on conceptual understanding, but if tests only assess cal-culations, teachers will naturally emphasize the me-chanics instead of the underlying concepts. Thus, it is critical that teachers be assessed and, in turn, assess their students on conceptual and not merely proce-dural understanding. Further, assessment should emphasize the statistical problem-solving process,

requiring teachers to clearly communicate statistical ideas and consider the role of variability and context at each stage of the process. The assessments of sta-tistical understanding used by teacher educators are particularly important, as they are likely to influence how teachers assess their own students.

At every grade level—elementary, middle, and high school—the statistical education of teachers presents a different set of challenges and opportu-nities. Ideally, development of statistical literacy in students should begin at the elementary-school level (Franklin and Mewborn, 2006), with teachers prepared beyond the level of statistical knowledge expected of their students. In particular, elementary teachers should understand how foundational statis-tical concepts connect to content developed in later grades and other subjects across the curriculum. Ele-mentary teachers should receive statistics instruction

Rebecca Nichols/asa

Effective teacher preparation must provide teachers with an understanding of foundational topics.

Page 156: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

4 | Statistical Education of Teachers

chaPter 1

in a manner that models effective pedagogy and em-phasizes the statistical problem-solving process.

Both MET I and MET II indicate middle-grade teachers should not receive the same type of mathemat-ical preparation as elementary generalists. Students are expected to begin thinking statistically at grade 6, and topics introduced in the middle grades include data col-lection design, exploration of data, informal inference, and association. Given the plethora of statistical topics at the middle-school level under the CCSSM and oth-er state standards, middle-school teachers should take courses that explore the statistical concepts in the mid-dle-school curriculum at a greater depth, develop peda-gogical content knowledge necessary to teach those con-cepts, and expose themselves to statistical applications beyond those required of their students.

High-school mathematics teachers typically major in mathematics, but the theoretical statistics cours-es often taken by mathematics majors do not suffi-ciently prepare them for the statistics topics they will teach. In many universities, teachers only take a proof-driven mathematical statistics course, while courses in data analysis may not count toward their major. High-school teachers should take courses that develop data-driven statistical reasoning and include experiences with statistical modeling in addition to those that develop knowledge of statistical theory.

The recommendations included in this report concern not only the quantity of preparation need-ed by teachers of statistics, but also the content and quality of that preparation. It is the responsibility of mathematicians, statisticians, mathematics edu-cators, statistics educators, professional developers, and administrators to provide teachers with cours-es and professional development that cultivate their statistical understanding, as well as the pedagogical knowledge to develop statistical literacy in the next generation of learners.

ReferencesAmerican Statistical Association. (2005). Guidelines

for Assessment and Instruction in Statistics Edu-cation: College Report. Alexandria, VA: Author.

Bureau of Labor Statistics, U.S. Department of Labor. (2014) Occupational Outlook Handbook, 2014–15 Edition. Retrieved from www.bls.gov.

Cobb, G., and Moore, D. (1997). Mathematics, sta-tistics, and teaching. The American Mathemati-cal Monthly, 104(9):801–823.

delMas, R. (2005). A comparison of mathematical and statistical reasoning. In Dani Ben-Zvi and Joan Garfield (Eds.), The Challenge of Develop-ing Statistical Literacy, Reasoning, and Thinking (pp. 79-95). New York, NY: Kluwer Academic Publishers.

Franklin, C., Kader, G., Mewborn, D., More-no, J., Peck, R., Perry, M., and Scheaffer, R. (2007). Guidelines and Assessment for In-struction in Statistics Education (GAISE) Re-port: A PreK-12 Curriculum Framework. Alexandria, VA: American Statistical As-sociation. Retrieved from www.amstat.org/ education/gaise.

Franklin, C., and Mewborn, D. (2006). The statis-tical education of pre K–12 teachers: A shared responsibility. In NCTM 2006 Yearbook: Think-ing and Reasoning with Data and Chance (pp. 335–344).

Konold, C., and Kazak, S. (2008). Reconnect data and chance. Technology Innovations in Statistics Education, 2(1). Retrieved from https://escholar-ship.org/uc/item/38p7c94v.

Manyika, J., Chui, M., Brown, B., Bughin, J., Dobbs, R., Roxburgh, C., and Hung Byers, A. (2011, May). Big Data: The Next Frontier for Innova-tion, Competition, and Productivity. Retrieved from www.mckinsey.com.

National Governors Association Center for Best Practices and Council of Chief State School Of-ficers. (2010). Common Core State Standards for Mathematics. Washington, DC: Authors.

Shulman, L.S. (1986). Those who understand: Knowledge growth in teaching. Educational Re-searcher, 15(2):4–14.

The recommendations included

in this report concern the quantity

of preparation needed by teachers

of statistics and also the content

and quality of that preparation.

Page 157: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 5

chaPter 2

This chapter offers six broad recommendations for the preparation of teachers of statistics. These recommen-dations are intended to provide educational leaders with support to initiate any needed changes in teacher educa-tion or professional development programs to support teachers in learning to teach statistics effectively. The recommendations speak to the content teachers need to know, the ways in which they should learn it, and who should be assisting them in developing this knowledge. In particular, Recommendations 5 and 6 elaborate on the shared responsibility for the preparation of teachers of statistics. For elementary-school and middle-school teachers, statistics is often embedded in mathematics courses; thus, statisticians, mathematicians, and math-ematics educators share responsibility for ensuring that all teachers are prepared to teach high-quality statistics content with appropriate instructional methods to the next generation of students.

The recommendations for teacher preparation in this document are intended to apply to teachers pre-pared via any pathway for teacher preparation and credentialing—including undergraduate, post-bac-calaureate, graduate, traditional, and alternative—whether university-based or not. As used here, the term “teacher of statistics” includes any teacher in-volved in the statistical education of PreK–12 stu-dents, including early childhood and elementary school generalist teachers; middle-school teachers; high-school teachers; and teachers of special needs students, English Language Learners, and other spe-cial groups, when those teachers have responsibility for supporting students’ learning of statistics.

These recommendations apply only to the statistics content teachers need to know, but the recommenda-tions assume teachers, both pre-service and in-service, will have the opportunity to learn about pedagogy as it relates to teaching statistics in other courses or venues.

While we advocate that those who teach teachers should model the type of pedagogy we want them to use with students, simply modeling pedagogy is not sufficient for teachers to develop the skills and commitments needed to teach in ways that help stu-dents learn statistical content with meaning and un-derstanding. Thus, it is important that the content

recommendations made in this document be paired with appropriate pedagogical learning.

General RecommendationsThe following recommendations draw heavily on those provided in Mathematical Education of Teachers II (CBMS, 2012). This report includes six recommenda-tions for the statistical preparation of PreK–12 teach-ers, presented as follows:

• Recommendations 1, 2, 3, and 4 deal with the ways PreK–12 teachers should learn

• Recommendation 5 addresses the shared responsibility of statistics teacher educa-tors in preparing statistically proficient teachers

• Recommendation 6 provides details about the statistics content preparation needed by teachers at elementary-school, middle-school, and high-school levels.

Statistics for TeachersRecommendation 1. Prospective teachers need to learn statistics in ways that enable them to develop a deep conceptual understanding of the statistics they will teach. The statistical content knowledge needed by teachers at all levels is substantial, yet quite different from that typ-ically addressed in most college-level introductory statis-tics courses. Prospective teachers need to understand the statistical investigative process and particular statistical techniques/methods so they can help diverse groups of students understand this process as a coherent, reasoned activity. Teachers of statistics must also be able to com-municate an appreciation of the usefulness and power of statistical thinking. Thus, coursework for prospective teachers should allow them to examine the statistics they will teach in depth and from a teacher’s perspective.

Recommendation 2. Prospective teachers should engage in the statistical problem-solving process—for-mulate statistical questions, collect data, analyze data, and interpret results—regularly in their courses. They

CHAPTER 2Recommendations

Page 158: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

6 | Statistical Education of Teachers

chaPter 2

should be engaged in reasoning, explaining, and mak-ing sense of statistical studies that model this process. Although the quality of statistical preparation is more important than the quantity, Recommendations 3, 4, and 5 discuss the content teachers are expected to teach. Detailed recommendations for the amount and nature of their coursework for the various grade bands are discussed in Chapters 4, 5, and 6 of this report.

Recommendation 3. Because many currently practicing teachers did not have an opportunity to learn statistics during their pre-service preparation programs, robust professional development opportu-nities need to be developed for advancing in-service teachers’ understanding of statistics. In-service profes-sional development programs should be built on the same principles as those noted in Recommendations 1 and 2 for pre-service programs, with teachers actively engaged in the statistical problem-solving process. Re-gardless of the format of the professional development (university-based, district-based), it is important that statisticians with an interest in K–16 statistical educa-tion be involved in designing and, where possible, de-livering the professional development.

Recommendation 4. All courses and professional development experiences for statistics teachers should allow them to develop the habits of mind of a statis-tical thinker and problem-solver, such as reasoning, explaining, modeling, seeing structure, and general-izing. The instructional style for these courses should be interactive, responsive to student thinking, and problem-centered. Teachers should develop not only knowledge of statistics content, but also the ability to work in ways characteristic of the discipline. Chapter 3 elaborates on the Standards of Mathematical Practice as they apply to statistics.

Roles for Teacher Educators in StatisticsRecommendation 5. At institutions that prepare teachers or offer professional development, statistics teacher education must be recognized as an import-ant part of a department’s mission and should be undertaken in collaboration with faculty from statis-tics education, mathematics education, statistics, and mathematics. Departments need to encourage and re-ward faculty for participating in the preparation and professional development of teachers and becoming involved with PreK–12 mathematics education. De-partments also need to devote commensurate resourc-es to designing and staffing courses for prospective and

practicing teachers. Statistics courses for teachers must be a department priority. Instructors for such cours-es should be carefully selected for their statistical ex-pertise as well as their pedagogical expertise, and they should have opportunities to participate in regional and national professional development opportunities for statistics educators as needed. 4

Recommendation 6. Statisticians should recog-nize the need for improving statistics teaching at all levels. Mathematics education, including the statistical education of teachers, can be greatly strengthened by the growth of a statistics education community that includes statisticians as one of many constituencies committed to working together to improve statistics instruction at all levels and to raise professional stan-dards in teaching. It is important to encourage part-nerships between statistics faculty, statistics education faculty, mathematics education faculty, and mathe-matics faculty; between faculty in two- and four-year institutions; and between statistics faculty and school mathematics teachers, as well as state, regional, and school-district leaders.

In particular, as part of the mathematics education community, statistics teacher educators should sup-port the professionalism of teachers of statistics by do-ing the following:

• Endeavoring to ensure that K–12 teachers of statistics have sufficient knowledge and skills for teaching statistics at the level of certifica-tion upon receiving initial certification

• Encouraging all who teach statistics to strive for continual improvement in their teaching

• Joining with teachers at different levels to learn with and from each other

There are many initiatives, communities, and professional organizations focused on aspects of building professionalism in the teaching of mathe-matics and statistics. More explicit efforts are need-ed to bridge current communities in ways that build upon mutual respect and the recognition that these initiatives provide opportunities for professional growth for higher education faculty in mathematics, statistics, and education, as well as for the mathe-matics teachers, coaches, and supervisors in the PreK–12 community. Becoming part of a communi-ty that connects all levels of mathematics education

4 Statisticians work in depart-ments of various configura-

tions, ranging from stand-alone statistics departments

to departments of mathe-matical sciences that include mathematics, statistics, and computer science. For ease

of language, we use the term departments generically here

to mean any department in which statisticians reside.

Page 159: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 7

chaPter 2

will offer statisticians more opportunities to partici-pate in setting standards for accreditation of teacher preparation programs and teacher certification via standard and alternative pathways.

Specific RecommendationsThe paragraphs that follow provide an overview of the specific recommendations for the statistical education of teachers at various levels. These recommendations are elaborated in Chapters 4 (elementary), 5 (middle school), and 6 (high school).

Elementary SchoolProspective elementary school teachers should be pro-vided with coursework on fundamental ideas of ele-mentary statistics, their early childhood precursors, and middle school successors. The coursework could take three formats:

1. A special section of an introductory statis-tics course geared specifically to the content and instructional strategies noted above. This course can be designed to include all levels of teacher preparation students.

2. An entire course in statistical content for elementary-school teachers.

3. More time and attention given to statistics in existing mathematics content courses. Most likely, one course would be reconfigured to place substantial emphasis on statistics, but this would also likely result in reconfiguring the content of all courses in the sequence to make the time for the statistics content.

There is a great deal of mathematics and statistics content that is important for elementary-school teach-ers to know, so decisions about what to cut to make more room for statistics will be difficult. Thus, MET II advocates increasing the number of credit hours of instruction for elementary-school teachers to 12 credit hours. Note that these hours are all content-focused; pedagogy courses are in addition to these 12 hours.

Middle SchoolProspective middle school grades teachers of statistics should complete two courses:

1. An introductory course that emphasiz-es a modern data-analytic approach to

statistical thinking, a simulation-based introduction to inference using appro-priate technologies, and an introduction to formal inference (confidence intervals and tests of significance). This first course develops teachers’ statistical con-tent knowledge in an experiential, active learning environment that focuses on the problem-solving process and makes clear connections between statistical reasoning and notions of probability.

ThiNksTock

Courses should use the GAISE framework model and engage students in the statistical problem-solving process.

Page 160: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

8 | Statistical Education of Teachers

chaPter 2

2. A second course that focuses on strengthening teachers’ conceptual understandings of the big ideas from Essential Understandings and the sta-tistical content of the middle-school curriculum. This course is also intended to develop teachers’ pedagogical content knowledge by providing strategies for teaching statistical concepts, integrating appropriate technology into their in-struction, making connections across the curriculum, and assessing statistical un-derstanding in middle-school students.

High SchoolProspective high-school teachers of mathematics should complete three courses:

1. An introductory course that emphasiz-es a modern data-analytic approach to statistical thinking, a simulation-based introduction to inference using appro-priate technologies, and an introduction to formal inference (confidence intervals and tests of significance)

2. A second course in statistical methods that builds on the first course and includes both randomization and classical procedures for comparing two parameters based on

both independent and dependent samples (small and large), the basic principles of the design and analysis of sample surveys and experiments, inference in the simple linear regression model, and tests of inde-pendence/homogeneity for categorical data

3. A statistical modeling course based on multiple regression techniques, including both categorical and numerical explan-atory variables, exponential and power models (through data transformations), models for analyzing designed experi-ments, and logistic regression models

Each course should include use of statistical soft-ware, provide multiple experiences for analyzing real data, and emphasize the communication of sta-tistical results.

These courses should use the GAISE framework model and engage teachers in the statistical prob-lem-solving process including study design. These courses are different from the more theoretically oriented probability and statistics courses typical-ly taken by science, technology, engineering, and mathematics (STEM) majors. Note that while some aspects of probability are fundamental to statistics, a classical probability course—while useful—does not satisfy the recommendations offered here. As dis-cussed in Chapter 1, we recommend the fundamen-tal notions of probability be developed as needed in the service of acquiring statistical reasoning skills.

ReferenceConference Board of the Mathematical Sciences.

(2012). The Mathematical Education of Teachers. Providence, RI: American Mathematical Society.

These courses should use

the GAISE framework model

and engage teachers in the

statistical problem-solving

process including study design.

Page 161: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 9

chaPter 3

CHAPTER 3Mathematical Practices Through a Statistical Lens

The upcoming chapters in this report provide recom-mendations for the statistics that elementary-, mid-dle-, and high-school teachers should know and how they should come to know it. However, the report also recognizes that knowledge of statistical content is supported by the processes and practices through which teachers and their students acquire and apply statistical knowledge.

The importance of processes and proficiencies that complement content knowledge are well recognized in mathematics education. In Principles and Standards for School Mathematics (PSSM) (2000), the National Coun-cil for Teachers of Mathematics (NCTM) presents five process standards that highlight ways of acquiring and using content knowledge: problem-solving, reasoning and proof, communication, connections, and represen-tations. In Adding It Up (2001), the National Research Council (NRC) breaks down mathematical proficiency into five interrelated strands: conceptual understanding, procedural fluency, strategic competence, adaptive rea-soning, and productive disposition. The Common Core State Standards for Mathematics (CCSSM) (2010) builds on the processes and proficiencies outlined by NCTM and NRC in its eight Standards for Mathematical Prac-tice. CCSSM describes the connection of practice stan-dards to mathematical content as follows:

The Standards for Mathematical Practice describe ways in which developing student practitioners of the discipline of mathe-matics increasingly ought to engage with the subject matter as they grow in mathe-matical maturity and expertise throughout the elementary-, middle-, and high-school years. Designers of curricula, assessments, and professional development should all attend to the need to connect the mathe-matical practices to mathematical content in mathematics instruction.

The Standards for Mathematical Content are a balanced combination of procedure and conceptual understanding. Expectations that begin with the word “understand” are often

especially good opportunities for connecting the practices to the content. Students who lack understanding of a topic may rely on procedures too heavily. Without a flexible base from which to work, they may be less likely to consider analogous problems, represent problems coherently, justify con-clusions, apply the mathematics to practical situations, use technology mindfully to work with the mathematics, explain the mathematics accurately to other students, step back for an overview, or deviate from a known procedure to find a shortcut. In short, a lack of understanding effectively prevents a student from engaging in the mathematical practices (NCACBP and CCSSO, 2010, p. 8).

The statistical education of teachers should be in-formed by the Standards for Mathematical Practice as seen through a statistical lens. This chapter interprets the eight practice standards presented in the CCSSM in terms of the practices necessary to acquire and apply statistics knowledge. The perspective of a “sta-tistical lens” is established through several sources, including the following:

• The PreK–12 GAISE Curriculum Frame-work (Franklin et al., 2007)

• Developing Essential Understanding of Statistics Grades 6–8 (Kader and Jacobbe, 2013)

• Developing Essential Understanding of Statistics Grades 9–12 (Peck, Gould, and Miller, 2013)

• The Challenge of Developing Statistical Literacy, Reasoning, and Thinking (Ben-Zvi and Garfield, 2004)

• Statistical Thinking in Empirical Enquiry (Wild and Pfannkuch, 1999)

Page 162: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

10 | Statistical Education of Teachers

chaPter 3

The mathematicians, statisticians, and educators involved in the statistical preparation of teachers should strive to connect the mathematical practices through a statistical lens to statistical content in the instruction of teachers so teachers may, in turn, foster these practices in their students. In the descriptions that follow, we use the term “students” to parallel the Standards for Mathematical Practice in the Common Core State Standards; but, as with mathematics, the statistical practices also apply to teachers when they are learning the content.

1. Make sense of problems and persevere in solving them. Statistically proficient students understand how to carry out the four steps of the sta-tistical problem-solving process: formulating a statis-tical question, designing a plan for collecting data and carrying out that plan, analyzing the data, and inter-preting the results. In practice, the components of this process are interrelated, so students must continually ask themselves how each component relates to the oth-ers and the research topic under study:

• Can the question be answered with data? Will answering the statistical question pro-vide insight into the research topic under investigation?

• Will the data collection plan measure a variable(s) that provides appropriate data to address the statistical question? Does the plan provide data that allow for gen-eralization of results to a population or to establish a cause and effect conclusion?

• Do the analyses provide useful informa-tion for addressing the statistical question? Are they appropriate for the data that have been collected?

• Is the interpretation sound, given how the data were collected? Does the interpre-tation provide an adequate answer to the statistical question?

Students must persevere through the entire statis-tical problem-solving process, adapting and adjusting each component as needed to arrive at a solution that adequately connects the interpretation of results to the statistical question posed and the research topic under study. Additionally, students must be able to critique and evaluate alternative approaches (data collection plans and analyses) and recognize appropriate and in-appropriate conclusions based on the study design.

2. Reason abstractly and quantitatively. Sta-tistically proficient students understand the difference between mathematical thinking and statistical think-ing. Students engaged in mathematical thinking ask, “Where’s the proof?” They use operations, generaliza-tions, and abstractions to prove deterministic claims and understand mathematical patterns free of context. Students engaged in statistical thinking ask, “Where’s the data?” They reason in the presence of variability and anticipate, acknowledge, account for, and allow for variability in data as it relates to a particular context.

Although statistical thinking is grounded in a concrete context, it still requires reasoning with abstract concepts. For example, how to measure an

Rebecca Nichols/asa

Teachers work through the statistical problem-solving process, adapting and adjusting each component as needed to arrive at a solution that connects the interpretation of results to the statistical question posed and the topic under study.

Page 163: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 11

chaPter 3

attribute in answering a statistical question, select-ing a reasonable summary statistic such as using the sample mean (which may be a value that does not exist in the data set) as a measure of center, in-terpreting a graphical representation of data, and understanding the role of sampling variability for drawing inferences—all of these require reasoning with abstractions.

3. Construct viable arguments and critique the reasoning of others. Statistically proficient students use appropriate data and statistical methods to draw conclusions about a statistical question. They follow the logical progression of the statistical prob-lem-solving process to investigate answers to a statis-tical question and provide insights into the research topic. They reason inductively about data, making in-ferences that take into account the context from which the data arose. They justify their conclusions, commu-nicate them to others (orally and in writing), and cri-tique the conclusions of others.

Statistically proficient students also are able to com-pare the plausibility of alternative conclusions and dis-tinguish correct statistical reasoning from that which is flawed. This is an especially important skill given the massive amount of statistical information in the media and elsewhere. Are appropriate graphs being used to represent the data, or are the graphs misleading? Are appropriate inferences being made based on the da-ta-collection design and analysis? Statistically proficient students are ‘healthy skeptics’ of statistical information.

4. Model with mathematics. Statistically profi-cient students can apply mathematics to help answer sta-tistical questions arising in everyday life, society, and the workplace. Mathematical models generally use equa-tions or geometric representations to describe structure. Statistical models build on mathematical models by including descriptions of the variability present in the data; that is, data = structure + variability.

For example, middle school students may use the mean to represent the center of a distribution of uni-variate data and the mean absolute deviation to model the variability of the distribution. High-school students may use the normal distribution (as defined by a math-ematical function) to model a unimodal, symmetric distribution of quantitative data or to model a sampling distribution of sample means or sample proportions. For bivariate data, students may use a straight line to model the relationship between two quantitative vari-ables. With consideration of the correlation coefficient

and residuals, the statistical interpretation of this lin-ear model takes into account the variability of the data about the line. The statistically proficient student un-derstands that statistical models are judged by whether they are useful and reasonably describe the data.

5. Use appropriate tools strategically. Sta-tistically proficient students consider the available tools when solving a statistical problem. These tools might include a calculator, a spreadsheet, applets, a statistical package, or tools such as two-way tables and graphs to organize and represent data. A tool might be a survey used to collect and measure the variable (attribute) of interest. The use of tools is to facilitate the practice of statistics. Tools can help us work more efficiently with analyzing the data so more time can be spent on understanding and communi-cating the story the data tell us.

For example, statistically proficient middle-school students may use technology to create boxplots to compare and analyze the distributions of two quan-titative variables. High-school students may use an applet to simulate repeated sampling from a certain population to develop a margin of error for quantify-ing sampling variability.

When developing statistical models, students know technology can enable them to visualize the results of varying assumptions, explore patterns in the data, and compare predictions with data. Statistically proficient students at various grade levels are able to use tech-nological tools to carry out simulations for exploring and deepening their understanding of statistical and probabilistic concepts. Students also may take advan-tage of chance devices such as coins, spinners, and dice for simulating random processes.

6. Attend to precision. Statistically proficient students understand that precision in statistics is not just computational precision. In statistics, one must be precise about ambiguity and variability. Students understand the statistical problem-solving process begins with the precise formulation of a statistical question that anticipates variability in the data col-lected that will be used to answer the question. Pre-cision is also necessary in designing a data-collection plan that acknowledges variability. Precision about the attributes being measured is essential.

After the data have been collected, students are precise about choosing the appropriate analyses and representations that account for the variability in the data. They display carefully constructed graphs with

Page 164: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

12 | Statistical Education of Teachers

chaPter 3

clear labeling and avoid misleading graphs, such as three-dimensional pie charts, that misrepresent the data. As students interpret the analysis of the data, they are precise with their terminology and statistical language. For example, they recognize that ‘correla-tion’ is a specific measure of the linear relationship between two quantitative variables and not simply another word for ‘association.’ They recognize that ‘skew’ refers to the shape of a distribution and is not another word for ‘bias.’

Students can transition from exploratory statistics to inferential statistics by using a margin of error to quantify sampling variability around a point estimate. Students recognize the precision of this estimate de-pends partially upon the sample size—the larger the sample size, the smaller the margin of error.

As students interpret statistical results, they connect the results back to the original statistical question and provide an answer that takes the variability in the data into account. Statistically proficient students recognize that clear communication and precision with statistical language are essential to the practice of statistics.

7. Look for and make use of structure. Sta-tistically proficient students look closely to discover a structure or pattern in a set of data as they attempt to answer a statistical question. For univariate data, the mean or median of a distribution describes the center of the distribution—an underlying structure around which the data vary. Similarly, the equation of a straight line describes the relationship between two quanti-tative variables—a linear structure around which the data vary. Students use structure to separate the ‘signal’ from the ‘noise’ in a set of data—the ‘signal’ being the structure, the ‘noise’ being the variability. They look for patterns in the variability around the structure and rec-ognize these patterns can often be quantified.

For example, if there is a positive, linear trend in a set of bivariate quantitative data, then students can quantify this pattern with a correlation coefficient to measure strength of the linear association and use a regression line to predict the value of a response variable from the value of an explanatory variable. Statistically proficient students use statistical mod-eling to describe the variability associated with the identified structure.

8. Look for and express regularity in re-peated reasoning. Statistically proficient students maintain oversight of the process, attend to the details, and continually evaluate the reasonableness of their results as the y are carrying out the statistical prob-lem-solving process. Students recognize that proba-bility provides the foundation for identifying patterns in long-run variability, thereby allowing students to quantify uncertainty. Randomization produces proba-bilistic structure and patterns that are repeatable and can be quantified in the long run.

For example, in a statistical experiment with enough subjects, randomly assigning subjects to treat-ment groups will balance the groups with respect to potentially confounding variables so any statistically significant differences can be attributed to the treat-ments. In sampling from a defined population, se-lection of a random sample is a repeatable process and probability supports construction of a sampling distribution of the statistic of interest. Statistical-ly proficient students understand the different roles randomization plays in data collection and recognize it is the foundation of statistical inference methods used in practice.

ReferencesBen-Zvi, B., and Garfield, J. (Eds.). (2004). The Chal-

lenge of Developing Statistical Literacy, Reasoning, and Thinking. Dordrecht, The Netherlands: Kluw-er Academic Publishers.

Franklin, C., et al. (2007). Guidelines and Assessment for Instruction in Statistics Education (GAISE) Report: A PreK-12 Curriculum Framework. Alex-andria, VA: American Statistical Association.

Kader, G., and Jacobbe, T. (2013). Developing Essential Understanding of Statistics for Teaching Mathemat-ics in Grades 6-8. Reston, VA: NCTM.

Kilpatrick, J., Swafford, J., and Findell, B. (Eds). (2001). Adding It Up: Helping Children Learn Mathematics. National Research Council, Center for Education. Washington, DC: National Academy Press.

National Council of Teachers of Mathematics (NCTM). (2000). Principles and Standards for School Mathematics. Reston, VA: NCTM.

Peck, R., Gould, R., and Miller, S. (2013). Developing Essential Understanding of Statistics for Teaching Mathematics in Grades 9-12. Reston, VA: NCTM.

Wild, C., and Pfannkuch, M. (1999). Statistical think-ing in empirical enquiry. International Statistical Review, 67(3):223–265.

Statistically proficient students

look closely to discover a structure

or pattern in a set of data.

Page 165: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 13

chaPter 4

Expectations for Elementary School Students“Every high-school graduate should be able to use sound statistical reasoning to intelligently cope with the re-quirements of citizenship, employment, and family and to be prepared for a healthy, happy, and productive life.” (Franklin et al., 2007, p.1)

The foundations of statistical literacy must begin in the elementary grades PreK–5, where young students begin to develop data sense—an understanding that data are not simply numbers, categories, sounds, or pictures, but entities that have a context, vary, and may be useful for an-swering questions about the world that surrounds them.

Recommendations for developing statistical think-ing from key national reports such as the ASA’s PreK–12 GAISE Framework, NCTM’s Principles and Standards for School Mathematics, and CCSSM for students in grade levels PreK–5 include the following:

• Understand what comprises a statistical question

• Know how to investigate statistical ques-tions posed by teachers in a context of interest to young students

• Conduct a census of the classroom to collect data and design simple experiments to compare two treatments

• Distinguish between categorical and numerical data

• Sort, classify, and organize data

• Understand that data vary

• Understand the concept of a distribution of data and how to describe key features of this distribution

• Understand how to represent distribu-tions with tables, pictures, graphs, and numerical summaries

• Understand how to compare two distributions

• Use data to recognize when there is an association between two variables

• Understand how to infer analysis of data to the classroom from which data were produced and the limitations of this scope of inference if we want to infer beyond this classroom

Students should learn these elementary-grade topics using the statistical problem-solving perspective as de-scribed in the GAISE framework (Franklin et al., 2007):

• Know how to formulate a statistical ques-tion (anticipate variability in the data that will be collected) and understand how a statistical question differs from a mathemat-ical question

• Design a strategy for collecting data to address the question posed (acknowledge variability)

• Analyze the data (account for variability)

• Make conclusions from the analysis (taking variability into account) and connect back to the statistical question

The GAISE framework recommends students learn statistics in an activity-based learning environ-ment in which they collect, explore, and interpret data to address a statistical question. Students’ exploration and analysis of data should be aided by appropriate technologies capable of creating graphical displays of data and computing numerical summaries of data. Using the results from their analyses, students must have experiences communicating a statistical solu-tion to the question posed, taking into account the variability in the data and considering the scope of their conclusions based on the manner in which the data were collected.

CHAPTER 4Preparing Elementary School Teachers to Teach Statistics

Page 166: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

14 | Statistical Education of Teachers

chaPter 4

The elementary grades provide an ideal environment for developing students’ appreciation of the role statistics plays in our daily lives and the world surrounding us. Not only does statistics reinforce important elementary-level mathematical concepts (such as measurement, counting, classifying, operations, fair share), it provides connec-tions to other curricular areas such as science and social studies, which also integrate statistical thinking. Elemen-tary-school teachers have the opportunity to help young children begin to appreciate the importance of under-standing the stories data tell across the school curricu-lum, not just the mathematical sciences. For instance, science fair projects can be a vehicle for encouraging stu-dents to develop the beginning tools for making sense of data and using the statistical investigative process.

Essentials of Teacher PreparationTo implement an elementary-grades curriculum in sta-tistics such as that envisioned in the GAISE framework and other national recommendations, elementary-grades teachers must develop the ability to implement and ap-preciate the statistical problem-solving process at a level that goes beyond what is expected of elementary-school students. Teachers must be equipped and confident in guiding students to develop the statistical knowledge and connections recommended at the elementary lev-el. Although the Common Core State Standards do not include a great deal of statistics in grades K–5, we pro-vide guidance on the content teachers need to know to meet content standards outlined by both the ASA and NCTM. Standards documents will change from time to time, so we are recommending a robust preparation for elementary-school teachers.

The primary goals of the statistical preparation of elementary-school teachers are three-fold:

1. Develop the necessary content knowledge and statistical reasoning skills to imple-ment the recommended statistics topics for elementary-grade students along with the content knowledge associated with the middle school–level statistics content (see CCSSM or Chapter 5 of this document). Statistical topics should be developed through meaningful experiences with the statistical problem-solving process.

2. Develop an understanding of how statistical concepts in middle grades build on content developed in elementary-grade levels and an understanding of how statistical content

in elementary grades is connected to other subject areas in elementary grades.

3. Develop pedagogical content knowl-edge necessary for effective teaching of statistics. Pre-service and practicing teachers should be familiar with common student conceptions, content-specific teaching strategies, strategies for assessing statistical knowledge, and appropriate integration of technology for developing statistical concepts.

In designing courses and experiences to meet these goals, teacher preparation programs must recognize that the PreK–12 statistics curriculum is conceptually based and not the typical formula-driven curriculum of sim-ply drawing graphs by hand and calculating results from formulas. Similarly, the statistics curriculum for teachers should be structured around the statistical problem-solv-ing process (as described under student expectations).

Elementary-school teacher preparation should in-clude, at a minimum, the following topics:

Formulate Questions• Understand a statistical question is asked

within a context that anticipates variability in data

• Understand measuring the same variable (or characteristic) on several entities results in data that vary

• Understand that answers to statistical ques-tions should take variability into account

Collect Data• Understand data are classified as either

categorical or numerical

º Recognize data are categorical if the possible values for the response fall into categories such as yes/no or favorite color of shoes

º Recognize data are numerical (quantitative) if the possible values take on numerical values that represent different quantities of the variable such as ages, heights, or time to complete homework

Page 167: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 15

chaPter 4

º Recognize quantitative data that are discrete, for example, if the possible values are countable such as the number of books in a student’s backpack

º Recognize quantitative data are con-tinuous if the possible values are not countable and can be recorded even more precisely to smaller units such as weight and time

• Understand a sample is used to predict (or estimate) characteristics of the population from which it was taken

º Recognize the distinction among a population, census, and sample

º Understand the difference between random sampling (a ‘fair’ way to select a sample) and non-random sampling

º Understand the scope of inference to a population is based on the method used to select the sample

• Understand experiments are conducted to compare and measure the effectiveness of treatments. Random allocation is a fair way to assign treatments to experimental units.

Analyze Data • Understand distributions describe key

features of data such as variability

º Recognize and use appropriate graphs (picture graph, bar graph, pie graph) and tables with counts and percent-ages to describe the distribution of categorical data

º Understand the modal category is a useful summary to describe the distri-bution of a categorical variable

º Recognize and use appropriate graphs (dotplots, stem and leaf plots, histo-grams, and boxplots) and tables to describe the distribution of quantita-tive data

º Recognize and use appropriate numer-ical summaries to describe characteris-tics of the distribution for quantitative data (mean or median to describe center; range, interquartile range, or mean absolute deviation to describe variability)

º Understand the shape of the distribu-tion for a quantitative variable influenc-es the numerical summary for center and variability chosen to describe the distribution

º Recognize the median and interquar-tile range are resistant summaries not affected by outliers in the distribution of a quantitative variable

• Understand distributions can be used to compare two groups of data

º Understand distributions for quantita-tive data are compared with respect to similarities and differences in center, variability, and shape, and this compar-ison is related back to the context of the original statistical question(s)

º Understand that the amount of overlap and separation of two distributions for quantitative data is related to the center and variability of the distributions

º Understand that distributions for cat-egorical data are compared with using two-way tables for cross classification of the categorical data and to proportions of data in each category, and this com-parison is related back to the context of the original statistical question(s)

• Explore patterns of association by using values of one variable to predict values of another variable

º Understand how to explore, describe, and quantify the strength and trend of the association between two quantita-tive variables using scatterplots, a cor-relation coefficient (such as quadrant

Page 168: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

16 | Statistical Education of Teachers

chaPter 4

count ratio), and fitting a line (such as fitting a line by eye)

º Understand how to explore and describe the association between two categorical variables by comparing conditional proportions within two-way tables and using bar graphs

Interpret Results • Recognize the difference between a param-

eter (numerical summary from the popu-lation) and a statistic (numerical summary from a sample)

• Recognize that a simple random sample is a ‘fair’ or unbiased way to select a sample for describing the population and is the basis for inference from a sample to a population

• Recognize the limitations of scope of in-ference to a population depending on how samples are obtained

• Recognize sample statistics will vary from one sample to the next for samples drawn from a population

• Understand that probability provides a way to describe the ‘long-run’ random behavior of an outcome occurring and recognize how to use simulation to approximate probabilities and distributions

Experiences for teachers should include attention to common misunderstandings students may have regarding statistical and probabilistic concepts and developing strategies to address these conceptions. Some of these common misunderstandings are related to making sense of graphical displays and how to ap-propriately analyze and interpret categorical data. The research related to these common misunderstandings is discussed in Chapter 8. Examples related to the com-mon misunderstandings are included in Appendix 1.

Developing teachers’ communication skills is critical for teaching the statistical topics and concepts outlined above. The role of manipulatives (such as cubes to represent individual data points) and technology in learning statis-tics also must be an important aspect of elementary-school teacher preparation. Teachers must be proficient in using manipulatives to aide in the collection, exploration and

analysis, and interpretation of data. Teachers also are encouraged to become comfortable using statistical soft-ware (that supports dynamic visualization of data) and calculators for these purposes. The Mathematical Practice Standards as seen through a statistical lens are vital (see Chapter 3) for helping students and teachers develop the tools and skills to reason and communicate statistically.

Program Recommendations for Prospective and Practicing Elementary TeachersAll teachers (pre-service and in-service) need to learn statistics in the ways advocated for PreK–12 students to learn statistics in GAISE. In other words, they need to engage in all four parts of the statistical problem-solving process with various types of data (categorical, discrete numerical, continuous numerical).

At present, few institutions offer a statistics course specially designed for pre-service or in-service elemen-tary-school teachers. These teachers generally gain their statistics education through either an introductory sta-tistics course aimed at a more general audience or in a portion of a mathematics content course designed for ele-mentary-school teachers. Often, the standard introducto-ry statistics course does not address the content identified above at the level of depth needed by teachers, nor does it typically engage teachers in all aspects of the statistical process. Thus, while such a course might be an appropri-ate way for a future teacher to meet a university’s quan-titative reasoning requirements, it is not an acceptable substitute for the experiences described in this document.

Institutions typically offer from one to three math-ematics content courses for future teachers. In many cases, future teachers take these courses at two-year institutions prior to entering their teacher education programs. Generally, a portion of one of these courses is devoted to statistics content. Most of these courses are taught in mathematics departments and by a wide range of individuals, including mathematicians, mathematics educators, graduate students, and adjunct faculty mem-bers. While there is growing appreciation in the field for the importance of quality instruction in these courses, few are taught by individuals with expertise in statistics or statistics education. The job title of the person teach-ing these courses is far less important than the individu-al’s preparation for teaching the statistics component of the courses. As noted above, the individual must possess a deep understanding of statistics content beyond that being taught in the course and understand how to foster the investigation of this content by engaging teachers in the statistical problem-solving process.

Page 169: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 17

chaPter 4

Course Recommendations for Prospective TeachersThere are multiple ways the statistics content noted above could be delivered and configured, depending on the possibilities and limitations at each institution. What is clear is that most elementary-teacher preparation pro-grams need to devote far more time in the curriculum to statistics than is currently done. At best, statistics is half

of a course for future elementary teachers. At worst, it is a few days of instruction or skipped entirely. In many instances, the unit taught is “probability and statistics” and includes substantial attention to traditional math-ematical probability. We advocate a minimum of six weeks of instruction be devoted to the exploration of the statistical ideas noted above.

Among the possible options for providing appropriate statistical education for elementary school teachers are the following:

• A special section of an introductory statistics course geared to the content and instructional strate-gies noted above. This course can be designed to include all levels of teacher preparation students.

• An entire course in statistical content for elementary-school teachers.

• More time and attention given to statistics in existing mathematics content courses. Most likely, one course would be reconfigured to place substantial emphasis on statistics, but this would also likely result in reconfiguring the content of all courses in the sequence to make time for the statistics content. There is a great deal of mathematics and statistics content that is important for elementary-school teachers to know, so decisions about what to cut to make more room for statistics will be difficult. Thus, MET II advocates increasing the number of credit hours of instruction for elementary-school teachers to 12. Note that these hours are all content-focused; pedagogy courses are in addition to these 12 hours.

The recommendations above imply that those who teach statistics to future teachers need to be well versed in the statistical process and possess strong understand-ing of statistics content beyond what they are teaching to teachers. In addition, they should be able to articulate the ways in which statistics is different from mathematics. The PreK–12 GAISE framework emphasizes it is the focus on variability in data and the importance of context that sets statistics apart from mathematics. Peters (2010) also discusses the distinction in the article, “Engaging with the Art and Science of Statistics.” Many classically trained mathematicians have not had opportunities to explore and become comfortable with the statistical content top-ics and concepts outlined for elementary teachers. Thus, preparing to teach such courses will require collaboration among teacher educators of statistics.

Such courses must be taught with an emphasis on active engagement with the ideas through collect-ing data, designing experiments, representing data, and making inferences. Lecture is not appropriate as a primary mode of instruction in such courses. Such courses also need to be taught using manipulatives and technological tools and software that are available in schools, as well as more sophisticated technological

tools and software. Assessment in these courses should focus on assessing reasoning and understanding of the big ideas of statistics, not just the mechanics of com-puting a particular statistic. Chapter 7 provides a de-tailed discussion of assessment.

Professional Development Recommendations for Practicing TeachersCurrent elementary-school teachers are in need of professional development. It is critical that practicing teachers have opportunities for meaningful profession-al development. The content and pedagogy of the pro-fessional development should be similar to that previ-ously described for pre-service teachers.

Illustrative ExampleTo implement an elementary-school curriculum in sta-tistics like that envisioned in the GAISE framework, el-ementary-grades teachers must develop an appreciation of the statistical problem-solving process at a level that goes beyond what is expected of elementary school stu-dents. The following examples illustrate expectations for elementary-grades teachers across the four components of the statistical problem-solving process.

Page 170: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

18 | Statistical Education of Teachers

chaPter 4

ScenarioAs the time for annual testing draws near, students at an elementary school and their parents begin to receive mes-sages about the importance of eating breakfast on test days to ensure optimal test performance. A student asks his teacher if eating breakfast really influences how well you do on a test. The teacher decides to pursue this with the students because she is curious as well. Thus, she decides she will have the students design and carry out a statistical study to help them determine whether skipping breakfast before an exam could affect an individual’s score.

Formulate QuestionsFirst, the teacher helps the students write a specific statistical question to investigate such as:

How do the scores on the exam compare be-tween the two groups of students (those who ate breakfast versus those who did not)?

Collect DataThen, the teacher facilitates a discussion about how they could go about collecting useful data for investigating this statistical question. Determining how data should be collected to address the statistical question requires careful thinking. Designing an appropriate and feasible data-collection plan requires planning, and teachers should be given ample opportunity to do so in statistical courses. Would the best design for this study be a sample survey, an experiment, or an observational study?

The teacher should realize that the best study design for this investigation would be a statistical experiment. Ideally, the teacher would “randomly assign” the students in her class into one of the two groups because random assignment tends to produce similar groups that are bal-anced with regard to potential confounding variables such as intelligence or statistical ability. While it would be ideal for the teacher to create the two groups in this man-ner, it may not be practical. In this case, it is not practical or ethical to randomly assign the students to either eat or not eat breakfast before a test. Thus, the most feasible and practical design for this study is observational.

The students decide they will conduct this ex-periment using a math test scheduled for the next week. The first question on the test will be “Did you eat breakfast this morning before coming to school?” The class will then use the data from this question to classify the students into one of two groups (breakfast or no breakfast) and use the students’ scores to investigate the statistical question posed: How do the scores on the exam

compare between the two groups of students (those who ate breakfast versus those who did not)?

Before exploring and analyzing the data, a classroom teacher should encourage students to think about what they expect to see in the analysis. In this case, students have likely already heard claims about causal relation-ships between eating breakfast and scoring well on tests. Thus, the teacher could encourage students to research the topic of eating breakfast and its relationship to test performance and have students predict what they ex-pect to observe about the two distributions of exam scores such as shape, median or mean, and range.

Suppose 40 students completed the test, which con-sists of 30 multiple-choice questions. Following are the scores5 (number correct out of 30 questions) for the students in each group:

Breakfast: 26 21 29 17 24 24 23 19 24 25 20 25 22 29 28 18 30 23No Breakfast: 20 20 19 15 20 25 17 20 22 18 28 21 22 23 26 17 21 16 14 19 28 11

We observe that the group sizes are different. Eigh-teen students were in the Breakfast group, while 22 stu-dents were in the No Breakfast group.

Note: The following Analyze Data and Interpret Results sections are presented sequentially for different types of representations, rather than presenting a com-plete analysis and then interpreting the results.

Analyze Data (Using Dotplots)The goal of exploring, analyzing, and summarizing data is not simply to construct a graphical display or compute numerical summaries. The teacher should use the graphical display and/or numerical summaries to help students identify patterns present in the vari-ability so they can address the question under study.

For example, the following comparative dotplots are useful for displaying and comparing the scores be-tween the two groups.

5 These data are not the results of an actual study, but are

randomly generated scores.

30

Breakfast Dotplot

No

Bre

akfa

stB

reakfa

st

12 15 18 21 24 27

12 15 18 21 24 27 30

30

Score

FIGURE 1

Page 171: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 19

chaPter 4

Looking at the dotplots there is a tendency for stu-dents in the Breakfast group to score higher than stu-dents in the No Breakfast group. The center of the scores for the Breakfast group is around 24 correct, while the center of the scores for the No Breakfast group is around 20 correct. The scores for each group appear to be rea-sonably symmetric about their respective centers. As the range for the scores in the No Breakfast group is 17 compared to a range of 13 for the Breakfast group, there appears to be more variability in the scores in the No Breakfast group than in the Breakfast group.

Interpret Results (Using Dotplots)While there is some overlap in scores between the two groups (scores between 17 and 28), there is also some separation. Specifically, four students in the No Break-fast group scored lower than anyone in the Breakfast group. On the other hand, three students in the Break-fast group scored higher than anyone in the No Break-fast group. Thus, although the dotplots show some over-lap in scores between the two groups, there is a tendency for students in the Breakfast group to score higher than students in the No Breakfast group.

Analyze Data (Using Boxplots)When sample sizes are different, comparing displays based on counts can sometimes be deceptive. Thus, a teacher should know it is useful to provide graphical dis-plays that do not depend on sample size. One such graph is the boxplot. A boxplot displays the intervals of each quarter of the data based on the Five-Number Summary (Minimum value, First Quartile, Median, Third Quartile, and Maximum value). Because the median is indicated in a boxplot, it provides more specific information about the center of the data than a dotplot. Additionally, the inter-quartile-range (IQR), a more informative measure of vari-ability than the range, is easily observed from a boxplot.

Boxplots are especially useful for comparing two groups of quantitative data because the overlap/sep-aration can be expressed in terms of percentages. The comparative boxplots below summarize the data on the exam scores for the two groups.

Based on the boxplots, the scores for the Breakfast group tend to be higher than the scores for the No Break-fast group. The median score for the Breakfast group is 24 correct, while the median score for the No Breakfast group is 20 correct. In fact, all five summary measures for the Breakfast group are higher than the correspond-ing measures for the No Breakfast group. The scores for each group appear to be reasonably symmetric about their respective medians. Although the range is greater for the No Breakfast group than the Breakfast group, the IQR for each group is 5, indicating similar amounts of variability in the middle 50% of scores.

Interpret Results (Using Boxplots)Teachers should be able to make and help students make a variety of observations about the data represented by the boxplots. For instance, while there is some overlap in scores between the groups (scores between 17 and 28), there is also some separation. More specifically, approx-imately 25% of the students in the No Breakfast group scored lower than anyone in the Breakfast group. On the other hand, the first quartile for the Breakfast group is 21, indicating that approximately 75% of teachers in the Breakfast group scored 21 or higher. In the No Breakfast group, fewer than half the students scored 21 or high-er. Thus, although the boxplots show some overlap in scores between the two groups, there is a tendency for students who ate breakfast to score higher than students who did not eat breakfast.

Analyze Data (Using Numerical Summaries)In the practice of statistics, technology is used to obtain numerical summaries for data. Although it is useful to have students calculate numerical summaries by hand at least once, the emphasis in the PreK–12 statistics curricu-lum is placed on the interpretation of the statistics, not the hand calculation of summary statistics using the formulas.

The mean is a commonly reported numerical sum-mary of quantitative data. The means for the scores in the two groups are reported below:

Breakfast Group No Breakfast Group

Mean: 23.7 20.1

Like the median, the mean provides information about the center of the data.

Two numerical summaries of the amount of vari-ability in quantitative data are the mean absolute devia-tion (MAD) and the standard deviation (SD). Although we would not expect K–5 students to be able to reason about the MAD and SD, teachers of K–5 students should

Breakfast Boxplot

10 15 20 25 30Score

10 15 20 25 30

Breakfast

No Breakfast

FIGURE 2

Page 172: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

20 | Statistical Education of Teachers

chaPter 4

be able to engage in this kind of reasoning so they have a higher level of understanding of this scenario than is expected of their students.

The MAD and SD are reported below for each class.

Breakfast Group No Breakfast GroupMAD 2.98 3.20 SD 3.82 4.31

Each summary (MAD or SD) provides a measure of the typical difference between an observed score and the mean score. For example, the MAD indicates the 18 scores in the Breakfast group vary from 23.7 correct by 2.98 points on average and the 22 scores in the No Break-fast group vary from 20.1 correct by 3.20 points on aver-age. Because the MAD and SD for the No Breakfast group are a little larger than the MAD and SD for the Breakfast group, there is a little more variability in the scores for the No Breakfast group. However, the MAD and SD for both groups are fairly similar, indicating the variability in the scores is not very different for the two groups.

Interpret Results (of the Numerical Summaries)Teachers should note that the means capture the tendency of students in the Breakfast group to have higher scores, as displayed in the comparative dotplots. Specifically, the mean of the Breakfast group (23.7) is greater than the mean of the No Breakfast group (20.1). This tendency can be captured in a single statistic by reporting the dif-ference between the two sample means (23.7-20.1). Thus, students in the Breakfast group scored, on average, 3.6 points higher than those in the No Breakfast group. A comparison of the two groups focuses on this difference by asking, “Is a difference between means of 3.6 points a meaningful difference?” The answer to this question de-pends on two features of the data—the sample sizes and amounts of variability in the scores within the two groups.

One way to think about the magnitude of the dif-ference between the two means (3.6) is to express this difference relative to a measure of variability such as the MAD or SD.

Because the MADs are different for the two classes, we will use the larger MAD. We use the larger MAD to be on the cautious side, as we know mathematically having a larger denominator makes the ratio smaller, leaving us less likely to exaggerate the relationship. The difference between the means relative to the amount of variability is 3.6/3.2 = 1.125. Thus, the two means are 1.125 MADs apart.

Is this a meaningful difference? Although this quan-tity does not take into account the sample sizes, the ratio does provide a way to judge the difference in means with

respect to the amount of variability within each distri-bution. Specifically, this quantity gives some indication that the difference between the means (3.6) is meaning-ful—the difference is large relative to the variation with-in the data. Thus, it is reasonable to conclude that those students who eat breakfast tend to score higher on the test than those students who do not. There is evidence based on this sample of 40 students that eating breakfast is beneficial to higher performance on an assessment instrument. Developing Essential Understanding of Sta-tistics for Teaching Mathematics in Grades 6-8 (Kader and Jacobbe, 2013) offers a detailed discussion of how to compare two distributions for quantitative variables.

Based on the graphical and numerical analysis, it is tempting to say that eating breakfast was the cause for the higher mean score; however, teachers should understand we must be careful to not make a cause-and-effect conclusion because this was an observation-al study, not a randomized experiment. It is important that teachers be pushed to think statistically beyond the computation of a measure of center. Although it might be tempting to compute the mean or median of the test results and directly draw conclusions about the Breakfast group being better, teachers should be pushed to think about meaningfulness of the differ-ence in the manner outlined in this example. Through such an investigation teachers will be exposed to many of the content recommendations above and partake in the statistics investigative process.

ReferencesCommon Core State Standards Initiative (2010).

Common Core State Standards for Mathematics. Common Core State Standards (College- and Career-Readiness Standards and K–12 Standards in English Language Arts and Math). Washington, DC: National Governors Association Center for Best Practices and the Council of Chief State School Officers. Retrieved from www.corestandards.org.

Franklin, C., et al. (2007). Guidelines and Assessment for Instruction in Statistics Education (GAISE) Report: A PreK–12 Curriculum Framework. Alex-andria, VA: American Statistical Association.

Kader, G., and Jacobbe, T. (2013). Developing Essential Understanding of Statistics for Teaching Mathemat-ics in Grades 6–8. Reston, VA: NCTM.

National Council of Teachers of Mathematics (NCTM). (2000). Principles and Standards for School Mathematics. Reston, VA: NCTM.

Peters, S. (2010). Engaging in the arts and science of statistics. Mathematics Teacher, 103(7)

Page 173: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 5

Statistical Education of Teachers | 21

Expectations for Middle-School StudentsSince its inclusion in the National Council for Teach-er of Mathematics’ (NCTM) Curriculum and Eval-uation Standards for School Mathematics (1989), statistical content has gradually been expanded within the middle-school mathematics curriculum. The Common Core State Standards for Mathemat-ics (CCSSM) (2007) and other state standards em-phasize the importance of statistics and probability at the middle-school level. Thus, middle-school teachers are increasingly expected to teach units on statistical content.

Recommendations from reports such as CCSSM and NCTM’s Principles and Standards for School Mathematics (PSSM) (2000) and Curriculum Focal Points (2006) describe subject matter in statis-tics and probability within each of the three mid-dle-school grade levels. (In this chapter, the term “middle grades” or “middle school” refers to grade levels 6, 7, and 8.) Expectations of statistical under-standing for middle-school students generally in-clude the following:

• Understand the role of variability in statistical problem solving

• Explore, summarize, and describe patterns in variability in univariate data using numerical summaries and graphical representations, including:

º Frequencies, relative frequencies, and the mode for categorical data

º Measures of center and measures of variability for quantitative data

º Bar graphs for categorical data

º Dotplots, histograms, and boxplots for quantitative data

• Explore, summarize, and describe patterns of association in bivariate data based on:

º Two-way tables for bivariate categorical data

º Scatterplots for bivariate quantitative data

• Investigate random processes and un-derstand probability as a measure of the long-run relative frequency of an outcome, understand basic rules of probability, and ap-proximate probabilities through simulation

• Understand connections between probabil-ity, random sampling, and inference about a population

• Compare two data distributions and make informal inferences about differences be-tween two populations

These middle-grade topics should be developed from the statistical problem-solving perspective as de-scribed in the Guidelines and Assessment for Instruction in Statistics Education (GAISE) Report: A PreK–12 Curriculum Framework (Franklin et al., 2007). The GAISE problem-solving approach is built around four components: formulating a statistical question (antici-pating variability in the data), designing a plan for pro-ducing data (acknowledging variability) and collecting the data, exploring and analyzing the data (accounting for variability), and interpreting the results (taking vari-ability into account). The GAISE framework emphasiz-es the omnipresence of variability in data and recogniz-es the role of variability within each component.

To gain a sound understanding of the statistical topics in the middle-school curriculum, students should learn statistics in an activity-based learning environment in which they collect, explore, and interpret data to address statistical questions. Further, students’ exploration and analysis of data should be aided by appropriate technolo-gies, which, at a minimum, are capable of creating graphi-cal displays of data and computing numerical summaries

CHAPTER 5Preparing Middle-School Teachers to Teach Statistics

Page 174: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 5

22 | Statistical Education of Teachers

of data. Using the results from their analyses, students should consider the scope of their conclusions based on the manner in which the data were collected and com-municate an answer to the statistical question posed.

Essentials of Teacher PreparationThe primary goals of the statistical preparation of middle school teachers are three-fold:

1. Develop the necessary content knowl-edge and statistical reasoning skills to implement the recommended statistical topics for middle-grade students. Thus, teachers should achieve statistical content knowledge beyond that required of their students. Statistical topics should be devel-oped through meaningful experiences with the statistical problem-solving process.

2. Develop an understanding of how statistical concepts in middle grades build on the con-tent developed in elementary grades, provide a foundation for the content in high school, and are connected to other subject areas, including mathematics, in middle grades.

3. Develop pedagogical content knowledge necessary for effective teaching of statistics. Pre-service and practicing teachers should be familiar with common student concep-tions, content-specific teaching strategies, strategies for assessing statistical knowledge, and appropriate integration of technology for developing statistical concepts.

In addition to those topics covered in elementa-ry-teacher preparation, middle-school teacher prepara-tion should include at a minimum the following topics:

Formulate Questions• Distinguish between questions that require a

statistical investigation and those that do not

• Translate a “research” question into a ques-tion that can be answered with data and addressed through a statistical investiga-tion (e.g., see Scenario 1 of Appendix 1)

Collect Data• Identify appropriate variables for ad-

dressing a statistical question

• Distinguish between categorical and quanti-tative variables

• Recognize quantitative data may be either discrete (for example, counts, such as the number of pets a student has) or continuous (measurements, such as the height or weight of a student)

• Design a plan for collecting data

º Distinguish between observational studies and comparative experiments

º Use random selection in the design of a sampling plan

º Use random assignment in the design of a comparative experiment

º Recognize the connections between study design and interpretation of results; consider issues such as bias, confounding, and scope of inference

Analyze Data• Understand a data distribution describes the

variability present in data

º Use appropriate tabular and graphical representations and summaries (fre-quencies, relative frequencies, and the mode) of the distribution for categorical data

º Use appropriate graphical representa-tions and numerical summaries of the distribution for quantitative data; sum-marize by describing patterns in the variability (shape, center, and spread) and identifying values not fitting the overall pattern (outliers)

º Recognize when a normal distribution might be an appropriate model for a data distribution

º Recognize when a skewed distribution might be an appropriate model of a data distribution and understand the effects of skewness on measures of center and spread

Page 175: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 5

Statistical Education of Teachers | 23

• Use distributional reasoning strategies to compare two or more groups based on categorical data

º Compare modal categories

º Compare proportions within each category

• Use distributional reasoning strategies to com-pare two groups based on quantitative data

º Compare shapes, centers, and vari-ability

º Identify areas of overlap and separa-tion between the two distributions

º Understand how variability within groups affects comparisons between groups

• Explore and analyze patterns of association between two variables

º Distinguish between explanatory and response variables

º Summarize and interpret data on two categorical variables in a two-way table

º Summarize and interpret data on two quantitative variables in a scatterplot

º Use linear functions to model the association between two quantitative variables when appropriate

º Use a linear model to make predictions

º Use correlation to measure the strength of a linear association be-tween two quantitative variables

º Identify nonlinear relationships (e.g., power or exponential) between two quantitative variables

Interpret Results• Understand that one goal of statistical infer-

ence is to generalize results from a sample to some larger population

• Distinguish between population parame-ters and sample statistics

• Draw conclusions that are appropriate for the manner in which the data are collected

º Recognize that generalization from a sample requires random selection

º Recognize that statements about causation require random assignment

• Understand that random sampling from a population or random assignment in an experiment links the mathematical areas of statistics and probability

• Understand probability from a relative frequency perspective

º Use simulation models to explore the long-run relative frequency of outcomes

º Use the addition rule to calculate the probability of the union of disjoint events and the multiplication rule to calculate the probability of the inter-section of independent events

• Use simulation to explore, describe, and summarize the sample-to-sample variabili-ty (the sampling distribution) of a statistic

• Understand inferential reasoning through randomization and simulation to determine whether observed results are statistically significant

• Use simulation to develop a margin of error and explore the relationship between sample size and margin of error

• Use the normal distribution as appropriate to model distributions of sample statistics

As middle-school teachers develop statistical content knowledge, it is critical that they recognize the vertical connections of statistical topics across grade levels, the horizontal connections across the mathematics curriculum, and connections to other

Page 176: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 5

24 | Statistical Education of Teachers

subject areas. Statistics in the middle grades builds on the foundational experiences students have in the elementary grades and must strengthen and expand this foundation in preparation for students’ experi-ences with statistics in high school. Many statistical concepts developed in middle school are useful for reinforcing other areas of mathematical content with-in the middle-grades curriculum, including probabil-ity, measurement, number and operations, algebraic concepts, and linear functions. Additionally, most applications of statistics are in areas other than math-ematics (e.g., the sciences and social sciences), which provides students with opportunities to see connec-tions between the mathematical sciences and other areas of study. A thorough discussion of vertical and horizontal statistical connections, along with connec-tions across the middle-grades curriculum, is provid-ed in Developing Essential Understandings of Statistics, Grades 6–8 (Kader and Jacobbe, 2013, pages 81–90).

In addition to content knowledge, the preparation of middle-grade teachers should develop the peda-gogical knowledge necessary for effective teaching of statistics. Teachers should be introduced to common misunderstandings students have regarding statisti-cal and probabilistic concepts and learn to use appro-priate content-specific teaching strategies to address them. Some of the contexts in which common misun-derstandings occur include the following:

• The interpretation of graphical displays and tabular summaries of data

• The importance of random selection for obtaining a representative sample

• The notion of a sampling distribution

The research related to some of these misunder-standings is discussed in Chapter 8. Examples relat-ed to common misunderstandings are contained in Appendix 1.

The role of technology in learning statistics also must be an important aspect of middle-school teacher preparation. Teachers need to be comfort-able using technology to aid in the collection, explo-ration, analysis, and interpretation of data, as well as to develop concepts. While we do not recommend a specific technology, the technology/technologies chosen should have the capability to create dynamic graphical displays, produce numerical summaries of data, and perform simulations easily.

As students’ understanding of statistical con-cepts evolve, it is important that teachers learn the value of formative assessment. As noted in Chapter 7, writing statistical assessments is particularly dif-ficult because it requires disentangling mathemat-ical ideas and rote computational exercises from statistical thinking. Statistical assessments should emphasize conceptual understanding and interpre-tation over the application of formulas or algorith-mic thinking.

Recommendations for Prospective and Practicing TeachersMany of the topics described for the preparation of middle-school teachers are included in the traditional introductory college-level statistics course. However, this course alone is not adequate for preparing mid-dle-school teachers to teach the statistical content for middle-school students proposed by reports such as GAISE, CCSSM, and PSSM. Often, introductory sta-tistics courses pay little attention to formulating sta-tistical questions and give perfunctory attention to exploring and analyzing data. These courses frequently provide an axiomatic approach to probability, stress-ing the rules of probability instead of developing the concept of probability as a long-run relative frequency through simulation. Connections between statistics and probability are often ambiguous, and instead of fo-cusing on statistical reasoning, inference is approached as a collection of rote procedures.

NCTM’s Developing Essential Understandings of Statistics for Teaching Mathematics in Grades 6–8 (2013) provides a set of recommendations for prepar-ing middle-school teachers. This document describes four big ideas as a foundation for providing teachers with a deep understanding of the statistical content re-quired to teach statistics in middle school:

Big Idea 1: Distributions describe vari-ability in data.

Big Idea 2: Statistics can be used to compare two or more groups of data.

Big Idea 3: Bivariate distributions describe patterns or trends in the covari-ability in data on two variables.

Big Idea 4: Inferential statistics uses data in a sample selected from a popula-tion to describe features of the population.

Page 177: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 5

Statistical Education of Teachers | 25

The approach to developing statistical concepts in Essential Understandings is based on the notion that middle-grade teachers should experience the learn-ing of statistical concepts in ways similar to those of their students.

Course Recommendations for Prospective TeachersMET II (2012) recommends middle-school teachers take a course in statistics and probability beyond a modern technology-based introductory statistics course that includes topics on designing statistical studies, data analysis, and inferential reasoning.

In summary, this report recommends prospective middle-school statistics teachers acquire their statisti-cal knowledge base through the following courses:

A first course in statistics that develops teachers’ statistical content knowledge in an experiential, active learning environment that focuses on the problem-solving process and makes clear connections between statistical reasoning and notions of probability.

A second course that focuses on strengthening teachers’ conceptual understandings of the big ideas from Essential Understandings and the statistical content of the middle-school curriculum. This course also is intended to develop teachers’ pedagogical content knowledge by providing strategies for teaching statistical concepts, integrating appropriate technology into their instruction, making connections across the curriculum, and assessing statistical understanding in middle-school students.

Both courses should give teachers opportunities to ex-plore real problems that require them to do the following:

• Formulate statistical questions; design strat-egies for data collection and collect the data; explore, analyze, and summarize the data; and draw conclusions from the data

• Use dynamic statistical software or other modern technologies to aid in the collec-tion, analysis, and interpretation of data and enhance their learning and understanding of both statistical and probabilistic concepts

Professional Development Recommendations for Practicing TeachersBecause of the new emphasis on statistics in the mid-dle-grades mathematics curriculum, practicing mid-dle-school teachers are in need of professional develop-ment. Consequently, it is critical that practicing teachers have opportunities for meaningful professional develop-ment. The content and pedagogy of the professional de-velopment should be similar to that previously described for pre-service teachers.

Illustrative ExampleThe following example illustrates the complete statis-tical problem-solving process at the level expected of a middle-school teacher. Additional examples are pro-vided in Appendix 1.

Formulate QuestionsStatistical investigations undertaken in elementary school are typically based on questions posed by the teacher that can be addressed using data collected with-in the classroom. In middle school, the focus expands beyond the classroom, and students begin to formu-late their own questions. Because many investigations will be motivated by students’ interests, middle-school teachers must be skilled at constructing and refining sta-tistical questions that can be addressed with data.

For example, suppose a student is planning a project for the school’s statistics poster competition. The student recently read that consumption of bottled water is on the rise and wondered whether people actually prefer bot-tled water to tap or if they could even tell the difference between the two. When asked for advice about how to conduct a study, the teacher suggested having individuals drink two cups of water—one cup with tap water and one cup with bottled water. For each trial, the bottled water would be the same brand and the tap water would be from the same source. Not knowing which cup contained which type of water, each participant would identify the cup he/she believed to be the bottled water. Thus, a statis-tical question that could be investigated would be:

Are people more likely than not to correctly identify the cup with bottled water?

Collect DataTeachers must think carefully about how to collect data to address the above statistical question and how to

Page 178: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 5

26 | Statistical Education of Teachers

record the data on participants. As the statistical ques-tion requires data on the categorical variable “whether or not the individual correctly identified the cup with bottled water,” each participant should be asked to identify the cup he/she believes to be the bottled water, and, based on the response, the student would record a value of “Correct” (C) or “Incorrect” (I).

In this illustration, the student asked 20 classmates from her school to participate in the study. Each partic-ipant was presented with two identical cups, each con-taining 2 ounces of water. Each participant drank the water from the cup on the right first and then drank the water from the cup on the left. Unknown to the partic-ipants, the cup on the right contained tap water for half the participants, and the cup on the right contained bot-tled water for the other half. Each participant identified which cup of water he/she considered to be the bottled water. Following are the resulting data: C, I, I, C, I, I, C, I, C, C, I, C, I, C, C, I, C, C, C, C.

Analyze DataData on a single categorical variable are often summarized in a frequency table and bar graph indicating the number of responses in each category. The frequency table and bar graph for the above data are displayed below:

Note that 12 of the 20 participants (60%) correctly identified the bottled water, which is more than half. This

provides some evidence that people are more likely than not to distinguish bottled water from tap water. Note that the quantities “12” and “60%” are called statistics because they are computed from sample data.

Interpret ResultsAlthough more than half the participants in the study correctly identified bottled water, it is still possible that participants could not tell the difference and were simply guessing. If the participants were randomly guessing, the probability of a participant selecting bottled water would be 0.5, and we would expect about 10 of the 20 partic-ipants to correctly identify bottled water. However, this doesn’t guarantee that exactly 10 people will be correct, because there would be random variation in the number correct from one group of 20 participants to another.

This is similar to the idea of flipping a fair coin 20 times. Although we expect to get 10 heads, we are not surprised if we get 9 or 11 heads. That is, there is random variation in the number of heads we get if a fair coin is tossed 20 times. Thus, to decide whether people can tell the difference between tap water and bottled water, we must determine whether the observed statistic—“12 out of 20” correctly identifying bottled water—is a likely outcome when students are guessing and their selec-tions are completely random. This is an important ques-tion, often asked as part of this component of the statis-tical problem-solving process: “Is the observed statistic a likely (or unlikely) outcome from random variation if everyone is simply guessing?”

The answer to this question is at the heart of statisti-cal reasoning. If the observed statistic is a likely outcome, then random variation provides a plausible (believable) explanation for the observed value of the statistic and we conclude people may be guessing. If the observed statistic is an unlikely outcome, then this suggests the observed value of the statistic is due to something other than just random variation. In this case, the difference between the observed and expected values of the statistic is said to be statistically significant and we would conclude that peo-ple are not guessing. That is, people are more likely than not to correctly identify the cup with bottled water.

One way to address this question is to develop a model (a simulation model or a theoretical probability model) for exploring the long-run behavior of the sta-tistic. For example, a simulation model for exploring the random variation in the statistic “the number that correctly select bottled water when participants are ran-domly guessing” would be to toss a fair coin 20 times. A coin-toss that results in a “head” corresponds to correct-ly identifying the bottled water. For each trial (20 tosses

FIGURE 1

Bottled Water Guesses

Fre

qu

en

cy o

f S

ele

cti

on

Correct Incorrect

02

46

810

1214

Selection Frequency (Count)

CorrectIncorrect

Total 20

812

Selection

Page 179: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 5

Statistical Education of Teachers | 27

of the coin), record the number of heads. The dotplot summarizes the results for the “number of heads” from 100 trials of tossing a coin 20 times.

Based on the dotplot, getting 12 or more heads occurred in 19 of the 100 trials. So, if the coin is fair, the probability of getting 12 or more heads would be estimated at 0.19 based on this simulation. Thus, if par-ticipants cannot tell the difference and are randomly guessing, then 12 out of 20 people correctly identifying bottled water would not be a surprising outcome.

Applets for performing a simulation such as this are widely available (e.g., www.rossmanchance.com/applets). Using an applet, 10,000 repetitions of tossing a fair coin 20 times yielded the following dotplot (Figure 3) for the number of heads from each repetition. In the simulation, only about 4% of the 10,000 trials resulted in a number of heads that differed from the expected value by more than 4 heads. So, when flipping a fair coin 20 times, the probabil-ity of getting between 6 and 14 heads (inclusive) would be estimated to be 0.96. Thus, we can be fairly confident that the statistic (observed number of heads) will be within 4 of the expected number of heads (10). This value (±4), called the margin of error, tells us how much the statistic is likely to differ from the parameter due to random variation.

Number of Heads

Result of 10,000 Simulations

2 4 6 8 10 12 14 16 18

>=15<=5

As with the previous simulation, obtaining 12 heads (12 participants correctly identifying bottled water) is not a surprising outcome. Thus, because 12 out of 20 appears to be a likely outcome when the selection is random, the evidence against guessing is not very strong. Therefore, it is plausible that partic-ipants could not tell the difference between bottled water and tap water and were guessing which cup contained the bottled water.

Note that the statistical preparation of mid-dle-school teachers may include a more structured approach to solving this problem. This approach would consist of translating the statistical question into statements of the null and alternative hypotheses, estimating the p-value from the simulation, and using the p-value to describe the strength of the evidence against the hypothesis students are guessing. Also, this example could be expanded easily to one appropriate for preparing a high-school teacher. This expansion would include using the binomial probability distribu-tion as a mathematical model for describing the ran-dom variation in the number of heads out of 20 tosses and determining the exact p-value associated with the observed statistic.

ReferencesFranklin, C., et al. (2007). Guidelines and Assessment

for Instruction in Statistics Education (GAISE) Report: A PreK–12 Curriculum Framework. Alex-andria, VA: American Statistical Association.

Kader, G., and Jacobbe, T. (2013). Developing Essential Understanding of Statistics for Teaching Mathemat-ics in Grades 6–8. Reston, VA: NCTM.

National Council of Teachers of Mathematics (NCTM) (1989). Curriculum and Evaluation Standards for School Mathematics. Reston, VA: NCTM.

Common Core State Standards Initiative (2010). Common Core State Standards for Mathematics. Common Core State Standards (College- and Career-Readiness Standards and K–12 Standards in English Language Arts and Math). Washington, DC: National Governors Association Center for Best Practices and the Council of Chief State School Officers. Retrieved from www.corestandards.org.

National Council of Teachers of Mathematics (NCTM) (2000). Principles and Standards for School Mathematics. Reston, VA: NCTM.

National Council of Teachers of Mathematics (NCTM) (2006). Curriculum Focal Points. Reston, VA: NCTM. FIGURE 3

FIGURE 2

Result of 100 Simulations

2 4 6 8 10 12 14

Number of Heads

16

Page 180: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence
Page 181: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 29

chaPter 6

Expectations for High-School StudentsStatistical concepts in high school tend to be scattered throughout the curriculum, although it is increas-ingly common to find high schools offering a stand-alone statistics course in addition to an Advanced Placement (AP) statistics course. Statistical concepts appear in other mathematics courses, as well (e.g., re-gression is often discussed in algebra and geometry courses while discussing equations of lines).

State standards and nationally distributed stan-dards documents have increasingly emphasized statistics content at the high-school level over at least the past 40 years. Secondary teachers are thus required to teach a substantial amount of statistics by integrating it into mathematics courses and/or teaching designated stand-alone courses. Because of this, high-school teacher preparation needs to not only prepare teachers to teach statistics content, but also illustrate to teachers how the concepts are relat-ed and interwoven with mathematics.

One of the main areas in which this interweaving of statistics and mathematics is essential and explicit is modeling, which is becoming an important fea-ture of the high-school curriculum. Modeling gen-erally involves finding equations or mathematical systems that represent possible relationships among variables. If the variables produce data, then the modeling process must account for variation in the data and, thus, becomes statistical in nature.

High-school teachers should have experience modeling real-world situations, many of which be-gin with messy data sets that have to be “cleaned” (for example, by dealing with missing data and in-accurately recorded data) before any modeling is appropriate. Such key features of data analysis must be conveyed to students so they see the uses and misuses of statistical models, especially important in this age of Big Data. Teachers and students alike should come to appreciate the wisdom of statistician George Box (1987) in his famous dictum, “All mod-els are wrong; some models are useful.”

Recommended standards in statistics and prob-ability for high-school students from GAISE, the

CCSSM, NCTM, the College Board, and many state guidelines generally cover the following topics:

• Explore, summarize, and interpret univariate data, categorical and quantitative, including the normal model for data distributions

• Explore, summarize, and interpret bivariate categorical data based on two-way tables of frequencies and relative frequencies

• Explore bivariate quantitative data by way of scatterplots

• Construct and interpret simple linear mod-els for bivariate quantitative data

• Understand the role of randomization in designing studies and as the basis for statis-tical inference

• Understand the rules of probability, with emphasis on conditional probability, and using these rules in practical decision-mak-ing (e.g., knowing how to interpret risk)

• Model relationships among variables

Building on the spirit of statistics teaching and learning in the middle grades, these topics should be introduced from a data analytic perspective with re-al-world data and simulation of random processes be-ing prime instructional vehicles.

Essentials of Teacher PreparationHigh-school teachers should develop an understanding of statistical reasoning from a data and simulation per-spective and an appreciation for the effectiveness of such an approach in teaching and learning the basic tenets of statistics. As in the middle and elementary grades, this approach to statistics is built around four components of the problem-solving process of formulating questions, collecting data, analyzing data, and interpreting results, with emphasis on the omnipresence of variability and

CHAPTER 6Preparing High-School Teachers to Teach Statistics

Page 182: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

30 | Statistical Education of Teachers

chaPter 6

the quantification of uncertainty as a necessary compo-nent of making valid conclusions.

The primary goals of statistical preparation for high-school teachers are three-fold:

1. Develop the necessary statistical reason-ing skills along with the content knowl-edge in statistics beyond the typical intro-ductory college course. Statistical topics should be developed through meaningful experiences with the statistical prob-lem-solving process.

2. Develop an understanding of how statistical concepts develop throughout PreK–8 and how they connect to high-school statistics content, as well as develop an understand-ing of how statistical concepts are related, or not related, to mathematical topics.

3. Develop pedagogical content knowledge necessary for effective teaching of statistics. Pre-service and practicing teachers should be familiar with common student concep-tions, content-specific teaching strategies, strategies for assessing statistical knowledge, and appropriate integration of technology for developing statistical concepts.

In meeting these goals, preparation programs should pay particular attention to common miscon-ceptions that students may have and discuss strategies and examples to address these misconceptions. Ad-ditional emphasis on technology use for high-school teachers is also an important aspect of teacher prepara-tion. Teachers not only need to be well versed in using dynamic statistical software to solve and understand problems, but they also need to feel comfortable teach-ing a statistical concept using technology as a tool.

Topics in data-driven statistical reasoning for high-school teachers should include at least the following in addition to those covered in elementary- and mid-dle-school teacher preparation.

Formulate Questions• Recognize questions that require a statisti-

cal investigation versus those that do not

• Develop statistical questions that help to focus a real issue (research question) on components that can be measured

Collect Data • Recognize appropriate data for answering

the posed statistical question

º Distinguish between categorical and quantitative variables

º Recognize that quantitative data may be either discrete (counts, such as the number of females in a class) or continuous (measurements, such as time or weight)

• Understand the role of random selection in sample surveys and the effect of sample size on the variability of estimates

• Understand the role of random assign-ment in experiments and its implications for cause-and-effect interpretations

• Understand the issues of bias and con-founding in nonrandomized observa-tional studies and their implications for interpretation

Analyze Data• Explore univariate data, both categorical

and quantitative

º Recognize situations for which the normal distribution might be an ap-propriate model for quantitative data distributions

º Recognize situations in which the data distributions tend to be skewed, and how the skewness affects mea-sures of center and spread

º Compare multiple univariate data sets, numerical and graphical

• Explore bivariate data, both categorical and quantitative

º Describe patterns of association as seen in two-way tables

º Describe patterns of association as seen in scatterplots

Page 183: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 31

chaPter 6

º Describe patterns of association between a categorical and a quantitative variable

º Construct and describe simple linear re-gression models and explain correlation

• Model rich real-world problems

º Describe patterns of association as seen in multiple pairwise scatterplots

º Fit and interpret multiple regression models including both numerical and categorical explanatory variables

º Fit and interpret exponential and power models

º Fit and interpret logistic regression models

Interpret Results• Understand basic probability from a relative

frequency perspective

º Understand additive and multiplica-tive rules

º Understand conditional probability and independence

º See the explicit connection between conditional probability and indepen-dence in two-way tables

º See the explicit connection of prob-ability to statistical inference and p-values

• Understand inferential reasoning through randomization and simulation

º Conduct tests of significance and approximate p-values

º Estimate population parameters and approximate margins of error

• Infer from small samples based on the bi-nomial and hypergeometric distributions,

calculating exact probabilities of possible outcomes

• Infer from large samples (using both confidence intervals and significance tests, as appropriate) for means and proportions based on the normal distribution of sample means and sample proportions

• Infer using the chi-square statistic for bivariate categorical data

Some of these topics in question formulation, data exploration, and informal inferential reasoning begin in middle-school curricula; however, they are further de-veloped in the high-school setting with a view toward extending their use to new and deeper concepts such as study design, the normal distribution, standard devia-tion, correlation, and formal inference procedures based on sampling distributions. It is important to note that the probability topics listed here are the ones that are criti-cal for understanding the statistical reasoning process, and are not intended to provide a full complement of probability topics that might be taught in a mathematics course on that subject. In fact, as will be expanded on below, many so-called statistics courses suffer from too much emphasis on probability.

Recommendations for Prospective and Practicing TeachersProgram Recommendations for Prospective High-School TeachersFor this data-analytic and randomization approach to teaching statistics, a traditional formula-oriented introductory statistics course is not appropriate for prospective teachers, because it emphasizes learning a set list of procedures over understanding statistical reasoning. Neither is the standard calculus-based in-troductory statistics and probability course designed to serve engineering and science majors in many in-stitutions appropriate, because such courses tend to overemphasize probability theory and present a more theoretical development of statistical methods. The GAISE College Report (www.amstat.org/education/gaise) provides an excellent set of recommendations for an introductory statistics course (or courses) aimed toward statistical reasoning (GAISE Report Executive Summary pp. 2):

1. Emphasize statistical literacy and develop statistical thinking.

Page 184: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

32 | Statistical Education of Teachers

chaPter 6

2. Use real data.

3. Stress conceptual understanding, rather than mere knowledge of procedures.

4. Foster active learning in the classroom.

5. Use technology for developing conceptual understanding and analyzing data.

6. Use assessments to improve and evaluate student learning.

The report also provides informed advice about how these recommendations can be realized, along with outcomes that are essential for statistical literacy.

Such outcomes include believing and understanding that:

• Statistics begins with a question to investigate

• Data beat anecdotes

• Variability is natural, predictable, and quantifiable

• Association is not causation

• Statistical significance does not necessarily imply practical importance, especially for studies with large sample sizes

In summary, this report recommends that prospective high-school teachers of statistics acquire their knowledge base through the following courses:

1. An introductory course that emphasizes a modern data-analytic approach to statistical thinking, a simulation-based introduction to inference using appropriate technologies, and an introduction to formal inference (confidence intervals and tests of significance)

2. A second course in statistical methods that builds on the first and includes both randomization and classical procedures for comparing two parameters based on both independent and dependent samples (small and large), the basic principles of the design and analysis of sample surveys and experiments, inference in the simple linear regression model, and tests of independence/homoge-neity for categorical data

3. A statistical modeling course based on multiple regression techniques, including both categorical and numerical explanatory variables, exponential and power models (through data transformations), models for analyzing designed experiments, and logistic regression models

Each of the above courses should include the use of statistical software, provide multiple expe-riences with analyzing real data, and emphasize the communication of statistical results both orally and in writing.

Ideally, each of the first two courses should be taught with a pedagogical component for future teachers demonstrating effective methodologies for developing the subtle reasoning of statistics in students.

While a modern theory-based mathematical sta-tistics course is appropriate for high-school teachers of the subject, especially for prospective teachers of AP Statistics, it is strongly recommended that it not be

the only course exposing teachers to statistics in their curriculum. A theoretical course of this type should be taken after teachers develop an understanding of and appreciation for basic statistical reasoning expe-rienced from an empirical perspective and have some experience with statistical modeling.

Professional Development Programs for High-School TeachersBecause of the new emphasis on statistics in the curriculum, high-school teachers currently teach-ing are in need of professional development oppor-tunities that highlight the content and approach outline in the above. In general, any professional

Page 185: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 33

chaPter 6

development in statistics should have teachers do the following:

• Use real data in an active learning environment

• Use dynamic statistical software or other modern appropriate technology

• Learn basic statistical concepts using ran-domization and simulation approaches

• Discuss potential student misunder-standings around each topic

• Understand how to use formative assess-ment effectively

Illustrative ExampleThe following example illustrates the complete sta-tistical problem-solving process at the level expect-ed of a high-school teacher. Additional examples are provided in Appendix 1.

ScenarioA student interested in the texting phenomenon among high-school students wants to study how many texts students in her school receive and send in a typical day. Encouraged to think a little deeper, though, the stu-dent decides she really wants to know more than, say, the average number of texts received and sent because she believes students tend to send fewer texts than they receive. Upon hearing of this study, a friend adds a second idea: “I’ll bet texting time cuts into homework time for students.”

Formulate QuestionsBy high school, students should be able to describe and develop their own statistical investigations, refin-ing a general investigative idea into one or more clear statistical questions that can be answered through ap-propriately collected and analyzed data. High-school teachers must facilitate discussion of this key process so students see why and how a sound statistical anal-ysis depends on good questions.

After some discussion of the first ideas with her teacher, the student decides on the following question:

What is the relationship between number of texts received and number texts sent for students in my high school?

Refining the second idea a bit, they come up with a second statistical question to investigate:

What is the relationship between hours spent on homework per week and hours spent on texting per week for students in our high school?

Collect DataAs always, good answers to these questions depend on getting good data from students on the number of texts received and sent on a typical day. Teachers must be prepared to help students design an appro-priate data-collection procedure and carry out the study. In carrying out this study, teachers must guide students to think carefully about how to pose the sur-vey questions to the participants and record the data collected in a manner that will facilitate the analyses.

Because of the large size and complexity of the student body and the limited time frame for the study, it was not feasible to ask each student in the school about their texting habits. So, the students, perhaps guided by discussion with the teacher, de-cided to design the study as a sample survey, taking a random sample of students from the school roster. They determined that time constraints would allow them to locate and interview about 40 students on the day set aside for data collection. They designed the survey to ask:

How many texts did you receive yesterday?

How many texts did you send yesterday?

How many hours do you spend texting in a typical week?

How many hours do you spend on homework in a typical week?

By high school, students should be

able to describe and develop their

own statistical investigations,

refining an idea into clear

statistical questions that can

be answered through data.

Page 186: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

34 | Statistical Education of Teachers

chaPter 6

Analyze DataSuppose the data generated by the student survey were the following:

Gender Text Messages Sent Yesterday

Text Messages Received Yesterday

Homework Hours (week)

Messaging Hours (week)

Female 500 432 7 30

Female 120 42 18 3

Male 300 284 8 45

Female 30 78 3 8

Female 45 137 12 80

Male 0 93 5 0

Male 52 75 15 6

Male 200 293 14 10

Male 100 145 10 2

Female 300 262 3 83

Male 29 82 7 4

Male 0 80 2 3

Male 30 99 15 5

Male 0 74 3 0.5

Male 0 17 28 0

Male 10 107 10 6

Female 10 101 9 3

Female 150 117 6 100

Female 25 124 4 4

Male 1 101 7 1

Female 34 102 25 5

Male 23 83 7 10

Male 20 118 1 1

Male 319 296 12 12

Female 0 87 5 1

Female 30 100 3 70

Female 30 107 20 30

Female 0 8 9 0.2

Female 100 160 1 60

Male 20 111 1 2

Female 200 129 3 30

Male 25 101 18 6

Female 50 56 1 2

Female 30 117 15 2

Male 50 76 7 23

Male 40 60 6 10

Female 160 249 5 10

Male 6 96 8 2

Male 150 163 20 25

Female 200 270 10 30Source: www.amstat.org/censusatschool

TABLE 1

Page 187: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 35

chaPter 6

Text Message Sent vs. Received

Teacher preparation should include rich discus-sions about survey design and allow teachers time to carry out surveys, covering all for steps of the statistical reasoning process.

In looking for possible association between two nu-merical variables, it is best to begin with a scatterplot. Teachers should understand why scatterplots are the best choice to display the possible association between

two numerical variables and be exposed to examin-ing scatterplots that show strong association between variables as well as those that display weak associations between variables. In this context, teachers can discuss the difference between models that account for vari-ation and models that are deterministic, bridging the concepts of finding the equation of a line and fitting a line to data with variation.

The plot shows a strongly positive linear trend with a least-squares regression line having slope close to 1 and a negative y-intercept, with an outlying point. The points show fairly even and relatively small variability around the line, with a correlation coefficient of about 0.9 (as calculated using statistical software). The plot also shows some curvature in the data, suggesting the relationship could be investigated with a more detailed

model. Departures from a linear pattern can be seen more dramatically by studying the vertical differenc-es (residuals) between the y-values predicted from the line and the actual data values. When the residuals are plotted against the original x-values, the shape of the plot shows some curvature, indicating a curved line (perhaps quadratic or exponential) would fit the data somewhat better than the straight line.

0 100 200 300 400

010

020

030

040

050

0

Number of Text Messages Received

Num

ber o

f Tex

t Mes

sage

s Se

nt

Sent=−65+1.14Received

Text Messages Sent vs. Received

y=x

Number of Text Messages Received

0 100 200 300 400

50

04

00

30

020

010

00

Nu

mb

er

of

Text

Mess

ag

es

Sen

t

The least-squares regression equation for predicting Sent from Received is the dark green line:Sent = -65 +1.14Received

The dashed line is the equation Sent = Received.

FIGURE 1

Page 188: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

36 | Statistical Education of Teachers

chaPter 6

The scatterplot of text messaging hours versus homework hours shows a very weak negative trend influenced by a few large values in text messaging hours, with uneven variation around the line.

Residuals vs. Text Messages Recieved

Number of Text Messages Received

Resi

du

als

0 100 200 300 400

-10

0-5

010

05

015

00

Hours Doing Homework vs. Hours Text Messaging

Hours Spent Text Messaging

0 20 40 60 80 100

05

1015

20

25

Ho

urs

Sp

en

t D

oin

g H

om

ew

ork Regression

Interpret ResultsAs to the relationship between texts received and sent, there is a strong, positive linear association, as shown by evenly spread and relatively small residuals and a high correlation coefficient.

The plot shows a large cluster of points below the line between 60 and 120 texts received. The effect of this cluster is to pull the line toward them and thus increase the slope of the line. The effect of the extreme value at the upper right is to pull that end of the line upward, thus further increasing the slope.

If students sent and received the same number of text messages, the data would lie perfectly on a line through the origin with slope 1 (the dashed line on the

scatterplot). The regression line has slope close to 1 but the y-intercept is at -65 and lies slightly below the Sent = Received line. Thus, if the regression line were used as a prediction equation, the predicted sent messages would be less than the received messages for the range of data seen here. This fact, plus the preponderance of points below either line, provides some evidence in support of the belief that students tend to send few-er texts than they receive. The evidence based on the regression line would be even stronger if the point on the right was discovered to be in error and removed from the data set. Teachers should be pushed to think about the effects of different points on the estimated equation, ensuring they go beyond merely interpreting

The least-square regression equation for predicting Homework from Messaging is: Homework = 9.8—0.04Messaging

FIGURE 2

FIGURE 3

Page 189: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 37

chaPter 6

the slope and y-intercept and begin to think about the effects of the data on the model.

A more basic question of statistical inference is, “Could the observed positive slope of this regression line have occurred simply by chance?” If, in fact, there is no relationship between the two variables, then the observed pairings of the messages received and sent can be regarded simply as random occurrences. The question, then, is whether random pairings of these

data could have produced the observed slope as a reasonably likely outcome. This question can be an-swered by simulating a distribution of slopes from random pairings of the observed data. Such a distri-bution of slopes from 500 randomizations, does not produce a single slope near the observed 1.14 (the largest is close to 0.8), which allows us to conclude that the observed positive slope cannot be explained by chance alone.

More formally, one can test the hypothesis that the true slope of the regression line is zero (no positive linear association) by conducting a t-test. The t-value is 12.2, indicating that if the true slope is 0, the sam-ple slope of 1.14 is 12.2 standard errors above 0. The chance of having a slope of 1.14 or more given the true slope is 0 (the p-value) is about 0.0001 (very small!). Thus, there is strong evidence to reject the hypothe-sis that the true slope of the regression line is zero and conclude there is a statistically significant posi-tive relationship between texts sent and texts received. High-school teachers should be able to connect the simulation outcomes to the hypothesis test outcomes and show sound understanding of the information that each is giving about the significance of the relationship between the x variable and y variable in the model.

As to the question about the relationship between text messaging hours and homework hours, the tex-ting time appears to have little, if any, association with

homework time, in large measure because so many of the homework hours are low, regardless of the texting time. The five data points with texting hours per week at 60 or above may raise suspicions of inaccuracy in the reported data, providing opportunity for the teacher to emphasize the importance of checking data for accuracy (part of “cleaning” the data). If these points were found to be in error and removed, the regression line would have a slope even closer to zero. In short, these sample data provide little evidence of association between tex-ting hours and homework hours. In fact, using a t-test, the corresponding p-value for observing a sample slope of -0.04 or one more extreme if the true slope is equal to zero is 0.31(relatively large!). It is plausible the true slope is zero, indicating no significant relationship between texting hours and homework hours.

Further study of these data could shed light on whether the patterns seen above persist for males and females separately.

Histogram of Slopes

Fre

qu

en

cy

010

20

30

40

50

60

-0.6 -0.4 -0.2 0 0.2 0.4 0.6 0.8

Slopes

FIGURE 4

Page 190: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

38 | Statistical Education of Teachers

chaPter 6

ReferencesBox, G. E. P., and Draper, N. R. (1987). Empirical

Model Building and Response Surfaces. New York, NY: John Wiley and Sons.

Franklin, C., et al. (2007). Guidelines and Assessment for Instruction in Statistics Education (GAISE) Report: A PreK–12 Curriculum Framework. Alexandria, VA: American Statistical Association.

Garfield, J., et al. (2007). Guidelines and Assessment for Instruction in Statistics Education (GAISE) College Report. Alexandria, VA: American Sta-tistical Association.

Common Core State Standards Initiative (2010). Common Core State Standards for Mathematics. Common Core State Standards (College- and Career-Readiness Standards and K–12 Standards

in English Language Arts and Math). Washing-ton, DC: National Governors Association Center for Best Practices and the Council of Chief State School Officers. Retrieved from www.corestandards.org.

College Board (2006). College Board Standards for College Success: Mathematics and Statistics. New York, NY: College Board.

National Council of Teachers of Mathematics (NCTM) (1989). Curriculum and Evaluation Standards for School Mathematics. Reston, VA: NCTM.

National Council of Teachers of Mathematics (NCTM) (2000). Principles and Standards for School Mathematics. Reston, VA: NCTM.National Council of Teachers of Mathematics (NCTM) (2006). Curriculum Focal Points. Reston, VA: NCTM

ThiNksTock

By high school, students should be able to describe and develop their own statistical investigations, refining a general investigative idea into one or more clear statistical questions.

Page 191: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 7

Statistical Education of Teachers | 39

To promote development of teachers’ statistical knowl-edge and evaluate the effectiveness of instruction, it is important teachers be assessed in a manner that is ap-propriately aligned with the objectives detailed in this report. Although there are different ways of viewing assessment (formative, summative, etc.), this chapter focuses only on presenting examples of assessment items intended to measure conceptual understanding at all stages of the statistical problem-solving process. These items take into consideration the distinction between mathematical and statistical thinking, noting that statistical thinking is inextricably linked to context and variability plays a role in each component of the statistical problem-solving process.

Although this report focuses on the statistical edu-cation of teachers, many of the same issues apply when assessing either students’ or teachers’ understanding of statistics. Regardless of age, learners introduced to sta-tistics encounter the same principal concepts. Thus, dis-cussions of assessment in statistics are not specific to a particular age group (Gal and Garfield, 1997). Further, because teachers assess students in their own classrooms and are evaluated based on students’ performance on large-scale assessments, issues related to assessment of students are particularly relevant to teachers.

Despite calls from the statistics education commu-nity for greater emphasis on concepts (e.g., ASA, 2005; Cobb, 1992), large-scale assessment still predominant-ly assesses procedural competency. Many items classi-fied as statistics items on current large-scale standard-ized assessments focus on rote computations and fail to assess statistical reasoning (Gal and Garfield, 1997). However, efforts are being made to develop and pro-mote both large- and small-scale assessments that align with objectives central to the discipline of statistics.

For example, the stated goal of the Assessment Resource Tools for Improving Statistical Thinking (ARTIST) project6 is “to help teachers assess statistical literacy, statistical reasoning, and statistical thinking of students in first courses of statistics.” The Levels of Conceptual Understanding in Statistics (LOCUS) proj-ect7 has developed items and instruments that assess statistical understanding as articulated in the GAISE report. One goal of the project—Broadening the impact

and evaluating the effectiveness of randomization-based curricula for introductory statistics8 —is to facilitate assessment of introductory statistics courses to better understand student learning in “traditional” and ran-domization-based courses. The team has developed a pre-test and post-test composed of multiple-choice questions that assess conceptual understanding and student attitudes toward statistics; additionally, they have shared conceptual questions to be used over the course of the semester.

Examples of Assessment in StatisticsIn this section, items from traditional large-scale assess-ments, which tend to assess procedural competency, will be contrasted with items from projects that model sound assessment in statistics. The discussion highlights items that emphasize conceptual understanding, statisti-cal thinking, and the statistical problem-solving process.

Assessing Procedural CompetencyThe California Department of Education released several examples that illustrate the way statistics is as-sessed on the California Standards Test, including the Grade 7 Mathematics item9 shown in Figure 1.

Of the six sample items released that illustrate as-sessment of statistics, data analysis, and probability at grade 7, five items ask students to find a median. Cali-fornia is not alone in over-representing the median at the expense of other statistical topics. For example, the statistical items provided as examples on the Grade 8 Florida Comprehensive Achievement Test version 2.0

CHAPTER 7Assessment

6 Visit https://apps3.cehd.umn.edu/artistfor more information about this project. This project is funded by the National Science Foundation (NSF CCLI-ASA-0206571).

7 Visit locus.statisticseducation.org for additional sample items and more information about this project. This project is funded by the National Science Foundation (DRL-1118168).

8 This project is funded by the National Science Foundation (DUE-1323210).

9 Retrieved from www.cde.ca.gov/ta/tg/sr/css05rtq.asp on October 10, 2014.

Jared scored the following number of points in his last 7 basketball games: 8, 21, 7, 15, 9, 15, and 2. What is the median number of points scored by Jared in these 7 games?

(a) 9

(B) 11

(c) 15

(D) 19FIGURE 1

Page 192: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 7

40 | Statistical Education of Teachers

all involve students finding the median10. Furthermore, these items typically do not require conceptual under-standing of the median and its role in the statistical problem-solving process, but instead emphasize com-putation. The item shown in Figure 1 does not require any understanding of why a median would be chosen as a measure of center or how the median might useful in analyzing the basketball player’s performance.

Assessing Conceptual UnderstandingContrast the item in Figure 1 with an item from the LOCUS project, shown in Figure 2, above.

The item in Figure 2 assesses understanding of nu-merical summary statistics; however, test-takers are not required to make any calculations. Instead, the item assesses the ability to identify the more appropri-ate numerical summaries for the data based on proper-ties of the distributions being compared. Considering the shapes of the distributions displayed, test-takers must recognize that the mean and standard deviation are more strongly influenced by outliers. Thus, the median and IQR are the more appropriate numerical summaries for these data. Identifying the most appro-priate summaries of data based on properties of the

10 See http://fcat.fldoe.org/fcat2/fcatitem.asp.

Which of the following is the best statistical reason for using the median and interquartile range (iQr), rather than the mean and standard deviation, to compare the centers and spreads of these distributions?

(a) The mean and standard deviation are more strongly influenced by outliers than the median and IQR.

(B) The median and IQR are easier to calculate than the mean and standard deviation.

(c) The two groups contain different numbers of states, so the standard deviation is not appropriate.

(D) The two distributions have the same shape.

States That Border the Ocean

Percent of Area Covered by Water

States That Do Not Border the Ocean

Percent of Area Covered by Water

carlton found data on the percent of area that is covered by water for each of the 50 states in the U.S. he made the dotplots below to compare the distributions for states that border an ocean and states that do not border an ocean.

FIGURE 2

0 10 20 30 40 50

0 10 20 30 40 50

Page 193: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 7

Statistical Education of Teachers | 41

the graph above shows the distri-bution of the contents, by weight, of a county’s trash. if approximately 60 tons of trash consists of paper, approximately how many tons of trash consist of plastics?

(a) 24

(B) 20

(c) 15

(D) 12

distributions requires deeper conceptual understand-ing of statistics than rote calculation.

Assessment of conceptual understanding is also important for more advanced statistical concepts. Fig-ure 3 illustrates an item created by Tintle et al. as part of the NSF-funded project Broadening the Impact and Evaluating the Effectiveness of Randomization-Based Curricula for Introductory Statistics.

Instead of simply assessing the ability to calculate a p-value, this item requires conceptual understanding of statistical significance—the notion that results found to be statistically significant are unlikely to have occurred by chance alone. More specifically, the item in Figure 3 re-quires the test-taker to recognize that p-values are calcu-lated under the assumption that the null hypothesis is true (in this case, under the assumption that 50% of adults in the population prefer to watch the movie at home).

Assessing Statistical ThinkingAs discussed elsewhere in this report (chapters 1 and 3), there are substantive differences between statistical thinking and mathematical thinking. In particular, statistical thinking recognizes the need for data, the importance of data production, and the omnipresence of variability (Wild and Pfannkuch, 1999). However, many assessment items that involve exploring data and data displays primarily measure mathematical thinking, not statistical thinking.

For example, the item11 in Figure 4, above, is from the Praxis Series Middle School Mathematics Assess-ment, which is required for teacher licensure in more than 40 states and U.S. territories.

With movie-viewing-at-home made so convenient by services such as Netflix, Pay-per-view, and Video-on-demand, do a majority of city res-idents now prefer watching movies at home rather than going to the theater? to investigate, a local high-school student, lori, decides to conduct a poll of adult residents in her city. She selects a random sam-ple of 100 adult residents from the city and gives each participant the choice between watching a movie at home or the same movie at the theater. She records how many choose to watch the movie at home.

After analyzing her data, Lori finds that significantly more than half of the sample (p-value 0.012) preferred to watch the movie at home. Which of the following is the most valid interpretation of lori’s p-value of 0.012? (circle only one.)

(a) A sample proportion as large as or larger than hers would rarely occur.

(B) A sample proportion as large as or larger than hers would rarely occur if the study had been conducted properly.

(c) A sample proportion as large as or larger than hers would rarely occur if 50% of adults in the population prefer to watch the movie at home.

(D) A sample proportion as large as or larger than hers would rarely occur if more than 50% of adults in the population prefer to watch the movie at home.

FIGURE 3

11 Retrieved from www.ets.org/praxis/prepare/materials/5169

on October 10, 2014.

FIGURE 4

Paper 40%

Other 36%

Metals 9%

Plastics 8%

Glass 7%

Page 194: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 7

42 | Statistical Education of Teachers

Although data for Figure 4 are displayed in a circle graph, the item requires only mathematical thinking (use of percents or ratios to answer a deter-ministic question). The correct answer (D) can be calculated using the fact that the ratio of plastics to paper in the trash is 8% to 40% or 1 to 5, which is equivalent to a ratio of 12 tons to 60 tons. This solu-tion does not require any consideration of why the data are of interest, how the data were produced, or how these sample percentages might compare with population percentages. Contrast the item shown in Figure 4 with the LOCUS item in Figure 5.

The item in Figure 5 requires statistical think-ing. Test-takers are expected to make connections between the statistical question, how the data were collected, and how the results should be interpreted. In particular, the item requires test-takers to iden-tify statistical questions that can and cannot be an-swered using data from a sample survey involving randomly selected participants. The data are appro-priate for answering all questions presented except choice (B). Cause-and-effect conclusions are only appropriate based on data from experiments with random assignment.

Another important aspect of statistical thinking is a questioning attitude toward statistical claims, such as those presented in the media (Watson, 1997).

Watson (1997) presented the following open-ended formative assessment item, which allows students and teachers to “demonstrate statistical understand-ing and questioning ability which would not be pos-sible in a multiple-choice format” (p. 5). Figure 6 illustrates the relationship between a sample (poll conducted in Chicago) and a population (inference made about all U.S. high-school students), although these statistical terms are not explicitly mentioned.

Watson (1997) reports that some test-takers respond to the first question with criticism of the implications of the article’s claims, while others rec-ognized the sample might not be representative of the population. The geographical cues in the second part of the question provide another opportunity to recognize the sampling issue. In general, assessment items using the media provide a means to assess sta-tistical thinking as it occurs outside the classroom.

Assessing the Statistical Problem-Solving ProcessThe items previously presented emphasize various components of the statistical problem-solving pro-cess: formulating questions, collecting data, analyz-ing data, and interpreting results. In practice, the components of the process are inter-related, so it is often expedient to use items that address more than

a 13-year study of 1,328 adults random-ly selected from a population carefully monitored the personal habits and health conditions of participants. Per-sonal habits included tobacco use and coffee consumption. health conditions included incidence of stroke. Which of the following questions about this pop-ulation caNNOt be answered using data from this study?

(a) Are coffee drinkers more likely to smoke than adults who do not drink coffee?

(B) Does coffee con-sumption cause a reduction in the incidence of stroke?

(c) Do coffee drinkers have fewer strokes than adults who do not drink coffee?

(D) What percentage of the population are coffee drinkers?

FIGURE 5

“aBOUt 6 in 10 United States high-school students say they could get a handgun if they wanted one, a third of them within an hour, a sur-vey shows. The poll of 2,508 junior and senior high-school students in chicago also found 15 percent had actually carried a handgun within the past 30 days, with 4 percent taking one to school.”

(a) Would you make any criticisms of the claims in this article?

(B) If you were a high-school teacher, would this report make you refuse a job offer some-where else in the United States, say Colorado or Arizona? Why or why not?

FIGURE 6

Page 195: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 7

Statistical Education of Teachers | 43

one component. Free-response items can be useful for assessing statistical thinking across the statistical problem-solving process. For example, consider the LOCUS item12 shown in Figure 7.

Parts (A) and (B) of the item shown in Figure 7 require test-takers to analyze data by comparing average mile times and variability in mile times for runners in two races. Specifically, test-takers should explain that the data presented in the histograms do not support Jaron and Sierra’s predictions: the mile times of runners in the 5K are actually more vari-able than the mile times of runners in the half-mar-athon, and on average, the mile times of runners in the half-marathon are shorter than the mile times of runners in the 5K. To receive full credit, respons-es must include explanations based on the graphi-cal displays. Part (C) asks for an interpretation of results that requires consideration of how the data were collected and what statistical questions can be answered based on the data. Test-takers should rec-ognize that the way people were “assigned” to one

race or the other has implications for the conclusions that can be drawn. Because people chose which race to run (and that choice was likely based on running ability), we should not conclude that an individual person’s mile time would be less when that person runs a half-marathon than when he or she runs a 5K.

Even well-written free-response items are limit-ed as assessments of the statistical problem-solving process because the research topic and/or structure of the analysis are provided in the item. An alterna-tive way to emphasize the statistical problem-solv-ing process in assessment is through projects that allow teachers to carry out the process from begin-ning to end. After choosing a research topic, teach-ers formulate a statistical question that anticipates variability, collect data appropriate for answering the question posed, analyze the data using graphical displays and other statistical methods, and interpret results in a manner appropriate for the data collect-ed. Projects are an especially appropriate means of assessment for teachers, as they will be responsible

the city of Gainesville hosted two races last year on New Year’s Day. individual runners chose to run either a 5K (3.1 miles) or a half-marathon (13.1 miles). One hundred thirty four people ran in the 5K, and 224 people ran the half-marathon. the mile time, which is the average amount of time it takes a runner to run a mile, was calculated for each runner by dividing the time it took the runner to finish the race by the length of the race. The histograms below show the distributions of mile times (in minutes per mile) for the runners in the two races.

(a) Jaron predicted that the mile times of runners in the 5K race would be more consistent than the mile times of runners in the half-marathon. Do these data support Jaron’s statement? Explain why or why not.

(B) Sierra predicted that, on average, the mile time for runners of the half-marathon would be greater than the mile time for runners of the 5K race. Do these data support Sierra’s statement? Explain why or why not.

(c) Recall that individual runners chose to run only one of the two races. Based on these data, is it reasonable to conclude that the mile time of a person would be less when that person runs a half-marathon than when he or she runs a 5K? Explain why or why not.

FIGURE 7

Miles Times (minutes per mile)

Mile Times for 5K Runners Mile Times for Half-Marathon Runners

Miles Times (minutes per mile)

Rela

tive F

req

uen

cy

Rela

tive F

req

uen

cy

0.00

0.05

0.10

0.15

0.20

0.25

4 6 8 10 12 14 16 18 20 22

12An article published in

Statistics Teacher Network

discusses test-taker responses

to this item: www.amstat.org/education/stn/pdfs/stn83.pdf.

4 6 8 10 12 14 16 18 20 22

0.00

0.05

0.10

0.15

0.20

0.25

Page 196: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

CHAPTER 7

44 | Statistical Education of Teachers

for leading their students through the entire statisti-cal problem-solving process.

Implications for Statistical Education of TeachersAssessment plays an important role in the statistical education of teachers. If assessments are well de-signed, they direct teachers’ focus to the central as-pects of the course such as understanding key con-cepts, statistical thinking in light of variability, and carrying out the statistical problem-solving process. Further, as courses aimed at preparing teachers of statistics are being created or revamped in response to new standards, valid assessment tools are needed to evaluate the impact of instruction.

The assessments used by teacher educators are particularly important as they are likely to affect what teachers will value and how they will assess

their own students. Because quality assessment of statistical content is comparable for teachers and students, instructors of pre-service and in-service teachers have the opportunity and responsibility to model effective assessment.

Finally, because teacher evaluation systems in many states are based on student performance on standardized assessments, assessments naturally influence classroom instruction. Thus, it is critical that large-scale assessments evaluate statistics in a manner aligned with the values of the discipline and the objectives articulated in K–12 standards. Teach-er-educators and policy makers should advocate for instruments that provide valid and reliable measures of statistical understanding. These standardized as-sessments have the potential to reinforce or under-mine the efforts of programs that prepare teachers of statistics.

ThiNksTock

Free-response items can be useful for assessing statistical thinking across the statistical problem-solving process.

Page 197: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 45

chaPter 8

The prior chapters of this report articulate and outline specific recommendations about teacher preparation in statistics at the different grade levels. The discussion of the research on the teaching and learning of statis-tics presented in this chapter provides a starting point to inform those implementing the recommendations on how specific topics within the recommendations can be approached and taught.

Despite significant attention given to teacher ed-ucation in mathematics (Ball, 1991; Ball and Bass, 2000; Franke et al., 2009; Hill and Ball, 2004), few re-search-based guidelines are in place concerning what teachers need to know to teach statistics effectively. Although still an emerging field, some research does exist on student learning of statistics, teacher under-standing of statistics, and teacher preparation in statis-tics. Furthermore, several expository pieces have been published highlighting ideas and advances in the field. The goal of this chapter is to present a brief overview of what is known and not known from the research on the teaching and learning of statistics in PreK–12. An example of a more complete review of the literature on research on statistics learning and reasoning can be found in Shaughnessy (2007).

Research on Differences Between Mathematical and Statistical Thinking and Reasoning Research in statistics education has prompted growing recognition of the differences between mathematical thinking and statistical thinking (Groth, 2007; Hannigan, Gill, and Leavy, 2013). For example, statistical thinking involves recognition of the need for data, the importance of data production, and the omnipresence of variability. Various models of student learning in statistics have been constructed in the literature emphasizing the need to reason in the presence of variability (Ben-Zvi and Fried-lander, 1997; Jones et al. 2004; Hoerl and Snee, 2001; Wild and Pfannkuch, 1999). The development of statistical thinking must begin with a problem one seeks to answer through the use of data. This process of sifting through data to answer a problem is analytical by nature and in-volves constant evaluation in relation to the question be-ing answered (Wild and Pfannkuch, 1999).

Hannigan, Gill, and Leavy (2013) conducted a study of prospective mathematics teachers using the Com-prehensive Assessment of Outcomes in a First Statistics (CAOS) course test and found that, despite the prospec-tive teachers having strong mathematics abilities, their results were not significantly better than those of stu-dents from nonquantitative disciplines. Based on these results, the authors suggest “statistical thinking is differ-ent from mathematical thinking and that a strong back-ground in mathematics does not necessarily translate to statistical thinking” (p. 446). They note that this finding has implications for teacher preparation, as it should not be assumed teachers can transfer their knowledge of mathematics to statistics in ways that will allow them to meet the increased expectations for teaching statistics.

Research on Student Statistical LearningSeveral studies and expository articles have focused on the nature of students’ statistical thinking (Saldanha and Thomson, 2002; Mokros and Russell, 1995; Cobb, McClain, and Gravemeijer, 2003; delMas, 2004; Jones, Langrall, and Mooney, 2007). Such papers have identi-fied topics and concepts that are difficult for students to learn and have suggested potential pedagogical ap-proaches that may help facilitate the teaching of specif-ic concepts (Garfield and Ben-Zvi, 2008; Bakker, 2004; Gil and Ben-Zvi, 2011; Lehrer, Kim, and Schauble, 2007; Dierdorp, Bakker, Eijkelhof, and Maanen, 2011). Often, students learning statistical concepts rely on computational methods to solve problems without un-derstanding the statistical ideas being discussed (Cher-vany et al., 1977; Stroup, 1984).

Several studies document the difficulties that arise with introductory concepts such as interpreting graphs and finding descriptive statistics such as measures of center and spread (e.g., Mokros and Russell, 1995; Cai, 2000; Capraro, Kulm, and Capraro, 2005; Friel, Curcio, and Bright, 2001; Konold and Pollatsek, 2002; Watson and Mortiz, 2000; Well and Gagnon, 1997) and with more complex concepts such as sampling methods, study design, and sampling distributions (e.g., Saldanha and Thomson, 2002; Cobb and Moore, 1997; Groth, 2003; Shaughnessy, 2007; Shaughnessy, Ciancetta, and Canada, 2004; Watson and Moritz, 2000). For example,

CHAPTER 8Overview of Research on the Teaching and Learning of Statistics in Schools

Page 198: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

46 | Statistical Education of Teachers

chaPter 8

Lehrer, Kim, and Schauble (2007) worked with 5th- and 6th-grade students to “invent” and revise data displays, measures of center and variability, and investigating models of chance to account for variability. They found that when students developed their own measures of center and precision, they better understood how such measures related to a distribution.

A number of studies have found that students often have difficulty dealing with and accepting variability, despite the fundamental importance of this concept in statistics (Ben-Zvi and Garfield, 2004; Utts, 2003; Cobb, McClain, and Gravemeijer, 2003; delMas et al., 2007; Shaughnessy et al., 2004; Watson, Kelly, Callingham, and Shaughnessy 2003). The 2004 November issue of the Statistics Education Research Journal (SERJ) was dedicat-ed to research discussing student conceptions of vari-ability. Pfannkuch (2004) identified three themes that emerged from the studies contained in the special issue. First, thinking tools such as tables, graphs, and data vi-sualization software (dynamic statistical software) used by students are linked to reasoning about the variation observed. Second, reasoning about variability is integral to all stages of the statistical problem-solving process. Third, reasoning about variability is essential for both exploratory data analysis and classical inference.

Other studies also explore student understanding of variability. For example, Bakker (2004) used a design-re-search approach to test two instructional activities with 30 8th-grade students to see how students with little statistical background reason about sampling variabil-ity and data. Their results showed that using activities geared toward eliciting diagrammatic reasoning—such as making a diagram, experimenting with the diagram, and reflecting on the results—provided opportunities to promote student reasoning in meaningful ways.

Student learning related to other important topics in the PreK–12 statistics curriculum such as associa-tion among variables, both quantitative and categori-cal, also have been documented in research. Batanero, Estepa, Godino, and Green (1996) examined students’ conceptions of association in 2x2, 2x3, and 3x3 contin-gency tables. The authors characterized three incorrect conceptions of association—a determinist conception (students only consider variables dependent when there are no exceptions), a unidirectional conception (students only consider variables dependent when they are positively associated), and a localist conception (students use only part of the data in the table).

With respect to quantitative variables, research shows students have difficulty understanding that plotting points on a Cartesian graph can show the

relationship between two variables (Bell, Brekke, and Swan, 1987). Secondary students have difficulty see-ing overall patterns in the data when asked to read a scatterplot. This difficulty might stem from students tending to perceive data as a series of individual cases (case-oriented view), rather than as a whole with “char-acteristics that are not visible in any of the individual cases” (aggregate view) (Bakker, 2004, p. 64). Estepa and Batanero (1996) documented the prevalence of the case-oriented view by observing students reading scatterplots to judge associations between quantitative variables. Further, they noted students only detect a linear relationship when the correlation is strong.

Students also exhibit difficulty when using a line of fit to model data and when making predictions. Some con-ceptions cited in the literature are that the line should go through the maximum number of points possible, the line must go through the origin, there must be the same number of points above and below the line, the line must pass through the left-most and right-most points or the highest and lowest points on the scatterplot, or the line must be placed visually close to a majority or cluster of the points (Sorto, White, and Lesser, 2011). In his study of student understanding of covariation, Mori-tz (2004) found students often focused on a few points or a single variable, rather than bivariate data, and based judgments on prior beliefs instead of data.

Dierdorp, Bakker, Eijkelhof, and Maanen (2011) presented results from a teaching experiment with 12 11th-grade Dutch students. They found that imple-menting a teaching and learning strategy that focused around tasks inspired by authentic problems support-ed students’ learning about correlation and regression. Tasks that required students to collect and model their own data increased their need and desire for finding their own solutions and extending their knowledge.

To help address student difficulties with statistical learning, researchers have suggested particular instruc-tional approaches that lead students “to become aware of and confront” their misunderstandings (Garfield, 1995, p. 31). It is thus important to design activities and lessons that address and bring to the surface potential issues. For example, students could be asked to make guesses or predictions about data and random events and then compare their predictions to their findings (delMas, Garfield, and Chance, 1999; Garfield, 1995).

Shaughnessy (2007) noted that letting students en-gage with exploratory, open-ended tasks that ask stu-dents “what do you notice?” and “what do you won-der about?” prompts them to think more deeply about variability in data. To better understand variability,

Page 199: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 47

chaPter 8

McClain, McGatha, and Hodge (2000) commented that students need opportunities to explain their rea-soning and methods when dealing with variation. To improve conceptual understanding of descriptive sta-tistics, Watson and Mortiz (2000) noted that instead of asking students to merely apply algorithms to sum-marize data, students must be presented with learning experiences that necessitate the representation of data with a single summary value. To counter difficulties in the analysis of association between variables, Moritz (2004) recommended that instruction build on stu-dents’ existing reasoning by graphing and verbalizing covariation in familiar contexts even before introduc-ing graphing conventions.

Other studies also have documented the impor-tance of working through informal reasoning prior to introducing more formal statistical concepts for the development of statistical understanding of diffi-cult topics by students. For example, Gil and Ben-Zvi (2011) studied the role of explanation in developing informal inferential reasoning (IIR) through a case study of two small groups of 6th-grade students. They identified four types of explanations in students’ devel-opment of IIR: descriptive, abductive, reasonableness, and conflict resolution. These modes of explanation enabled students to make sense of sample data, made students aware of context surrounding the statistical investigations they were carrying out, and offered a way to resolve conflicts between what the students ex-pected to see in the data and what they actually saw. They concluded teaching approaches that encourage explanation can support the development of IIR.

Cobb, McClain, and Gravemeijer (2003) designed an experiment in an 8th-grade classroom lasting 14 weeks (41 sessions) to study student learning trajectory for covariation. The learning trajectory was developed and tested through a series of mini-cycles in which the research team would conjecture about student learn-ing, design the teaching, and debrief after a class ses-sion to help sequence the next session. An important result from the work was highlighting the importance of exploratory data analysis (EDA) prior to engaging students in more formal statistical inference.

McClain and Cobb (2001) report on findings from two teaching experiments conducted with 7th- and 8th-grade students. The goal of this study was to ex-plore ways to support students in developing a view of data sets as a distribution. They found that guiding students through discussions about the data-generat-ing process and instilling classroom norms that re-quired students to explain and justify their thoughts

enabled them to focus on ways to organize data to develop arguments.

Several studies, many already mentioned, have pointed out the benefits of using dynamic statistical software in instruction (Watson and Donne, 2009; Konold, 2007). For example, Ben-Zvi, Aridor, Ma-kar, and Bakker (2012) studied 5th-grade students in an inquiry-based classroom working on a growing samples activity (Konold and Pollatsek, 2002; Bakker, 2004; Ben-Zvi, 2006) with the use of statistical soft-ware Tinkerplots. While students initially tended to make statements that either expressed extreme confi-dence about results or that nothing could be conclud-ed, students later entered a second phase in which they were able to make middle-ground probabilistic statements more easily. The authors attribute these advances to the design of the activity and the use of Tinkerplots by students.

Lehrer, Kim, and Schauble (2007) also used Tinker-plots with their students and noted the students, through the employment of the statistical software, were quickly able to explore their “invented” measures and under-stand whether they provided insightful information

ThiNksTock

Researchers have identified four types of explanations in students’ development of informal inferential reasoning: descriptive, abductive, reasonableness, and conflict resolution.

Page 200: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

48 | Statistical Education of Teachers

chaPter 8

about the data. The software also gave students a tool to quickly investigate different scenarios of chance.

Ben-Zvi (2000) discusses how the use of powerful technological tools can shift activities to higher cogni-tive levels, change the objectives of an activity, provide access to graphics and visuals, and focus activities on transforming and analyzing representations for stu-dents. He discusses several statistical software packag-es and the advantages each provides.

In a book chapter, Biehler, Ben-Zvi, Bakker, and Makar (2013) discuss how technology can enhance student learning. They examine how features of dy-namic software such as Tinkerplots and Fathom can facilitate student understanding. Software can aid stu-dents in exploratory data analysis help develop stu-dents’ aggregate view of data.

As these examples illustrate, the research on student learning of statistics has implications for teaching, which has implications for the statistical education of teachers. Overall, the research has shown the importance of data exploration in informal ways, the importance of technol-ogy in furthering student understanding, and the impor-tance of designing activities that foster students’ statistical reasoning. The recommendations put forth for teacher preparation in this report align with this research base as they recommend courses for teachers focused on data ex-ploration aided by the use of technology.

Research on Teacher Statistical LearningHistorically, teacher preparation programs have not adequately developed the statistical content knowledge necessary for effective teaching of statistics in PreK–12. Rubin, Rubin, and Hammerman (2006) found that teachers’ statistical thinking is not substantially different than students’ statistical thinking. Like their students, teachers have gaps in their understanding of several basic concepts in statistics (Callingham, 1997; Greer and Ritson, 1994) as well as more complex con-cepts such as covariation and regression (Engel and Sedlmeier, 2011; Casey and Wasserman, 2015).

For example, the teachers studied by Jacobbe and Hor-ton (2010) were successful at reading data from graphical displays, but unsuccessful with questions that assessed higher levels of graphical comprehension. While most pre-service and in-service teachers can compute mea-sures of center, many lack a conceptual understanding of what the measures of center represent (Groth and Berg-ner, 2006; Jacobbe, 2012; Leavy and O’Laughlin, 2006).

Teachers also experience difficulty with the con-cept of variability. For example, Hammerman and

Rubin (2004) noted that secondary teachers involved in professional development discussed the variation in distributions using only segments and slices of the distributions and not the entire picture. Confrey and Makar (2002) discussed similar results with mid-dle-school teachers, who examined variation in dis-tributions by focusing on single points instead of the distribution as a whole.

Similarly, Makar and Confrey (2004) gave in-ser-vice secondary teachers student performance data and asked them to compare performance between different types of students. Only a few teachers were able to make comparisons by discussing the variation in the distributions; most teachers focused on a sin-gle summary such as the mean or made a very general statement about passing rates. In addition, Hannigan et al. (2013) indicated that the prospective teachers in their sample had particular difficulties with sampling variability. Bargagliotti et al. (2014) also noted several misunderstandings in-service teachers had about sam-pling variability, such as believing repeated samples were necessary to make inferential statements using sampling distributions.

In a 2013 study, Peters offered insights into factors that may lead teachers to understand variation. Peters examined the learning of five AP Statistics teachers and explored how reflection and discourse, data circum-stances that trigger dilemmas, retrospective methods, and teacher education play a role in teachers’ develop-ment of understanding statistical variation. Peters not-ed how teachers have a strong desire for an overarch-ing content framework, such as the statistical process noted in this report and in GAISE. Additionally, she highlighted how reflection, triggers, and retrospective methods allow teachers to obtain deeper understand-ing of variation.

Other studies also offered insights into increasing teacher understanding. For example, while studying 56 high-school teachers working on activities centered on comparing distributions and randomization testing, Madden (2011) found that statistically provocative, technologically provocative, and contextually provoca-tive tasks might increase teacher engagement in infor-mal inferential reasoning (IIR).

Leavy, Hannigan, and Fitzmaurice (2013) inter-viewed nine teachers at length to explore the factors influencing teachers’ attitudes toward statistics. They found mathematics teachers perceive statistics as diffi-cult to learn for reasons that include the uniqueness of statistical thinking and reasoning and the role of con-text and language in statistics.

Page 201: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 49

chaPter 8

Watson (2002) discussed an instrument devel-oped to profile teacher understanding and teaching needs for probability and statistics. In administering this instrument to 43 primary and secondary teachers in Australia, she found that primary teachers taught many activities related to data and chance to their stu-dents; however, the lessons did not lead to a coherent overall program. While the secondary teachers exhib-ited more coherent curriculum in their lessons, the lessons remained theoretical and teachers did not in-troduce activities, such as simulations, that would help students visualize the theory.

Teacher educators must consider these issues as they carefully plan to implement the recommenda-tions put forth in this report.

Research on Teacher Preparation in StatisticsThe emphasis on statistics at the pre-college level de-mands a targeted effort to improve the preparation of pre-service teachers and provide quality profession-al development for in-service teachers. Pfannkuch and Ben-Zvi (2011) stated that statistical courses for teachers should be developed around five major themes: (1) developing understanding of key statisti-cal concepts, (2) developing the ability to explore and learn from data, (3) developing statistical argumenta-tion, (4) using formative assessment, and (5) learning to understand students’ reasoning. The goals of such a course should be to offer good statistical content training for teachers; discuss student reasoning and how to build and scaffold students’ conceptions; and understand curricula, technology, and sequences of instructional activities that build students’ concep-tions across grade levels.

By analyzing online discourse among teachers, Groth (2008) concluded that teachers’ perceptions, interpretations, and understanding of the GAISE re-port guidelines might influence their classroom de-livery of the content. Based on these findings, Groth indicated teacher understanding and choices can in-fluence the statistics education experience of students in the classroom.

To affect teacher practice and ensure that practice is effective, a curriculum for teachers must incorpo-rate aspects of both statistical content knowledge and specialized teaching knowledge (Shulman, 1986). Building on the Mathematical Knowledge for Teach-ing (MKT) framework of Ball, Hill, and Bass (2005), Groth (2013) separated Statistical Knowledge for Teaching (SKT) into Subject Matter Knowledge and

Pedagogical Content Knowledge. Each is then subdi-vided further into three categories. For example, Sub-ject Matter Knowledge includes content knowledge specialized for teaching, while Pedagogical Content Knowledge includes knowledge about curriculum. Furthermore, Groth connected ideas of Pedagogically Powerful Ideas (Silverman and Thompson, 2008) and Key Developmental Understandings (Simon, 2006) to SKT. Within his SKT framework, Groth highlighted several examples in which statistical and mathemati-cal reasoning differ.

Currently, few research-based courses for pro-spective teachers or professional development work-shops are documented and offered specifically to pre-pare individuals to teach statistics (Bargagliotti et al., 2014; Garfield and Everson, 2009; Gould and Peck, 2004). Due to the meager offerings of specialized courses or programs, teachers are not to blame for their lack of preparation to teach statistical concepts effectively. Instead, research points to the vital need for teacher preparation programs that adequately ad-dress the statistical preparation of teachers.

The work of Heaton and Mickelson (2002, 2004) examines the collaborative efforts of a mathematics ed-ucator and statistician to help prospective elementary teachers develop statistical knowledge by incorporat-ing statistical investigation into existing elementary curricula. The collaboration offers insight into pre-ser-vice teachers’ statistical and pedagogical content knowledge based on their application of the process of statistical investigation themselves and with children.

Groth (2007) called for new kinds of statistics courses geared toward expanding statistical knowl-edge for teaching. Such knowledge should include not only statistical content knowledge (Cobb and Moore, 1997), but also discussions of best practices in teaching statistics and common student difficulties in learning statistics.

Shaughnessy (2007) described the critical need for professional development, saying, “Our teach-ing force is undernourished in statistical experi-ence, as statistics has not often been a part of many teachers’ own school mathematics programs” (p. 959). Franklin and Kader (2010) noted it is import-ant for teachers to not only be familiar with the sta-tistical content they teach, but also have a sound understanding of how their grade-level content fits with the statistics concepts taught in the grade lev-els below and above theirs.

Furthermore, the 2013 ASA and NCTM joint position statement advises, “The need is critical for

Page 202: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

50 | Statistical Education of Teachers

chaPter 8

high-quality pre-service and in-service preparation and profes-sional development that supports PreK–12 teachers of mathe-matics, new and experienced, in developing their own statistical proficiency” (ASA-NCTM, 2013, p. 1).

ConclusionsAs the field of statistics education research develops, it is helpful to document ways to develop student and teacher statistical thinking. As noted above, several research studies and conference proceed-ings papers and expository pieces have focused on the types of issues that emerge while teaching and learning statistics in PreK–12 and in teacher preparation. In general, teacher preparation programs need to provide courses that align with research and give pre-ser-vice teachers opportunities to engage in the statistical investigative process as suggested throughout this report. Professional develop-ment should consider the research base outlined in this chapter to guide the development of statistical topics. Furthermore, as cur-ricula, lesson plans, and strategies are developed (for example, see lesson plans at www.amstat.org/education/stew and strategies such as learning trajectories outlined, for example, by Bargagliotti et al. 2014; Makar and Confrey, 2007; Makar, 2008; Cobb, McClain, and Gravemeijer, 2003), subsequent robust statistical studies are needed to test the effects on student and teacher statistical understanding. Studies, both large and small scale, linking teacher understanding to student understanding are also necessary.

ReferencesBall, D. (1991). Research on teaching mathematics: Making subject

matter knowledge part of the equation. In J. Brophy (Ed.) Advances in research on teaching: Teachers’ subject matter knowledge and classroom instruction (2:1–48). Greenwich, CT: JAI Press.

Ball, D. L., and Bass, H. (2000). Interweaving content and pedagogy in teaching and learning to teach: Knowing and using mathematics. In J. Boaler (Ed.) Multiple perspectives on the teaching and learning of mathematics (pp. 83–104). Westport: Ablex.

Ball, D. L., Hill, H. H., and Bass, H. (2005). Knowing mathe-matics for teaching: Who knows mathematics well enough to teach third grade, and how can we decide? American Mathematical Educator, Fall, 14–46.

Bargagliotti, A. E., Anderson, C., Casey, S., Everson, M., Franklin, C., Gould, R., Groth, R., Haddock, J., and Watkins, A. (2014). Project-SET materials for the teaching and learning of sampling variability and regression. International Conference of Teaching Statistics (ICOTS) conference proceedings, invited paper.

Bakker, A. (2004). Reasoning about shape as a pattern in vari-ability. Statistics Education Research Journal, 3(2):64–83.

Batanero, C., Estepa, A., Godino, J., and Green, D. (1996). Intuitive strategies and preconceptions about association

in contingency tables. Journal for Research in Mathematics Education, 27(2):151–169.

Biehler, B., Ben-Zvi, D., Bakker, A., and Makar, K. (2013). In Clements et al. (Eds.), Third international handbook of mathematics education, springer international handbook of education 27, DOI 10.1007/978-1-4614-4684-2_21. New York, NY: Springer Science + Business Media.

Bell, A., Brekke, G., and Swan, M. (1987). Misconceptions, conflict, and discussion in the teaching of graphical inter-pretation. In J. D. Novak (Ed.), Proceedings of the second international seminar: Misconceptions and educational strategies in science and mathematics (1:46–58). Ithaca, NY: Cornell University.

Ben-Zvi, D., Aridor, K., Makar, K., and Bakker, A. (2012). Stu-dents/ emergent articulations of uncertainty while making informal statistical inferences. International Journal of Mathematics Education.

Ben-Zvi, D. (2006). Scaffolding students’ informal inference and argumentation. In A. Rossman and B. Chance (Eds.) Proceedings of the seventh International Conference on Teaching of Statistics, Salvador, Bahia, Brazil, July 2–7, 2006. Voorburg, The Netherlands: International Sta-tistical Institute. www.stat.auckland.ac.nz/iase/publica-tions/17/2D1_BENZ.pdf.

Ben-Zvi, D. (2000). Toward understanding the role of techno-logical tools in statistical learning. Mathematical Thinking and Learning, 2:1–2, 127–155.

Ben-Zvi, D., and Garfield, J. (Eds.) The challenge of developing statistical literacy, reasoning, and thinking. Dordrecht, The Netherlands: Kluwer Academic.

Ben-Zvi, D., and Friedlander, A. (1997b). Statistical thinking in a technological environment. In J. Garfield and G. Burrill (Eds.) Research on the role of technology in teaching and learning statistics (pp. 45–55). Voorburg, The Netherlands: International Statistical Institute.

Callingham, R. (1997). Teachers’ multimodal functioning in relation to the concept of average. Mathematics Education Research Journal, 9:205–224.

Cai, J. (2000). Understanding and representing the arithmetic averaging algorithm: An analysis and comparison of U.S. and Chinese students’ responses. International Journal of Mathematical Education in Science and Technology, 31:839–855.

Capraro, M. M., Kulm, G., and Capraro, R. M. (2005). Middle grades: Misconceptions in statistical thinking. School Sci-ence and Mathematics, 105:165–174. DOI: 10.1111/j.1949-8594.2005.tb18156.x

Casey, S., and Wasserman, N. (2015). Teachers’ knowledge about informal line of best fit. Statistics Education Research Journal. In Press.

Chervany, N. L., Collier, R. D., Fienberg, S., and Johnson, P.

Page 203: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 51

chaPter 8

(1977). A framework for the development of mea-surement instruments for evaluating the introduc-tory statistics course. The American Statistician, 31(1):17–23.

Cobb, G. W., and Moore, D. S. (1997). Mathematics, statistics, and teaching. American Mathematical Monthly, 104:801–823.

Cobb, P., McClain, K., and Gravemeijer, K. (2003). Learning about statistical covariation. Cognition and Instruction, 21(1):1–78.

Confrey, J., and Makar, K. (2002). Developing second-ary teachers’ statistical inquiry through immer-sion in high-stakes accountability data. Paper presented at the Twenty-fourth Annual Meeting of the North American Chapter of the Interna-tional Group for the Psychology of Mathematics Education (PME-NA), Athens, GA.

delMas, R. C. (2004). A comparison of mathematical and statistical reasoning. In J. Garfield and D. Ben-Zvi (Eds.) The challenge of developing statis-tical literacy, reasoning, and thinking (pp. 79–95). Dordrecht, The Netherlands: Kluwer.

delMas, R., Garfield, J., Ooms, A., and Chance, B. (2007). Assessing students’ conceptual under-standing after a first course in statistics. Statistics Education Research Journal, 6(2):28–58.

delMas, R., Garfield, J., and Chance, B. (1999). A model of classroom research in action: Devel-oping simulation activities to improve students’ statistical reasoning. Journal of Statistics Educa-tion, 7(3).

Dierdorp, A., Bakker, A., Eijkelhof, H., and Maanen, J. (2011). Authentic practices as contexts for learn-ing to draw inferences beyond correlated data. Mathematical Thinking and Learning, 13:1–2, 132–151.

Engel, J., and Sedlmeier, P. (2011). Correlation and regression in the training of teachers. In C. Batanero, G. Burrill, and C. Reading (Eds.) Teaching statistics in school mathematics—challenges for teaching and teacher education: A joint ICMI/IASE study (pp. 247–258). New York, NY: Springer.

Estepa, A., and Batanero, C. (1996). Judgments of correlation in scatterplots: An empirical study of students’ intuitive strategies and preconceptions. Hiroshima Journal of Mathematics Education, 4:25–41.

Franke, M., Webb, N., Chan, A., Ing, M., Freund, D., and Battey, D. (2009). Teacher questioning to elicit students’ mathematical thinking in

elementary school. Journal of Teacher Education, 60(4):380–392.

Franklin, C., and Kader, G. (2010). Models of teacher preparation designed around the GAISE framework. In C. Reading (Ed.) Data and context in statistics education: Towards an ev-idence-based society. Proceedings of the eighth International Conference on Teaching Statistics (ICOTS8, July 2010), Ljubljana, Slovenia. Voor-burg, The Netherlands: International Statistical Institute. www.stat.auckland.ac.nz/~iase/ publications.php

Friel, Curcio, and Bright, (2001). Making sense of graphs: Critical factors influencing comprehen-sion and instructional implications. Journal for Research in Mathematics Education, 124–158.

Garfield, J. B., and Ben-Zvi, D. (2008). Developing students’ statistical reasoning: Connecting research and teaching practice. New York, NY: Springer Science and Business Media.

Garfield, J. (1995). How students learn statistics. Inter-national Statistical Review, 63:25–34.

Garfield, J., and Everson, M. (2009). Preparing teachers of statistics: A graduate course for future teachers. Journal of Statistics Education, 17(2):223–237.

Greer, B., and Ritson, R. (1994). Readiness of teachers in Northern Ireland to teach data handling. Pro-ceedings of the fourth International Conference on Teaching Statistics, Vol. 1. Marrakech, Morocco: National Organizing Committee of the Fourth International Conference on Teaching Statistics, pp. 49–56.

Gil, E., and Ben-Zvi, D. (2011). Explanations and context in the emergence of students’ informal inferential reasoning. Mathematical Thinking and Learning, 13:1, 87–108.

Groth, R. (2003). High school students’ levels of thinking in regard to statistical study design. Mathematics Education Research Journal, 15(3):252–269.

Groth, R. E. (2008). Assessing teachers’ discourse about the PreK–12 Guidelines for Assessment and Instruction in Statistics Education (GAISE). Statis-tics Education Research Journal, 7(1):16–39.

Groth, R. E. (2007). Toward a conceptualization of statistical knowledge for teaching. Journal for Research in Mathematics Education, 38:427–437.

Groth, R. E. (2013). Characterizing key developmen-tal understandings and pedagogically powerful ideas within a statistical knowledge for teaching

Page 204: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

52 | Statistical Education of Teachers

chaPter 8

framework. Mathematical Thinking and Learning, 15:121–145.

Groth, R. E., and Bergner, J. A. (2006). Preservice elementary teachers’ conceptual and procedural knowledge of mean, median, and mode. Mathe-matical Thinking and Learning, 8:37–63.

Gould, R., and Peck, R. (2004). Preparing secondary mathematics educators to teach statistics. Curricu-lar Development in Statistics Education: Internation-al Association for Statistical Education, 278–283.

Hammerman, J. K., and Rubin, A. (2004). Strategies for managing statistical complexity with new soft-ware tools. Statistics Education Research Journal, 3(2):17–41.

Hannigan, A., Gill, O., and Leavy, A. (2013). An in-vestigation of prospective secondary mathematics teachers’ conceptual knowledge of and attitudes toward statistics. Journal of Mathematics Teacher Education, 16:427–449.

Heaton, R., and Mickelson, M. (2002).The learning and teaching of statistical investigation in teach-ing and teacher education. Journal of Mathematics Teacher Education, 5:35–39.

Mickelson, W. T., and Heaton, R. M. (2004). Prima-ry teachers’ statistical reasoning about data. In The challenge of developing statistical literacy, reasoning, and thinking (pp. 327–352). Springer Netherlands.

NCTM. (2013). Preparing PreK–12 teachers of statis-tics: A joint position statement of the American Statistical Association (ASA) and NCTM. Reston, VA: NCTM.

Hill, H. C., and Ball, D. L. (2004). Learning math-ematics for teaching: Results from California’s Mathematics Professional Development Institutes. Journal of Research in Mathematics Education, 35:330–351.

Hoerl, R.W., and Snee, R. D. (2001). Statistical thinking: Improving business performance. Pacific Grove, CA: Duxbury.

Jacobbe, T., and Horton, B. (2010). Elementary school teachers’ comprehension of data displays. Statis-tics Education Research Journal, 9:27–45.

Jacobbe, T. (2012). Elementary school teachers’ un-derstanding of the mean and median. Internation-al Journal of Science and Mathematics Education, 10(5):1143–1161.

Jones, G. A., Langrall, C. W., Mooney, E. S., and Thornton, C. A. (2004). Models of development in statistical reasoning. In D. Ben-Zvi and J. Gar-field (Eds.) The challenge of developing statistical

literacy, reasoning, and thinking (pp. 97–117). Dordrecht: Kluwer Academic.

Jones, G. A., Langrall, C. W., and Mooney, E. S. (2007). Research in probability: Responding to classroom realities. In F. K. Lester Jr. (Ed.) Second handbook of research on mathematics teaching and learning (pp. 909-955). Charlotte, NC: Information Age Publishing.

Konold, C. (2007). Designing a data tool for learn-ers. In M. Lovett and P. Shah (Eds.) Thinking with data (pp.267–291). New York, NY: Law-rence Erlbaum Associates.

Konold, C., and Pollatsek, A. (2002). Data analysis as the search for signals in noisy processes. Journal for Research in Mathematics Education, 33:259–289.

Konold, C., Pollatsek, A., Well, A., and Gagnon, A. (1997). Students analyzing data: Research of critical barriers. Research on the role of technolo-gy in teaching and learning statistics, 151.

Leavy, A., O’Loughlin, N. (2006). Preservice teach-ers’ understanding of the mean: Moving beyond the arithmetic average. Journal of Mathematics Teacher Education, 9:53–90.

Leavy, A., Fitzmaurice, O., and Hannigan, A. (2013). If you’re doubting yourself then, what’s the fun in that? An exploration of why prospec-tive secondary mathematics teachers perceive statistics as difficult. J. Stat. Educ, 21.

Lehrer, R., Kim, M., Schauble, L. (2007). Support-ing the development of conceptions of statistics by engaging students in measuring and model-ing variability. International Journal of Comput-ers for Mathematics Learning, 12:195–216.

Madden, S. (2011). Statistically, technologically, and contextually provocative tasks: Support-ing teachers’ informal inferential reasoning. Mathematical Thinking and Learning, 13:1–2, 109–131.

Makar, K. (2008). A model of learning to teach statistical inquiry. In C. Batanero, G. Burrill, C. Reading, and A. Rossman (Eds.) Joint ICMI/IASE study: Teaching statistics in school math-ematics. Challenges for teaching and teacher education. Proceedings of the ICMI Study and 2008 IASE Round Table Conference.

Makar, K., and Confrey, J. (2007). Moving the context of modeling to the forefront: Preservice teachers’ investigations of equity in testing. In W. Blum, P. Galbraith, H-W. Henn, and M. Niss (Eds.) Modelling and applications in mathematics

Page 205: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 53

chaPter 8

education: The 14th ICMI study (pp. 485–490). New York, NY: Springer.

Makar, K., and Confrey, J. (2004). Secondary teachers’ statistical reasoning in comparing two groups. In D. Ben-Zvi and J. Garfield (Eds.) The challenge of developing statistical literacy, reasoning, and think-ing (pp. 353–374). Boston, MA: Kluwer Academic Publishers.

McClain, K., and Cobb, P. (2001). Supporting stu-dents’ ability to reason about data. Educational Studies in Mathematics 45:103–129.

McClain, K., McGatha, M., and Hodge, L. L. (2000). Improving data analysis through discourse. Math-ematics Teaching in the Middle School, 5:548–553.

Mokros, J., and Russell, S. J. (1995). Children’s concepts of average and representativeness. Journal for Re-search in Mathematics Education, 26(1):20–39.

Moritz, J. (2004). Reasoning about covariation. In The challenge of developing statistical literacy, reasoning, and thinking (pp. 227–255). Springer Netherlands.

Peters, S. (2014). Developing understanding of statistical variation: Secondary statistics teach-ers’ perceptions and recollections of learning factors. Journal of Mathematics Teacher Education, 17(6):539–582.

Pfannkuch, M. (2005). Thinking tools and variation. SERJ Editorial Board, 83.

Pfannkuch, M., and Ben-Zvi, D. (2011). Developing teachers’ statistical thinking. In Teaching statistics in school mathematics—challenges for teaching and teacher education (pp. 323–333). Springer Netherlands.

Rubin, A., and Hammerman, J. (2006). Understanding data through new software representations. In G. Burrill (Ed.) Thinking and reasoning with data and chance: 2006 NCTM yearbook (pp. 241–256). Reston, VA: National Council of Teachers of Mathematics.

Saldanha, L., and Thomson, P. (2002). Conceptions of sample and their relationship to statistical inference. Educational Studies in Mathematics, 51:257–270.

Silverman, J., and Thompson, P. W. (2008). Toward a framework for the development of mathematical knowledge for teaching. Journal of Mathematics Teacher Education, 11(6):499–511.

Simon, M. A. (2006). Key developmental understand-ings in mathematics: A direction for investigating and establishing learning goals. Mathematical Thinking and Learning, 8(4):359–371.

Shaughnessy, J. M. (2007). Research on statistics learning and reasoning. In F. K. Lester (Ed.) Second handbook of research on mathematics teaching and learning (pp. 957–1009). Charlotte, NC: Informa-tion Age Publishing.

Shaughnessy, J. M., Ciancetta, M., and Canada, D. (2004). Types of students reasoning on sampling tasks. In Proceedings of the 28th Conference of the International Group for the Psychology of Mathe-matics Education, 4:177–184.

Shaughnessy, J. M. (2007). Research on statistics learning. Second handbook of research on mathe-matics teaching and learning, 957–1009.

Shulman, D. (1986). Those who understand: Knowledge growth in teaching. Educational Researcher, 15:4–14.

Stroup. (1984). The statistician and the pedagogical monster: Characteristics of effective instructors of large statistics classes. In Proceedings of the Section on Statistical Education. Washington, DC: American Statistical Association.

Sorto, M. A., White, A., and Lesser, L. (2011). Un-derstanding student attempts to find a line of fit. Teaching Statistics, 11(2):49–52.

Wild, C., and Pfannkuch, M. (1999). Statistical think-ing in empirical enquiry. International Statistical Review, 223–248.

Watson, J. (2005). The probabilistic reasoning of mid-dle school students. In Exploring probability in school (pp. 145–169). New York, NY: Springer.

Watson, J. M., Kelly, B. A., Callingham, R. A., and Shaughnessy, J. M. (2003). The measurement of school students’ understanding of statistical variation. International Journal of Mathematical Education in Science and Technology, 34(1):1–29.

Watson, J. (2001). Profiling teachers’ competence and confidence to teach particular mathematics topics: The case of chance and data. Journal of Mathematics Teacher Education, 4:305–337.

Watson, J., and Donne, J. (2009). TinkerPlots as a research tool to explore student understanding. Technology Innovations in Statistics Education, 3(1).

Watson, J. M., and Moritz, J. B. (2000). Development of understanding of sampling for statistical litera-cy. Journal of Mathematical Behavior, 19:109–136.

Watson, J. M., and Moritz, J. B. (2000). The lon-gitudinal development of understanding of average. Mathematical Thinking and Learning, 2(1–2):11–50.

Utts, J. (2003). What educated citizens should know about statistics and probability, The American Statistician, 57(2):74–79.

Page 206: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence
Page 207: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 55

chaPter 9

CHAPTER 9Statistics in the School Curriculum: A Brief History

Increasing importance has been placed on data analysis in the United States during the recent decade. Data-driven decision making and statistical studies have drawn interest from the general population and policymakers, as well as businesses and schools. Influenced by this new emphasis, data analysis has be-come a key component of the PreK–12 mathematics curricula across the country. For example, the number of students tak-ing AP Statistics increased from 7,500 in 1997 to 169,508 in 2013 (College Board, 2013) and statistics content is appearing in most state curriculum guidelines. As statistics is receiving ever-increasing prominence in the PreK–12 curriculum, it is of paramount importance that it also gains prominence in teacher education programs.

As sound teacher education should include an appreciation of history, this chapter presents a review of the history of statis-tics education in PreK–12, with material adapted from Scheaf-fer and Jacobbe (2014).

The Early Years: 1920s–1950sThe notion of introducing statistics and statistical thinking into the school mathematics curriculum has a long and varied his-tory of nearly a century. In the 1920s, as the United States was becoming ever more rapidly an industrialized urban nation (even introducing statistical quality control in manufacturing), proposed changes in the school mathematics curriculum were often cast in the framework of making mathematics more util-itarian and thus broadening the scope of its appeal. Among the recommendations found in The Reorganization of Mathemat-ics in Secondary Education, a 1923 report by the relatively new Mathematical Association of America (MAA) (National Com-mittee on Mathematical Requirements 1923), were that statis-tics be included in the junior-high school curriculum (grades 7, 8, and 9), more from a computational than an algebraic point of view, and that a course in elementary statistics be included in the high-school curriculum.

Those advocating for change in mathematics education were mathematicians and mathematics educators, and their proposals for statistics were heavily mathematical and prob-abilistic. Among this group, however, were some statisticians. One of the first statisticians to enter the discussions on school curriculum changes was Helen Walker, who taught statistics at Columbia University Teacher’s College from 1925 to 1957 and who served as president of the American Statistical Associa-tion (ASA) in 1944 and president of the American Educational Research Association (AERA) in 1949-1950. Viewing statistics as a service to the welfare of society, she argued for its inclu-sion in the high-school curriculum as an essential public need.

Any one vitally concerned with the teaching of high-school pupils and observant of the rapidly growing public need for some knowledge of quantitative meth-od in social problems must be asking what portions of statistical method can be brought within the compre-hension of high-school boys and girls, and in what way these can best be presented to them. (Walker, 1931, p. 125)

The years of WWII and its aftermath were a boon for statis-tics, in both research and education. In “Personnel and Train-ing Problems Created by the Growth of Applied Statistics in the United States,” the National Research Council’s (NRC) Commit-tee on Applied Mathematical Statistics stated that “definite ad-vantages would result if certain aspects of elementary statistics were effectively taught in the secondary schools” (NRC, 1947, p. 17). The committee further explained that progress in teach-ing statistics (both high school and college) was hindered by a shortage of adequately prepared teachers. This problem remains to this day and is the primary reason for this report.

Although these early efforts at building statistics into the school curriculum had limited successes along the way, the cumulative effect began to turn the tide in noticeable ways in the 1950s. In 1955, the College Entrance Examination Board (CEEB) appointed a commission on mathematics with the goal of “improving the program of college preparatory mathematics in the secondary schools” (p. 1). Members included Freder-ick Mosteller, a Harvard statistics professor; Robert Rourke, a high-school mathematics teacher; and George Thomas, a col-lege mathematics professor—all vitally interested in improving and expanding the teaching of statistics. The commission re-ported the following:

Statistical thinking is part of daily activities, and an introduction to statistical thinking in high school will enhance deductive thinking. Numerical data, frequency distribution tables, averages, medians, means, range, quartiles were to be introduced in 9th grade. A more formal examination of probability concepts should be introduced later (grade 12). (CEEB, 1959, p, 5)

Mosteller, Rourke, and Thomas wrote a book for a high-school statistics course, Introductory Probability with Statistical Applications: An Experimental Course (1957), that quickly be-came a best seller for the CEEB. Notice the emphasis on prob-ability, however, as compared to the emphasis on data analysis, which was to come into its own in the next two decades.

Page 208: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

56 | Statistical Education of Teachers

chaPter 9

The Data Revolution: 1960sPrompted by a space race and computing power, the 1960s saw a data revolution that changed the inter-est in and practice of statistics. In that environment, the ASA president in 1968, Mosteller reached out to the National Council of Teachers of Mathematics (NCTM) to establish the ASA and NCTM Joint Com-mittee on Curriculum in Probability and Statistics. This committee developed materials for the schools that changed the tone of high-school statistics from an emphasis on probability to an emphasis on data.

Statistics: A Guide to the Unknown, one of the early publications of the joint committee, is a collection of essays—intended for the lay public, teachers, and stu-dents—that describes important real-life applications of statistics and probability. Statistics by Example, a series of four booklets, provided real examples with real data for students to analyze from data exploration and description through model building.

During this time, John Tukey—a professor of sta-tistics at Princeton and a friend of Mosteller’s—was

steering much of the emphasis in statistics away from mathematical theory and toward data analysis. He stated the following:

All in all, I have come to feel that my central interest is in data analysis, which I take to include, among other things: procedures for analyzing data, techniques for interpreting the results of such pro-cedures, ways of planning the gathering of data to make its analysis easier, more precise or more accurate, and all the machinery and results of (mathematical) statistics which apply to analyzing data. (Tukey, 1962, p. 6)

Tukey invented many of the data analytic proce-dures in common use today. The ASA/NCTM Joint Committee, under Mosteller’s influence, embraced the Tukey approach to data analysis and worked on adapting this approach to materials suitable for use at the school level. The combination of Mosteller, Tukey, and the advent of inexpensive computing drove the successes of statistics in the schools that came about over the next 40 years.

Progress, Not Perfection: 1970s-1990sHowever, the activities of the next 40 years were not unmitigated successes, and we still have not reached the intended level of “statistical reasoning for all.” In the 1970s, for example, the Conference Board of the Mathematical Sciences formed the National Advisory Committee on Mathematics Education (NACOME) to look into current trends. Statistics education was summarized in one key statement: “While probability instruction seems to have made some progress, sta-tistics instruction has yet to get off the ground.” (NA-COME, 1975, p. 45) The report stated that statistics should be given more attention because of its impor-tance in the life of every citizen.

Even though numerical information is encountered everywhere, in newspapers and in magazines, on radio and on television, few people have the training to accept such information critically and to use it effective-ly. (NACOME, 1975, p. 45)

Their recommendations on teaching statistics in-cluded “use statistical topics to illustrate and motivate

asa phoTo

Frederick Mosteller reached out to the National Council of Teachers of Mathematics (NCTM) to establish the ASA and NCTM Joint Committee on Curriculum in Probability and Statistics.

Page 209: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 57

chaPter 9

mathematics, emphasize statistics as an interdisci-plinary subject, and develop several separate cours-es dealing with statistics to meet varied local condi-tions.” (NACOME, 1975, p. 47). This advice is still worth heeding today.

NCTM’s An Agenda for Action: Recommendations for School Mathematics of the 1980s included numer-ous references to statistical topics that should play an increasing role in the mathematics curriculum (of-ten without using the term “statistics”). For example, the section on problem solving recommended more emphasis on methods of gathering, organizing, and interpreting information; drawing and testing infer-ences from data; and communicating results. The sec-tion on basic skills stated, “There should be increased emphasis on such activities as locating and processing quantitative information, collecting data, organizing and presenting data, interpreting data, drawing infer-ences and predicting from data” (NCTM, 1980, p. 4).

Building on these expanding interests in statistics and its earlier successes, the Joint Committee ob-tained a grant from the National Science Foundation (NSF) to begin the ASA-NCTM Quantitative Liter-acy Project (QLP). The QLP originally consisted of four booklets—Exploring Data, Exploring Probability, The Art and Technique of Simulation, and Exploring Surveys and Information from Samples—and a plan for carrying out many workshops across the country. (See Scheaffer, 1989 and 1991, for details.) The QLP did not foment a revolution, but the materials were well received and the workshops were successful in influencing a number of teachers and mathematics educators, especially some of those who would de-velop NSF-funded teaching materials for elementary and middle-school mathematics in the ensuing years (e.g., Connected Mathematics Project; Investigations in Number, Data, and Space).

Fortunately, the NCTM Board of Directors took note of the QLP as it was developing its 1989 Curriculum and Evaluation Standards for School Mathematics. This document called for statistics to be an integral part of the mathematics curriculum by giving it sta-tus as one of the five content strands to be taught throughout the school years.

Throughout the 1980s and 1990s, many other re-ports and activities came to the support of statistics education. In the early 1980s, the National Commission on Excellence in Education was appointed to study mathematics education in the country. Their report, A Nation at Risk, was highly supportive of statistics and probability, both directly and indirectly.

The teaching of mathematics in high school should equip graduates to:

• Understand geometric and algebraic concepts;

• Understand elementary probability and statistics;

• Apply mathematics in everyday situations

• Estimate, approximate, measure, and test the accuracy of their calculations (Nation-al Commission, 1983, p. 25)

By the 1990s, the National Research Council’s Mathematical Sciences Education Board (MSEB) was strongly aligned with the movement toward more sta-tistics in the mathematics curriculum of the schools.

If students are to be better prepared math-ematically for vocations as well as for every-day life, the elementary-school mathematics must include substantial subject matter other than arithmetic:

… Data analysis, including collection, organization, representation, and interpre-tation of data; construction of statistical tables and diagrams; and the use of data for analytic and predictive purposes

… Probability, introduced with simple experiments and data-gathering (MSEB, 1990, p. 42)

Secondary-school mathematics should introduce the entire spectrum of mathemat-ical sciences: ... data analysis, probability and sampling distributions, and inferential reasoning. (MSEB, 1990, p. 46)

Indeed, the1990s were a period of rapid develop-ment of state curriculum standards on data analysis, NSF support of teacher enhancement and materials development projects on statistics, and AP Statistics. As to the latter, it was the AP Calculus committee that led the development of AP Statistics as a second Ad-vanced Placement course in the mathematical scienc-es. (See Roberts, 1999, for details about the develop-ment of AP Statistics.)

Page 210: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

58 | Statistical Education of Teachers

chaPter 9

One of the key questions delaying the approval of this course was the availability of teachers who could teach the subject in high school. Fortunately for the success of the AP course, teachers who had become leaders in the QLP volunteered to be the first AP Statistics teachers and to lead workshops to educate others. But the ever-in-creasing popularity of the course (and the retirement of those founding teachers) requires many more teachers with qualifications to teach the course effectively.

GAISE, Signature Event of the 2000s The emphasis on statistics education for all through the quantitative literacy programs of the 1980s set the stage for introducing AP Statistics in the 1990s. The success of the latter, in turn, reflected focus back on statistics education in the grades so as to prepare students for better access to and success in the AP program. This revisiting of PreK–12 sta-tistics education, sharpened somewhat by the MET I report of 2001, led to the Guidelines for Assessment and Instruction in Statistics Education: A PreK–12 Curriculum Framework (GAISE) (www.amstat.org/education/gaise) (Franklin et al., 2007). This report has been well received by statistics and mathematics educators and has served as the basis for revised curricula in statistics within many state guidelines and professional development programs.

The main goal of GAISE was to provide fairly de-tailed guidelines about how to achieve a statistically literate graduating high-school student at the end of the student’s PreK–12 education. The report aimed to accomplish two goals: (1) articulate differences between mathematics and statistics and (2) outline a two-dimensional framework for statistical learning.

One important feature of the framework is that, un-like the NCTM standards or any state standards outlined by grade, a student’s progression is based solely on student experience. In addition, the framework is not defined as a list of topics a student must complete. Instead, the report decomposes statistical thinking into four main process components (formulate questions, collect data, analyze data, and interpret results), within which a student’s level of knowledge (level A, B, or C) progresses. One of the pri-mary concerns that motivated the creation of the GAISE document was that “statistics … is a relatively new subject for many teachers who have not had an opportunity to develop sound understanding of the principles and con-cepts underlying the practices of data analysis that they are now called upon to teach” (Franklin et al., 2007, p. 5).

In 2010, the Common Core State Standards for Mathematics (CCSSM) were adopted by numerous states. The GAISE framework served as a foundation for the statistics standards in the CCSSM.

Why Mathematics? Why Schools?After nearly 100 years of attempts at getting a coher-ent, informative, useful statistics curriculum in the schools, one might ask, “Why in mathematics?” and “Why in the schools?” Taking the first question, many have suggested statistics should be part of the social sciences or sciences, where it is most used. Over the years, such attempts have been made in that direction, with the general result being that a few specialized techniques may gain footage in the curriculum (such as the chi-square test in biology or fitting regression models in physics) while a coherent curriculum in statistical thinking will not. In recent years, many mathematicians and mathematics educators have ac-cepted statistics as an important part of the mathe-matical sciences because of its emphasis on inductive reasoning and applying mathematics to important re-al-world problems. The following are two examples of this thinking, both in terms of reasoning and practice, coming from highly respected mathematicians David Mumford (a Field’s medalist) and George Polya (a re-nowned mathematician and educator).

For over two millennia, Aristotle’s logic has ruled over the thinking of western intellec-tuals. All precise theories, all scientific mod-els, even models of the process of thinking itself, have, in principle, conformed to the straight-jacket of logic. But from its shady beginnings devising gambling strategies and counting corpses in medieval London, probability theory and statistical inference now emerge as better foundations for scien-tific models, especially those of the process of thinking and as essential ingredients of theoretical mathematics, even the founda-tions of mathematics itself. We propose that this sea change in our perspective will affect virtually all of mathematics in the next century. (Mumford, 1999, p. 1)

We must distinguish between two types of reasoning: demonstrative and plausible.

Demonstrative reasoning is inherent in mathematics and in pure logic; in other branches of knowledge, it enters only insofar as the ideas in question seem to be raised to the logico-mathematical sphere. Demonstrative reasoning brings order and coherence to our conceptual

Page 211: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 59

chaPter 9

systems and is therefore indispensable in the development of knowledge, but it cannot supply us with any new knowledge of the world around us. Such knowledge can be obtained, in science as in everyday life, only through plausible reasoning. The inferences from analogy and inductive proofs of natural scientists, the statisti-cal arguments of economists, the docu-mentary evidence of historians, and the circumstantial evidence of lawyers can reasonably lay claim to our confidence, and to a very high degree, under favorable circumstances. But they are not demon-strative; all such arguments are merely plausible. (Polya, 2006, p. 36; reprinted from 1959)

As to the second question, the following quote from Theodore Porter, a historian of science, neatly sums up the arguments:

Statistical methods are about logic as well as numbers. For this reason, as well as on account of their pervasiveness in modern life, statistics cannot be the business of statisticians alone, but should enter into the schooling of every educated person. To achieve this would be a worthy goal for statistics in the coming decades. (Porter, 2001, p. 64)

In this information age, statistical reasoning should be part of everyone’s education, whether or not they are college bound. For the essence of statisti-cal reasoning to become part of an individual’s habits of mind, such education must begin early in a per-son’s schooling and be maintained over years of ed-ucational and practical experiences. Teachers remain a crucial ingredient to guide the process of learning statistical thinking.

ReferencesCollege Entrance Examination Board (CEEB). (1959).

Program for college preparatory mathematics. Iowa City, IA: Commission on Mathematics, Author.

Mathematical Sciences Education Board. (1990). Reshaping school mathematics. Washington, DC: National Research Council.

Mumford, D. (1999). The dawning of the age of stochasticity. Available at www.dam.brown.edu/

people/mumford/Papers/Dawning.pdf.National Advisory Committee on Mathematical

Education (NACOME). (1975). Overview and analysis of school mathematics grades K–12. Reston, VA: NCTM.

National Commission on Excellence in Education. (1983). A nation at risk: The imperative for edu-cational reform. Washington, DC: United States Department of Education.

National Committee on Mathematical Require-ments. (1923). The reorganization of mathematics in secondary education. Washington, DC: Mathe-matical Association of America (MAA).

National Council of Teachers of Mathematics (NCTM). (1980). An agenda for action: Recom-mendations for school mathematics of the 1980s. Reston, VA: Author.

National Council of Teachers of Mathematics (NCTM). (1989). Curriculum and evaluation standards for school mathematics. Reston, VA: Author. Available at www.nctm.org/standards/content.aspx?id=26629.

National Research Council (NRC). (1947). Report by the Committee on Applied Mathematical Statis-tics, Reprint and Circular Series, 128.

Polya, G. (2006). Mathematics as a subject for learn-ing plausible reasoning. Mathematics Teacher 100:5, 36.

Porter, T. (2001). Statistical futures. Amstat News, 291:61–64.

Roberts, R., Scheaffer, R., and Watkins, A. (1999). Advanced Placement Statistics—Past, present, and future. The American Statistician, 53(4).

Scheaffer, R. L. (1986). The Quantitative Literacy Project. Teaching Statistics, 8(2).

Scheaffer, R. L. (1991). The ASA-NCTM Quantitative Literacy Project: An overview. Proceedings of the 3rd International Conference on Teaching Statistics, 45–49.

Scheaffer, R. L., and Jacobbe, T. (2014). Statistics ed-ucation in the K–12 schools of the United States: A brief history. Journal of Statistics Education, 22(2).

Tukey, J. (1962). The future of data analysis. Annals of Mathematical Statistics, 33:1–67.

Walker, H. (1931). Mathematics and statistics. In Sixth Yearbook, Mathematics in Modern Life (pp. 111–135), Reston, VA: NCTM.

Walker, H. (1945). The Role of the American Statistical Association. Journal of the American Statistical Association, 40:1–10.

Page 212: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence
Page 213: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 61

aPPeNDix 1

APPENDIX 1This appendix includes a series of short examples and accompanying discussion that addresses particular difficulties that may occur while teaching statistics to teachers. The examples and difficulties presented are not meant to provide an exhaustive list of potential is-sues teacher educators may encounter when teaching pre- or in-service teachers. Instead, they are meant to highlight common subtleties and difficulties that arise. This appendix is organized into four sections:

1. Question/Design Alignment

2. Connections Between Data Type, Numerical Summaries, and Graphical Displays

3. Proportional Reasoning in Statistics

4. The Role of Randomness in Statistics

Question/Design Alignment Two interrelated components of the statistical process include formulating statistical questions and collecting data. Typically, a general problem or research topic is presented. To investigate the topic, one must understand

what specific statistical questions should be investigated and what data-collection method should be employed to answer those questions.

A statistical question is one that anticipates variability in the data that would be collect-ed to answer it and motivates the collection, analysis, and interpretation of data.

The formulation of questions is of particular im-portance for teachers, and thus for teacher preparation, because teachers must lead discussions and pose good questions in their own classrooms to motivate rich statistical investigations. It is important that teachers understand formulating statistical questions is not an easy exercise and is one that requires great precision in language. Formulating questions and data collection are topics outside the realm of traditional mathemati-cal reasoning and thus often are challenging for teach-ers, even those proficient in mathematics. Most im-portantly, teacher educators and teachers must ensure the questions being formulated and the accompanying data-collection plan align with the goal of the general problem or research topic being investigated.

SCENARIO 1: STRICT PARENTSStudents in a high-school mathematics class decided their term project would be a study of the strict-ness of the parents or guardians of students in the school. Their goal was to estimate the proportion of students in the school who thought of their parents or guardians as strict and the proportion of students in the school whose parents were strict. What would be some examples of questions that could be posed? What would be an appropriate study design for this study, given the students do not have time to interview all 1,000 students in the school?

In the strict parents scenario, teachers could for-mulate a question that makes the topic of strictness subjective. For example, a student may want to sur-vey the class by using the question, “Is your curfew 10 p.m.?” This question alludes to beliefs about strict-ness; however, it is not objective, since one student may consider 10 p.m. very strict while the next may consider it lenient, thus not shedding light on stu-dent beliefs about parent strictness. Therefore, this question does not align with the goal of uncovering

whether students believed their parents are strict. A better question would be to simply ask students “Do you believe your parents are strict?”

Another important issue is potential bias in survey questions, such as, “Don’t you think it is unfair for a parent to limit the use of your cell phone?” Questions like this lead the respondent to agree with the inter-viewer. Appropriate questions for measuring student beliefs might be: “Do you believe your parents are strict?” or “Do you feel your parents are strict?”

Page 214: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

62 | Statistical Education of Teachers

aPPeNDix 1

To gauge whether parents are actually strict, the survey questions must provide some baseline measure of strictness. Appropriate questions might be along the following lines:

• Do you have a set number of hours you must spend on homework per day?

• Do you have a restriction on the amount of time you may spend for personal use of the web?

• Do you have a curfew on school nights?

Teachers need ample opportunity to practice de-signing questions and then designing data-collection methods accordingly to answer the questions.

SCENARIO 2: HOMEWORKA middle-school student thinks teachers at his school are giving too much homework, and he intends to make use of his statistics project to study his conjecture. This student needs to transition from this research topic to a statistical question to investigate. Which of the following questions would not be a good statistical question to investigate and why?

Some possible questions he could investigate are the following:

1. How many hours per week do students at this school spend on homework?

2. Do you think teachers at this school are giving too much homework?

3. How does the amount of time students spend on homework per week at our school compare with the amount of time students spend on homework per week at another school?

4. Is there an association between the number of minutes spent on homework each day and the amount of sleep students get on school nights?

While question (2)—“Do you think teachers at this school are giving too much homework?”—is not a statistical question to investigate, it is an appropriate survey question that could inform the problem of un-derstanding whether students think teachers assign too much homework at the school.

The other questions listed are, in fact, statistical questions a student could investigate. For question (1)—“How many hours per week do students at this school spend on homework?”—the student would need a sample of students at his school (ideally a random sample). Each student surveyed would report the num-ber of hours he/she spends per week on homework. The analysis of the data might include providing graphical displays of the homework times (a dotplot or histogram) along with appropriate numerical summaries, such as the mean homework time and the MAD of the home-work times for those surveyed. The interpretation of the data would include a description of the variability in the homework times based on the analysis. If the sample se-lected is a random sample, the student could provide a confidence interval on mean study time for all students

at his school. However, whether or not students are be-ing given too much homework would require knowl-edge of a baseline for the amount of time middle-school students are expected to spend on homework.

If a student chose to investigate question (3)—“How does the amount of time students spend on homework per week at our school compare with the amount of time students spend on homework per week at another school?”—then the data-collection method would switch to collecting samples of students from both schools (ideally random samples). All students surveyed would report the number of minutes he/she spends per day on homework. The analysis of the data could consist of providing comparative graphical dis-plays of the homework times (e.g., boxplots) along with appropriate numerical summaries of the data, such as the median homework time and the IQR of the home-work times for those surveyed. The interpretation of the data would include a comparison of five-number summaries along with identification of areas of over-lap and areas of separation between the two groups. If these were random samples and the difference between

Page 215: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 63

aPPeNDix 1

the median times were meaningful, then the student could generalize these results to the larger groups. However, if the median homework time for students at his school is higher than the median time at the oth-er school, this does not explicitly mean students at his school are getting too much homework.

Question (4)—“Is there a relationship between the number of minutes spent on homework each day and the amount of sleep students get on school nights?”—would be addressed by sampling students at the school (ideally a random sample). Each student surveyed would report the number of minutes he/she spends per week for one day and the amount of sleep the student got that night. The analysis of the data could consist of providing a scatterplot of the sleep time against home-work time. If the scatterplot displays a linear trend in the data, the student could also report a linear equation for predicting sleep time from homework time. The in-terpretation of the data would include a description of the relationship between sleep time and homework time. If sleep time is generally decreasing as homework time increases, this suggests time on homework may interfere with how much sleep students at this school get; however, this does not explicitly mean students at this school are getting too much homework.

Notice the difficulty in answering the research question. The answers to each statistical question might provide insight into the research question, but they do not explicitly provide an answer to the stu-dent’s question/conjecture.

Once one articulates the statistical question(s) to be investigated for the given topic or problem, then one must determine the appropriate study design, and in turn the appropriate method of analysis, and then draw conclusions. The example scenarios do not call for an experiment. However, the examples require selection of samples of students. Random sampling is necessary

to reduce the biases that might arise otherwise (like sampling only seniors or only friends of the math club for the strict parents scenario) and forms the basis of statistical inference.

Teachers must realize that this seemingly simple process of random selection will often, if not always, run into difficulties in practice, as the list of students may change nearly every day, some selected students may not cooperate, and so on. One of the best ways to generate a random sample of student names is to get a list of students, number the students from 1 to N, and then select random numbers between 1 and N to deter-mine which students to include in the sample.

Teachers will suggest other sampling plans, such as systematically sampling students from the lunch line, or sampling homerooms (clusters) rather than individual students. Teachers should have some un-derstanding of such alternative plans and recognize they may work, but not necessarily as well as a simple random sample. In addition, making valid inferences for the more complex design plans requires deeper insight into statistical methodology than is addressed in K–12 education.

Connections Between Data Type, Numerical Summaries, and Graphical DisplaysAn important aspect of the data analysis component is using the appropriate numerical summaries and graphical displays for the collected data and posed questions. Teachers should be particularly careful about choosing the correct summaries and displays for a data set given that they have to communicate statistical ideas to their respective classrooms. In gen-eral, teachers must be able to recognize that the type of data they have dictates what summaries and dis-plays are appropriate to use.

SCENARIO 3: SCHOOL COLORSA new elementary school has opened for the school year. Students are told their opinion is important in choos-ing a school color. To investigate the color preferences at the school, students need to develop the following statistical question that will, in turn, inform the decision of the school color:

Which color is most popular among students in the new elementary school?After collecting survey data from a random sample of students asking each student, “What is your favor-ite color from the following list: red, blue, and yellow?”, the children summarized the data and found that the favorite color was red for 16 children, the favorite color was blue for 18 children, and the favorite color was yellow for 13 children.

Page 216: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

64 | Statistical Education of Teachers

aPPeNDix 1

Teachers need to summarize and display data in the appropriate ways. For example, for the school colors scenario, a meaningful conclusion is that the modal category (or mode) is the color blue, with the highest count. However, teachers often incorrectly state that the mode is 18. That is, saying the count for the cate-gory of blue is the mode instead of the category itself.

A second common issue is to treat the counts of the categories (the summaries) as the data and

calculating a mean summary of the counts, (16 + 18 + 13)/3. This is a meaningless summary in terms of the question, “What is the favorite color of the students in this class?” Note that the data are categorical—each observation is the category in which the student an-swered favorite color. In other words, there are three possible cases (blue, red, and yellow) and the data are the individual responses from the students (e.g., blue, blue, red, etc.).

SCENARIO 4: AQUARIUM AND ZOO A school is planning a field trip to the aquarium or the zoo for students in grades 6–9. To determine whether the school should go to the aquarium or zoo, the school principal investigates the following statis-tical question:

Which field trip is most popular among students in each grade?

There are 100 students at each grade level, and every student was asked which place he or she would prefer to visit. The bar graphs for the four grade levels are shown below.

Vote for a trip to Aquarium/Zoo

Grade 9Grade 8Grade 7Grade 60

10

20

30

40

50

60

70

90

Fre

qu

en

cy

The aquarium and zoo example illustrates how to understand variability in the data through graphical displays. The data are summarized with bar graphs for each of the four grades. The grade level with the least variable (or most consistent) responses is Grade 8. We see that 80% of the students prefer aquarium, whereas only 20% prefer the zoo. Thus, 80% of the responses are the same (there is more consensus) and have no variability; this grade has

the largest portion of the students preferring one particular category. Many teachers will pick Grade 7. They interpret the fact that the two bars are even as indicating the least amount of variability. In this case, they are comparing the frequencies for each category and, because the frequencies are the same, deciding there is no variability. However, for this survey, the responses for Grade 7 are the most vari-able (least consistent).

In which grade level were the responses least variable?

Aquarium

Zoo

90

Page 217: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 65

aPPeNDix 1

This example illustrates several concepts par-ticularly relevant to teacher preparation. Teachers need to be comfortable with communicating to par-ents about percentiles; however, confusion arises among the terms percent, percentiles, and z-scores. Teachers must understand clearly that making an assumption about the distribution of scores is the only way to get a reasonable approximate answer in this problem. Scaling the observed scores in terms of the number of standard deviations away from the mean (z-scores) maps the 30 on the ACT into 1.67 and the 700 on the SAT into 1.50. But these figures would be too confusing to report to most audiences and thus should be mapped into percentiles, which

turn out to be about 95 and 93, respectively. (Of course, technology allows one to go from the raw score to the percentile without the z transformation, but z-scores are a useful concept for teachers to un-derstand and experience.)

Using this process in reverse, the 90th percen-tiles for the two distributions are 28 for the ACT and 674 for the SAT. Teachers need practice with phrases such as “95 percent of the scores are below 30, the 95th percentile of the distribution that corresponds to a z-score of 1.67.” In addition, teachers should be taught to look carefully at the published percentile scores for these exams to see how closely they line up with those of a well-chosen normal distribution.

SCENARIO 5: SAT AND ACT PERCENTILES Scores on large-scale national tests tend to be mound-shaped with little skew, thus allowing the normal distribution to be a good model for their distributions. For a particular year, the ACT mathematics scores had a mean of about 20 and a standard deviation of about 6. The SAT math scores had a mean of about 520 with a standard deviation of about 120. How does an ACT score of 30 compare to an SAT score of 700? What was the 90th-percentile score for each exam?

SCENARIO 6: BRAIN WEIGHT V. BODY WEIGHTIt is hypothesized that larger animals have larger brains. If a relationship between body weight and brain weight existed, then body weight (a relatively easy measurement to make) could be used to predict brain weight (for which measurement is rather hard on the animal). What is the relationship between the body weight of an animal and brain weight of an animal? To explore this statistical question, data were obtained on the brain weight (in grams) and body weight (in kilograms) for a sample of 30 animals of differing species (see table below). What does a plot of the data reveal about the relationship? How could the relationship between brain weight and body weight be modeled?

The brain weight and body weight of different species (displayed in the table with other variables that will be referred to in this example) scenario illustrates an exam-

ple that necessitates careful graphical displays of data at multiple stages of the process to gain information about the relationship between brain weight and body weight.

Page 218: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

66 | Statistical Education of Teachers

aPPeNDix 1

Species

Brain Weight measured in

grams (Brain_wt Gm)

Body Weight measured in ki-lograms (Brain_wt Body_wt Kg)

Natural Log of Brain Weight

(LnBrain)(LBr)

Natural Log of Body Weight

LnBody(LBo)

Species (f=fish, m=mammal)

Species Code (1=fish,

0=mammal)

Catfish 1.84 2.894 0.609766 1.06264 f 1

Barracuda 3.83 5.978 1.34286 1.78809 f 1

Mackerel 0.64 0.765 -0.44629 -0.26788 f 1

Salmon 1.26 3.93 0.231112 1.36864 f 1

brown_trout 0.57 0.292 -0.56212 -1.231 f 1

tuna 3.09 5.21 1.12817 1.65058 f 1

northern_trout 1.23 2.5 0.207014 0.916291 f 1

grizzly_bear 233.9 142.88 5.45489 4.96201 m 0

cheetah 2.45 22.2 0.896088 3.10009 m 0

lion 106.7 28.79 4.67002 3.36003 m 0

raccoon 40 5.175 3.68888 1.64384 m 0

Skunk 10.3 1.7 2.33214 0.530628 m 0

tiger 302 209 5.71043 5.34233 m 0

Wolf 152 29.94 5.02388 3.3992 m 0

Greyhound 105.9 24.49 4.6625 3.19826 m 0

Seal 442 107.3 6.09131 4.67563 m 0

Walrus 1126 667 7.02643 6.50279 m 0

Porpoise 1735 142.43 7.45876 4.95885 m 0

blue_whale 6800 58059 8.82468 10.9692 m 0

Bat 0.94 0.028 -0.06187 -3.57555 m 0

Mole 1.16 0.04 0.14842 -3.21888 m 0

Baboon 140 7.9 4.94164 2.06686 m 0

grey_monkey 66.6 4.55 4.1987 1.51513 m 0

chimpanzee 440 56.69 6.08677 4.0376 m 0

human 1377 74 7.22766 4.30407 m 0

Mouse 0.55 0.018 -0.59783 -4.01738 m 0

Squirrel 3.97 0.183 1.37877 -1.69827 m 0

rhinoceros 655 763 6.48464 6.63726 m 0

african_elephant 5712 6654 8.65032 8.80297 m 0

horse 618 461.76 6.42649 6.13505 m 0

TABLE 1

Page 219: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 67

aPPeNDix 1

Ln

(B

rain

Weig

ht)

In observing these data, many teachers will think fitting a statistical model to these data will be impossible; others will suggest deleting the two “outliers”—the blue whale and the elephant. Some, more experienced in mathematics, might suggest trying transformations of the data. A number of

transformations of one or both variables might be tried, the most common ones being squares, square roots, and logarithms. After taking the natural logarithms of both variables (ln(Brain_wt) and ln(Body_wt)), the scatterplot shown in Figure 2 looks reasonable to fit a simple linear model.

The following plot illustrates the relationship between body weight and brain weight of the species:

Body Weight vs. Brain Weight

0

Body Weight (Thousands of Kilograms)

Bra

in W

eig

ht

(Th

ou

san

ds

of

Gra

ms)

10 20 30 40 50 60

01

23

45

67

Body Weight vs. Brain Weight (Transformed)

02

46

-28

10

ln (Body Weight)

Regression

-6 -4 -2 0 2 4 6 8 10 12

FIGURE 1

FIGURE 2

Page 220: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

68 | Statistical Education of Teachers

aPPeNDix 1

It is important for teachers to note that once the data are transformed, the “outliers” do not appear to be outliers at all. Instead, the transformation allowed us to see the blue whale and elephant fit quite well with the trend in these data.

Closer examination and perseverance with the analysis, however, suggest there might be two groups of animals—

one mainly on or above the single regression line and one below the line and close to zero. The list of animals does contain both fish and mammals, so perhaps that is a key. Plotting the data to show that differentiation and then al-lowing for different lines for the two groups results in a more informative and better-fitting model (to find the two lines, one uses the species code variable in the data table.)

Body Weight vs. Brain Weight (Transformed with Subgroups)

-6 -4 -2 0 2 4 6 8 10 12

02

46

-28

10

Ln (Body Weight)

Ln

(B

rain

Weig

ht)

Regression for Fish

Regression for Mammals

As shown in the estimated equations, there is no difference between the slopes, so the result of the mod-eling process becomes two parallel lines, one for fish and one for mammals.

High-school teachers should note that the mul-tiple regression model appropriate for this analysis relates the response variable natural log on the brain weight (denoted by ln(Br)) to the explanatory vari-ables (1) natural log of body weight (denoted by ln(-Bo)) and (2) type of species (denoted by S), including an interaction term that allows the slope of the line to change as we move from mammals to fish. More specifically, beta 3 allows for different slopes, and beta 2 allows for different intercepts. Thus the interaction model to estimate is:

The least-squares analysis of the interaction model shows that the interaction term does not differ signifi-cantly from zero (p-value = 0.82), so it can be eliminated

without loss of information. The model without inter-action shows high significance for both the ln(Bo) and S terms; the “best-fitting” model reduces to two parallel lines, one for mammals and one for fish (note the R2 for this model is 0.90). Teachers may also be encouraged to examine the residual plots to understand the goodness of fit of the model.

Proportional Reasoning in StatisticsProportional reasoning is important in mathemat-ics, emphasized from the upper elementary grades through high school. This type of reasoning is key to success in statistical reasoning. Proportional reason-ing in statistics is about the magnitude of a difference relative to sample size and amount of variability. In essence, it is about understanding magnitudes of dif-ferences within a context.

Often in statistics, we want to compare numerical summaries of data between two groups. A major goal of the comparison is to decide whether the observed difference between the two summaries is meaningful.

FIGURE 3

Page 221: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 69

aPPeNDix 1

In statistics, the foundations for judging the size of a difference lie in proportional reasoning. Two factors are taken into account when comparing the size of the

difference between two proportions or between two means. These are (1) the group sizes and/or (2) the amount of variability within the data.

SCENARIO 7: MEDICAL SCREENING Understanding the results of medical screening tests is vitally important to the health of individuals and the functioning of the health care system. Such tests are not perfect, but the nature of the errors can be dangerously misleading. The figure shows what happens in a typical scenario of screening for HIV by use of the ELISA and Western Blot tests.

• For a population of heterosexual men exhibiting low-risk behavior, the rate of HIV infection is about 1 in 10,000. The true positive rate (sensitiv-ity) of these tests is about 99.9%. The true nega-tive rate (specificity) is 99.99%. Discuss how these two figures are used in the accompanying figure.

• The probability of a randomly selected person having a disease given that the screening says the disease is present is called the “positive predictive value.” Discuss how the positive predictive value is calculated in the HIV example. Why do you sup-pose the author used the smiley faces in this way?

The medical screening example illustrates that probabilities can always be interpreted as long-run relative frequencies. Generally, it is not obvious to teachers that a 0.01% infection rate can be interpreted as about one positive case in a typical group of 10,000 men, or, in other words, what can be expected in a random sample of 10,000 men. If a man is known to be infected, the expectation is that the highly accurate test will detect it.

On the other hand, of the 9,999 men not infect-ed, the expectation is that 9999(0.9999)=9998 will be tested as not infected, leaving 1 to be falsely detected as positive. The author, a medical doctor, finds the ex-pected frequencies are much easier for most patients to interpret than are the perplexing percentages.

Teachers should focus on seeing conditional prob-abilities as relative frequencies and reasoning out the conditional relative frequencies from the data, rather than through the memorizing formulas. Once the conditional probability is obtained, many will be surprised at how large it is, given the small infection rates and high rates of accuracy among the tests. The message: Conditioning can make a huge difference

in rates and depends crucially on the overall infec-tion rate. Now, suppose the infection rate doubles to 0.02%. How will that affect the conditional probabil-ity in question?

The conditional reasoning may be easier to see in a table, as shown, rather than a tree diagram.

Man + Man - Totals

test + 1 1 2

test - 0 9,998 9,998

totals 1 9,999 10,000

The condition of testing positive reduces the rel-evant cases to the first row of data; the chance of actually being positive given that the test was posi-tive is 1/2. If the infection rate doubles to 0.02%, the expected number of positives among those infected goes to approximately 2, and the conditional prob-ability of the chance of being positive given that the test is positive increases to 2/3. Thus, this probabili-ty is highly dependent upon the infection rate.

Page 222: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

70 | Statistical Education of Teachers

aPPeNDix 1

SCENARIO 8: FACEBOOKA middle-school student believes girls are more likely to have a Facebook account than boys. This student needs to transition from this research topic to a statistical question to investigate. After some discussion with her teacher, she decides to investigate the following statistical question:

For students at my school, is there is an association between sex and having a Facebook account?

This question would lead to the student obtaining a list of all students enrolled in her school and selecting a random sample of 200 students. With help from several friends, the 200 students selected are surveyed and asked to record their sex and whether they have a Facebook account. The data from the survey are summa-rized in the following contingency table:

Have a Facebook Account? Sex

Female Male

Yes 75 59

No 37 29

The student pointed out that more girls (75) had a Facebook account than boys (59). Based on these re-sults, should she conclude there is an association between the variables sex and having a Facebook account? Is this difference meaningful?

To examine the “meaningfulness” of the difference in number of Facebook accounts between girls and boys, teachers must realize they need to adjust for the different group sizes. That is, 75 of 112 girls surveyed had a Facebook account, while 59 of the 88 boys sur-veyed had a Facebook account. Thus, the proportion of girls in the sample with a Facebook account is 75/112 ≈ .67, and the proportion of boys in the sample with

a Facebook account is 59/88 ≈ .67. Because these two proportions are essentially the same, there does not ap-pear to be an association between gender and having a Facebook account.

Teachers should be encouraged to discuss what you gain from a sample of size 200 versus, for example, a sam-ple of size 20 in this scenario. Teachers should note that the larger sample size reduces the variability in the results.

SCENARIO 9: TEXTSA middle-school student thinks that, on average, boys send more text messages in a day than girls. He thinks this is true for both 7th- and 8th-grade students, and, based on this research topic, formulates the following statistical question:

How do the number of texts sent on a typical day compare between 7th-grade girls and boys, and between 8th-grade girls and boys?

The student obtained four lists of students—a list of all 7th-grade girls enrolled at the school, a list of all 7th-grade boys enrolled at the school, a list of all 8th-grade girls enrolled at the school, and a list of all 8th-grade boys enrolled at the school. Next, the student randomly selected 20 students from each list. Note that this guarantees the same number of students in each of the four groups selected. Each of the 80 students selected is contacted and asked:

How many text messages did you send yesterday?

Page 223: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 71

aPPeNDix 1

Here are the data:

Girls7 Boys7 Girls8 Boys8

39 37 82 56

57 56 84 62

76 58 89 68

78 59 91 79

79 60 94 90

99 75 102 95

117 77 107 101

117 97 111 104

122 102 115 104

124 104 121 109

136 105 124 110

140 120 135 113

141 121 138 126

145 125 140 130

147 127 150 130

151 139 154 137

153 170 159 138

159 182 165 143

202 189 168 147

217 197 171 158

TABLE 2

FIGURE 4

The texts example takes this proportional reasoning a step further. The resulting data are summarized in the four comparative dotplots.

Number of Text Messages Sent in One Day

50 75 100 125 150 175 200Bo

ys

8G

irls

8G

irls

7B

oys

7

Page 224: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

72 | Statistical Education of Teachers

aPPeNDix 1

Summary statistics for each group are:

Group Group Size

Mean StDev

7th-Grade Girls

20 125.0 44.8

7th-Grade Boys

20 110.0 47.5

8th-Grade Girls

20 125.0 29.7

8th-Grade Boys 20 110.0 29.1

Note that the difference between the means between girls and boys in the 7th grade is 15 texts. Thus, the 7th-grade girls sent, on average, 15 more texts than the 7th-grade boys. Also, this is the same difference when com-paring 8th-grade girls to 8th-grade boys. The 8th-grade girls sent, on average, 15 more texts than the 8th-grade boys. Thus, in absolute terms, the difference between the group means for both 7th- and 8th-grade boys and girls is the same. However, when evaluating the size of this dif-ference, teachers must use proportional reasoning to ex-amine this difference in centers in relation to the amount of variability in the data.

While the means for 7th-grade girls and boys are different, the standard deviations are fairly close (44.8 versus 47.5), indicating similar amounts of variability in number of texts sent for both 7th-grade girls and 7th-grade boys. Also, the standard deviations for 8th-grade girls and 8th-grade boys are fairly close (29.7 versus 29.1). Again, this indicates similar amounts of variability in the number of texts sent for both 8th-grade girls and 8th-grade boys. However, there is considerably less variability in the data on number of texts for the 8th-grade boys and girls than there is for the 7th-grade boys and girls. How does this affect the meaningfulness of the difference between groups?

For each grade level, teachers can judge the size of the difference between means by dividing the difference by a standard deviation (for a detailed example of this approach, see the Chapter 4 illustrative example). As long as the sample sizes are the same, this quantity pro-vides insight into how meaningful the observed differ-ence is for each class. For each grade, teachers can use the larger of the two standard deviations. Thus, for 7th graders, this quantity would be 15/47.5 = .32. For 8th graders, this quantity would be 15/29.7 = .51

Because this quantity is larger for 8th graders, the difference of 15 texts is more meaningful for 8th-grade students than it is for 7th-grade students.

As previously indicated, as long as the sample sizes are the same, this quantity provides useful information about the magnitude of the difference between mea-sures of center for two groups. When the sample sizes are different, sample size is also a factor teachers must consider when judging the magnitude of the difference.

The classical statistical procedure for comparing two population means based on independent random sample is the t-statistic, defined as:

, where

The t distribution for sample sizes of 20 or more is generally a robust distribution under the regular as-sumptions to use the t-test. The larger the t statistics, the stronger the evidence of a meaningful (significant) difference between the two sample means. Note that the denominator of the t statistic takes into account both the sample sizes and variability within each sample. Thus, the t statistic is measuring the magnitude of the differ-ence between the two sample means relative to both the sample sizes and amount of variability within each group. Also note that the smaller the denominator is, the larger the value for the t statistic. In this case, the larger the sample sizes are, the smaller the denominator. Also, the smaller the standard deviations (amounts of variability) are, the smaller the denominator.

The t statistic for comparing the mean number of texts of 7th-grade girls with 7th-grade boys is (yielding a p-value of ~0.3):

The t statistic for comparing the mean number of texts of 8th-grade girls with 8th-grade boys is (yielding a p-value of ~.1):

Page 225: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 73

aPPeNDix 1

Note the t statistic is larger for the 8th graders than for the 7th graders. This is because there is less variabil-ity in the data on 8th graders than 7th graders.

Therefore, while the girls in both 7th and 8th grade sent an average 15 texts more than boys in these sam-ples, this difference is more meaningful for the 8th graders than it is for the 7th graders. This is because there is less variability in the number of texts sent for the 8th graders than for the 7th graders.

The Role of Randomness in StatisticsRandom assignment and random selection are fun-damental concepts in statistics that teachers should understand. Although both are difficult in practice to achieve, they serve as key gold standards for dif-ferent aspects of statistics. While it is important to stress the importance of randomization in statistics, teachers should be aware that, in practice, conditions are far from ideal, so many statistical techniques are developed around what to do in light of not having ideal conditions.

Random selection is the backbone of statistical in-ference, as it provides a way to obtain a sample repre-sentative of the population. This notion is key to the construction and use of sampling distributions for in-ference. Teachers must have experience exploring and

working with the notion of sampling distributions, a concept that, if not given the adequate amount of time, can be confusing and mysterious. The focus of such ex-ploration should be around distinguishing among the population distribution, distribution of sample data, and sampling distributions.

Random assignment in an experiment is indis-pensable if one would like to claim causality of some type. Teachers should understand why random as-signment helps mitigate the effects of potential con-founding variables in an experiment. In particular, the distributions of potential confounding variables should be similar across conditions if random as-signment took place.

SCENARIO 10: HEALTHIER MENU School administrators are interested in providing a menu in the lunchroom that students like. The admin-istrators at a school plan to survey students to measure satisfaction with the new healthier menu in the lunchroom. They would like to answer the research question:

Do students like the new lunch menu at the school?

They survey the students by asking them the following survey questions (note these are survey questions, not statistical questions):

• Do you purchase food in the lunchroom?

• How many days a week do you purchase food in the lunchroom?

• Are you satisfied with your purchases?

• Do you like the food in the lunchroom?

To study whether students like the new lunchroom menu, teachers are expected to know that the results of such a survey can be generalized to the entire school only if the selected sample is a random sample from the entire

student body at the school. In discussing a method for obtaining a random sample, teachers may suggest pref-erence for other sampling methods that may not produce adequate samples to then make inferential statements.

Random assignment and random

selection are fundamental concepts

in statistics that teachers should

understand. Although both are

difficult in practice to achieve, they

serve as key gold standards for

different aspects of statistics.

Page 226: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

74 | Statistical Education of Teachers

aPPeNDix 1

SCENARIO 11: HOMEOWNERS A student attempts to investigate whether homeowners in the neighborhood support a proposed new tax for schools. This student thus articulates the following statistical question to investigate:

Will over 50% of the homeowners in your neighborhood agree to support a proposed new tax for schools?

The student takes a random sample of 50 homeowners in her neighborhood and asks them if they support the tax. Twenty of the sampled homeowners say they will support the proposed tax, yielding a sample pro-portion of 0.4. That seems like bad news for the schools, but is it plausible that the population proportion favoring the tax in this neighborhood could still be 50% or more?

The notion of random sampling can be introduced quite naturally by examining a question such as the one in the homeowners example regarding a decision about an unknown population proportion using the tool of simula-tion. Teachers may agree that sample proportions can dif-fer from sample to sample, so that a second sample from this same set of households is likely to produce a different result, but they may not see how this knowledge is helpful in answering the question. What does “plausible” mean?

At this point, the probabilistic reasoning of sta-tistical inference must be carefully explained and demonstrated, and this reasoning can be introduced via simulation (note that this scenario could also be modeled using a binomial distribution with proba-bility of 0.5 representing the probability of support-ing the tax; however, for the purpose of this example, the focus will be on using more informal methods of simulation to answer the statistical question posed).

One can suppose the population proportion fa-voring the tax proposal is, indeed, 50%. The teachers can create a model for such a population (perhaps random digits with even numbers equated to favor-ing the proposal) and take repeated random samples of size 50 from it, each time recording the sample proportion of “favors.” The plot (above) of 200 runs of such a simulation, an example shown below, has 25 out of the 200, or 12.5%, at or below 0.40. So, the chance of seeing a 40% or fewer favorable response in the sample—even if the true proportion of such re-sponses was 50%—is not all that small, casting little doubt on 50% as a plausible population value. It is im-portant that teachers recognize multiple samples are not needed for inference, but instead this simulated exercise is merely a way to understand the construc-tion of sampling distributions and how a sampling distribution is used in inference.

Sample Proportions

0.2 0.3 0.4 0.5 0.6 0.7 0.8

FIGURE 5

Sample proportions: sample size 50; true proportion 0.5

Page 227: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 75

aPPeNDix 1

SCENARIO 12: DOLPHINS Swimming with dolphins can certainly be fun, but is it also therapeutic for patients suffering from clinical depres-sion? To investigate this possibility, researchers recruited 30 subjects aged 18–65 with a clinical diagnosis of mild to moderate depression. Subjects were required to discontinue use of any antidepressant drugs or psychotherapy four weeks prior to the experiment and throughout the experiment. These 30 subjects went to an island off the coast of Honduras, where they were randomly assigned to one of two treatment groups. Both groups engaged in the same amount of swimming and snorkeling each day, but one group (the animal care program) did so in the presence of bottlenose dolphins. The other group (outdoor nature program) did not. At the end of two weeks, each subjects’ level of depression was evaluated, as it had been at the beginning of the study (Antonioli and Reveley, 2005).

The following table summarizes the results of this study:

Inferential reasoning should now move from sim-ulation to more formal methods developed from the normal distribution. The simulated distribution of sample proportions (the sampling distribution) has a mean of 0.49 and a standard deviation of 0.07. It can be modeled quite well by a normal distribution having that mean and standard deviation. But, we need not run a simulation each time we deal with a new propor-tion or sample size, because the underlying theory tells us that the mean of the sampling distribution of sample proportions will always equal the population propor-tion, p, and the standard deviation will be given by

where n is the sample size. These values are, respec-tively, 0.50 and 0.07 for the example, nearly identical

to the values from the simulation. Thus, the observed sample proportion of 0.40 is only 1.43 standard de-viations below the mean, not far enough to say it is outside the range of reasonably likely outcomes.

The randomization process can be repeated for other choices of the population proportion, resulting in an interval of “plausible values” for that parameter. An easier way to accomplish this, however, is to make use of the normal distribution model. If “reasonably likely” outcomes are set to be those in the middle 95% of the distribution of the sample proportion, , then any population proportion within about two stan-dard deviations—estimated using the sample propor-tion—of the observed would have that sample result within its reasonably likely range. Thus, all propor-tions within two standard deviations of the observed sample proportion form an interval of plausible values for the true population proportion.

Showed Substantial Improvement

No Substantial Improvement Total

animal care program (dolphin therapy)

10 5 15

Outdoor nature program (control group)

3 12 15

total 13 17 30

The dolphin study provides an example of an experiment that randomly assigns subjects to treat-ments. Notice that 10 of the patients in the Animal Care Program group showed substantial improve-ment compared to 3 of the Outdoor Nature Program group. Because the groups are the same size, we do not need to calculate proportions to compare. More of the people who swam with the dolphins improved.

It is possible, however, that this difference (10 vs. 3) could happen even if dolphin therapy was not effective, simply due to the random nature of putting subjects into groups (i.e., the luck of the draw). But if 13 of the 30 people were going to improve regardless of whether they swam with dolphins, we would have expected 6 or 7 to end up in each group; how unlikely is a 10/3 split by this random assignment process alone? If the answer is that

Page 228: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

76 | Statistical Education of Teachers

aPPeNDix 1

this observed difference would be surprising if dolphin therapy were not effective, then we would have strong evidence to conclude that dolphin therapy is effective. Why? Because we would have to believe a rare event just happened to occur in this experiment otherwise.

It is possible to see whether the observed 10 im-provements under the dolphin therapy treatment is unusually large, given no treatment effect, by running a simulation of the randomization process. Suppose the 13 who improved are equally likely to improve under either treatment (no treatment effect). Then, any of the 10 “improvers” under the dolphin therapy just hap-pened to fall into that column by chance. How likely is it to get that result by chance alone?

One possible model of this process begins by choosing 13 red cards to indicate “improvers” and 17

black cards to represent “non-improvers.” Randomly select 15 cards from the 30 to place in the dolphin treatment category (the other 15 go in the control group) and count the number of “improvers” (red cards) among the 15. Repeat this randomization pro-cess many times (possibly with the aid of a computer) to generate a distribution of dolphin therapy “improv-ers,” and then calculate the proportion of these counts that are 10 or greater.

If this proportion turns out to be small, that ev-idence suggests the observed count of 10 is not like-ly to occur under the conditions of chance alone; the dolphin therapy seems to be having an effect. The plot above shows 100 runs of the simulation outlined above; 10 was equaled or exceeded only two times, a small chance indeed.

Page 229: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 77

aPPeNDix 2

This appendix includes a sample activity handout that could be used in a professional development course or classroom. The activities are taken from the illustrative examples provided in the school-level chapters (4, 5, and 6). The examples are: (1) Breakfast and Tests, (2) Bottled Water, and (3) Texting. The goal of this appendix is to il-lustrate a line of potential questions that could be used di-rectly with teachers so they can work through the exam-ples. For each question posed, the answer and explanation can be found in the description in chapters 4, 5, and 6.

Example Sample Teacher Task for Breakfast and Tests with SolutionsA college professor teaches a course designed to prepare elementary-school mathematics teachers to teach statis-tics in the schools. As part of the professor’s assessment of the teachers’ statistical understanding, the professor decides to use the LOCUS (Level of Conceptual Un-derstanding of Statistics, www.locus.statisticseducation.org) exam. This exam is designed to assess conceptual understanding of statistics at the PreK–12 grade levels. The professor will give 30 questions focused on the el-ementary and middle-school levels statistics standards.

1. The professor wants to research whether eat-ing breakfast before a morning exam could affect an individual’s score on the exam.

2. How can the professor’s research be articu-lated in a statistics question?

3. How would you design a study to investi-gate the question? Explain.

4. How would you set up your study?

5. What type of data would be collected from your study?

6. Outline the data-collection process you need to carry out.

7. Carry out your study within your classroom or suppose the following data were collected:

Forty teachers participated in the study and completed the beginning/intermediate level (levels A and B) LOCUS exam, which consisted of 30 multiple-choice questions. Following are the scores (number correct out of 30 questions) for the teachers in each group:

Breakfast: 26 21 29 17 24 24 23 19 24 25 20 25 22 29 28 18 30 23

No Breakfast: 20 20 19 15 20 25 17 20 22 18 28 21 22 23 26 17 21 16 14 19 28 11

8. What are appropriate ways to graphical-ly represent and summarize your data? Explain your choices.

9. What do your results indicate?

10. How do the exam scores compare between the non-breakfast group and breakfast group?

Example Sample Teacher Task for Bottled Water with SolutionsA student is planning a project for a regional statistics poster competition. The student recently read that con-sumption of bottled water is on the rise and the envi-ronmental implications of this rise. The student won-dered whether people actually prefer bottled water to tap, or if they could even tell the difference.

1. What questions could be articulated and re-searched to investigate whether people can tell the difference between bottled water and tap?

2. How would you design a study to investi-gate this question? Explain.

3. How would you set up a study to answer your question?

4. What type of data would be collected from the study?

APPENDIX 2

Page 230: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

78 | Statistical Education of Teachers

aPPeNDix 2

5. Outline the data-collection process you need to carry out.

6. Carry out your study or suppose the following data were collected:

Twenty participants were presented with two identical-looking cups of water with 2 ounces of water in each cup. Each participant drinks the water from the cup on the right first, and then drinks the water from the cup on the left. Unknown to the participants, the cup on the right contained tap water for half the par-ticipants and the cup on the right contained bottled water for the other half. Each partic-ipant identified which cup of water he/she considered to be the bottled water. Suppose the following data were collected: C, I, I, C, I, I, C, I, C, C, I, C, I, C, C, I, C, C, C, C.

7. What are appropriate ways to graphical-ly represent and summarize your data? Explain your choices.

8. What do your results indicate?

9. Is it still plausible that each participant was simply guessing? Why or why not? Explain.

10. How could you simulate a person guessing?

11. How could you simulate the entire experi-ment you carried out within the class?

12. Create a dotplot of your simulated data. Describe the center, variability, and shape of the distribution depicted in the dotplot.

13. What are common values for the number of people who guessed correctly?

14. What does the dotplot tell us about the preference of bottled water?

15. Perform a simulation of 1,000 trials. Does it show anything different from the previous simulation? Describe the center, variability, and shape of the distribution of the number of heads for the 1,000 trials.

16. Where does your sample result fall on the dotplot? Does the sample result appear to be “typical”?

17. Based on the dotplot, do you think it is still plausible that each participant was simply guessing? Why or why not? Explain.

Example Sample Teacher Task for Texting with SolutionsSuppose a student wants to study how many texts stu-dents receive and send in a typical day. On thinking a little deeper, though, the student decides she really wants to know more than, say, the average number of texts received and sent because she believes students tend to send fewer texts than they receive.

1. What questions could be articulated and researched to investigate this topic?

2. How would you design a study to investi-gate this question? Explain.

3. Outline the data-collection process in de-tail that the students would need to carry out the study.

4. Carry out the study in your school or find appropriate existing data.

5. Suppose the data generated by the survey is the following chart

Page 231: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 79

aPPeNDix 2

Gender

Text Messages

Sent Yesterday

Text Messages Received Yesterday

Homework Hours

(week)

Text Messaging

Hours (week)

female 500 432 7 30

female 120 42 18 w3

Male 300 284 8 45

female 30 78 3 8

female 45 137 12 80

Male 0 93 5 0

Male 52 75 15 6

Male 200 293 14 10

Male 100 145 10 2

female 300 262 3 83

Male 29 82 7 4

Male 0 80 2 3

Male 30 99 15 5

Male 0 74 3 0.5

Male 0 17 28 0

Male 10 107 10 6

female 10 101 9 3

female 150 117 6 100

female 25 124 4 4

Male 1 101 7 1

female 34 102 25 5

Male 23 83 7 10

Male 20 118 1 1

Male 319 296 12 12

female 0 87 5 1

female 30 100 3 70

female 30 107 20 30

female 0 8 9 0.2

female 100 160 1 60

Male 20 111 1 2

female 200 129 3 30

Male 25 101 18 6

female 50 56 1 2

female 30 117 15 2

Male 50 76 7 23

Male 40 60 6 10

female 160 249 5 10

Male 6 96 8 2

Male 150 163 20 25

female 200 270 10 30

TABLE 2

Source: www.amstat.org/censusatschool

Page 232: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

80 | Statistical Education of Teachers

aPPeNDix 2

• What are appropriate ways to graphical-ly represent and summarize your data? Explain your choices.

• What is the trend seen in your graphical representation (scatterplot)? Calculate the least-squares regression line for these data.

• Describe the association between texts sent and texts received for the large cluster of points below the least-squares line between 60 and 120 texts received. What effect does this cluster of points have on the slope of the line?

• What is the effect of the outlier at the extreme upper right? What would happen to the slope of the line if that data value was found to be in error and removed from the data set?

• Does the plot provide evidence in favor of the belief that students tend to send fewer texts than they receive? Would that evidence be strengthened if the outlying point were removed?

• What is the trend seen in the scatterplot of homework hours vs. messaging hours? Calculate the least-squares regression line for these data.

Page 233: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence
Page 234: NCTM HSSI 2016: MP’S THROUGH A STATISTICAL …...2 NCTM HSSI 2016: MP’S THROUGH A STATISTICAL LENS, PART 2: SIMULATIONS 2. a) Now, suppose that our results show strong evidence

Statistical Education of Teachers | 82

aPPeNDix 2