Download - Seeking the Magic Metric: Using Evidence to Identify and Track School System Progress 1 The Wing Institute's Sixth Annual Summit on Evidence- based Education

Seeking the Magic Metric: Using Evidence to Identify and Track School System Progress

1

The Wing Institute's Sixth Annual Summit on Evidence-based EducationApril 21, 2011

Mary Beth Celio , Northwest Decision Resources

Seattle, Washington

Based on work funded by the Wallace Foundation and completed in cooperation with the Center on Reinventing Public Education, Daniel J. Evans School of Public Affairs, University of WashingtonExpanded through research done in urban school districts for the design of early warning systems

What is a metric?

2

In the business world, a metric is any type of measurement used to gauge some quantifiable component of a company's performance, such as return on investment , employee and customer churn rates, revenues, and so on; in software development, a metric is the measurement of a particular characteristic of a program's performance or efficiency.

There is a natural desire to reduce complexity, to summarize vast quantities of information into a single number to characterize everything from a person to a nation. Examples: a student’s GPA or SAT score; a borrower’s credit score, a country’s GNP.

The general public is familiar with metrics in common use: poverty level, unemployment rate, cost of living index, even AYP. All are summaries of complex data.

2

And why would we want to seek one that is “magic?”

3

An effective metric captures the key elements of an institution, in a compact and compelling way, pointing toward a (perhaps unarticulated) goal.

A “magic metric” would be a metric that is compact, understandable, measurable ; able to be used with units in a system of similar units (e.g., retail stores in a

chain, high schools in a district, hospitals in a network) as well as the system as a whole; and

universally accepted

3

4

Beyond its power to summarize complex data, a metric or indicator system should provide those with responsibility for a system with the ability to make decisions—to decide whether and where action is needed.

Medical researchers/biostatisticians have often been ahead of the game in searching for improved tools for triage; education research is following their lead.

Current rush to develop metrics, dashboards, report cards, indicator systems, and indices (like California’s API), responds to the need for accountability . . . and as a substitute for the much maligned AYP.

The uses of a metric or indicator system

Basic principles of metric development:

5

“Top-down” vs grassroots? Neither. Long history of conflict between top-down and grassroots; evidence-

based is more important than either

Less may be more, but one is not enough. Schools and school systems awash in data ; the human mind has a

limited capacity to absorb unlimited data. A single metric, though attractive, is difficult to unpack for or motivate

to action.

Parsimony and power must be respected. “Thin slicing” is key (cf Malcolm Gladwell)

Basic principles of metric development, cont.:

6

Current status data are necessary but not sufficient. Data out of context are ungrounded; year-to-year fluctuations confuse.

Proxies for key elements (e.g. adequacy of funding, teacher effectiveness) are inevitable. Areas for which there are no universally accepted indicators cannot be

excused from assessment/reporting for that reason.

Presentation cannot be an afterthought. “Getting information from a table is like extracting sunbeams from a

cucumber" (Farquhar and Farquhar via Wainer)

Selected indicators should include:

1. A measure of the status of a school (or school system) relative to a specific, if implied, goal (e.g., goal is high school completion; status indicator is percent of Class of 2005 who graduated on time)

2. A description of the trend in this measure over 5 years relative to the 1st year (e.g., change in completion rates from 1999 through 2005)

3. A way to diagnose underlying problems and/or predict the future performance on this indicator (e.g., percent of 9th graders failing 2+ courses and earning < 1/4th required credits predicts percent non-completions—in progress based on Chicago Consortium research)

Harry S. Truman High School: 2004-5 school enrollment

0

50

100

150

200

9th grade 10th grade 11th grade 12th grade

White African American Hispanic

7

Harry S Truman High School: 2001-2005 cohort completion

550

600

650

700

750

800

850

900

9th 10th 11th 12th Grads

2005

2004

2003

2002

2001

Class of:

2002-72.6%

2001-71.2%2003-66.7%

2005-78.5%

2004-73.9%

The birth of the particular metric/indicator system presented today

8

Wallace-supported research clarified the current challenges faced by school leaders at all levels. These findings were bolstered by an extensive review of the “indicator” literature and further research in urban school districts. Key conclusions:

There is extreme pressure on school leaders to know and show where they are and where they’re going.

All levels of school personnel are drowning in data and need to figure out what it says.

The public (everyone from the Federal government to the local parent) needs to be able to understand what the data are saying.

There is frenzied activity among consultants/researchers to provide districts and schools with tools to gather, analyze and present data—some of which confuse the matter further.

Steps in a process

9

1. Select goals to be measured Use Deming (TQC) approach Survey existing goal/vision

statements Consult Superintendents

2. Include different types of indicators, with types being: Output (achievement, achievement

gap, student completion/retention) Input (student attraction, student

engagement, teacher attraction/retention, funding equity)

Lagging (status and trend for each) Leading (projections to future

performance)

3. Select quantitative measures linked by research to outcomes Must be readily available in

districts/schools Must be intuitively appropriate

4. Specify comparison group(s) Other schools in district/state (all or

by similar profile)

5. Decide on number 7 +/- 2

6. Select display mechanism Should permit status and change in

same format Designed to encourage rapid

understanding (familiar format, if possible)

Must be “do-able” by school districts

Seven indicators suggested:

10

1. Student achievement Scores on standards-based math and reading tests

2. Elimination of the achievement gap Status and change in reading and math achievement for subgroups of students

by race, economic status, English language facility, etc. (where there are adequate numbers within a subgroup for comparison)

3. Student attraction Ability of the school/district to attract students where there are opportunities for

choice among parents/students

4. Student engagement with school Proxy measures of school engagement, including attendance, tardiness and

involvement in school

Seven indicators suggested (cont.):

11

5. Student retention/completion Retention of students during the school year(s) and completion of the

requirements appropriate at each level (elementary, middle, high)

6. Teacher attraction and retention Proxy measure of teacher attraction using applications per opening and non-

retirement turnover

7. Funding equity/efficiency Proxy measure using amount of funding per student expected by policy and

amount actually received; return on investment using calculated per student funding

A Graphical Overview of the 2012 Republican FieldBy NATE SILVERFebruary 4, 201112

The siren call of the single magic metric remains. (Entrancing, but listen at your own risk.)

13

Return on Educational InvestmentA District-by-District Evaluation of U.S. Educational Productivity by Ulrich Boser, Centerfor American Progress, January 19, 2011

A final, critical step: selection of a display mechanism for a system of indicators

How a critical research finding is conveyed makes all the difference Wainer’s account of the Challenger disaster Tufte’s critique of PowerPoint (“PowerPoint makes you

stupid.”) An important insight, inadequately portrayed, won’t

spark understanding or action. AYP. Enough said. State and school district websites are replete with

examples of inadequate, boring or gimmicky displays that fail to make an impact, spark understanding and/or action, or extract sunshine.

14

Some important do’s and don’ts in indicator display

Basic principle: Overabundance of data + overabundance of display mechanisms = confusion.

If both status and trends are important for decision makers/stake holders, then both should be displayed on the same grid.

Where an entire system is being considered, all individual elements (schools) must be able to be viewed on a single scale.

15

16

Indicator Reading Writing Math Science Ext. grad rate

Achievement of- Non-low inc.- Low income (% met standard)

% MET STANDARD RATING90 – 100% 780 – 89.9% 670 – 79.9% 560 – 69.9% 450 – 59.9% 340 – 49.9% 2< 40% 1

RATE RATING> 95 790 – 95% 685 – 89.9% 580 – 84.9% 475 – 79.9% 370 – 75% 2< 70% 1

- Achievement vs. Peers(Learning Index)

DIFFERENCE IN LEARNING INDEX RATING> .20 7.151 to .20 6.051 to .15 5-.05 to .05 4-.051 to -.15 3-.151 to -.20 2 < -.20 1

DIFFERENCEIN RATE RATING> 12 76.1 to 12 63.1 to 6 5-3 to 3 4-3.1 to -6 3-6.1 to -12 2< -12 1

Indicator Reading Writing Math Science Grad Rate AverageNon-low inc. ach. 6 6 6 4 5.50Low-inc. ach. 5 4 3 1 3.25Ach. vs. peers 7 7 7 7 7.00Improvement

7 7 6 4 6.00Average 6.25 6.00 5.50 4.00 5.44

Washington State Accountability Index, 2010

The final display: not a magic metric, but an indicator system with two display mechanisms

Seven indicators are suggested here, each with a status and a growth measurement [Other goals/benchmarks could be substituted where necessary, but the total number shouldn’t increase without good reason.]

A summary metric can be computed for each unit (school) in the system and displayed on the S/C (status/change) Grid.

Complete indicator data for each school in the district is displayed on the Wallace Indicator Grid, with ratings for each school on each indicator relative to the selected standard.

Leading indicator (“Early Warning System”) also suggested and possible, but not shown here.

17

A possible summary display: S/C (status/change) Grid

Based on recent (January 2111) research and interactive report by Center for American Progress: Return on Educational Investment.

Combines all indicators into two metrics per school: one for status, one for change

Like the Wallace Indicators, can display whole system (or parts thereof) and individual schools on the same grid

18

The Wallace Indicators: full pictureA familiar and compact

display that tracks both status and trends for all indicators

Can display data for the entire system and for individual schools in the same grid

Allows comparisons across schools and with other schools in the district /state or with standards

Guy Fawkes

D.B. Cooper

MonmouthTroy

MemorialEdsel United

Crispus Atticus

Math U T W A A T

Reading U A W A T q

Math q T W A A A

Reading U T W A A A

A T T U W q

U A A T q U

q T A A W A

U A A U W T

A q A q A W

Math T q W A T A

Reading A U T A W q

Math T U A W A A

Reading W A T A A q

q U A W A A

T U W A A A

A q T U A W

T U A A q W

W T A U q

W = In bottom 10% of comparison group U = In top third of comparison group, but < top 10%

T = In bottom third of comparison group, but > bottom 10% q = In top 10% of comparison group

A = Within 15% (+/-) of comparison group = Not available for comparison group

Worse Better

Funding equity, change from 2005

Ch

an

ge

Student retention/completion, change from 2005

Teacher attraction and retention, change from 2005

Student attraction, change from 2005

Engagement with school, change from 2005

Reduction in achievement gap, change from 2005

Achievement, change from 2005

Indicators

Teacher attraction and retention

Middle Schools

Achievement

Sta

tus

Elimination of achievement gap

Student attraction

Engagement with school

Student retention/completion

Funding equity

19

Some final observationsCollecting and analyzing school-by-school data isn’t

enough. Reams of reports can paralyze rather than propel.

A compelling, well-displayed, research-based measurement system can provide accountability and motivate change.

Dashboards, report cards or other display mechanisms, whether commercial or home-grown, are inadequate unless grounded in research on goals and appropriate indicators.

Dashboards, etc., like PowerPoint, can make you stupid.

20

Download - Seeking the Magic Metric: Using Evidence to Identify and Track School System Progress 1 The Wing Institute's Sixth Annual Summit on Evidence- based Education

Top Related