building rubrics for large-scale, campus-wide assessment

Building Rubrics For Large-scale, Campus-

wide AssessmentThomas W. Zane

[email protected] Diane L. Johnson

[email protected] Jodi Robison [email protected]

mailto:[email protected]



HousekeepingPresenters

◦Thomas W. Zane

◦Diane L. Johnson

And you are?

Locations Breaks

Workshop Process◦Workbooks◦Materials◦Symbols

AgendaMorning – Learning and Exercises

1. Foundations of large-scale rubrics2. Rubric design3. Criteria4. Scales5. Descriptors6. Graders notes and exemplars7. Review methods

Afternoon Practice – 3 Methods1. Select rows from existing rubrics (Critical

Thinking)2. Adapt from existing rubrics (Communications)3. Select criteria (Build from Common Criteria)

Differences Between Classroom vs. Large-scale Rubrics

Stringency of the development rules

The types of information found within and attached to the rubric.

1.0 Foundations1. Foundational Concepts – Understand the types of rubrics and when to use them.

Rubric uses in Higher Education

· Support student learning and success.· Support and inform teaching practice.· Support specific pedagogical strategies.· Collect data for external stakeholders.

Types of Rubrics

· Holistic and analytic rubrics.· General and task-specific rubrics.· Scoring and educative rubrics.· Content and trait-based rubrics.

When To Use Rubrics

· Before creation of the lesson, assignment, or exam for planning and instruction.

· In class before the assignment is handed out to aid initial learning.· At assignment time and while students are completing their assignments

to improve student performance and learning.· After the assignments have been graded offering better feedback and

supporting stronger remediation.

Purpose and Type

1.1 & 1.2 Parts of a RubricScore ScaleCriteriaDescriptors

InstructionsScoring RulesGraders’ Scoring Notes & Exemplars

Criteria 1Well Below

Expectations

2Below

Expectations

3Meets

Expectations

4Exceeds

ExpectationsAnalyzing the Situation

No analysis is included.

Analysis is illogical. Analysis is logical and very brief.

Analysis is well developed.

Distinguishing Between Literal and Figurative Meanings

No response is included.

Student does not differentiate between literal and figurative meanings.

Student inconsistently differentiates between literal and figurative meanings.

Student consistently differentiates between literal and figurative meanings.

Looking at Other Viewpoints

Claim is not discussed. Student states claim without acknowledging argument(s) against it.

Student acknowledges argument(s) against claim but analysis is lacking.

Student analyzes argument(s) against claim.

Finding Solutions Student ignores problem.

Student acknowledges problem without offering any solutions.

Student suggests some solutions but the might be illogical and/or impractical.

Student creates thoughtful, innovative plan for solving problem.

Building Connections

New information is not included.

Student summarizes new information.

Student makes some connections between information and prior knowledge.

Student aptly relates new information to prior knowledge.

1.3 Holistic & Analytic RubricsHolistic

◦One big decision about submission as a whole

◦Integrated (complex) scoring descriptions

Analytic◦Many decisions◦Row-by-row scoring on various

criteria

1.3 Analytic vs. Holistic Rubrics

We have learned that analytic rubrics:

• Offer far better feedback to students

• Provide better data for making curricular decisions

• Improve reliability across graders• Are easier to train and use• Can take LESS time to score

So are far better suited to supporting our high-volume grading efforts

1.4 Why Use Rubrics?Assess progress, ability, etc.CommunicateImprove learningIncrease reliability of scoringLink to standards & course

objectivesReduce noiseSave timeSet and/or clarify expectations

1.5 Benefits of RubricsSupport Student Learning and

SuccessSupport and Inform Teaching

PracticeSupport Specific Pedagogical

StrategiesAnd oh, by the way…

◦That same data from the classroom can then be used to satisfy nearly every call for accountability measurement!

1.6 Potential Shortcomings of RubricsTime (To Build)AggravationEffort

1.7 High to Low Inference RubricsHigh Inference – The process of

coming to a conclusion or judgment based on known or assumed criteria.

Low Inference – Rubric defines more precisely what the evaluator is to detect in the performance.Low High

Green zone is optimal for analytic rubrics.

1.8 Span of SubmissionsGeneral – Many submissionsTask Specific – One submission

General Task Specific

1.9 Content and Trait BasedContent-based Rubrics –what

the student had to say)

Trait-based Rubrics –how well the student said it)

1.10 Educative vs. ScoringEducative rubrics tend to be

formative and written for student use.

Scoring rubrics tend to be summative and are written in great detail to inform scoring.

Why Use a Hybrid Rubric?

E-portfolio Assessment

Program Assessment

Classroom AssignmentsAlmost every assessment measure starts here.

Classroom data aggregated here.

A sample of all signature assignments.

1.11 Usability and Flow1. A rubric should be quick and easy to use.

2. Graders should spend time with their eyes on the student submission or performance rather than on a long, complex, or difficult to follow rubric.

3. Good flow depends on clarity, brevity, white space, a small number of rows and columns, concrete descriptors, and good organization of criteria.

Need a Break?

2. Should I Use a Rubric?1. Is a rubric the correct tool to use?◦Constructed response or performance? ◦Require more than one correct answer?

◦Gradations of quality?

2. What type of rubric design would be best?◦Who will grade? ◦Will there be usage limitations?

2.1. Identify Overall Purpose of the Assignment and Score(s)

What are the broad goals of the program/assignment and where does the measure fit into the program?

What is/are the purpose(s) for the scores?◦What decisions might be made?◦What actions might be taken?◦What consequences may result?

What does “success” look like?—◦Academic? Real-world? Both?

2.2 Targets of Measurement

◦Person◦Product◦Performance◦Process◦Impact

3.0 Criteria

IdentifyCriteria

· Use real-world standards and human judgment.· Collaborate among your peers.· Dig deeply into the construct of interest. · Consider the size of the “chunks” on each criterion row.· Search for mutually supportive rather than mutually exclusive criteria.

CreateBullet List

· Prioritize to select criteria that are: Actionable, Authentic, Valued, Important, Aligned, Chunked, Clear, Complete, Supportive

Order theResulting Bullet

List

DefineEach Criteria

· Chronological order.· Order of the performance.· Order that is familiar to the graders.· Order by cognitive complexity.

3. Criteria – Define what matters most.

Scoring Scales

Criteria selection is the most important decision of the entire rubric creation process.

Criteria selection answers the question: What matters?

General Rules for Criteria Development

1. Use real-world standards and human judgment.2. Collaborate with peers.3. Dig deeply into the construct of interest.4. Select hard to measure criteria rather than

settling for what is easy to count.5. Consider the criterion breadth on each criterion

row. 6. Search for mutually supportive rather than

mutually exclusive criteria. 7. If working with trait-based rubrics, try to agree

on what quality means across sections of a course, multiple courses, and across the entire campus.

3.1 Types of CriteriaQualityDepth or BreadthRelevance or AdequacyImpactAccuracyPhysical Attributes (rarely used)

3.2 Methods for Finding Criteria(Easiest to hardest methods.)

1. Draw from existing rubrics 2. Select from common, generic, or

universal criteria3. Industry or discipline based

standards and learning outcomes

4. Break down the constructs from scratch

3.3 Method 1: Select from Preexisting RubricsSearch for existing rubrics that

identify criteria.Look for specific aspects and

exemplars of each criterion.Don’t adopt it just because it is in

print. Provide attribution.

3.3 Method 2: Select from Generic CriteriaScan the Generic Criteria list. Select 4-12 criteria that matter

based on your purpose and values.

3.3 Method 3: Draw from Standards & Outcomes

Search standards and outcomes instead of rubrics.

Pull valued criteria out for measurement.

Notes About Using Standards and OutcomesStandards are frequently written as “values” or broad “goals”.

Outcomes or Curriculum Objectives often resemble behavioral objectives.

Real-world Outcomes are wonderful, but may be listed as statements.

Search for concrete definitions of success.

3.3 Method 4: Build From ScratchResearch the constructs of

interest (critical thinking, reflection, life-long learning, etc.

BrainstormDeconstruct the larger constructs

into measurable parts

Cautions When Selecting Attributes For Your RubricRarely use physical appearance.Don’t measure simple content

knowledge.Avoid progress or gain over time.Affective measures are

extraordinarily difficult.Don’t expect written responses to

measure everything.

3.5 Criteria Selection RulesActionableAuthenticValuedImportant Aligned SizedClear Complete Supportive

3.6 Order the Criteria1. By the chronological order

graders will encounter each criterion within the submission or performance.

2. By the order of the performance. 3. By criteria order that are

familiar to the graders. 4. By logical grouping.5. In order of cognitive complexity.

3.7 Define Each CriterionBriefly define the meaning of the

criteria on each row of the rubric table.

Good criteria definitions make it relatively easy to see when the descriptors across each row contain elements that are inside or outside the intended measure.

3.8 Check for Construct Irrelevant Variance (CIV)Overrepresentation

◦Extra criteria in the rubric.◦Nice to measure, but not directly

related to the purpose of the rubric.

Underrepresentation ◦Missing criteria that should have

been included in the rubric. ◦Things that were critical to measure

but were missed.

Need a Break?

4.0 Scales4. Scoring Scale – Select the number and type of scoring levels.

Based on Intended Score Interpretations

· Be interpretable in the way you intended.· Inform the decisions you intend to make and/or actions you intend to

take.

Based on the Expected Range of Student

Performance

· Consistent with where the expected ability level starts or ends.· Gradations in the scoring scale detailed enough to capture

differences in ability.

Based on the Discernible Nuances in Quality

Ensure Consistency of Meaning for Each

Scoring Level

· Only as many levels of quality that the grader can perceive.· Use as few scoring levels as you can to get the job done.

· Each scoring level (columns) must have a specific, defined, and consistent meaning for every row in the table.

Descriptors

4.1 Examples of ScalesExamples of Scoring Scales (First Row of Each Rubric) Scale

Purpose1

(Need to re-teach this lesson.)

2(Need to

remediate students to cover

errors in learning.)

3(May need to

review the lesson before the final

exam.)

4(No need to review this

lesson.)

Measure student learning to inform teaching strategy.

1(Needs Rewrite)

2(Needs Revision)

3(Passes Without

Revision)

4(Passes with

Commendation)

Measure student learning to inform student actions.

1(Clearly does not meet standard)

2(Approaches

standard)

3(Meets standard)

4(Exceeds standard)

Measure student learning to

determine if students meet

minimum national standards.

1(Cannot use this

book for this course)

2(Might be able to use this book if

significant supplementary materials were

added)

3(Could use this

book for the course with

normal additional course materials)

4(Could use this

book without any additional materials)

Make textbook recommendations

1(Entering freshman)

2(Basic

introductory understanding of

concepts)

3(Developing

undergraduate understanding of

concepts)

4(Integrated

graduate-level understanding of

concepts)

Determine department-wide student academic

progress

1Beginning

2Developing

3Average

4Accomplished

Skill levels.

4.2 Rules for Scoring Scales1. Scoring scales must reflect the

purpose of the rubric 2. Define the discernible quality

levels in student performances3. Use as few scoring levels as you

can to get the job done --- (because as the number of columns increases, grading costs can increase proportionally)

5.0 Descriptors5. Descriptors – Define the discernable characteristics of each scoring point.

Breaking Descriptors Down into Measurable

Aspects

· Graders need to be able to see, hear, taste, smell, or otherwise perceive characteristics of student performance and then apply a score to the observation.

Differentiation between Descriptors

· Use brief, clear, and concrete descriptions of observable performance.

· Use observable conditions/indicators to help differentiate one scoring level from another.

· Do not use comparative language (e.g., good, better, best) to differentiate scoring levels.

Unidimensional vs. Multidimensional Criterion Scales

Descriptive vs. Comparative

· Create unidimensional criterion scales that change only in the degree of the criterion on that row.

· Descriptive descriptors describe discernible deeds.· Do not use comparative language (e.g., good – excellent - superior).

Qualitative vs. Quantitative

· Measure gradations of quality rather than counting characteristics of quality.

Level of Detail

The Three Bears Principle: Just enough detail to:· Offer students some guidance for the assignment.· Provide definitions of success.· Help graders score consistently.· Provide meaningful feedback.· Support better stakeholder score interpretation.

Graders Notes

5.1 Make Descriptors Observable and MeasurableGraders need to be able to see,

hear, taste, smell, or otherwise perceive characteristics of student performance and then apply a score to the observation.

5.2 Differentiation between DescriptorsWe use three absolute rules:

1. Use brief, clear, and concrete descriptions of observable performance.

2. Use observable conditions/indicators to help differentiate one scoring level from another.

3. Do not use comparative language (e.g., good, better, best) to differentiate scoring levels.

5.3 Unidimensional vs. Multidimensional ScalesUnidimensional scales – changes

across row are the degree of a single criterion.

Guard against breaking the row down into minutiae – thus losing the integrated whole.

Criteria 1Well Below

Expectations

2Below

Expectations

3Meets

Expectations

4Exceeds

ExpectationsAnalyzing the Situation

No analysis is included.

Analysis is illogical. Analysis is logical and very brief.

Analysis is well developed.

Distinguishing Between Literal and Figurative Meanings

No response is included.

Student does not differentiate between literal and figurative meanings.

Student inconsistently differentiates between literal and figurative meanings.

Student consistently differentiates between literal and figurative meanings.

Looking at Other Viewpoints

Claim is not discussed.

Student states claim without acknowledging argument(s) against it.

Student acknowledges argument(s) against claim but analysis is lacking.

Student analyzes argument(s) against claim.

Finding Solutions Student ignores problem.

Student acknowledges problem without offering any solutions.

Student suggests some solutions but the might be illogical and/or impractical.

Student creates thoughtful, innovative plan for solving problem.

Building Connections

New information is not included.

Student summarizes new information.

Student makes some connections between information and prior knowledge.

Student aptly relates new information to prior knowledge.

5.4 Descriptive vs. Comparative Scales

◦Descriptive descriptors describe discernible deeds!

◦Describe performance rather than make judgments about it.

COMPARATIVE/RELATIVE DESCRIPTIVE/CONCRETEEffective strategy Uses data to identify errorsFrequently 4 or 5 timesInteresting Used voice inflection or

hand gesturesCreative solution Solution not in textbookMethodology section is detailed

Methodology section is replicable

5.5 Qualitative vs. Quantitative ScalesEnsure that your rubric measures

gradations of quality rather than counting characteristics of quality.

More is not usually a strong surrogate for better.

Simple Rule: Use counting 0 to 1% of the time! ;-)

5.6 Level of DetailJust enough detail -

◦Offers students some guidance for the assignment.

◦Provides definitions of success.◦Helps graders score consistently.◦Provides meaningful feedback.◦Supports better stakeholder score

interpretation.

5.7 Writing The Descriptors1. Begin with the passing

descriptor -

2. Then create the bottom of the scale -

3. Now the top of the scale -

4. Finally, work on the just under passing descriptor -

Finding Solutions Student ignores the problem.

Student acknowledges the problem without offering any solutions.

Student suggests some solutions but they might be illogical and/or impractical.

Student creates a thoughtful, innovative logical plan for solving the problem.













6.0 Graders Notes & Exemplars

6. Graders Notes – Explicate rubric cells with description and exemplars.

Describe Scoring Levels for All Criteria

· Write brief easy-to-read descriptive sentences that help the graders know it when they see it.

· Create graphic representations when needed.

Craft Exemplars· Create brief facsimiles of student work.· Focus on exemplars for the two middle columns (just under passing

and acceptable passing-level work).

Pilot Test & Revise

On to Live Rubric Use

· Try out the rubric by grading a small number of student submissions or performances.

· Review rubric criteria, scale, and descriptors in light of student performance.

· Revise the rubric as needed.

Collect Additional Exemplars

· Collect examples of student work during pilot testing and early live rubric use for each scoring level and each criterion.

· Focus first on average responses then focus on outliers and exceptions.

Next Chapter

6.1 Other Decisions: Operations and Training

1. Procedure for retrieving and scoring performances

2. Procedure for accessing and using graders notes

3. Rules governing anonymity, confidentiality, etc.

4. Procedures for marking and completing rubrics5. Rules governing feedback 6. Rules for decision making (pass/fail, revision

needed) or computing a score (includes weighting)

7. Procedures for reporting problems

Rubric Quality Checklist

Exam Time!

Come Join the Conversationwww.slcc.edu/assessment

Website and toolkits are maintained as free resources

Come help us improve the tools by offering advice, feedback, examples, etc.

www.slcc.edu/assessment

Presenters’ E-mail

Thomas W. Zane [email protected]

Diane L. Johnson [email protected]

Jodi Robison [email protected]




building rubrics for large-scale, campus-wide assessment

Documents