badges: a solution to our teacher evaluation disaster?

Upload: aspentaskforce

Post on 14-Apr-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

  • 7/30/2019 Badges: A solution to our teacher evaluation disaster?

    1/6

    Badges: A solution to our teacher evaluation disaster?

    ByValerie Strauss

    This was written by Cathy N. Davidson, a Duke U niversity professor and author of Now You

    See It: How the Brain Science of Attention Will Transform the Way We Live, Work, and

    Learn .

    By Cathy N. Davidson

    Last spring, when Googles Project Oxygen revealed the results obtained from number-

    crunching its entire stock of personnel records hiring, firing, merit raises, promotions no

    one was more surprised than Google to find that the famously data-driven company had

    actually been promoting managers for their squishy, soft, Management 101 people skills.

    Google prides itself on managers who have technical chops, but technical expertise didnt

    even make the Big Eight of esteemed management qualities. Fortunately, Google used a

    flexible and open enough text-mining system to be able to see what was there, not what

    wasnt, and to make its own contradictions visible. The company is now re-examining its

    own management rules and its deepest assumption about who and what makes a goodmanager.

    But what if Google had asked the question to find out how well its employees fulfilled the

    companys stated data-driven values: How many of our managers have the technical

    expertise to be good managers? The outcome of its own data-crunching could have been a

    disaster. Google (the 2012 top company on the Fortune list, by the way) might have failed

    its own empirical and objective and standardized test. If Google had been a public

    school, it might have been slated for closure in 2014 because of its failure.

    Of course Im overstating the case for effect. But the point here is that all the data in the

    world doesnt matter if you ask the wrong question of the data or if the method of testing isnt

    flexible enough to yield real, true data about success and failure.

    All the data in the world doesnt matter if you are collecting one kind of information but the

    real problem or virtue lies elsewhere. I believe this is the conundrum we are now in with the

    multiple-choice, end-of-grade form oftesting that the United States (and the world) now uses

    as its gold standard. Its not only an outmoded form of testing, but teaching aimed at

    ensuring that students achieve success at bubble tests does not ensure real learning. It also

    does not ensure that students will retain what they have learned and be able to apply it to

    their next level of learning challenges, in the classroom or beyond.

    How do we measure learning innovation?

    This issue came up pointedly at last weeks Harvard Innovations in Learning and Teaching

    (HILT) symposium where I was delighted to be one of the plenary speakers. The symposium

    was designed to help us all use the best research on learning to rethink the traditional

  • 7/30/2019 Badges: A solution to our teacher evaluation disaster?

    2/6

    classroom. One of the closing speakers said that they would be crunching the data to make

    sure the results of learning experiments were rigorous. The best innovators are often not

    the best evaluators, he said.

    I thought of Project Oxygen and the dismal state ofNo Child Left Behind school

    evaluation methods and responded, True enough! But the best evaluators are often not thebest innovators. We also have to be clear that our metrics are expansive enough to count

    values that may not be testable by current measures.

    Fortunately, Henry Roddy Roediger was also a plenary speaker. Roedigers research

    shows the limitations of item-response testing that is divorced from that which is being

    learned. His work in theMemory Lab at Washington University also shows that lecturing is

    the least effective learning method. If you want people to retain and to be able to master and

    apply what they learn, they have to be tested over and over as they are learning, and with

    feedback that helps them to learn better.

    Roedigers testing methods include a variety of challenges, including teaching others whatyou learn, working with someone who has a different answer than you to explain and correct

    your thinking, writing up your conclusions for a public audience that will challenge you, and

    other interactive forms of challenge-based testing.

    Harvard physicist Eric Mazuralso demonstrated his interactive testing-learning methods at

    HILT. He posed a basic physics problem to the crowd, we clicked our answers, and then he

    had us try to convince someone else who had a different answer to change their minds.

    In my interactions, a problem occurred. Perhaps because I was a plenary speaker, the

    stranger I chose as my partner, a very smart and lovely person who knew the right answer,

    didnt prevail on me forcefully enough to change my answer. I was wavering, convince-able,but the learning transfer didnt happen in our exchange. (It did, however, when it turned out

    he was right and I was wrong; I will probably neverforget that physics lesson, which proves

    Mazurs point, in long form).

    But lets back up. If my partner in this physics audience had been a Web developer, and we

    were doing a Web-building project together, my wrong answer well might have been the one

    we went with and our common projectwould have failed.

    Because so much code is written collaboratively, with strangers, where outcomes matter to

    the success of the project, for future jobs and future collaborations, coders have developed a

    complex yet easy (and difficult to game) system of awarding one another badges forsuccessful, innovative collaboration. They dont need a multiple-choice test to prove they are

    good coders. In fact, unlike doctors, accountants, beauticians, or financial advisers,

    programmers dont even have a formal certification or credentialing system.

    Millions of Web programmers worldwide have learned to innovate at a far faster pace than

    most of us and to evaluate one another rigorously throughpeer assessment. Really. That is

    so counter-intuitive that Im going to repeat it: Millions of Web programmers worldwide have

  • 7/30/2019 Badges: A solution to our teacher evaluation disaster?

    3/6

    learned to innovate at a far faster pace than most of us and to evaluate one another

    rigorously through peer assessment.

    How is this possible? How can peers really evaluate one another. They can and do in Web

    world by awarding badges as peer-given contribution and reputation points. Badges are the

    visible symbol of a complex system of rigorous peer evaluation of all the complex skills (thekind Project Oxygen turned up at Google) as well as all the innovative programming that

    Web coders contribute to one another.

    I believe we can learn much from what and how they do what they do.

    Badges, innovation, and evaluation: The example of stack exchange

    To understand more about the world of Badges I interviewed Jeff Atwood, cofounder

    ofStack Exchange, a question-and-answer website which also includes Stack Overflow for

    programmers, Server Fault for system administrators, and more than 70 others that range

    from photography to productivity. He also writes the popular blogCoding Horror.

    Stack Overflow serves the 12-13 million community of programmers world wide, seeing site

    traffic in the range of 16 million views per months to their site. Atwood likes to say that one

    of Stack Exchanges chief contributions is making platforms that make it easy for people to

    contribute their knowledge to one another. Memberspose questions and other members

    answer them, and, if the answer is good, you award points to your coding colleague.

    If you are heading in a wrong direction (as I was in Eric Mazurs session at HILT), and

    someone is able to steer you in the right direction, you award points to that person for their

    teaching abilities. The points add up, and you can see the results on your own personal

    website where programmers can proudly display their badges. Click on a programmers

    glowing gold badge and you find a detailed assessment of absolutely everything that

    contributed to the high scores, including my personal comments about why.

    Im not talking about resume-speak. If I award points, you can read the actual details and

    reasons for why Captain Coder over in Beijing earned points for her C++ programming

    chops, or why Mr. Algorithmic in Sidney was awarded top points for being precognitive,

    someone who follows development of new ideas and communities during the earliest stages.

    Cruncher from Cambridge might earn points for being a self-learner, or a teacher, or for

    being tenacious, outspoken, or disciplined, different assets in the community based on

    algorithms and contributions to the site. (You can see the Stack Overflow badges and points

    here .)

    These qualities merge programming skills with teaching and learning skills collaborative

    skills because, to deliver code on time, you need all of those (as Project Oxygen also

    found with its personnel-record data-mining). Atwood calls them a reputational breadcrumb

    trail on the Internet. But hes being modest.

  • 7/30/2019 Badges: A solution to our teacher evaluation disaster?

    4/6

    Another part of StackExchange is Careers 2.0, a job posting and connecting service, a kind

    of Match.Com for jobs. Reputation based on badges and points are the currency of the

    realm and it is a leading service for employers looking to hire managers, programmers,

    and just about anyone else in the worlds mobile, distributed programmer workforce. It should

    come as no surprise that many of the best tech companies, including Google, use Careers

    2.0 for their recruiting.

    But one more word about badging. Its not just about jobs. As Atwood says, the badges on

    Stack Exchange dont just record participation, they incentivize it. They also allow you to

    match a range of qualities you value with the complex range of qualities that peers have

    recognized and rewarded. You do a good job, others give you credit. And, if I, as an

    employer, want to find out whysomeone has earned a badge, all I have to do is click on your

    badge, find out the details, and read the comments and then I can decide how much I do or

    dont trust the reputation. Its open, so I can see where Mr. Algorithmic is getting his points.

    That is the thing about non-standardized open content: others can comment on it, emend it,

    challenge it. And, if you want to crunch such loosey-goosey evaluation, well, we now havetext-mining software that allows that, with remarkable complexity, as we saw from Project

    Oxygen.

    We no longerneedto use the A, B, C, D, or None of the Above multiple-choice test invented

    in 1914 and patterned after the state-of-the-art mass production of its time, Henry Fords

    assembly line. When you think about it, its pretty hard to believe that the state of the art

    evaluation system the world is currently using forevaluating something as complex

    as learningdates back to the Model T.

    Better evaluation systems exist now

    We have computers now, everyone. Imagine that! But we are still using the testing

    methods designed for the era of the Model T, a form of testing for lower order thinking that

    measures the narrow range of thinking measured by Best Available Answer testing. We

    know, from the best data-based research, this form of testing is a dis-incentive to learning,

    especially for kids who dont believe they have any chance of obtaining the goal of using

    good test scores to get into college. In other words, the tests incentivize, to use Jeff

    Atwords word, only those aspiring to get to an end: college, a certificate, a credential. The

    tests do not incentivize contribution, participation, collaboration, and learning what Stack

    Exchange strives for.

    Think about that. We have a system of tests designed for citizens of the Industrial Age,

    based on the assembly line, that are extremely costly, dont measure much of content, and

    dont motivate learning. And millions of programmers have found a way that works so well

    they dont even need formal credentials and accreditation systems. What they do works

    and works based on peers evaluating contribution (they dont even have a system of

  • 7/30/2019 Badges: A solution to our teacher evaluation disaster?

    5/6

    failing: they reward what works, what is good, setting the bar for reputation at its highest,

    not at its lowest denominator).

    We not only can use far more interactive, complex, humane, interesting, challenging, and

    innovative forms of assessment for real learning, real teaching, real collaboration the tech

    community is already doing that. Teachers, researchers, experimenters, and evaluators allneed to think about these systems and learn from them. Project Oxygen revealed patterns

    even Google didnt suspect. Stack Exchange is doing that daily, with millions of people.

    The badging systems Im interested in exploring have to be offered by non-profit learning

    organizations in order to avoid further commercialization and exploitation of our educational

    system. They have to be less not more expensive to administer than the current

    cumbersome system of either Human Resource (HR) evaluation or end of grade tests or

    teacher standards and evaluation or merit systems. They have to include peer

    components. They have to include a range of skills, content, subject matter, mastery,

    application, theory, and practice, competencies and collaborative or character

    qualities. And, most important, they have to be tied to the learning process itselfand incentivize and motivate not just document real, long-term, engaged, interactive

    learning.

    Badges for lifelong learning

    Since September, the nonprofit learning network I cofounded,HASTAC (haystack) has

    been working with the John D. and Catherine T. MacArthur Foundation and the Mozilla

    Foundation to run competitions on Badges for Life Long Learning, as part of our

    annualDigital Media and Learning Competition .

    It turns out that many institutions join us in thinking our Model T form of testing is archaic and

    a dis-incentive to either real learning or real learning innovation, in schools, in informal

    learning settings, or in the workplace. Nearly 340 different institutions from NASA to Intel,

    from small local schools to the Department of Education have offered challenges. Weve

    just announced winners of the first phase of a separate Teacher Mastery Competition too.

    And were now challenging developers to apply to work with institutions to co-create badging

    systems that work for the values and learning goals of the institutions.

    In the end, we will have a rich portfolio of active projects, all developing badging and

    reputation systems online, funded for a year so that they can learn and so that we thepublic can learn from an open competition, an open year of co-developing, and an open

    year of evaluating, recommending, refining, improving, and creating together. That is what

    learning is about. We can all learn to do this together, in the way that the Open Web has

    developed for the 21stcentury but that has yet to penetrate into our institutions of formal

    learning and into many of our business institutions as well.

    You cant build the next generation of the Web with an assembly line

  • 7/30/2019 Badges: A solution to our teacher evaluation disaster?

    6/6

    At the HILT conference at Harvard, we talked a lot about how real metrics, real data, real

    experiment can serve real learning innovation. If we dont also think

    about innovative metrics, data, and experimental methods, we will replicate old standards

    and values but with some relatively insignificant new tweaks. If we want true innovation in

    learning, we must strive for true innovation in the methods we use for deciding what counts

    and how we count. Im hopeful that we are at a tipping point. I believe we are on the vergeof using the successful methods already being used by the developers of the Internet to find

    the best ways of learning for the Internet Age. I believe we will soon be finding new ways to

    measure contribution and to motivate learning not for the era of the Model T but for the

    21st

    century.

    -0-

    Follow The Answer Sheet every day by

    bookmarkinghttp://www.washingtonpost.com/blogs/answer-sheet.