peer-review/assessment aid to learning & assessment
DESCRIPTION
Peer-Review/Assessment Aid to Learning & Assessment. Phil Davies Division of Computing & Mathematical Sciences Department of Computing FAT University of Glamorgan. Defining Peer-Assessment. In describing the teacher .. - PowerPoint PPT PresentationTRANSCRIPT
Peer-Review/Assessment
Aid to Learning & Assessment
Phil DaviesDivision of Computing & Mathematical Sciences
Department of ComputingFAT
University of Glamorgan
Defining Peer-Assessment
• In describing the teacher ..
A tall b******, so he was. A tall thin, mean b******, with a baldy head like a light bulb. He’d make us mark each other’s work, then for every wrong mark we got, we’d get a thump. That way – he paused – ‘we were implicated in each other’s pain’
McCarthy’s Bar McCarthy’s Bar (Pete McCarthy, 2000,page (Pete McCarthy, 2000,page 68)68)
AUTOMATICALLY
CREATE A MARK THAT REFLECTS THE QUALITY OF AN ESSAY/PRODUCT
VIA PEER MARKING,
AND ALSO
A MARK THAT REFLECTS THE QUALITY OF THE
PEER MARKING PROCESS i.e. A FAIR/REFLECTIVE
MARK FOR MARKING AND COMMENTING
Below are comments given to students.Place in Top FOUR Order of Importance to YOU
1. I think you’ve missed out a big area of the research2. You’ve included a ‘big chunk’ that you haven’t cited3. There aren’t any examples given to help me understand4. Grammatically it is not what it should be like5. Your spelling is atroceious6. You haven’t explained your acronyms to me7. You’ve directly copied my notes as your answer to the
question8. 50% of what you’ve said isn’t about the question9. Your answer is not aimed at the correct level of
audience10.All the points you make in the essay lack any
references for support
Order of Answers
• Were the results all in the ‘CORRECT’ order – probably not?
• Why not!
• Subject specific?• Level specific – school, FE, HE• Teacher/Lecturer specific?
• Peer-Assessment is no different – Objectivity through Subjectivity
Typical Assignment Process
• Students register to use system - CAP
• Create an essay in an area associated with the module
• Provide RTF template of headings
• Submit via Bboard Digital Drop-Box
• Anonymous code given to essay automatically by system
• Create comments database / categories
Each Student is using a different set of weighted comments
Comments databases sent to tutor
First Stage => Self Assess own Work
Second Stage (button on server) => Peer Assess 6 Essays
Self/Peer Assessment
• Often Self-Assessment stage used– Set Personal Criteria– Opportunity to identify errors– Get used to system
• Normally peer-mark about 5/6
• Raw peer MEDIAN mark produced• Need for student to receive Comments + Marks
• Need for communication element?
AUTOMATICALLY EMAIL THE MARKER ..
ANONYMOUS
The communications element
• Requires the owner of the file to ‘ask’ questions of the marker
• Emphasis ‘should’ be on the marker• Marker does NOT see comments of other
markers who’ve marked the essays that they have marked
• Marker does not really get to reflect on their own marking – get a reflective 2nd chance
• I’ve avoided this in past -> get it right first time
Feedback Index
• Produce an index that reflects the quality of commenting
• Produce a Weighted Feedback Index
• Compare how a marker has performed against these averages
• Judge quality of marking and commenting i.e. provide a mark for marking AUTOMATICALLY
CompensationHigh and Low Markers
• Need to take this into account
• Each essay has a ‘raw’ peer generated mark - MEDIAN
• Look at each student’s marking and ascertain if ‘on average’ they are an under or over marker
• Offset mark given by this value
• Create a COMPENSATED PEER MARK
How to work out Mark (& Comment) Consistency
• Marker on average OVER marks by 10%• Essay worth 60%• Marker gave it 75%• Marker is 15% over• Actual consistency index (Difference) = 5
• This is done for all marks and comments• Creates a consistency factor for marking and
commenting
Marks to Comments Correlation
• Jennifer Robinson – a third of comments not useful
• Liu – Holistic comments not specific
• Davies – Really good correlation between marks and comments received
-5 -4 -3 -2 -1 -0 +0 1 2 3 4 5 6 7 8 9
29 44 41 49 46 53 64 49 53 60 62 69 68 69 82
38 48 47 51 45 54 58 53 62 62 64 65 73
49 51 50 60 57 57 67 66
51 58 53 50 59
57 63
59 65
64
0 4.2 5.0 1.4 3.5 4.0 6.8 4.8 3.6 3.9 4.7 2.5 3.1 2.8 0
29 41 45 48 49 49 56 52 56 58 59 67 64 71 82
Ra
ng
e
Fre
qu
en
cy
Ma
rk Diffe
ren
ce
Ma
rk Co
nsiste
ncy
Ma
rkC
on
sisten
cyR
an
ge
s
Fe
ed
ba
ckD
iffere
nce
Fe
ed
ba
ckC
on
sisten
cy
Fe
ed
ba
ck Co
nsiste
ncy
Ra
ng
es
We
igh
ted
Fe
ed
ba
ck D
iffere
nce
We
igh
ted
Fe
ed
ba
ck Co
nsiste
ncy
We
igh
ted
Fe
ed
ba
ckR
an
ge
s
80> 1 0.57 8.32 8.32 0.58 3.62 3.62 -3.1 11.93 11.93
70> 1 -4.8 10.48 10.48 -1.12 4.18 4.18 -4.8 8.0 8.0
65-69
8 -0.93 7.22 11.41-4.77
-0.03 3.05 5.06-1.98 -0.34 11.51 15.60-6.32
60-64
10 -2.00 5.90 10.20-1.63
-0.36 2.89 4.36-0.57 -1.90 12.57 15.89-9.13
55-59
8 0.72 5.48 9.74-2.97 0.08 2.88 4.44-1.83 -0.04 16.12 27.75-7.63
50-54
10 -2.69 7.20 10.64-1.4 -1.2 4.55 9.54-2.04 -3.1 11.37 16.24-6.0
45-49
8 2.14 5.61 6.79-3.49 1.42 2.43 3.37-1.26 5.13 15.56 24.2-8.41
40-44
2 4.17 4.56 5.42-3.68 0.99 1.81 2.38-1.25 5.01 16.35 17.69-15.01
35-39
1 -4.67 3.73 3.73 -1.84 2.7 2.7 -1,24 7.27 7.27
25-29
1 4.6 5.78 5.78 0.27 2.12 2.12 -0.87 15.6 15.6
Automatically Generate Mark for Marking
• Linear scale 0 -100 mapped directly to consistency … the way in HE?
• Map to Essay Grade Scale achieved (better reflecting ability of group)?
• Expectation of Normalised Results within a particular cohort / subject / institution?
Current ‘Simple’ Method
• Average Marks– Essay Mark = 57%– Marking Consistency = 5.37
• Ranges– Essay 79% <-> 31%– Marking Consistency 2.12 <-> 10.77
• Range Above Avge 22% <-> 3.25 (6.76=1)
• Range Below Avge 26% <-> 5.40 (4.81=1)
Innovation Grant Proposal
• Put the emphasis on the marker to get it right• Get the opportunity to ‘reflect’ on COMMENTS
before go back to essay owner• 2nd chance – not sure if I want the results to
have a major effect – hope they get it right the 1st time – consistency
• Is there a Need to have discussion between markers at this stage? – NO as it is dynamic
• Will review stage remove need for compensation?
Used on Final Year Degree + MSc DEGREE DCS
• 36 students on module• 192 markings• 25 ‘replaced’ markings out of 192 (13%)• Average time per peer marking = 37 minutes• Range of time taken to do markings 6-116• Average number of menu comments/marking = 9.8• Raw average mark for essays = 61%• Out of the 25 Markings ‘replaced’ (1 student replaced a
marking twice) only 6 marks changed 6/192 (3%)• Number of students who did replacements = 11(out of
36)• 1 student ‘Replaced’ ALL his/her markings• 6 markings actually changed mark +7, -4, -9, +3, -6, +6
(Avge = -0.5)
Used on Final Year Degree + MSc MSc EL&A
• 13 students • 76 markings• 41 replaced markings (54%)• Average time per marking = 42 minutes• Range of time taken to do markings 3-72 minutes• Average number of menu comments/marking = 15.7• Raw average mark = 61%• Out of 41 Markings ‘replaced’ –> 26 changed mark 26/76
(34%)• Number of students who did replacements = 8 (out of 13)• 2 students ‘Replaced’ ALL his/her markings• 26 markings actually changed mark • -1,+9, -2,-2, +1, -8, -3,-5, +2, +8, -2, +6, +18(71-89), -1,
-4, -6, -5, -7, +7, -6, -3, +6, -7, -7, -2, -5 (Avge -0.2)
Current Conclusions• The results of the mapping of the compensated peer-marks to the
average feedback indexes are very positive. Although the weighted development of the average feedback index only produces a slight improvement to an already very positive correlation, it addresses a concern that the subjectivity of the comments derived from the menu driven system were not totally subjective.
• The main concern of this method of automatically developing a mark for marking & commenting is the mapping of the consistency factors to an absolute grade. It should be kept in mind how difficult it currently is to explain to a student why they have been awarded 69% and their colleague has 71% within a traditional assessment.
• Review Stage -> Tangible or Non-Tangible -> MARKS OR REFLECTION
Some Points Outstanding or Outstanding Points
• What should students do if they identify plagiarism?
• What about accessibility?• Is a computerised solution valid for all?• At what age / level can we trust the use of peer
assessment?• How do we assess the time required to perform
the marking task?• What split of the marks between creation &
marking