centrality in a modified- angoff standard setting phd dissertation - proposal by
DESCRIPTION
Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by Michael Scott Sommers ( 張夏石 ) Department of Educational Psychology, National Taiwan Normal University. Outline of the Proposal. Statement of the Problem Purpose & Motivation - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/1.jpg)
Centrality in a Modified-Angoff Standard SettingPhD Dissertation - Proposal
by
Michael Scott Sommers (張夏石 )Department of Educational Psychology,
National Taiwan Normal University
![Page 2: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/2.jpg)
Outline of
the Proposal
![Page 3: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/3.jpg)
Statement of the ProblemPurpose & Motivation• Introduction to Standard Setting• Angoff Standard Setting Procedure• Problems with the Angoff Procedure• What is Centrality?• Research QuestionsMethods & Analysis• Materials & Participants• AnalysisExpected Results
![Page 4: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/4.jpg)
Statement of the
Problem
![Page 5: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/5.jpg)
Standard setting is the most widely used method to establish cutscores for high stakes examinations.
Despite this, many questions remain about how the procedure works and what exactly the meaning of the cutscore is.
This is especially true for the Angoff and related methods of standard setting.
![Page 6: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/6.jpg)
It is widely believed that judges in an Angoff standard setting have problems judging the most difficult and the easiest items.
That this inability creates centrality for the estimates that judges are required to provide.
![Page 7: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/7.jpg)
Aim of the study
Study the impact of questions of different difficulty and different rounds of the standard setting on panelist centrality in a modified-Angoff procedure
![Page 8: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/8.jpg)
1. Question 1: Does Centrality exist in the modified-Angoff standard setting?
2. Question 2: How does Centrality change across the rounds of the modified-Angoff procedure?
3. Question 3: Is Centrality explained by differences in panelist ratings between extreme (difficult and easy) items and median difficulty item?
![Page 9: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/9.jpg)
Purpose &
Motivation
![Page 10: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/10.jpg)
Introduction to Standard Setting
Standard setting is the most widely used method to establish cutscores for high stakes examinations.
Despite this, many questions remain about how the procedure works and the full meaning of the cutscore .
This is especially true for the Angoff and related methods of standard setting.
![Page 11: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/11.jpg)
It is widely believed that judges in an Angoff standard setting have problems judging the most difficult and the easiest items.
That this inability creates centrality for the estimates that judges are required to provide.
![Page 12: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/12.jpg)
• Standard setting is a procedure used to calculate a cutscore for a test.
• Standard is a verbal description of performance.
• Cutscores are the scores on a test needed to separate people taking a test in to the different categories of a standard.
![Page 13: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/13.jpg)
Common European Framework
of Reference
(CEFR)
![Page 14: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/14.jpg)
C2
C1
B2
B1
A2
A1
Can understand with ease virtually everything heard or read. Can express him/herself spontaneously, very fluently and precisely.
Can understand a wide range of demanding, longer texts. Can express him/herself fluently and spontaneously.
Can understand the main ideas of complex text. Can interact with a degree of fluency and spontaneity that makes interaction possible without strain.Can understand the main points of clear standard input on familiar matters. Can produce simple connected text on familiar topics.Can understand sentences and frequently used expressions related to areas of immediate relevance. Can communicate in simple and routine tasks.
Can understand and use familiar everyday expressions and very basic phrases.
![Page 15: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/15.jpg)
B1 Can understand the main points of clear standard input on familiar matters. Can produce simple connected text on familiar topics.
What is a “familiar matter” or “simple text”?
![Page 16: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/16.jpg)
A standard setting procedure can help understand these terms so they can be used to decide how a test can be used to determine if the this has been reached.
![Page 17: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/17.jpg)
• The Angoff standard setting procedure is one of the most widely used methods in Taiwan and the world to determine the passing score for high stakes tests.
• First suggested by William Angoff who attributed the idea to his colleague Ledyard Tucker (Cizek & Bunch, 2007).
![Page 18: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/18.jpg)
• There are many different types of standard setting procedures. One recent review (Kaftandjieva, 2010) identified more than 60 different methods for standard setting.
![Page 19: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/19.jpg)
Angoff Standard Setting Procedure
Judges are trained to use a description of performance called Performance Level Descriptors (PLDs) and match test items with these descriptors to create a cutscore that can be used to divide test takers in to different levels of performance.
![Page 20: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/20.jpg)
Judges are trained to use the PLDs and imagine a ‘barely proficient student’ (BPS)
Angoff, Step 1: The BPS
BPS
PLDs
![Page 21: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/21.jpg)
Step 2: Item Functioning
TEST ITEMJudges examine each item to assess difficulty.
![Page 22: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/22.jpg)
Step 3: Quantifying Estimates
BPS
“I think a BPS has a 68% chance of answering correctly.”
1
0
.50
.68
Judges quantify their expectations of the outcome as probabilities.
![Page 23: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/23.jpg)
Calculating the Cutscore
Item 1 68Item 2 43Item 3 72Item 4 80Item 5 76
Mean = 67.8
Judge 1 Mean = 67.8Judge 2 Mean = 72.2Judge 3 Mean = 65Judge 4 Mean = 75
Mean across judges = 70
Final Cutscore = 70
![Page 24: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/24.jpg)
Angoff, Step 4: Discussion & Feedback
Sharing Estimates / DiscussionEmpirical P-values
Conditional ‘P-values’% Students who would pass
![Page 25: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/25.jpg)
Typical Angoff Procedure
Round 1
Round 2
Round 3
Discussion&
Feedback
Discussion&
Feedback
FinalCutscore
![Page 26: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/26.jpg)
Problems with the Angoff Procedure?
![Page 27: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/27.jpg)
It is widely reported in standard setting and testing research that even the most experienced and well-trained judge may have problems estimating the difficulty of items on a test.
![Page 28: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/28.jpg)
Is this true for all items?
Are judges completely wrong?
Are some items easier than others to estimate accurately?
![Page 29: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/29.jpg)
Very difficult and very easy, i.e. extreme items, appear to be more difficult to
estimate correctly.
Items of moderate difficulty can be judged more accurately.
Judges are not using the full range of the scale.
![Page 30: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/30.jpg)
This study is a clarification of the measurement properties associated with this problem.
![Page 31: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/31.jpg)
Does item difficulty and the rounds of the standard setting help us understand the observed centrality of the panelists?
What is the effect of item difficulty and the rounds of a standard setting on the observed centrality in an Angoff standard setting?
![Page 32: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/32.jpg)
What is Centrality?
![Page 33: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/33.jpg)
Centrality is a widely accepted concept in the study of rating scales and raters that describes the clustering of rater scores around the center of a rating scale.
It has not been used to describe the results of an Angoff standard setting before but the similarity between the two situations indicates it could produce important results.
![Page 34: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/34.jpg)
A wide range of definitions have been suggested.
These are reviewed in Saal, Downey, and Lahey (1980)
Their review is not very helpful.
By current standards, their conclusions about the measurement of Centrality are useless.
![Page 35: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/35.jpg)
Why?
They include measures that are clearly measuring different things.
They include measures that have been used without discussion of what they’re measuring.
![Page 36: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/36.jpg)
Saal, Downey, and Lahey (1980) is focused on classical measures of Centrality.
![Page 37: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/37.jpg)
Wolfe (2004, pp. 39-40)“centrality...results in a concentration of assigned ratings in the middle of the rating scale…”
![Page 38: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/38.jpg)
A number of related terms
![Page 39: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/39.jpg)
“…restricted range exists when centrality is combined with leniency or harshness. That is, the restriction of range results in a restricted range around a non-central location on the rating scale. The converse of rater centrality occurs when raters tend to overuse the extreme rating scale categories - a rater effect called extremism.”
![Page 40: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/40.jpg)
Other related terms
Central tendency – sometimes used synonymously with Centrality, sometimes it is used differently.
Rater effect
Rater bias
![Page 41: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/41.jpg)
I will only be dealing with,
Wolfe (2004, pp. 39-40)“centrality...results in a concentration of assigned ratings in the middle of the rating scale…”
![Page 42: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/42.jpg)
Item Centrality – ratings from different raters for different items are clustered near the center of a rating scale.
Rater Centrality – rating for different items from different raters are clustered near the center of a rating scale
![Page 43: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/43.jpg)
Items/persons
p1 p2 p3 ….. Pn Itemi1 ip11 ip12 ip13….. Ip1n CentralityI2 ip21 ip22 ip23….. ip2n
I3 ip31 ip32 ip13….. ip3n
.
.
.
In ipn1 ipn2 ipn3….. Ipnn
Person
Centrality
![Page 44: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/44.jpg)
Research Questions
![Page 45: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/45.jpg)
Aim of the study
Study the impact of questions of different difficulty and different rounds of the standard setting on panelist Centrality in a modified-Angoff procedure
![Page 46: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/46.jpg)
1. Question 1: Does Centrality exist in the modified-Angoff standard setting?
2. Question 2: How does Centrality change across the rounds of the modified-Angoff procedure?
3. Question 3: Is Centrality explained by differences in panelist ratings between extreme (difficult and easy) items and median difficulty item?
![Page 47: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/47.jpg)
Methods and
Analysis
![Page 48: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/48.jpg)
Materials & Participants
Analysis
![Page 49: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/49.jpg)
Methods & Participants
![Page 50: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/50.jpg)
• Reliability (Cronbach’s Alpha): ~.90 overall (> .80 for listening/reading subtests)
• Item Analysis (Rasch Fit, Point Biserials)• Construct Validation: Factor Analysis, PCA
Linked Exam - EPTSpring Midterm - Annual English Proficiency Test (EPT) to assess gains in student proficiency.
Angoff Yes/No: Spring 2009 (97-2 學年 ) EPTAngoff: Spring 2010 (98-2 學年 ) EPT
~3000 Examinees per year level~12,000 Examinees
![Page 51: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/51.jpg)
Angoff Standard Setting – July, 2010
Panel I Panel II Panel III
18 judges, all with ESL background13 Female, 5 Male14 English NNS, 4 English NS3 Administrators, 13 Teachers, 3 Recently-
graduated TAs
![Page 52: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/52.jpg)
Angoff Standard Setting - June, 2009
Materials: 40-item EPT reading and 40-item EPTListening, CEFR B1 Reading and Listening
Descriptors (PLDs)
Training: 1 full day on CEFR, EPT familiarization, Angoff procedures. (Sommers, 2012)
Procedure: 3 Rounds with discussion/feedback data.
Data Collection: 18 x 80 x 3 = 4320 estimates; Discussions of items during training and
between rounds recorded and transcribed.
![Page 53: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/53.jpg)
Analysis
![Page 54: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/54.jpg)
The measurement of Centrality has not been well-studied.
Some research was done in the 1970s and reviewed in Saal, Downey, and Lahey (1980).
This work was based on classical measurement and was not very useful in terms of understanding Centrality.
![Page 55: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/55.jpg)
It suggests the following classical measures for Centrality.
• standard deviation• distance from the mean• kurtosis• rater X ratee ANOVA.
![Page 56: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/56.jpg)
Measures similar to Centrality have been developed and used in different fields.
• Job performance evaluation• Expertise evaluation
Cochran-Weis-Shanteau Index
![Page 57: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/57.jpg)
All of these have been based on the idea that Centrality is an error and its presence indicates a problem.
Thus, Centrality is really an issue of rater accuracy.
Much of this work involves finding ways to check the accuracy of raters.
![Page 58: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/58.jpg)
This is problematic.
These measures do not measure the same thing.
For example, standard deviation and kurtosis are very different ideas and do not always correlate very well.
![Page 59: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/59.jpg)
More recently, Edward Wolfe and his students have developed and begun exploring measures of Centrality based on latent trait theory and Rasch Modeling.
![Page 60: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/60.jpg)
Wolfe and his students (Wolfe, 2004; Wolfe, Moulder & Myford, 2001; Myford & Wolfe, 2003; 2004; Yue, 2011) have suggested a large number of such measures.
![Page 61: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/61.jpg)
• Mean-square fit statistics• Expected-residual correlations• Ratee measures and their residuals derived
from Multi-Facted Rasch Measurement models (MFRM)
• Correlation of Rasch measures and measures from raters
• Rater slope (point biserial)
![Page 62: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/62.jpg)
Yue (2011) tested these measures, as well as standard deviation in a series of studies using simulated data.
She found the most sensitive measures were• Standard deviation• Correlation of the raw measure and the Rasch
residual correlation • Correlation of the Rasch expected value and
the Rasch residual value
![Page 63: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/63.jpg)
For my study, I have selected several of these measures to examine Centrality in the Angoff standard setting.
![Page 64: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/64.jpg)
I have chosen • Kurtosis• Standard deviation• Correlation of the raw measure and and the
Rasch residual value, I will call this
rmeasure,res
from here on.
![Page 65: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/65.jpg)
I have chosen these because both standard deviation and kurtosis are well-understood classical statistics. They are easily performed and their interpretation is straight forward.
Yue (2011) has demonstrated that standard deviation has a great deal of utility in detecting Centrality.
She did not study kurtosis. Kurtosis is easy to perform and interpret. I’m curious to see how well kurtosis does compared with standard deviation.
![Page 66: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/66.jpg)
Pearson Kurtosis is a measure of the peakedness of a distribution of scores.
It is the 4th movement of the distribution typically calculated with
![Page 67: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/67.jpg)
= indicates the sum of calculationsX = observed value
= population meanN = number of scores in sample.
∑❑
❑
❑
![Page 68: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/68.jpg)
= indicates the sum of calculationsX = observation value
= sample means = sample standard deviationn = number of scores in sample.
![Page 69: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/69.jpg)
I want to compare at least one latent trait measure.
I picked one of latent trait measures from Yue (2011) because they have an advantage of interpretation.
![Page 70: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/70.jpg)
There are no guidelines for the interpretation of kurtosis and standard deviation to determine if there is or is not Centrality.
![Page 71: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/71.jpg)
The latent trait measure
rmeasure, res
Is designed so that under perfect Centrality, the measure should be -1.0 under perfect extremism, the measure should be 1.0with more Centrality, the measure moves from 0 to -1.0,.
![Page 72: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/72.jpg)
Rasch Model
Log(Pni1/1-Pni1)= βn - δi
Whereβn is the location of person n along the underlying latent trait, δi is the location of item i along the same latent variable, Pni1 and Pni0 are the probabilities of person n on item i scoring 1 and 0, respectively .
![Page 73: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/73.jpg)
Residual = Xnr-Enr
where
Xnr is the observed value for rater r and
Enr is the expected value for rater r
![Page 74: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/74.jpg)
Expected Results
![Page 75: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/75.jpg)
Question 1: Does Centrality exist in the modified-Angoff standard setting?
![Page 76: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/76.jpg)
Hypothesis 1:
Centrality is a fundamental part of the Angoff standard setting procedure and will be detectable in every standard setting.
![Page 77: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/77.jpg)
Research suggests that Item Centrality is built into the methods of the Angoff standard setting.
A decrease in standard deviation across rounds is a “common feature of standard settings” (Cizek, 2001a, p. 10) and is generally interpreted as an indicator of the validity of the particular standard setting being examined.
.
![Page 78: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/78.jpg)
It is not clear if there will be Judge Centrality. Some research indicates there might not be, but this has not been a focus of previous work and there is little work that reflects on it
![Page 79: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/79.jpg)
Expected ResultI expect there will be detectable Centrality in the modified-Angoff standard setting.
It is not clear if Item Centrality can be found in Round 1, but it will be detectable before the final cutscore is derived.
I make no clear predictions about Judge Centrality. Some research indicates there might not be, but this has not been a focus of previous work and there is little work that reflects on it.
![Page 80: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/80.jpg)
Question 2: How does Centrality change across the rounds of the modified-Angoff procedure?
![Page 81: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/81.jpg)
Hypothesis 2:
Centrality is a fundamental part of the Angoff standard setting procedure and will be detectable in every standard setting. It is produced by discussion and feedback, which are basic procedures of the modified-Angoff. As such, there will be more Centrality in the later rounds of the procedure.
![Page 82: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/82.jpg)
Expected Results
Item Centrality should decrease, as the Angoff procedure is designed to make this happen.
Once again, it is not clear if there will be Judge Centrality. Some research indicates there might not be, but this has not been a focus of previous work and there is little work that reflects on it.
![Page 83: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/83.jpg)
Question 3: Is Centrality explained by differences in panelist ratings between extreme (difficult and easy) items and median difficulty item?
![Page 84: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/84.jpg)
This is the central question of my project.
![Page 85: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/85.jpg)
Hypothesis 3:
Centrality is caused by problems evaluating the most extremely easy and extremely difficult items. It is these items that should contribute the most to valid measures of Centrality.
![Page 86: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/86.jpg)
I propose a series of calculations.
![Page 87: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/87.jpg)
Calculation 1Based on measures suggested by Impara & Plake (1998) using differences in raw scores, rather than classical or latent trait transformations
absolute value of the distance from the midpoint of logit scores to the item’s empirical p-value or logit difficulty score
![Page 88: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/88.jpg)
use this in correlations with classical measures of Centrality, such as standard deviation or kurtosis and perhaps latent trait measures
![Page 89: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/89.jpg)
Expected Results 1
I expect the modified-Angoff will show significant correlations between measures of score differences and item p-value/logit difficulty measures.
![Page 90: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/90.jpg)
Presuming that some measures capture only part of Centrality, I propose combining several measures to see if I can create a more sensitive index.
Combine Kurtosis and Standard Deviation to see if I can use these two different measures to capture some aspect of Centrality.
![Page 91: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/91.jpg)
Calculation 2• the first round panelist estimates will be
subtracted from the final value in the third round• the value for kurtosis will be positive or negative
depending on whether it, and the Centrality of raters or items, is increasing or decreasing across the standard setting.
• a similar operation will be performed for the standard deviation
• combine these 2 measures to produce a 2X2 matrix
![Page 92: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/92.jpg)
KURTOSIS across 3 rounds positive negativeA I Bsmaller HIGHEST I CENTRALITY IST. -----------------------------------------------------DEV. C I D across 3 rounds I LOWEST I CENTRALITYlarger I
![Page 93: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/93.jpg)
The A Quadrant shows the highest level of Centrality, containing those items and panelists with a decreasing standard deviation and increasingly positive kurtosis.
The D Quadrant shows the least Centrality containing items, with an increasing standard deviation and an increasingly negative kurtosis.
![Page 94: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/94.jpg)
I can perform this for each item and for each judge.
Then use identify which items and which judges
Are the items with the greatest Centrality also the items with the most extreme p-value / logit difficulty score (Rasch difficulty score)?
![Page 95: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/95.jpg)
Expected Results 2I predict the items with the most extreme difficulty values (easy and difficult) will fall in Quadrant A.
The pattern for judges is not theoretically important for this study.
![Page 96: Centrality in a Modified- Angoff Standard Setting PhD Dissertation - Proposal by](https://reader035.vdocuments.site/reader035/viewer/2022081512/56816612550346895dd95ba5/html5/thumbnails/96.jpg)
Questions?