redeveloping the exposure control parameters of cat items...
TRANSCRIPT
Redeveloping the exposure control parameters of CAT items when a pool is modified
Shun-Wen Chang
National Taiwan Normal University
Deborah J. Harris ACT, Inc.
Paper presented at the Annual Meeting of the American Educational Research Association, New Orleans, April 2002.
Abstract
Both pool management and exposure control method implementation have been the focus of easing test security concerns in CAT. Maintaining pools of high quality is an ongoing process and sometimes only a small modification may be needed. This study explored how the deletion of a single item and the unused items might alter the exposure control parameters of the remaining items derived by the SLC algorithm.
Simulations were carried out for this study. The findings indicated that the original exposure control parameters were no longer appropriate for a modified item pool and the derivation for the exposure control parameters ought to be repeated. Removing the unused items seemed to cause no effects on the performance of the SLC procedure in the observed exposure rates and measurement precision. The characteristics of the unused items were examined and compared with those of the high, moderate and low exposure groups, respectively. Results from this study have offered a better understanding of the effects of item interactions on the values of the exposure control parameters. Also, this study should provide psychometric researchers and test practitioners valuable insights for a better pool construction design. Keywords: computerized adaptive testing, item exposure rate, exposure control parameters, item pool
3 Test security is an issue of great concern for large scale, high-stakes tests in the continuous
testing context of computerized adaptive tests (CAT). Efforts have been made to prevent some popular
items from being overly exposed to examinees. This issue of controlling exposure rates of CAT items
has been studied extensively through the management of item pools, as well as the incorporation of
exposure control methods into the item selection process.
For pool management, the focus is on the CAT pool design, creation, rotation and maintenance
over time (Guo, Way, & Reshetar, 2000; Mills & Steffen, 2000; Stocking & Lewis, 2000; van der
Linden, 2000; Veldkamp & van der Linden, 2000; Way, Swanson, Steffen, & Stocking, 2001). Exposure
rates of items can be limited through rotation and regulation of multiple parallel pools. It was found that
through pool rotation, the impact of collusion among test-takers was reduced dramatically (Mills &
Steffen, 2000). Since adaptive tests are administered on a continuous basis, a good pool management
system is essential for secure and effective operational CAT administrations.
In addition to well-managed item pools, methods have been developed by embedding statistical
mechanisms in the item selection procedure to control the exposure rates of items to a desired maximum
value, r, that is specified in advance of testing (Davey & Parshall, 1995; Hetter & Sympson, 1997;
Stocking & Lewis, 1998, 2000; Sympson & Hetter, 1985). Most methods control the exposure rate of
an item through the exposure control parameter, which dictates the probability of administering the item,
given it is selected. Provided that an item has been selected, whether to administer this item to the
examinee depends upon the exposure control parameter of the item. For very popular items, the
exposure control parameters could be as low as the pre-specified desired exposure rate, indicating that
these items cannot be freely administered when they are selected. For items rarely appearing, the
associated exposure control parameters could be as high as 1.0, meaning that once these items are
selected, they are almost always presented. The values of these parameters are determined from a series
of iterative simulations using all items in the pool.
Investigations on the properties of the various exposure control strategies showed that the
Stocking and Lewis conditional multinomial (SLC) procedure (Stocking & Lewis, 1998, 2000) was the
most effective (Chang, 1998). The specific feature of the SLC algorithm that makes it the most
appealing is that this method develops for each item an exposure control parameter with respect to a
particular ability level. While the unconditional methods of exposure control only limit the item's
overall appearance in reference to the examinee sample drawn from a target population, the SLC
procedure directly controls the item exposure to examinees at the same or similar levels of proficiency.
This procedure controls against an item being administered to almost all examinees at one particular
ability level, even if the item's overall exposure rate is low for examinees across the entire ability
continuum.
4 Since the exposure control parameters of an item are developed based on the existence of all
other items in the pool, when there is any significant change in test structure of the pool, the exposure
control parameters must be redeveloped. Stocking and Lewis (2000) pointed out that if an item pool is
changed, even by the addition or deletion of a single item, the iterative simulations should be carried out
again to guarantee continued exposure control for the new item pool. Maintaining pools of high quality
is an ongoing process. Items are continuously under review and modifications of the pools are possible,
even likely. It is conceivable that such iterative simulations must be repeated very frequently for the
newly formed item pools. However, developing the exposure control parameters for the SLC method is
especially tedious and time-consuming. If the modification is only to withhold a single item from use in
the pool for some reason (for example, the item becomes dated for content reasons or it is found
defective due to some flaw), is it still necessary to redevelop the exposure control parameters of the
remaining items for this procedure? How much weight does one item carry in the values of the
exposure control parameters of the other items in the pool? It is worthwhile to find out whether using
the existing exposure control parameters in the modified pool could be warranted for the SLC algorithm.
By doing so, not only could time savings accrue in the preparation of the exposure control parameters,
but the interactions of these parameters could be further understood.
Research studies and practical experience have indicated that CAT administrations often lead to
numbers of items that are seldom or never used, depending on the exposure control method and item
pool employed (Chang, 1998; Chang, Ansley, & Lin, 2000; Veldkamp & van der Linden, 2000).
According to the maximum information selection rule, the unused items are those with relatively low
item information values. When compared with other items for a given ability level, they may not be
competitive enough in contributing to the estimation of examinees' abilities. The unused items seem to
make little contribution to the test. Therefore, it might be reasonable to suspect that removing the set of
unused items would not affect the existing exposure control parameters of all other items, so that
repeating the derivation process for the remaining items in such a reduced pool may not be necessary.
This study will examine the effect of removing the unused items on the exposure control parameters for
the SLC strategy.
Also, because the unused items are never administered to the examinees, will the CAT procedure
still perform as well without these items in the pool, using either the existing or the redeveloped
exposure control parameters? Is there still a sufficient number of items appropriate for selection in this
reduced pool? These questions deserve a through investigation, so that the impact of unused items can
be further understood, and values of the exposure control parameters as a result of the item interactions
can be better described. At the same time, it is important to inspect more carefully the characteristics of
the unused items, despite the fact that they possess very low information values for a given ability level.
5 Because unused items are also a consequence of utilizing a particular exposure control method for a
specific item pool, this inspection will reveal information about the nature of the SLC algorithm in
relation to the test structure of the pool. Findings of this study should be useful for constructing an even
better pool, in terms of item usage, for the SLC method.
Purpose of the Study
This study is designed to explore whether the removal of a single item or the unused items from
the pool impacts the exposure control parameters of the remaining items in a CAT; that is, whether
redeveloping the exposure control parameters is necessary when some items are eliminated from the
pool. Specifically, this study attempts to achieve the following objectives:
1. to explore whether the exposure control parameters of the items are affected by removing a single
item or the unused items from the pool;
2. to examine how the observed item exposure rates and measurement precision are affected by using
the reduced pool where the unused items are eliminated;
3. to offer more information about the performance of the exposure control method in relation to the test
structure of an item pool.
Method and Data
Simulations were employed to carry out this research. The study design and methods for data
analyses are described below.
I. Decisions for the CAT Components
The 3-PL model of IRT forms the basis for the investigations of this study. A single pool of 480
discrete items was used, in which the item parameters were calibrated from multiple forms of a large
scale standardized paper-and-pencil test using BILOG (Mislevy & Bock, 1990) on a single ability scale.
The CAT administration process was initiated by assigning each examinee a common ability
estimate of zero to simulate a situation where no a priori information was available about the individuals.
Items were selected based on the maximum item information criterion with the SLC algorithm
incorporated into the selection process. The target maximum exposure rate was specified to be .20. The
content presented to the examinees was balanced according to the Kingsbury and Zara's mechanism
(1989). The first item was administered following the algorithm of item exposure control, regardless of
the item's content attribute. The percentage of items that had been administered in each content
category was calculated and compared to the corresponding pre-specified percentage. The content area
with the largest discrepancy between the empirical and the desired percentage was then identified; the
6 next item was selected from this content area and administered based on the algorithm of the SLC
method. To estimate the examinees' abilities, Owen's Bayesian strategy (Owen, 1975) was employed for
the provisional ability estimation and the maximum likelihood estimation method (Birnbaum, 1968) was
utilized for the final ability estimation. Each examinee was administered 30 items.
II. Design of the Study
Two explorations were designed for this study.
Study I: Removing a Single Item
This study was designed to examine whether the removal of a single item caused the exposure
control parameters of the remaining items to change. There are various reasons an item may become
inappropriate and ought to be excluded from the current pool, and it may not necessarily be an item with
very low information value for a given ability level. In addition, the information value an item
possesses varies by ability level. An item with low information at one ability level may be associated
with high information values for other ability levels. For the purpose of evaluating the removal effect of
one item, this study considered three items of different information values at three selected ability levels.
Arbitrarily, items of the highest, the middle, and the lowest information were chosen for removal at each
of the theta points of -2.8, 0.0 and 2.8, respectively, with no considerations of the content area the item
belonged to.
Study II: Removing All Unused Items
For this part of the investigation, all of the unused items resulting from the operational CAT
administrations were eliminated from the pool. The reduced pool is named hereafter for this condition
to indicate the pool established based on the full pool without the unused items.
III. Procedures for Data Simulation
Three parts of the simulations were involved in generating the data for this study. First, the
exposure control parameters for each item in the pool were developed according to the SLC algorithm
and the CAT design proposed for this study. Then, the simulations continued for the operational CAT
administrations. The designated items were then removed from the pool and the exposure control
parameters of the remaining items were redeveloped and applied to the operational testing situations.
Stage I: Developing Exposure Control Parameters
The development of the SLC exposure control parameters was in reference to a particular level
of proficiency. The adaptive tests were administered to a conditional sample of 3,000 examinees at each
of the theta levels equally spaced over the interval of -3.2 and 3.2 with an increment of .40 (i.e., -3.2, -
2.8,..., 3.2), totalling 17 ability points. This could be regarded as administering the adaptive tests to a
7 sample of 51,000 examinees (with 3,000 examinees at each of the 17 ability levels) whose abilities were
uniformly distributed over the range of -3.2 and 3.2 on the theta scale.
The desirability of items for the SLC method was ordered based only on item information values,
not on the weights as specified in the original algorithm for which the Stocking/Swanson weighted
deviations model (WDM) (Stocking & Swanson, 1993; Swanson & Stocking, 1993) was employed to
select items. The process was repeated until the observed maximum exposure rates were approximately
equal to the desired level and the exposure control parameters were stabilized in the subsequent
iterations. The stabilized parameters at the final round of iterations were the exposure control
parameters to be used in operational adaptive testing. For each of the 480 items, a set of exposure
control parameters was obtained for each discrete ability level. The parameters could be visualized as
the elements in a matrix of 17 ability levels by the number of items in the pool. These were the
parameters existing in the full pool of 480 items.
Stage II: Simulating Operational CAT Administrations
During the operational CAT administrations, the adaptive test was delivered to each examinee in
a sample of 50,000 examinees drawn from a standard normal distribution, N(0,1), being representative
of the real examinee population. The exposure control parameters developed from the previous stage
were utilized here to manage the administration frequencies of the selected items. The number of times
an item was actually administered and the number of items in the pool never administered were recorded.
The adaptive tests were also administered to conditional sample sizes of 3,000 examinees over
17 equally spaced ability points from -3.2 to 3.2. This step was carried out in order to compute the
conditional maximum observed exposure rates and the error values conditional on each theta point.
Stage III: Removing the Single and Unused Items
At this stage, each of the chosen single items and the set of the unused items were, separately,
removed from the pool. The items with the highest, moderate, and lowest information values at each of
the theta points of -2.8, 0.0 and 2.8 were identified from an Info Table, which contained lists of the items,
ordered by the amount of information they provided at the various ability points. The unused items
were those never administered in Stage II, when the full pool of 480 items was applied in the operational
CAT administrations. After each deletion, the exposure control parameters of the remaining items were
redeveloped. The new set of exposure control parameters was compared with the existing parameters.
The process continued for the operational CAT administrations. For each modified pool
condition, both sets of the existing and redeveloped exposure control parameters were respectively
employed to control the administration frequencies of the selected items. The results for the various
modified pools with the existing and redeveloped exposure control parameters were evaluated.
8 IV. Methods for Data Analyses
To analyze the extent to which the removed items affect the exposure control parameters of the
remaining items, the descriptive statistics of the differences between the existing and redeveloped
exposure control parameters were computed for each ability point. Also, the Pearson product-moment
correlations of the parameters were obtained for the various ability levels. The results of applying the
existing and redeveloped exposure control parameters in the operational testing situations were
evaluated based on the maximum observed exposure rates, pool usage and the conditional standard
errors of measurement (CSEMs). The maximum exposure rates observed were investigated in reference
to the entire examinee group and also to the examinees at a particular ability point. The item pool usage
was assessed using the numbers of items in the pool administered at least once. The CSEMs were the
errors of the ability estimation as a result of introducing the SLC algorithm with the different sets of the
exposure control parameters into the item selection process. Meanwhile, the results based on the full
pool of 480 items were incorporated for evaluating the effects of removing the unused items from the
pool.
The characteristics of the unused items were inspected with respect to the descriptive statistics of
the three IRT item parameters, and the item information values and exposure control parameters for each
ability point. These values were compared with those of the used items having high, moderate and low
observed exposure rates, respectively. These used items were ordered based on the exposure rates
observed in the operational CAT administrations and were classified in such a manner that the first third
of the items were in the high exposure group, the second third were in the moderate exposure group and
the last third were classified into the low exposure group.
Results and Discussion
The exposure control parameters of all items in the initial full pool were derived in reference to
each ability level. These parameters were employed in the operational CAT administrations to control
the exposure rates of the items. Forty-eight items in the full pool were never administered to examinees.
These unused items and each of the designated single items were, separately, removed from the full pool
and the exposure control parameters of the remaining items were redeveloped. For each single item
deletion, the new parameters could be visualized as the elements in a matrix of 17 ability levels by 479
items. For the unused items deletion, the parameters could be visualized as the elements in a matrix of
17 ability levels by 432 items.
Figure 1 shows the results of the maximum exposure rates observed at each ability point during
the iteration process for both full and reduced pools. A horizontal line is presented to indicate the pre-
specified exposure rate of .20. Only the results of the full pool and the reduced pool of 432 items are
9 presented. It can be seen that the iteration results were very similar for both situations, even with the
deletion of 48 items. The iteration results for the single item deletion conditions were virtually no
different from the configurations displayed here.
Figure 1 also shows that the iteration curves did not approach the desired rate of .20. Also, the
conditional observed maximum exposure rates for the various ability points seem to converge to
different values. The closer the iterations were at the extreme ability levels, the higher the values the
conditional observed maximum exposure rates converged to. These results reflected the nature of the
item pools, in that there were fewer items appropriate for administration for the extreme ability levels
than for the middle. The frequency distribution of the item difficulty values (i.e., the b parameters) for
the full pool is displayed in Figure 2 to provide some idea of the numbers of items appropriate for the
various ability levels.
Described below are the results of the relationships of the existing and redeveloped exposure
control parameters, followed by the results of employing the existing and redeveloped exposure control
parameters for the various modified pools in the operational testing situations. Then, the characteristics
of the unused items are presented.
The Relationships of the Existing and Redeveloped Exposure Control Parameters
The Differences of the Exposure Control Parameters
Table 1 reports the results of the differences between the existing and redeveloped exposure
control parameters for each theta point for the various removal conditions. The conditions for the single
item removal are represented by the capital letters H, M, and L to indicate the highest, moderate and
lowest informative items for removal at each of the three ability points of -2.8, 0.0 and 2.8, respectively.
For the unused items removal condition, it is simply noted as UNUSED. The N indicated under each
removal condition is the number of items remaining in the pool after the deletion. The N_Unchanged
line lists the number of items for which the redeveloped exposure control parameters remained the same
as the existing parameters. The MDIFF line contains the average values of the differences between both
sets of the parameters, with the redeveloped exposure parameters being subtracted from the existing
ones (i.e., MDIFF = average of [existing parameters minus redeveloped parameters]). The MDIFFABS
line reports the average values of the absolute differences between parameters.
Under the various conditions of removing the single items, the differences in the exposure
control parameters seem insubstantial. Table 1 shows that about 66% to 75% of the exposure control
parameters remained unchanged for the various ability levels, with fewer items having parameters
unchanged at the upper middle part of the ability scale (around the theta points of -.80 to 1.6) than at
both extremes. The findings indicate that removing one single item caused more items to have their
10 exposure control parameters changed for the upper middle part of the ability scale than at both extremes.
This phenomenon might be explained by the fact that there were more items appropriate for
administration for the upper middle part of the ability scale, as can be seen in Figures 1 and 2. Items
appropriate for the upper middle part of the theta scale were thus used more often. When the pool was
modified, a greater number of items were affected over this upper middle range than the two ends of the
ability scale.
As for the comparisons among the various conditions of deleting items with the varying degrees
of information values, there was no clear indication as to whether the effects were different. Removing
items with the varying degrees of information values for the different ability levels did not seem to have
noticeably differential effects on the results.
The values of MDIFF were negative or positive for the various single item removal conditions
(see Table 1), as the redeveloped exposure control parameters could be larger or smaller than the
existing parameters. However, it is somewhat interesting to see that the lower part of the ability scale
tends to result in positive MDIFF values (i.e., on average, the redeveloped parameters were smaller than
the original ones) while the middle and upper parts seem to produce more negative MDIFF values (i.e.,
on average, the redeveloped parameters were greater than the original ones). When the parameters were
redeveloped to be smaller, the probability of administering an item once selected was reduced; in other
words, the control for item exposure became stricter. When the parameters were redeveloped to be
greater values, the probability of administering an item given selection was increased; the control for
item exposure was lessened. The deletion of a single item might provide some opportunities for items
appropriate for the middle and upper parts of the ability scale to be administered more freely.
It is expected that the average observed exposure rate of items would be increased when the pool
size is reduced. That is, items would, in general, be used more often when there are fewer items in the
pool. However, since there were more items appropriate for the middle and upper parts of the ability
scale than for the extremes, the SLC algorithm opened up the opportunities for items appropriate at this
middle and upper range so that their exposure control parameters were redeveloped to be higher to allow
for more administrations given selection. The deletion of the items might just improve on the
probability of administering items appropriate for the middle and upper parts of the theta scale.
The values of the average absolute parameter differences (see MDIFFABS in Table 1) seem to be
small for the various conditions. The largest value was observed at the theta point of .80 for the various
single item deletion conditions, except for H(0.0). On average, the exposure control parameters at the
theta point of .80 suffered the biggest impact due to the single item removal. For the results with the
H(0.0) condition, the largest MDIFFABS was observed at the theta point of -2.4, but the MDIFFABS
values were in general greater than those of the other conditions. Notice that for the H(0.0) condition,
11 the removed item was the one possessing the greatest information value at the theta point of 0.0.
Removing this particular item led to the greatest impact among the various conditions, indicating that
eliminating an item with the greatest information value for the middle ability point of 0.0 caused the
greatest change on the exposure control parameters. This outcome was not surprising since the item
being removed was extremely popular for most examinees, and was almost always selected.
Other than the phenomenon observed with the removal of the most informative item for the
middle ability point of 0.0, the values of the MDIFFABS do not seem to show a clear pattern as to what
effects might be due to the different items' removal. That is, except for removing the most informative
item for the middle ability level, the results in MDIFFABS did not show differences among the various
removal conditions. Removing any other item seems to show similar effects on the change of the
exposure control parameters. These outcomes might be attributed to the fact that items possess different
information values at the various ability levels. The removed item may have low information value for
a given ability level, but this item might be the one with high information at another ability level.
As presented in Table 1, the results of the differences in the parameters for the unused item
removal condition were fairly similar to those observed with the single item removal, except with a
smaller percentage of unchanged redeveloped exposure control parameters (around 64% to 70% for the
various ability levels). The upper middle part of the ability scale still had a smaller number of
unchanged items than the two ends. Removing all the unused items also caused more exposure control
parameters to change for the upper middle part than for the extremes, in spite of the fact that these
removed items were those never administered in the operational testing using the full pool.
Table 1 shows that the MDIFF values for the unused items deletion were negative for all ability
levels, indicating that the redeveloped exposure control parameters were on average greater than the
original parameters. It can also be noticed that the effects seem to be stronger at the middle and upper
parts of the ability scale. The interpretations might be similar to those discussed with the single item
removal conditions. With fewer items in the pool, the SLC algorithm increased the values of the
exposure control parameters in general to allow for more administrations given selection. Because there
were more items appropriate for the middle and upper parts of the ability scale than the other parts, the
SLC algorithm tended to relax the exposure control for items more appropriate for these parts to a
greater extent so that higher values of the exposure control parameters were redeveloped.
As can be seen in Table 1, the average absolute parameter differences (i.e., the MDIFFABS
values) for this unused item deletion were in general greater than those observed with the single item
deletion conditions. It was not surprising to see that removing more items from the pool caused the
parameters to change to a greater extent. The largest MDIFFABS value for this condition was still
observed at the theta point of .80. On average, the exposure control parameters for this theta point still
12 suffered the biggest impact due to the removal of the unused items.
The Pearson Product-Moment Correlation
The Pearson product-moment correlations are contained in Table 2. The values were all
significant at the alpha .0001 level, and were almost perfect. For all the removal conditions, there were
very strong correlations between the original and redeveloped exposure control parameters at all ability
levels. There seems to exist no specific patterns for these correlations.
Results of Employing the Existing and Redeveloped Exposure Control Parameters in Operational
Testing
The exposure control parameters resulting from the final round of the iterations were associated
with the corresponding items to be used in the operational testing situations. Described below are the
results of administering CATs based on the various modified pools using both existing and redeveloped
exposure control parameters in operational testing. Since the results for the single item removal
conditions were fairly parallel to each other, only those for the H(0.0) deletion condition are presented in
the table and figures for discussion. The results based on the full pool of 480 items are included for the
examination of the effect of removing all unused items.
Table 3 reports the summary statistics of the observed exposure rates as a result of applying the
modified pools with both existing and redeveloped exposure control parameters to the entire examinee
group. The N column lists the numbers of items that were administered to examinees at least once.
Based on these items, the summary statistics were obtained. It can be seen in Table 3 that the observed
maximum exposure rates resulting from using the two sets of the exposure control parameters were as
low as the desired rate of .20 for either of the H(0.0) and unused items removal conditions. The pool
utilization was also similar for using both sets of the exposure control parameters. For the reduced pool
of 432 items, 427 items were used at least once, so there were still five items left unused. Also,
compared with the results of the full pool (see the lower part of Table 3), removing all the unused items
did not seem to cause differences on the observed exposure rates of the items. Table 3 presents also the
results of the three exposure groups, which are discussed below in the examination of the characteristics
of the unused items.
The conditional maximum observed exposure rates at each ability level are displayed in Figure 3.
The values were higher than the desired rate of .20 at all ability levels, which was not unexpected given
that the maximum observed exposure rates were stabilized, but to values higher than the desired rate in
the derivation of the exposure control parameters.
For both H(0.0) and unused items deletion conditions, the results yielded by using the
redeveloped exposure control parameters were almost identical to those of the full pool. However, the
13 curves were different when the existing exposure control parameters were applied. The maximum
exposure rates conditionally observed at the lower part of the ability scale were especially higher than
those of the full pool. Recall the results with the MDIFF values for the single item removal conditions
where the redeveloped exposure control parameters tended to be smaller than the existing parameters for
the lower part of the ability scale (see Table 1). The SLC algorithm seemed to exercise stricter exposure
control for items at the lower part of the scale in the modified pools. When the existing exposure
control parameters of larger values were utilized, the results were that the conditional maximum
observed exposure rates were higher than expected. Although the differences between the existing and
redeveloped exposure control parameters seemed insubstantial, as discussed in the earlier section, the
existing exposure control parameters were in fact no longer appropriate for the modified item pools.
These findings suggest that the exposure control parameters for a modified pool need to be redeveloped
in order to ensure the continued exposure control for the items. Considering the effect of removing the
unused items on the conditional maximum observed exposure rates, the performance of the reduced pool
was competitive to that of the full pool as long as the redeveloped exposure control parameters were
utilized.
The random errors conditional on each theta point are displayed in Figure 4. The CSEM curves
yielded by using the existing and redeveloped exposure control parameters were almost identical for
either of the modified pools. The employment of the existing or the redeveloped exposure control
parameters appears not to affect the CSEMs. Applying the reduced pool also produced similar values of
CSEMs to those of the full pool. Eliminating all unused items did not seem to affect the CSEMs either.
Examination of the Characteristics of the Unused Items
The number of items that were never administered in the operational testing when the full pool
of 480 items was employed was recorded. For the design considered in this study, 48 items were never
used. The characteristics of the unused items were examined and compared with those of the used items
having high, moderate, and low observed exposure rates. Table 3 reports the summary statistics of the
observed exposure rates for these three groups of items. The average observed exposure rates of the
high, moderate and low exposure groups were .13, .06 and .01, respectively. The maximum observed
exposure rate was controlled to be as low as the desired rate of .20 in the high exposure group.
With Respect to the Item Parameters
Table 4 contains the summary statistics of the three item parameters for the various exposure
groups. For the full pool, the mean and SD of the a parameters were 1.01 and .33, the mean and SD of
the bs were .15 and 1.08, and the mean and SD of the cs were .18 and .08. As expected due to the
maximum item information selection, the average value of the a parameters was the largest in the high
14 exposure group and the values decreased for less often exposed groups. The unused items had the
lowest average of the a parameters. In terms of the b parameters, the high exposure items were those of
average item difficulty, the moderate exposure items were those slightly more difficult and the low
exposure items were those even more difficult. The unused items tended to be the easy ones. For the c
parameters, the highest exposure items were associated with the lowest average guessing value while the
unused items had the highest guessing value on average.
With Respect to the Item Information Values
Table 5 displays the descriptive statistics of the item information values at each theta level for
the three exposure groups and the unused items. It can be detected that the maximum information
values for the various ability levels appeared in different exposure groups. For the middle part of the
ability scale (the theta points between -1.2 and .80), the maximum information values were observed in
the high exposure group, where the average information values were higher than those for the other
groups as well. For the slightly lower and upper parts (the theta points of -2.0 and -1.6, and of 1.2 and
1.6, respectively), the maximum information values were seen in the moderate exposure group, for
which the average information values were the highest among the various groups. As for the very high
and low ends of the scale, the maximum information values existed in the low exposure group. For the
unused items, the item information values were low in general, although the average information values
for some ability levels at the low end were the highest among the various groups.
With Respect to the Exposure Control Parameters
The descriptive statistics of the exposure control parameters are shown in Table 6 by ability level
and exposure group. The results based on the item information and exposure control parameters were
closely related, since the exposure control parameters were derived by the algorithm of the SLC
procedure based on the maximum item information criterion for item selection. For the high exposure
group, the average exposure control parameters were lower at the middle part of the ability scale than at
both ends, indicating that items appropriate for the middle were being controlled for their exposure to a
greater extent. Thus, the high exposure group contained items more appropriate for the middle than the
two ends. Also, the average exposure control parameters were slightly higher at the upper than the
lower end. For the moderate exposure group, the exposure control parameters at the upper end were
somewhat lower than those in the high exposure group. This might be explained by the fact that items
in the high exposure group were mostly appropriate for the middle part of the ability scale, but were in
general not as appropriate for the upper end as those in the moderate exposure group. For the low
exposure group, there existed items associated with lower exposure control parameters at the very high
end than the moderate or high exposure group. The low exposure group contained some items
especially appropriate for the very high end of the ability continuum. However, the fact is that except
15 for these few ability levels, items at all ability levels were associated with relatively high exposure
control parameters. Most items in this low exposure group were not likely to be selected. Most items
were not being administered, so the average exposure rate was still low.
It can be noticed that the value of the exposure control parameters could be 1.0 for items at any
ability level or in any exposure group. Even for the high exposure group, there still existed rarely
appearing items whose exposure control parameters were developed to be 1.0. Such phenomenon was
not unexpected, since items could not be popular across all ability levels. As has been seen in Table 5,
an item might be very informative at one ability level, but not informative at all at other levels. In terms
of the minimum exposure control parameters, all values were as low as .20 for the high exposure group,
suggesting that the high exposure group contained very popular items needing to be strictly controlled
for their exposure to the desired rate. For the moderate exposure group, only for ability levels of 1.6 and
above were the exposure control parameters derived to be .20 at the minimum. Apparently, items in the
moderate exposure group were less popular than in the high exposure group. For the low exposure
group, there still existed some popular items at the very high ability levels. But for the middle part of
the ability scale, items were not as competitive for selection as those in the high or moderate exposure
group, so the exposure control parameters were derived to be 1.0 in order to be administered once
selected.
As for the unused items, the minimums of the exposure control parameters were all 1.0. They
were rarely appearing items. Once the items were selected, they were always allowed to be
administered to examinees.
Summary and Conclusions
Summary of Results
The current study explored whether removing a single item or the unused items had effects on
the exposure control parameters of the remaining items; that is, whether redeveloping the exposure
control parameters for the remaining items was necessary when some items were eliminated from the
pool. For the purpose of evaluating the removal effect of one item, this study considered three items of
different information values at three selected ability levels. Items with the highest, moderate and lowest
information values were chosen for removal at each of the theta points of -2.8, 0.0 and 2.8, respectively.
To examine the results of removing the unused items, this study employed a reduced pool by excluding
the never administered items from the operational CAT administrations when the full pool was
employed. The exposure control parameters were redeveloped for all items in the modified pools. Both
existing and redeveloped exposure control parameters for the various pools were utilized in the
operational testing situations and their performance was compared. Also, the characteristics of the
16 unused items were evaluated in detail. The results of the current study are summarized below.
The Relationships of the Existing and Redeveloped Exposure Control Parameters
Under the conditions of removing both single and unused items, there were fewer items with the
parameters remaining unchanged at the upper middle part of ability scale than at both extremes.
Exposure control parameters at the upper middle part of the ability scale seemed to be affected by the
item removal to a greater extent than the other parts of the scale. These outcomes reflected the nature of
the item pools, in that there were more items appropriate for administration for the upper middle part
than for the extremes of the ability scale. Also, the findings demonstrated no differential effects on the
results by removing items with the varying degrees of information values for the different ability points.
An inspection of the average differences of the existing and redeveloped exposure control
parameters indicated that under the single item removal conditions, the redeveloped exposure control
parameters could be larger or smaller than the existing ones. However, for the middle and upper parts of
the ability scale, the redeveloped parameters tended to be greater than the existing parameters, so the
probability of administering items given selection for these parts was somewhat increased. For the
unused item removal condition, the redeveloped parameters were on average greater than the existing
ones and the results were stronger for the middle and upper parts of the ability scale. With fewer items
in the pool, the SLC algorithm in general increased the values of the exposure control parameters to
allow for more administrations given selection. But since more items were appropriate for the middle
and upper parts of the ability scale than for the two ends, the SLC algorithm tended to loosen the
exposure control for items appropriate for these parts to a greater extent.
The average absolute parameter differences for the unused items deletion were in general greater
than those for the single item deletion. Removing a larger number of items from the pool led to a
greater change on the exposure control parameters. Among the various conditions of the single item
deletion, the findings did not suggest noticeable differences due to the removal of items with varying
degrees of information values for the various ability levels, except for H(0.0). These phenomena might
be explained by the reason that an item possesses different information values, depending on the ability
levels. Although the removed item was of low information at one ability level, this item might be of
high information at some other ability levels.
The results regarding the Pearson product-moment correlations showed the redeveloped
exposure control parameters obtained from the various removal conditions were all strongly correlated
with the original parameters for all ability points.
Results of Employing the Existing and Redeveloped Exposure Control Parameters in Operational
17 Testing
For either the single item or the unused items deletion, the observed maximum exposure rates
produced by using the existing and redeveloped exposure control parameters were as low as the desired
rate of .20. Also, each of the modified pools was utilized to a similar degree by applying the two sets of
the exposure control parameters. Both applications left similar numbers of items in the pool that were
never administered in the operational testing situations.
However, as compared based on the conditional maximum observed exposure rates, the results of
utilizing the existing exposure control parameters were different from those of the redeveloped ones.
The maximum exposure rates conditionally observed at the lower part of the ability scale were
especially higher than those of the full pool or the modified pools with the redeveloped exposure control
parameters. The findings suggested that the existing exposure control parameters were no longer
appropriate and the redevelopment of the parameters ought to be necessary when the pool was modified.
In terms of the measurement precision, employing the existing or the redeveloped exposure control
parameters did not yield differences in the CSEMs.
As to the effect of removing all the unused items, the findings indicated that as long as the
redeveloped exposure control parameters were applied, the performance of the reduced pool was
competitive to that of the full pool. Both overall and conditional maximum observed exposure rates for
the reduced pool were as low as those for the full pool. Eliminating the unused items did not seem to
cause effects on the CSEMs either.
Examination of the Characteristics of the Unused Items
The characteristics of the unused items were compared with those of the high, moderate and low
exposure groups, respectively. The results showed that the unused items had the lowest average
discriminating value, were easier than the average difficulty level, and were those associated with the
highest guessing value on average.
The item information values for the unused items were low in general, although their average
information values for some ability levels were the highest among the groups. The outcomes also
revealed that the maximum information values for the various ability levels did not always appear in the
high exposure group. For the middle part of the ability scale, the maximum information values were
seen in the high exposure group. But, for the slightly lower and upper parts, the maximum values were
observed in the moderate exposure group and for the very high and low ends, the maximums were in the
low exposure group.
All exposure control parameters of the unused items were 1.0. Obviously, these items were
rarely appearing, so they were always allowed to be administered once selected. However, the results
18 were that these items were never selected in the operational testing, so they were never administered.
For the high exposure group, the average exposure control parameters were lower in the middle than at
both ends of the ability scale, saying that this group contained items more appropriate for the middle
than for the two ends. For the moderate exposure group, the exposure control parameters at the upper
end were somewhat lower than in the high exposure group. Similar to the observations with the item
information values stated above, the items in the high exposure group were mostly appropriate for the
middle part but were in general not as popular for the upper end as those in the moderate exposure group.
Although the low exposure group contained some items especially appropriate for the very high ability
levels, items at most parts of the ability scale were not likely selected since they were associated with
relatively high exposure control parameters.
Conclusions
This study explored how the deletion of a single item and the unused items might alter the
exposure control parameters of the remaining items derived by the SLC algorithm. Although removing
a single item or the unused items did not seem to have a substantial impact on the values of the exposure
control parameters, the results of applying the existing exposure control parameters for the modified
pools caused the conditional maximum observed exposure rates to be higher at the lower end of the
ability scale. Thus, the original exposure control parameters were no longer appropriate for the
modified pools. These findings provided evidence to support the view in Stocking and Lewis (2000) in
that, in their opinion, the iterative simulations should be carried out again to guarantee continued
exposure control for any modified item pools.
Employing the reduced pool with the redeveloped exposure control parameters yielded very
similar results to the full pool in both overall and conditional maximum exposure rates, and in the
CSEMs. The removal of all unused items seemed to have no effect on the performance of the SLC
procedure in the operational CAT administrations. It might be that the items removed were those that
would never be administered in the operational testing situations. These items were not influential
enough in altering the results of the CAT procedure. Employing the full pool and the reduced pool with
the redeveloped exposure parameters might provide similar enough conditions for the SLC algorithm to
perform equally well. However, it should be noticed that this study took into account only the content
matter of items during the item selection process. Contexts having nonstatistical constraints such as the
overlap or the item set constraints were not considered in the current study. The effect of the unused
items removal remains unknown under complex but more realistic test structures.
The characteristics of the 48 unused items were inspected in detail. As expected, the information
values of these unused items were low overall, but their average information values for the very low
19 ability levels were indeed the highest. However, for the ability distribution of N(0,1) employed in the
current study, not many examinees were around this very low end, so items more appropriate for this
part of the ability scale would not have been selected often. These unused items were not as competitive
as those in the other exposure groups based on the maximum information selection rule. As a result of
the item interactions, the exposure control parameters of these unused items were all derived to be 1.0,
in spite that some of the unused items were somewhat popular for examinees at the very low end. It is
conceivable that when there are more examinees with ability levels at this low end and items appropriate
for this part of the ability scale are in greater demand, the unused items would have a better chance of
being selected. This implies that the number of the unused items is a consequence of the ability
distribution of the examinees' population, in addition to the characteristics of the items themselves.
Considerations of the examinee population should not be ignored in the creation of a well-utilized item
pool, as also emphasized in Guo et al. (2000) for constructing a CAT item pool. Further studies might
be conducted to investigate the effects of the examinees' ability distributions on the pool utilization.
It is also important to attempt to determine the pool size for optimizing pool utilization. The
results of the SLC algorithm were almost identical based on the full pool of 480 items and the reduced
pool of 432 items. When the pool of 432 items was utilized in the operational testing, there were still
five items never administered. Could the pool size be further reduced for optimal pool usage? This
issue is, of course, very complicated since there exist many factors to consider, such as the desired
maximums of the exposure rates, the characteristics of the items, the structures of the item pool, and the
ability distribution of the real examinees' populations.
Both pool management and exposure control method implementation have been the focus of
easing test security concerns in CAT. Maintaining pools of high quality is an ongoing process, and
sometimes only a small modification may be needed. This study investigated how the deletion of a
single item and the unused items might alter the exposure control parameters of the remaining items
derived by the SLC algorithm. Results from this study have offered a better understanding of the effects
of item interactions on the values of the exposure control parameters and also, on the outcomes of the
observed exposure rates and measurement precision in operational testing. Along with a detailed
examination of the characteristics of the unused items, this study has provided psychometric researchers
and test practitioners valuable insights for a better pool construction design.
20
References
Birnbaum, A. (1968). Some latent trait models and their uses in inferring an examinee’s ability. In F. M. Lord & M. R. Novick, Statistical theories of mental test scores (pp. 395-479). Reading, MA: Addison-Wesley.
Chang, S. W. (1998). A comparative study of item exposure control methods in computerized adaptive testing. Unpublished doctoral dissertation, The University of Iowa. Iowa City, IA.
Chang, S. W., Ansley, T. N., Lin, S. H. (2000, April). Performance of item exposure control methods in computerized adaptive testing: Further explorations. Paper presented at the annual meeting of the American Educational Research Association, New Orleans.
Davey, T., & Parshall, C. G. (1995, April). New algorithms for item selection and exposure control with computerized adaptive testing. Paper presented at the annual meeting of the American Educational Research Association, San Francisco.
Guo, F., Way, W. D., & Reshetar, R. (2000, April). Test security and the development of computerized tests. Paper presented at the NCME invited symposium: Maintaining test security in computerized programs--Implications for practice, New Orleans.
Hetter, R. D., & Sympson, J. B. (1997). Item exposure control in CAT-ASVAB. In W. A. Sands, B. K. Waters, & J. R. McBride (Eds.), Computerized adaptive testing: From inquiry to operation (pp. 141-144). Washington, DC: American Psychological Association.
Kingsbury, G. G., & Zara, A. R. (1989). Procedures for selecting items for computerized adaptive tests. Applied Measurement in Education, 2(4), 359-375.
Mills, C. N., & Steffen, M. (2000). The GRE computer adaptive test: Operational issues. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 75-99). The Netherlands: Kluwer Academic Publishers.
Mislevy, R. J., & Bock, R. D. (1990). Item analysis and test scoring with binary logistic models. BILOG 3. Chicago, IL: Scientific Software, Inc.
Owen, R. J. (1975). A Bayesian sequential procedure for quantal response in the context of adaptive mental testing. Journal of the American Statistical Association, 70(350), 351-356.
Stocking, M. L., & Lewis, C. (1998). Controlling item exposure conditional on ability in computerized adaptive testing. Journal of Educational and Behavioral Statistics, 23(1), 57-75.
Stocking, M. L., & Lewis, C. (2000). Methods of controlling the exposure of items in CAT. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 163-182). The Netherlands: Kluwer Academic Publishers.
Stocking, M. L., & Swanson, L. (1993). A method for severely constrained item selection in adaptive testing. Applied Psychological Measurement, 17(3), 277-292.
Swanson, L., & Stocking, M. L. (1993). A model and heuristic for solving very large item selection problems. Applied Psychological Measurement, 17(2), 151-166.
Sympson, J. B., & Hetter, R. D. (1985, October). Controlling item exposure rates in computerized adaptive testing. Paper presented at the annual meeting of the Military Testing Association. San Diego, CA: Navy Personnel Research and Development Center.
21 van der Linden, W. J. (2000). Constrained adaptive testing with shadow tests. In W. J. van der Linden
& C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 27-52). The Netherlands: Kluwer Academic Publishers.
Veldkamp, B. P., & van der Linden, W. J. (2000). Designing item pools for computerized adaptive testing. In W. J. van der Linden & C. A. W. Glas (Eds.), Computerized adaptive testing: Theory and practice (pp. 149-162). The Netherlands: Kluwer Academic Publishers.
Way, W. D., Swanson, L., Steffen, M., & Stocking, M. (2001, April). Refining a system for computerized adaptive testing pool creation. Paper presented at the annual meeting of the American Educational Research Association, Seattle.
22
Table 1 Results of Differences of the Exposure Control Parameters by Ability Level and Removal Condition
Theta
Items Removed Parameters -3.2 -2.8 -2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4 2.8 3.2
N_Unchanged 352 349 340 341 344 336 331 322 325 329 320 331 327 339 340 360 348
H(-2.8)
MDIFF 0.0010 0.0002 0.0002 0.0005 0.0001 0.0005 -0.0011 -0.0029 -0.0018 -0.0024 -0.0016 -0.0016 -0.0011 -0.0014 -0.0010 -0.0010 -0.0020
N=479 MDIFFABS 0.0039 0.0034 0.0047 0.0039 0.0027 0.0042 0.0037 0.0043 0.0048 0.0039 0.0072 0.0030 0.0042 0.0025 0.0040 0.0030 0.0040
N_Unchanged 352 348 335 335 337 337 326 323 325 323 324 331 324 340 340 346 350
H(0.0)
MDIFF 0.0023 0.0025 0.0011 0.0017 0.0005 0.0013 0.0005 -0.0020 -0.0010 -0.0006 -0.0016 -0.0003 0.0006 0.0010 0.0002 0.0008 -0.0006
N=479 MDIFFABS 0.0064 0.0071 0.0078 0.0066 0.0060 0.0058 0.0053 0.0048 0.0063 0.0051 0.0068 0.0060 0.0055 0.0050 0.0059 0.0053 0.0075
N_Unchanged 351 351 337 336 341 340 329 324 325 328 320 330 330 339 342 351 345
H(2.8)
MDIFF -0.0005 0.0004 0.0005 0.0000 -0.0010 0.0008 -0.0010 -0.0027 -0.0017 -0.0008 -0.0021 -0.0012 -0.0008 0.0000 0.0000 0.0000 0.0010
N=479 MDIFFABS 0.0032 0.0036 0.0046 0.0050 0.0030 0.0028 0.0036 0.0046 0.0051 0.0029 0.0070 0.0029 0.0039 0.0030 0.0050 0.0040 0.0050
N_Unchanged 353 347 337 334 336 338 328 321 320 329 319 328 327 339 342 352 350
M(-2.8)
MDIFF 0.0020 0.0030 0.0019 0.0009 0.0022 0.0010 0.0002 -0.0017 -0.0008 0.0000 -0.0006 -0.0013 -0.0013 -0.0017 -0.0010 -0.0008 -0.0013
N=479 MDIFFABS 0.0040 0.0040 0.0057 0.0059 0.0049 0.0035 0.0051 0.0061 0.0054 0.0042 0.0076 0.0036 0.0049 0.0025 0.0028 0.0026 0.0037
N_Unchanged 351 348 337 339 339 337 330 321 325 327 321 328 331 341 340 346 350
M(0.0)
MDIFF 0.0002 0.0020 0.0014 0.0004 0.0009 0.0016 -0.0011 -0.0037 -0.0019 -0.0026 -0.0012 -0.0010 -0.0011 -0.0016 -0.0010 -0.0012 -0.0022
N=479 MDIFFABS 0.0026 0.0030 0.0046 0.0042 0.0029 0.0043 0.0045 0.0047 0.0040 0.0038 0.0066 0.0020 0.0027 0.0031 0.0030 0.0024 0.0036
N_Unchanged 354 350 341 337 344 339 330 323 325 330 319 330 325 342 339 350 352
M(2.8)
MDIFF -0.0006 0.0001 -0.0009 -0.0010 -0.0010 -0.0003 -0.0014 -0.0027 -0.0021 -0.0022 -0.0022 -0.0008 -0.0026 -0.0023 -0.0020 -0.0020 -0.0018
N=479 MDIFFABS 0.0032 0.0034 0.0044 0.0040 0.0030 0.0029 0.0037 0.0043 0.0040 0.0038 0.0053 0.0030 0.0045 0.0029 0.0050 0.0030 0.0037
N_Unchanged 352 349 340 335 338 341 330 323 324 327 324 332 328 343 337 351 352
L(-2.8)
MDIFF 0.0007 -0.0005 -0.0002 0.0000 -0.0009 -0.0001 -0.0027 -0.0034 -0.0020 -0.0012 -0.0008 -0.0004 0.0008 -0.0001 -0.0008 0.0001 -0.0003
N=479 MDIFFABS 0.0033 0.0024 0.0039 0.0040 0.0021 0.0033 0.0037 0.0046 0.0050 0.0039 0.0074 0.0039 0.0056 0.0034 0.0054 0.0035 0.0044
N_Unchanged 354 352 338 334 340 339 331 321 322 331 323 333 328 341 339 346 350
L(0.0)
MDIFF -0.0005 0.0001 -0.0002 -0.0017 -0.0008 -0.0003 -0.0009 -0.0039 -0.0018 -0.0023 -0.0025 -0.0019 -0.0005 0.0000 0.0000 0.0010 0.0000
N=479 MDIFFABS 0.0029 0.0031 0.0041 0.0057 0.0036 0.0026 0.0035 0.0051 0.0052 0.0036 0.0063 0.0030 0.0040 0.0030 0.0050 0.0040 0.0050
N_Unchanged 351 348 339 337 339 337 329 323 323 329 319 328 328 341 342 347 351
L(2.8)
MDIFF 0.0007 0.0005 0.0008 0.0012 0.0008 0.0010 -0.0005 -0.0008 -0.0023 -0.0020 -0.0027 -0.0019 -0.0022 -0.0017 -0.0020 -0.0003 -0.0013
N=479 MDIFFABS 0.0035 0.0033 0.0041 0.0041 0.0036 0.0038 0.0054 0.0031 0.0038 0.0052 0.0060 0.0026 0.0035 0.0028 0.0030 0.0023 0.0036
N_Unchanged 304 302 291 291 296 290 282 276 275 282 275 283 282 295 292 303 299
UNUSED MDIFF -0.0011 -0.0011 -0.0027 -0.0029 -0.0026 -0.0018 -0.0034 -0.0055 -0.0034 -0.0034 -0.0049 -0.0033 -0.0034 -0.0037 -0.0052 -0.0030 -0.0041
N=432 MDIFFABS 0.0035 0.0038 0.0057 0.0065 0.0050 0.0047 0.0049 0.0066 0.0057 0.0044 0.0081 0.0042 0.0058 0.0040 0.0057 0.0040 0.0060
23 Note. N_Unchanged indicates the number of items with the same existing and redeveloped exposure control parameters.
MDIFF represents the average value of the differences between the existing and redeveloped exposure control parameters, with the redeveloped parameters being subtracted from the existing parameters.
MDIFFABS represents the average value of the absolute differences between the existing and redeveloped exposure control parameters.
24
Table 2 Correlations of the Exposure Control Parameters by Ability Level and Removal Condition
Theta
Items Removed -3.2 -2.8 -2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4 2.8 3.2
H(-2.8) 0.99859 0.99904 0.99847 0.99875 0.99906 0.99851 0.99882 0.99868 0.99817 0.99915 0.99738 0.99933 0.99868 0.99953 0.99758 0.99859 0.99821
H(0.0) 0.99674
0.99675 0.99623 0.99673 0.99739 0.99792 0.99798 0.99874 0.99705 0.99871 0.99721 0.99784 0.99772 0.99827 0.99762 0.99797 0.99593
H(2.8) 0.99892 0.99886 0.99841 0.99758 0.99813 0.99945 0.99888 0.99881 0.99792 0.99959 0.99763 0.99946 0.99883 0.99849 0.99718 0.99759 0.9972
M(-2.8) 0.99774 0.99712 0.99754 0.99687 0.99726 0.99887 0.99788 0.99794 0.99836 0.99887 0.99719 0.99932 0.9982 0.99954 0.99934 0.99945 0.99884
M(0.0) 0.99941 0.99838 0.99754 0.99836 0.99946 0.99862 0.99818 0.99876 0.999 0.99922 0.99792 0.99954 0.99963 0.99921 0.9988 0.99958 0.99903
M(2.8) 0.99917 0.99896 0.99831 0.99771 0.99869 0.99938 0.99889 0.99865 0.99898 0.99923 0.99822 0.99943 0.99877 0.99927 0.99708 0.99893 0.99911
L(-2.8) 0.99933 0.99958 0.99901 0.99804 0.99976 0.99932 0.99905 0.99881 0.99703 0.99917 0.99677 0.99916 0.99756 0.999 0.99692 0.99865 0.99814
L(0.0) 0.99905 0.99909 0.99884 0.99726 0.99818 0.99956 0.99889 0.9986 0.99795 0.99935 0.99778 0.99944 0.99881 0.99857 0.99653 0.99731 0.99746
L(2.8) 0.99922 0.99934 0.99867 0.99896 0.99908 0.9992 0.99796 0.99946 0.99912 0.9986 0.99827 0.99945 0.99946 0.99942 0.99871 0.99967 0.99914
UNUSED 0.99928 0.99928 0.99783 0.99634 0.99784 0.9988 0.99873 0.99809 0.99827 0.99917 0.99677 0.99888 0.99795 0.9989 0.99738 0.99856 0.99749
Note. N_Unchanged indicates the number of items with the same existing and redeveloped exposure control parameters.
MDIFF represents the average value of the differences between the existing and redeveloped exposure control parameters, with the redeveloped parameters being subtracted from the existing parameters.
MDIFFABS represents the average value of the absolute differences between the existing and redeveloped exposure control parameters. All the correlations were significant at .0001 level (p < .0001).
25
Table 3 Descriptive Statistics of the Observed
Exposure Rates for the Various Conditions
Pools N M SD Skew Kurt Min Max
H(0.0) Removed
Existing 430 0.06977 0.05139 0.35987 -1.00655 0.000
02
0.19950
Redeveloped 429 0.06993 0.05102 0.35122 -0.99623 0.000
02 0.19822
UNUSED Removed
Existing 427 0.07026 0.05158 0.38329 -0.96678 0.000
02
0.20048
Redeveloped 427 0.07026 0.05152 0.36220 -0.98794 0.000
02 0.19866
The Full Pool 432 0.06944 0.05094 0.36144 -0.97549 0.000
02 0.19818
High Exposure 144 0.13031 0.02373 0.60987 -0.29616 0.095
06
0.19818
Moderate Exposure 144 0.06380 0.01735 0.08472 -1.12078 0.036
64 0.09502
Low Exposure 144 0.01423 0.01113 0.22776 -1.21673 0.000
02 0.03652
26
Table 4 Descriptive Statistics of the Item Parameters
for the Various Exposure Groups Exposure Groups Item Parameters M SD Skew Kurt Min Max
a 1.00967 0.33146 0.80064 1.95247 0.19626 2.62976
The Full Pool
b 0.14984 1.07669 -0.25518 -0.32757 -3.44405 2.55902
N=480 c 0.17638 0.08034 0.98093 1.52969 0.03408 0.50000
a 1.18043 0.23482 0.48491 -0.14527 0.76140 1.86887
High Exposure
b -0.00962 0.46101 -0.33003 -0.35696 -1.08687 0.91304
N=144 c 0.14555 0.07768 1.58430 4.38879 0.04022 0.49652
a 1.07575 0.33524 1.81559 5.45658 0.57622 2.62976
Moderate Exposure
b 0.21649 1.04485 -0.48977 -1.15781 -2.01056 1.70187
N=144 c 0.17425 0.07127 0.53527 0.22022 0.03408 0.39780
a 0.87409 0.31919 1.11162 1.47129 0.40469 2.12031
Low Exposure
b 0.41493 1.41871 -0.51678 -0.84154 -3.44405 2.55902
N=144 c 0.17934 0.06248 0.91955 1.66391 0.06020 0.42246
a 0.70586 0.21605 -0.16699 -0.23427 0.19626 1.12616
UNUSED
b -0.36700 1.06907 -0.33251 -0.82387 -2.87023 1.31659
N=48 c 0.26632 0.09419 0.40291 -0.20642 0.07508 0.50000
27
Table 5 Descriptive Statistics of the Item Information values by Ability Level and Exposure Group
Theta
Exposure Groups -3.2 -2.8 -2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4 2.8 3.2
Mean 0.00090 0.00300 0.00960 0.02827 0.07302 0.16281 0.30732 0.48038 0.61274 0.62766 0.50244 0.31196 0.16283 0.07922 0.03799 0.01835 0.00898
High Exposure
SD 0.00170 0.00580 0.01810 0.04840 0.10396 0.17823 0.24546 0.26978 0.27589 0.31734 0.35087 0.24817 0.12478 0.05947 0.03063 0.01673 0.00932
N=144 Min 0 0 0 0 0.00002 0.00022 0.00265 0.02916 0.19418 0.16601 0.07989 0.03678 0.01530 0.00530 0.00183 0.00063 0.00022
Max 0.01107 0.03487 0.10517 0.28545 0.57502 0.81845 1.38643 1.35625 1.73163 1.68827 1.96236 1.42610 0.61601 0.30088 0.16485 0.09864 0.05662
Mean 0.00790 0.01845 0.03888 0.06954 0.10166 0.12565 0.14371 0.16893 0.21840 0.30026 0.38927 0.44628 0.40170 0.24561 0.12547 0.06197 0.03075
Moderate Exposure
SD 0.01620 0.03509 0.07241 0.12688 0.16612 0.16967 0.15344 0.13530 0.14019 0.20300 0.30313 0.42344 0.52410 0.29551 0.13374 0.06639 0.03481
N=144 Min 0 0 0 0 0 0 0 0.00001 0.00021 0.00455 0.01414 0.00584 0.00241 0.00099 0.00041 0.00017 0.00007
Max 0.09751 0.19780 0.34945 0.70791 0.87109 0.67945 0.54413 0.47524 0.60704 0.87515 1.67293 2.65143 3.11494 2.01256 0.61300 0.30247 0.16700
Mean 0.01760 0.02670 0.03771 0.04967 0.06169 0.07367 0.08670 0.10243 0.12209 0.14672 0.17995 0.22875 0.28967 0.30889 0.24407 0.15377 0.08610
Low Exposure
SD 0.03770 0.05180 0.06502 0.07616 0.08374 0.08732 0.09065 0.10032 0.11648 0.12954 0.13779 0.17147 0.28303 0.38108 0.30772 0.19274 0.10402
N=144 Min 0 0 0 0 0 0 0 0 0.00003 0.00030 0.00313 0.00566 0.00315 0.00175 0.00097 0.00054 0.00030
Max 0.28183 0.36397 0.37741 0.32186 0.30327 0.28475 0.30786 0.38866 0.51041 0.56503 0.59481 0.73155 1.31923 2.39977 1.64078 1.25514 0.54550
Mean 0.01772 0.02675 0.03895 0.05402 0.07088 0.08817 0.10569 0.12569 0.15010 0.17141 0.17284 0.14890 0.11224 0.07743 0.05090 0.03278 0.02105
UNUSED
SD 0.02483 0.03424 0.04635 0.05975 0.07153 0.07823 0.07748 0.07106 0.07790 0.11056 0.13058 0.11829 0.08878 0.05993 0.03916 0.02615 0.01818
N=48 Min 0 0.00002 0.00007 0.00026 0.00103 0.00386 0.01371 0.01645 0.01595 0.01532 0.01459 0.01378 0.01291 0.01110 0.00805 0.00471 0.00275
Max 0.09683 0.10371 0.13882 0.18575 0.22393 0.25249 0.24907 0.25260 0.32820 0.41478 0.45997 0.43191 0.36110 0.24372 0.14350 0.09774 0.07873
28
Table 6 Descriptive Statistics of the Exposure Control Parameters by Ability Level and Exposure Group
Theta
Exposure Groups -3.2 -2.8 -2.4 -2 -1.6 -1.2 -0.8 -0.4 0 0.4 0.8 1.2 1.6 2 2.4 2.8 3.2
Mean 0.74354 0.73733 0.71443 0.67313 0.63140 0.58106 0.49554 0.38664 0.33439 0.44756 0.59162 0.71387 0.79506 0.85128 0.86705 0.87610 0.88050
High Exposure
SD 0.33744 0.34095 0.34093 0.34378 0.35775 0.36317 0.33998 0.26615 0.17959 0.31046 0.35891 0.35189 0.31055 0.27845 0.26123 0.25614 0.25127
N=144 Min 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2 0.2
Max 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Mean 0.77294 0.76862 0.75854 0.74875 0.73969 0.75459 0.81934 0.90766 0.96760 0.86384 0.72202 0.63174 0.65484 0.69325 0.71544 0.72815 0.73897
Moderate Exposure
SD 0.34594 0.34851 0.35079 0.35158 0.35022 0.33592 0.29396 0.19425 0.10840 0.24890 0.32506 0.36519 0.37158 0.37072 0.36504 0.35819 0.35185
N=144 Min 0.20013 0.20007 0.20007 0.20027 0.20074 0.20168 0.21661 0.28902 0.33860 0.23566 0.20367 0.20007 0.2 0.2 0.2 0.2 0.2
Max 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Mean 0.87793 0.88458 0.89768 0.92093 0.95569 0.99130 1 1 1 1 1 0.96487 0.84873 0.77348 0.73276 0.71724 0.70617
Low Exposure
SD 0.26522 0.25751 0.23839 0.20399 0.14739 0.05670 0 0 0 0 0 0.11520 0.27253 0.33533 0.35442 0.36380 0.36756
N=144 Min 0.20704 0.20826 0.21360 0.23050 0.28694 0.48940 1 1 1 1 1 0.43828 0.23050 0.20121 0.2 0.2 0.2
Max 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Mean 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
UNUSED
SD 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
N=48 Min 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
Max 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1
29
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Maximum
Observed Exposure Rate
2018161412108642Iteration
The Full Pool of 480 Items
thet
a va
lues
-3
.2
-2.8
-2
.4
-2.0
-1
.6
-1.2
-0
.8
-0.4
0.
0
0.4
0.
8
1.2
1.
6
2.0
2.
4
2.8
3.
2
1.0
0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0.0
Maximum
Observed Exposure Rate
2018161412108642Iteration
The Reduced Pool of 432 Items
thet
a va
lues
-3
.2
-2.8
-2
.4
-2.0
-1
.6
-1.2
-0
.8
-0.4
0.
0
0.4
0.
8
1.2
1.
6
2.0
2.
4
2.8
3.
2
Figure 1. Iterations of the Exposure Control Parameter Development
30
Figure 2. The Frequency Distribution of Item Difficulty Values for the Full Pool
Item Difficulty Values
2.401.60
.80-.00
-.80-1.60
-2.40-3.20
Num
ber o
f Ite
ms
80
60
40
20
0
31
Figure 3. Conditional Maximum Observed Exposure Rates
for the Various Conditions
1.0
0.8
0.6
0.4
0.2
0.0
Maximum
Observed Exposure Rate
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0Theta
Full Existing Redeveloped
H(0.0)
1.0
0.8
0.6
0.4
0.2
0.0
Maximum
Observed Exposure Rate
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0Theta
Full Existing Redeveloped
UNUSED
32
Figure 4. Conditional Standard Errors for the Various Conditions
1.0
0.8
0.6
0.4
0.2
0.0
Standard Error of Measurement
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0Theta
Full Existing Redeveloped
H(0.0)
1.0
0.8
0.6
0.4
0.2
0.0
Standard Error of Measurement
-3.0 -2.0 -1.0 0.0 1.0 2.0 3.0Theta
Full Existing Redeveloped
UNUSED