A Review of Linear Programming and Its Application
A Review of Linear Programming and Its Application
net/publication/228707933
CITATIONS READS
14 9,194
1 author:
Richard Fletcher
Massey University
38 PUBLICATIONS 912 CITATIONS
SEE PROFILE
All content following this page was uploaded by Richard Fletcher on 23 July 2014.
Richard B. Fletcher1
Massey University (Albany)
This report reviews the international research literature on linear programming as applied to
the issues of banking assessment items. It outlines the mathematical procedures needed to
obtain feasible solutions to selections made by teachers and constraints imposed by the
assessment developers. The various algorithms and heuristic procedures necessary for
feasible solutions in an item bank of only 500 items with testlets are discussed and
exemplified. The report recommends use of detailed item mapping, limiting the number of
ability levels, use of the simultaneous selection of items and sets method, use of the
maximin model, and use of the optimal rounding method in finding solutions.
1
2 Fletcher, R. B.
organisations. The above two issues can comparable to other tests constructed from the
undermine utility of a test if too much is known same item bank. This is a major advance in
about the item content beforehand. Ultimately, classroom-based assessment.
ability estimation is compromised and the The aim of this paper is to discuss linear
testing process is invalidated. The issue facing programming (LP) in the context of the asTTle
test developers is to create a very large item projects using IRT calibrated items, by
bank to alleviate some of these problems. providing practical examples for specifying
Another example of the utility of item objective functions and linear constraints. IRT
banking is optimal test design in the form of 0-1 test assembly is first addressed to overview this
linear programming (LP) that allows for measurement model and its applicability to test
complete tests to be assembled according to construction within the asTTle projects.
detailed test specifications while maximizing an Secondly, LP is presented along with an
objective function (Adema, Boekooi-Timminga, explanation of some of the objective functions
& van der Linden, 1991; Baker, Cohen, & available to the test constructor. Thirdly,
Barmish, 1988; Timminga & Adema 1995, practical constraints for individual items are
Thuenissen, 1985). Test assembly using introduced along with a worked example to
existing item banks and LP is applicable to both exemplify the main issues. Fourthly, set-based
classical test theory and IRT (Adema, 1990, items are presented along with a practical
1992a, 1992b; Adema, Boekkooi-Timminga, & example to illustrate this method. In sum, the
Gademann, 1992; Adema, Boekkooi- paper will provide computer programmers with
Timminga, & van der Linden, 1991; Adema & the relevant information and references to
van der Linden, 1989; Armstrong, Jones, & enable them to set up the constraints as outlined
Wang, 1994; Baker, Cohen, & Barmish, 1988; in the Assessment Tools for Teaching and
Berger & Veerkamp, 1996; Boekkooi- Learning (asTTle) proposal.
Timminga, 1987, 1990a, 1990b, 1993; de
Gruijter, 1990; Stocking, Swanson, & Item Response Theory
Pearlman, 1991, 1993; Swanson & Stocking, For the purpose of this paper it is assumed
1993; Theunissen, 1985, 1986; Timminga & that the one-parameter Rasch model is used as
Adema, 1995, 1996; van der Linden, 1996, the basis of item calibration. The Rasch model
2000; van der Linden & Boekkooi-Timminga, is computationally the simplest of the current
1989). Furthermore, LP can be used to IRT models for dichotomously scored data (see
assemble tests using dichotomous (e.g., Baker, Hambleton, 1989; Hambleton & Swaminathan,
Cohen, & Barmish, 1988; de Gruijter, 1990; 1985; and Harris, 1996 for a discussion of the
Stocking, Swanson, & Pearlman, 1991; various unidimensional dichotomous IRT
Theunissen, 1985, 1986; van der Linden & models). A critical assumption of the Rasch
Boekkooi-Timminga, 1989) or polytomous item model (and all other unidimensional IRT
formats (e.g., Berger, 1998; Berger & models) is unidimensionality for each item; that
Mathijssen, 1997; and Fletcher & Hattie, in is, one underlying latent ability should account
review). for the item response. The Rasch model implies
Baker, Cohen, and Barmish (1988, p. 190) that the probability of a correct response to an
suggest that “mathematical programming item (dichotomous in this case) is a function of
represents a major addition to the tools of the the item difficulty and the examinee’s ability
test constructor because it provides an level (θ), and is denoted by the following
analytical rather than an ad hoc procedure for equation:
item selection under IRT.” Such an approach to
test assembly is a major strength of the asTTle
project, as tests can be assembled to teacher
specifications using items with known (1)
properties. The use of LP in this situation where Pi (θ) is the item characteristic function
furnishes a teacher with an adaptive test that is and bi is the item difficulty parameter.
Technical Report 5: Linear Programming and its Application to asTTle 3
An important feature of IRT models is the 2. Items with information curves that can fill
concept of item information that is the the difficult areas to fill under TIF are
reciprocal of the standard error of measurement. selected.
Items with low standard errors have greater 3. Item information function curves are added
information and vice versa. Item information to determine their information contribution
provides the test constructor with an indication to the TIF.
of item measurement precision from which 4. Continue to add items so that the TIF is
items can then be selected accordingly into the approximated at the specified ability levels.
test on the basis of their information. For the
Rasch model, item information function (IIF) is 4
2.5
Information
(2) 2
0
curve (TIFC). The TIFC is simply the sum of -3 -2.4 -1.8 -1.2 -0.6 0 0.6 1.2 1.8 2.4 3
function, and will result in infeasibility. The and will help in identifying potential
objective function is to: infeasibility problems and in finding out how to
deal with these (see Tables 1, 2, and 4 for
examples of hypothetical item bank structures).
In general, though, the constraints should be
(16) reasonable so as to allow for a feasible solution
subject to: to be obtained, and, therefore, knowing the
qualities of the item bank will help to overcome
this type of issue.
(17)
The objective functions outlined above each Practical Constraints for Simultaneous
have their potential uses in the construction of Selection of Items
achievement tests. The maximin model (van The importance of setting accurate and
der Linden & Boekkooi-Timminga, 1989) reasonable practical constraints cannot be
appears to be the most flexible of the above overstated, as these are essential components in
objective functions, and seems more LP test assembly because they are the means by
appropriate for the asTTle projects in that it which test specifications are fully met.
only requires the TIF to be approximated. Constraints should be closely checked before
any attempt is made to solve the model. Some
Practical Constraints simple examples of practical test constraints
For the asTTle projects, teachers will not that are likely to be used in the asTTle projects
only specify an objective function but will also are given below.
make choices about characteristics of the type 1. Constraining the number of items allowed to
of test they desire. These choices will then be be selected into the test. In other words, the
transferred into an objective function and series sum of the items from i = 1, 2, ..., I must be
of practical linear constraints. Teachers will equal to N
therefore be setting test specifications, albeit
limited ones. In practice, however, test
specifications are often complex, and practical
constraints are required within LP in order to (18)
fully realize the test constructors’ demands. 2. Limiting the number of items with certain
Test specifications in LP are expressed as a characteristics (e.g., multiple-choice, open-
series of linear constraints. Before setting ended questions, deep or surface learning).
practical constraints, it is advisable to consider If Sd is a set of deep cognitive processing
the qualities of the item bank and its ability to items, then one can limit the number of these
meet them. For example, it would not make items in the test using the equation below:
sense to ask for six deep processing items at θ3
in the close reading curricular area, when in fact
there were only three items. Such a constraint
would lead to no solution being found to the test (19)
construction model, and as teachers will not 3. Limiting the sum of certain item attributes.
know how to deal with such infeasibility, then For example, the sum of the administration
constraints will need to be set within reasonable times (ti) for item i, should be less than or
limits. equal to bj. For this type of constraints, one
Teachers will certainly need to know should not use equality signs as these can
something about the structure of the item bank lead to no solution being found. The
to enable them to understand the choices they inequality sign therefore provides some
can make. For the computer programmers, flexibility in designation of items into the
however, mapping out the item bank will prove test.
invaluable when writing the linear constraints
Technical Report 5: Linear Programming and its Application to asTTle 7
the associated items are 2, 9, 16, 22, 29, 35, and and categorical stimulus attributes are defined
44, and so on. Sets are selected into the test if zs in Equations 44 to 47. Equations 48 to 50 are
= 1. Alternatively, zs = 0 indicates that the set is definitions of the decision variables.
not selected. Item is is denoted xis = 1 when it is Minimize y
selected into the test, and xis = 0 when it is not subject to:
in the test.
The maximin objective function (van der
Linden & Boekkooi-Timminga, 1989) is
specified such that information for item is at θk, (33)
denoted as Is (θk), is specified by the target
values T(θk ) (k = 1, 2, ..., K) for each value of
the TIF at θk. Thus, the TIF at each θk is (34)
required to fall between upper and lower
bounds [T(θk )-y, T(θk ) + y]. To control the size
of the bounds, y ≥0 is an actual numerical value
that defines the width of the interval. The (35)
objective function is therefore to minimise y.
Further notation is needed to fully explicate the
model and is based on those suggested by van
der Linden (2000), such that: (36)
1. qi is the value of item is and on quantitative
attribute q. For example, the quantitative
attribute for item is could be item difficulty.
2. rs denotes the value of stimulus s on (37)
quantitative attribute r. For example, the
quantitative attribute may be time taken to
read a passage of writing.
3. Cg is defined as the set of indices of items (38)
with the value g on categorical attribute C, g
= 1, 2, ..., G. If the categorical attribute is
level of processing then the associated items
would be those at the deep or the surface (39)
level.
4. Dh is a set of stimuli indices with value h on
categorical attribute D, h = 1, 2, ..., H.
5. n defines the upper (u) and lower (l) bounds (40)
for the numbers of items in subsets from the
item bank.
The following model formally outlines the
maximin objective function for items sets, and (41)
some practical possible constraints that can be
used. Equations 33 and 34 set the TIF to a
relative shape and to fall within a certain range. (42)
Equations 35 and 36 set the length of the test.
Equation 37 sets the number of sets to be
selected. The number of items to be selected
from an item set is given in Equations 38 and (43)
39. Items to be selected on the basis of their
quantitative or categorical attributes are denoted
in Equations 40 to 43. Similarly, quantitative
10 Fletcher, R. B.
(52)
(44)
(53)
(45)
(54)
(46)
(47) (55)
(48)
(56)
(49)
(57)
(50)
To provide a worked example for selecting
set-based items using some of the above
constraints, a hypothetical set of test
specifications, using item sets in Table 4, are (58)
specified, such that:
1. The information at θ1 = -2 and θ3 = 0 means
that the TIF should fall between a series of
lower and upper bounds such that the (59)
interval is minimized (Equations 51 & 52).
2. The test must contain no more than 40 items
(Equations 53 & 54).
3. Ten sets must be selected (Equations 55). (60)
4. Six sets must come from the personal and the
close reading curricular areas (Equation 56).
5. There must be no more than five items per
set (Equations 57 & 58). (61)
6. The test can take no longer than 40 minutes
to complete (Equations 59 & 60).
7. At least five items ≥ θ3 at the deep
processing level from the personal reading (62)
area must be selected (Equations 61 & 62).
Minimize y
subject to:
(63)
(51) (64)
Technical Report 5: Linear Programming and its Application to asTTle 11
problems farther along the branch can be solution is found. Likewise, optimality is
skipped. not guaranteed.
A full branch-and-bound search on a large The application of the simplex and the
LP problem requires a great deal of time to branch-and-bound methods for solving test
reach a solution, and thus, to use it as the sole assembly problems will facilitate an optimal
minimization method would be unacceptable in solution, if it exists. The main limitation of
the asTTle projects. To facilitate test assembly these approaches in terms of the asTTle projects
using large item banks, test developers can is the computing time needed to reach
employ heuristics. In general, heuristics optimality. Heuristics greatly increases the
facilitates varying degrees of precision ability of the test constructor to assemble tests
depending on the process chosen and the using LP methods, and therefore makes LP
constraints used. (For a more complete feasible to test assembly problems. Of the
discussion of the uses and implications of heuristics outlined above, van der Linden &
various heuristics, see Timminga et al., 1996). Boekooi-Timminga (1989) suggest that the
Stocking, Swanson, and Pearlman (1991) optimal rounding method provides excellent
presented the following heuristics for results that are close to optimal in reasonable
facilitating feasible solutions, based on the work computing time. The drawback of heuristics is
of van der Linden and Boekkooi-Timminga that they lead to varying degrees of precision
(1989). when the final solution is obtained, and this
1. Crude linear rounding. The initial solution should be borne in mind when examining
to the relaxed problem is obtained, and the results.
decision variables are allowed to take on
non-integer values 0 ≤ xi ≤ 1, and the Infeasibility
findings are rounded to 0 or 1. A problem An issue that can occur in LP is infeasibility.
with this approach is that it may not reach That is, there is a problem in finding a feasible
optimality, or satisfy all of the constraints. solution that fully meets the test constraints.
2. Improved linear rounding. This approach Although not an uncommon problem, there are
differs from crude linear rounding in that the methods for overcoming such issues. Swanson
decision variables are ranked in descending & Stocking (1993) note that studies reported in
order, and the first n (n = the number of the literature report relatively small item banks
items) are re-rounded to 1 and selected into of fewer than 1,000 items, and with as few as
the test. Again, optimality is not guaranteed, 50 constraints. In practice, however, item
nor may all the constraints be satisfied. banks tend to be much larger with many more
3. Optimal rounding. First the relaxed linear complex constraints, and therefore the
solution is obtained (0 ≤ xi ≥ 1), then all probability of finding a feasible solution
variables equal to 0 or 1 are fixed. increases. The message is clear, the larger the
Optimality is then achieved by applying the item bank, the lower the potential for
branch-and-bound method. Again, this infeasibility.
method does not guarantee an optimal Timminga and Adema (1995, p. 422) suggest
solution to the problem. that “the basic cause of infeasibility is that the
4. First 0-1 solution. Branch-and-bound set of test requirements is in contradiction with
methods are used to search for a global the characteristics of the item bank.” An
solution by discarding many local solutions. important consideration in the test development
After the first 0-1 integer solution is found, phase therefore is the identification of the test
the method stops. If a solution to the specifications and how these relate to the
constraints exists, it is found, although it structure of the item bank. Test specifications
may not be optimal. need to be thoroughly reviewed prior to the
5. Second 0-1 solution. This solution is similar application of LP methods in order to decrease
to the first 0-1 solution, except that the the probability of infeasibility. If, however,
method terminates after the second integer infeasibility is encountered, then practical
Technical Report 5: Linear Programming and its Application to asTTle 13
solutions are required to solve the problem. region. Combining constraints will also enlarge
The cause of infeasibility is often difficult to the solution area. For example, if no solution
detect, as computer software is unable to could be found to the model specified above
identify the problem in the constraints. More (i.e., Equations 22 to 32), then Equations 23 and
often, the problem can be manually traced to 25 can be combined so that:
impractical constraints on the item bank. For
example, if the item bank has 10 items relating
to surface learning, and the test specification
demand that, 12 items be surface learning, then
(66)
it is impossible to obtain a solution. In this
Combining these two constraints results in
obvious example, the test constructor can
the same number of items being selected, but it
identify the problem and rectify the constraints.
allows more items to be drawn from a curricular
Making changes to the right-hand side of a
area that is best able to provide information that
constraint is one method for increasing the
meets the TIF. It may be that one curricular
solution space to produce feasible solutions. It
area can provide 12 items, and the other area
is important to understand that changes, for
can provide 8 items to then meet all the test
example, in the right-hand side in Equation 15,
specifications. Test specificity is somewhat
will not increase the feasible region, as these are
compromised in this example, but not to a great
unrelated solution space. Changing the values
extent. The same approach can also be applied
in Equations 35 and 36 will, however, assist in
to combining constraints on item sets to avoid
increasing the solution space, and therefore may
infeasibility. In general, the problem of
result in a feasible solution being obtained.
combining constraints is that test specificity
Some manipulation (i.e., increase or decrease)
may be compromised. As a consequence, if
of n in the right-hand side is needed to facilitate
constraints are combined, then the final solution
any increase in the solution space.
should be examined to determine its
Another approach to increasing solution
acceptability in terms of the original test
space is to examine closely the equality and
specifications.
inequality signs. In general, the use of
Problems may also be encountered if the
inequality signs can result in the increase of the
objective function and one or more of the
solution space if the cause of infeasibility is due
constraints are in conflict. For example, if
to that constraint. Replacing inequality with
items are designated not to be in the test
equality signs can create infeasibility as there
(assigned 0, and left out the problem), then
may not be an exact solution to such a
modification of the objective function or the
constraint. Timminga and Adema (1995)
constraints is required to take this into account.
suggest that equality signs should be used only
Although the literature on infeasibility and
where constraints are applied to integer-valued
test assembly problems is limited, Timminiga
variables. With item sets, one should attempt to
and Adema (1995) suggest the following
increase the number of items associated to a
analytical strategy to overcome infeasibility.
common stimulus to reduce potential
1. Check that the constraints are compatible
infeasibility problems. Thus, for set-based
with the item bank.
items, and for the asTTle projects, the
2. Solve the relaxed LP problem. If
implication is to have a large item bank with
infeasibility occurs, then check the
many items associated with each stimulus to
constraints using some of the approaches
assist in reducing issues of infeasibility with
outlined above to modify the constraints.
set-based items.
3. If the relaxed problem can be solved, then
As previously stated, knowing the qualities if
solve the 0-1 LP problem.
the item bank can help to identify sources of
4. If the 0-1 LP problem has a solution, the test
infeasibility. If, for example, a constraint is too
is specified.
severe, then it should be either modified or
5. If the relaxed problem cannot be solved, then
deleted in such a way as to increase the feasible
check the constraints.
14 Fletcher, R. B.
6. If the 0-1 LP is still infeasible, then a the items and the amount of information they
heuristic method should be applied to the have. One method for determining the upper
problem. and lower bound on the TIF would be to draw
7. If infeasibility continues with the 0-1 LP repeated random samples of 40 items to
problem, then the test constructor should determine the upper and lower values of the
modify the test specifications. TIF, which could then serve as the values for
In sum, problems with infeasibility will be the TIF in all tests.
generally be traceable to incompatibility with The practical constraints presented above
the item bank or to constraints that are in should provide the computer programmers with
conflict with one another or with the objective the relevant information to be able to formulate
function. Most infeasibility can be overcome the teachers’ choices (i.e, curricular area, item
by knowing the structure of the item bank and format, administration time, level of processing,
by setting reasonable test specifications. and item difficulty) into linear constraints. The
Timminiga and Adema (1995) point out that if set-based items approach outlined by van der
the above procedures are followed, then the test Linden (2000) will allow for the constraints to
constructor should always be able to assemble a be designated between upper and lower bounds.
test. As with the TIF, the use of upper and lower
bounds on the practical constraints will provide
Conclusions the computer programmers with some
flexibility when dealing with possible
The review presented above suggests that LP
infeasibility issues.
is highly applicable to the asTTle projects, as it
Solving LP problems should not be
provides a search mechanism that will furnish
problematic for the asTTle projects, as
teacher-specified tests with a high degree of
heuristics are available that should provide
precision. Linear programming simultaneously
close-to-optimal results in reasonable
satisfies both content and statistical attributes
computing time. Van der Linden and
that will be specified by the teacher through the
Boekkooi-Timminga (1989) state that as the
use of an objective function and a series of
size of the item bank increases, the amount of
linear constraints. As the teacher choices will
computing time needed to find an exact solution
be limited (difficulty, curricular area, level of
increases. In the future, items are likely to be
processing, etc.), then writing these in LP form
added to the asTTle item bank, and therefore
should not be too problematic.
the adoption of heuristics, such as the optimal
An issue for the computer programmers of
rounding heuristic, will provide accurate tests
the asTTle projects will be to set an objective
(to be generated quickly) that meet detailed
function that can conform closely to the
specifications.
teachers’ test specifications in terms of the
range of ability. In general, three to five ability
Recommendations
points should be sufficient for the range of TIF.
With set-based items, setting the objective 1. Have a detailed map of the item bank
function should not be too problematic, given showing the quantitative and qualitative
that that maximin model (van der Linden, 2000; aspects of the items, and how these relate to
van der Linden & Boekkooi-Timminga, 1989) the curricular areas (see Tables 1, 2, and 4,
will allow for the TIF, and will be specified to and also Timminga, van der Linden, &
fall between an upper and lower bound. The Schweizer, 1996).
task of the computer programmers will be to 2. Although there are seven ability levels to be
determine the width of this interval to alleviate covered across the asTTle projects, the
potential infeasibility. Finding a set of values teachers should be constrained to choose no
that will decrease the potential of infeasibility, more than five ability levels, as increases on
but still maximize information, will be essential this number will likely result in difficulty in
to the success of the asTTle projects. Setting approximating the TIF or result in
these values will be dependent on the quality of infeasibility.
Technical Report 5: Linear Programming and its Application to asTTle 15
3. As the asTTle project will use mainly set- Adema, J. J., Boekkooi-Timminga, E., &
based items, the method for the simultaneous Gademann, A. J. R. M. (1992).
selection of items and sets outlined by van Computerized test construction. In M.
der Linden (2000) is recommended. This Wilson (Ed.), Objective Measurement:
method is shown to provide a more accurate Theory into Practice (Vol. 1, pp. 261–273).
approximation of the objective function in Norwood, New Jersey: Ablex.
reasonable computing time using a large Adema, J. J., Boekkooi-Timminga, E., & van
number of constraints (e.g., 24 stimuli with der Linden, W. J. (1991). Achievement test
5–12 items per stimuli (498 items) with over construction using 0-1 linear programming.
200 constraints across 2 tests; the LP models European Journal of Operational Research,
were solved in 1–5 minutes). 55, 103–111.
4. For set-based items, the TIF should be Armstrong, R. D., Jones, D. H., & Wang, Z.
specified to be a relative shape (the maximin (1994). Automated test construction using
model developed by van der Linden & classical test theory. Journal of Educational
Boekkooi-Timminga, 1989, and van der Statistics, 19(1), 73–90.
Linden, 2000). This approach is the most Baker, F. B., Cohen, A. S., & Barmish, B. R.
flexible approach, given that the TIF can be (1988). Item characteristics of tests
set to fall between lower and upper bounds. constructed by linear programming. Applied
Determining the width of the interval will be Psychological Measurement, 12(2), 189–
a critical aspect of the computer 199.
programming. Berger, M. P. F. (1998). Optimal design of
5. For solving the binary 0-1 and integer tests with items with dichotomous and
problems, the optimal rounding method is polytomous response formats. Applied
more flexible in its approach to finding Psychological Measurement, 22(3), 248-258.
feasible solutions in reasonable computing Berger, M. P. F., & Mathijssen, E. (1997).
time. This method will be invaluable for Optimal test designs for polytomously
finding accurate and fast solutions to the scored items. British Journal of
teachers’ choices, as the size of the item Mathematical and Statistical Psychology, 50,
bank for the asTTle projects is likely to 127–141.
increase in the coming years. Berger, P. F., & VeerKamp, W. J. J. (1996). A
review of selection methods for optimal test
References design. In G. Engelhard & M. Wilson
(Eds.), Objective Measurement: Theory into
Adema, J. J. (1990). The construction of
Practice (Vol. 3, pp. 437–455). Norwood,
customized two-stage tests. Journal of
New Jersey: Ablex.
Educational Measurement, 27(3), 241–253.
Birnbaum, A. (1968). Some latent trait models.
Adema, J. J. (1992a). Implementations of the
In F. M. Lord & M. R. Novick (Eds.),
branch-and-bound method for test
Statistical Theories for Mental Test Scores.
construction problems. Methodika, 6, 99–
Reading, MA: Addison-Wesley.
117.
Boekooi-Timminga, E. (1987). Simultaneous
Adema, J. J. (1992b). Methods and models for
test construction by zero-one programming.
the construction of weakly parallel test.
Methodika, 1(2), 101–112.
Applied Psychological Measurement, 16(1)
Boekooi-Timminga, E. (1990a). The
53–63.
construction of parallel tests from IRT-based
Adema, J. J., & van der Linden, W. J. (1989).
item banks. Journal of Educational
Algorithms for computerized test
Measurement, 15(2), 129–145.
construction using classical item parameters.
Boekooi-Timminga, E. (1990b). A cluster-
Journal of Educational Statistics, 14(3),
based method for test construction. Applied
279–290.
Psychological Measurement, 14(4), 341–
354.
16 Fletcher, R. B.
18
Appendix Table 1
Item Bank Structure for the Closed-Choice Items
Items
C-C C-C C-C C-C C-C C-C C-C
Curricular in C-C C-C Items at Deep C-C Items at Surface
Items Items Items Items Items Items Items
Area each Items Processing Processing
at θ1 at θ2 at θ3 at θ4 at θ5 at θ6 at θ7
Area
Personal 1–3, 7–9, 13–15, 19–21, 4–6, 10–12, 16–18, 21–
1–56 1–42 1–6 7–12 13–18 19–24 25–30 31–36 37–42
Reading 25–27, 31–33, 37–39 24, 28–30, 34–36, 40–42
57–59, 63–65, 69–71, 60–62, 66–68, 72–74,
Close 57–
57–98 57–62 63–68 69–74 75–80 81–86 87–92 93–98 75–77, 81–83, 87–89, 78–80, 84–86, 90–92,
Reading 112
93–95 96–98
113–115, 119–121, 125– 116–118, 122–124, 128–
Expressive 113– 113– 113– 119– 125– 131– 137– 143– 149–
127, 131–133, 137–139, 130, 134–136, 140–142,
Writing 168 154 118 124 130 136 142 148 154
143–145, 149–151 146–148, 152–154
169–171, 175–177, 181– 172–174, 178–180, 184–
Poetic 169– 169– 169– 175– 181– 187– 193– 199– 205–
Fletcher, R. B.
183, 187–189, 193–195, 186, 190–192, 195–197,
Writing 224 210 174 180 186 192 198 204 210
199–201, 205–207 202–204, 207–210
225–227, 231–233, 237– 228–230, 234–236, 230–
Transitional 225– 225– 225– 231– 237– 243– 249– 255– 261–
239, 243–245, 249–251, 232, 246–248, 251–253,
Writing 280 266 230 236 242 248 254 260 266
255–257, 261–263 258–260, 264–266
281–283, 287–289, 293– 284–286, 290–292, 295–
Exploring 281– 281– 281– 287– 293– 299– 305– 311– 317–
295, 299–301, 305–307, 297, 301–303, 308–310,
Writing 336 322 286 292 298 304 310 316 322
311–313, 317–319 314–316, 320–322
337–339, 343–345, 349– 340–342, 345–347, 352–
Thinking 337– 337– 337– 343– 349– 355– 361– 367– 373–
351, 355–357, 361–363, 354, 358–360, 364–366,
Critically 392 378 342 348 354 360 366 372 378
367–369, 373–375 370–372, 376–378
393–395, 399–401, 405– 396–398, 402–404, 408–
Processing 393– 393– 393– 399– 405– 411– 417– 423– 429–
407, 411–413, 417–419, 410, 414–416, 420–422,
Information 448 434 398 404 410 416 422 428 434
423–425, 429–431 426–428, 433–434
Note: C-C denotes closed-choice items.
Appendix Table 2
Item Bank Structure for the Open-Ended Items
Items
O-E O-E O-E O-E O-E O-E O-E
Curricular in O-E O-E Items at Deep O-E Items at Surface
Items Items Items Items Items Items Items
Area each Items Processing Processing
at θ1 at θ2 at θ3 at θ4 at θ5 at θ6 at θ7
Area
19
20 Fletcher, R. B.
Appendix Table 3
Hypothetical Number of Items at each θ across the Three Levels based on an Item Bank of 448
Items
θ1 = -3 θ2 = -2 θ3 = -1 θ4 = 0 θ5 = +1 θ6 = +2 θ7 = +3
very very
Level 1 easy average hard
easy hard
very very
Level 2 easy average hard
easy hard
very very
Level 3 easy average hard
easy hard
Note: θ2 -θ6 share the same items due the overlapping of the grade/ability levels.
Appendix Table 4
Item Attribute and Constraint Levels for Set-Based Items
Appendix Table 5
Hypothetical Item Bank with Set-based Items
Curricular Area Items C-C C-C C-C C-C C-C C-C C-C C-C Items at Deep C-C Items at Surface
in each Items Items Items Items Items Items Items at Processing Processing
Area at θ1 at θ2 at θ3 at θ4 at θ5 at θ6 θ7
Personal 1–49 1–7 8–14 15–21 22–28 29–35 36–42 43–49 1–4, 8–11, 15–18, 22– 5–7, 12–14, 19–21, 26–
Reading (PR) 25, 29–32, 36–39, 43–46 28, 33–35, 40–42, 47–49
Close Reading 50–98 50–56 57–63 64–70 71–77 78–84 85–91 92–98 50–53, 57–60, 64–67, 54–56, 61–63, 68–70,
(CR) 71–74, 78–81, 85–88, 75–77, 82–84, 89–91,
92–95 96–98
21