Enhancing item pool utilization when designing multistage computerized adaptive tests
In recent years, the multistage adaptive test (MST) has gained increasing popularity in the field of educational measurement and operational testing. MST refers to a test in which pre-constructed sets of items are administered adaptively and are scored as a unit (Hendrickson, 2007). As a special case of Computerized Adaptive Testing (CAT), a MST program needs the following components: an item response theory (IRT) model or non-IRT-based alternatives; an item pool design; module assembly; ability estimation; routing algorithm; and scoring (Yan et al., 2014). A significant amount of research has been conducted on components like module assembly, ability estimation, routing and scoring, but few studies have addressed the component of item pool design. An item pool is defined as consisting of a maximal number of combinations of items that meet all content specifications for a test and provide sufficient item information for estimation at a series of ability levels (van der Linden et al., 2006). An item pool design is very important because any successful MST assembly is inseparable from an optimal item pool that provides sufficient and high-quality items (Luecht & Nungester, 1998). Reckase (2003, 2010) developed the p-optimality method to design optimal item pools using the unidimensional Rasch model in CAT, and it has been proved to be efficient for different item types and IRT models. The present study extended this method to MST context in supporting and developing different MST panel designs under different test configurations. The study compared the performance of the MST assembled under the most popularly studied panel designs in the literature, such as 1-2, 1-3, 1-2-2, and 1-2-3. A combination of short, medium and long tests with different routing test proportions were used to build up different tests. Using one of the most popularly investigated IRT models, the Rasch model, simulated optimal item pools were generated with and without practical constraints of exposure control. A total number of 72 optimal items pools were generated and the measurement accuracy was evaluated by an overall sample and conditional sample using various statistical measures. The p-optimality method was also applied in an operational MST licensure test to see if it is feasible in supporting test assembly and achieving sufficient measurement accuracy in practice. Results showed that the different MST panel designs achieved sufficient measurement accuracy by using the items from the optimal item pools built with the p-optimality method. The same was true with the operational item pool. Measurement accuracy was related to test length, but not so much to the routing test proportions. Exposure control affected the item pool size, but the distributions of the item parameters and item pool characteristics for all the MST panel designs were similar under the two conditions. The item pool sizes under the exposure control conditions were several times larger than those under no exposure control, depending on the types of MST panel designs and routing test proportions. The results from this study provide information for how to enhance item pool utilization when designing multistage computerized adaptive tests, facilitating the MST assembly process, and improving the scoring accuracy.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Yang, Lihong
- Thesis Advisors
-
Reckase, Mark D.
- Committee Members
-
Reckase, Mark D.
Houang, Richard
Martineau, Joseph
Bowles, Ryan
- Date
- 2016
- Subjects
-
Educational tests and measurements--Research
Item response theory
Computer adaptive testing
Design
- Program of Study
-
Measurement and Quantitative Methods - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- xiv, 128 pages
- ISBN
-
9781369051353
1369051352