A fair comparison of the performance of computerized adaptive testing and multistage adaptive testing
"The comparison of item-level computerized adaptive testing (CAT) and multistage adaptive testing (MST) has been researched extensively (e.g., Kim & Plake, 1993; Luecht et al., 1996; Patsula, 1999; Jodoin, 2003; Hambleton & Xing, 2006; Keng, 2008; Zheng, 2012). Various CAT and MST designs have been investigated and compared under the same item pool. However, the characteristics of an item pool designed specifically for CAT are different from the characteristics of an item pool designed for MST. If CAT and MST are compared under the same item pool designed for either CAT or MST, the comparison might be unfair to the other test mode. To address this issue, this study focused on comparing the measurement accuracy and averaged test length of MST and CAT, when they were matched on conditional standard error of measurement, exposure rates, IRT scoring method and content specifications, under different item pools designed for MST and CAT, respectively. When designing a MST, multiple factors need to be considered. In this paper, a total of 16 conditions of MST designs (i.e., 1-2-3 and 1-3-3 panel designs; the AMI and DPI routing strategies; the test lengths of 45 and 60 items; forward and backward assembly) were employed. Each condition was compared with the result of the corresponding CAT. A simulation study was conducted to evaluate the performance of MST against the corresponding CAT. The results show similar measurement accuracy between MST and CAT, which implies that the efforts to make a fair comparison where successful. The reason is that both procedures matched similar conditional test information. This fair comparison of MST and CAT provides a reference for testing mode change from CAT to MST in terms of ability recovery and averaged test length. When considering the testing model change from CAT to MST, the backward assembled MST is not suggested even for a classification-oriented test. Whether to change the testing mode depends on the current averaged test length in CAT. If the current CAT has a moderate-length test, switching to a forward assembled MST with 3 stages is plausible and feasible. For a long test, staying in CAT is preferred over switching to MST."--Pages ii-iii.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Wang, Keyin
- Thesis Advisors
-
Reckase, Mark
- Committee Members
-
Maier, Kimberly
Konstantopoulos, Spyridon
Cui, Yuehua
- Date Published
-
2017
- Subjects
-
Educational tests and measurements--Evaluation
Competency-based educational tests--Evaluation
Ability--Testing--Evaluation
Computer adaptive testing
Evaluation
- Program of Study
-
Measurement and Quantitative Methods - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- xiv, 86 pages
- ISBN
-
9781369762037
1369762038
- Permalink
- https://doi.org/doi:10.25335/ypy5-6g68