An investigation into a Chinese placement test’s score interpretations and uses
Foreign language placement testing, an important component in university foreign language programs, has received considerable, but not copious, attention over the years in second language (L2) testing research (Norris, 2004), and it has been mostly concentrated on L2 English. In contrast to validation research on L2 English placement testing, the discussion on tests in languages other than English is limited (e.g., Mozgalina & Ryshina-Pankova, 2015). Additionally, these studies have been constrained by two main methodological limitations. First, the importance of item-level data analysis is largely overlooked. While the researchers have highlighted the value of examining total test scores in validation research, defensible score interpretations and uses should not be assumed without further evidence showing all test items function as intended by test developers. Second, the validity evidence reported in these studies falls into a narrow range: the evidence mainly focused on generalization (e.g., reporting test reliability), explanation (group performance comparisons), and extrapolation (correlational studies on the relationship between test scores and other criterion) inferences, and validation needs more than that (Chapelle, 2021). In contrast, the documentation of empirical results supporting domain description (content representation and relevance), evaluation (examination of item quality), and utilization (stakeholders’ perception of score usefulness) has been limited.The primary goal of my dissertation is to provide a comprehensive examination and evaluation of the test score uses and interpretations for the listening and reading sections of an in-house, college-level Chinese placement test. For my dissertation, I collect and evaluate quantitative (placement test scores, item responses, ACTFL proficiency test scores) and qualitative (interviews, focus group, questionnaires) validity evidence in an argument-based validation framework that was conceptualized by Kane (2006), and was further expanded by Chapelle et al. (2008): domain description, evolution, generalization, explanation, extrapolation, and utilization (see Chapelle, 2021, for a review). Employing mixed-methods, I aim to (1) study the functioning of test items by identifying and revising psychometrically problematic items, if any; (2) utilize the empirical results to inform test revisions; (3) demonstrate how the collected quantitative and qualitative results serve as strong or weak evidence or counterevidence for the claims within the validity argument; and (4) provide an overall evaluation of the intended interpretation and use of the placement test scores. With the study I hope to contribute to the larger discussion of the practices of foreign language assessment and argument-based test validation, and at the same time, offer insight into the ongoing development of validity research.
Read
- In Collections
-
Electronic Theses & Dissertations
- Copyright Status
- In Copyright
- Material Type
-
Theses
- Authors
-
Ma, Wenyue
- Thesis Advisors
-
Winke, Paula
- Committee Members
-
Reed, Dan
Van Gorp, Koen
Bowles, Ryan
- Date Published
-
2023
- Subjects
-
Linguistics
- Program of Study
-
Second Language Studies - Doctor of Philosophy
- Degree Level
-
Doctoral
- Language
-
English
- Pages
- 150 pages
- Permalink
- https://doi.org/doi:10.25335/t1ty-hm35