EXPLORING TASK AND GENRE DEMANDS IN THE PROMPTS AND RUBRICS OF
STATE WRITING ASSESSMENTS AND THE NATIONAL ASSESSMENT OF
EDUCATIONAL PROGRESS (NAEP)
By
Ya Mo

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
Curriculum, Teaching and Educational Policy—Doctor of Philosophy
Measurement and Quantitative Methods—Doctor of Philosophy

2014

ABSTRACT
EXPLORING TASK AND GENRE DEMANDS IN THE PROMPTS AND RUBRICS OF
STATE WRITING ASSESSMENTS AND THE NATIONAL ASSESSMENT OF
EDUCATIONAL PROGRESS (NAEP)
By
Ya Mo
My dissertation research examines constructs of writing proficiencies in state and
national assessments through content analysis of writing prompts and rubrics; predicts students'
writing performance on the National Assessment of Educational Progress (NAEP) from
assessment variations using multi-level modeling; and explores genre demands in state writing
assessments through syntactic analysis of writing prompts to identify the ambiguity and implicit
expectations and content analysis of rubrics and state standards to identify the genres specified.
Through content analysis of 78 prompts and 35 rubrics from 27 states’ writing
assessments, and three representative prompts and rubrics from the NAEP, the research
presented in Chapter 1 finds that state writing assessments and the NAEP seem to align in their
adoption of the writing process approach, their attention to audience and students’ topical
knowledge, their accommodations through procedure facilitators, and their inclusion of
organization, structure, content, details, sentence fluency, and semantic aspects as well as general
conventions, such as punctuation, spelling, and grammar in their assessment criteria. However,
the NAEP’s writing assessment differs from many states’ by having explicit directions for
students to review their writing, giving students two timed writing tasks, making informative
composition—which was rarely included in state assessments—one of the three genres assessed,
and including genre-specific components in their writing rubrics. The fact that all of the NAEP’s

writing rubrics are genre-mastery rubrics with genre-specific components can be considered one
of its biggest differences from most state writing assessments.
To examine the impact of the variations between state and national writing assessments
through Hierarchical Linear Modeling, the research presented in Chapter 2 examines the
relationship between students’ NAEP performances and the amount of difference between state
and NAEP direct writing assessments using content analysis of the state and NAEP prompts and
rubrics detailed above. This study finds that students’ preparedness for the tasks, namely the
similarity between the assessments of their home states and the NAEP, plays a role in students’
performance on the NAEP. Students from those states with writing assessments similar to the
NAEP performed significantly better than students from states with writing assessments that
differed markedly from the NAEP.
Through syntactic analysis of the same set of state prompts and content analysis of
rubrics and standards, the research presented in Chapter 3 explores genre demands in state
writing assessments. In total, this study found that 23% of prompts possessed one of two
problematic features: 14% of prompts were ambiguous, and 9% of prompts had implicit genre
expectations. Almost one third of those prompts that possessed problematic features were used
with genre-mastery rubrics. The content analysis of state writing standards also suggests that
22% of them do not cover all the genres assessed in their corresponding writing assessments. The
ambiguity and implicit genre expectations in writing prompts and the limited congruence of state
writing assessments with learning expectations pose potential threats to the valid interpretation
and use of these writing assessments.

ACKNOWLEDGMENTS

I am deeply indebted to my advisor, Professor Gary Troia. He inspired my interests in
writing assessments, guided me through writing research, made the IES-funded K-12 Writing
Alignment Project data available for my dissertation research, always gave me prompt feedback,
and offered me his support along every step of my doctoral study. I look up to and learn from his
productivity, diligence, and vision as a scholar.
I am also indebted to my co-advisor, Professor Mark Reckase. Being an outstanding
teacher, he introduced me to measurement theories and sparked my interest in assessments. His
devotion and passion towards the field of measurement is always inspirational to me.
I am very grateful to my other dissertation committee members—Professor Susan FlorioRuane and Professor Peter Youngs. They have always been extremely helpful, giving me all the
support that I need and sharing with me insights that helped develop my dissertation.
Finally, I extend my heartfelt thanks to my family and dear friends. They gave me their
unconditional love and support, which motivated me through every step of my academic
pursuits.
This dissertation study uses a portion of data collected and coded in the K-12 Writing
Alignment Project, funded by grant number R305A100040 from the U.S. Department of
Education, Institute of Education Sciences, to Michigan State University. Statements do not
necessarily reflect the positions or policies of this agency, and no official endorsement by it
should be inferred.

iv

TABLE OF CONTENTS

LIST OF TABLES

viii

LIST OF FIGURES

ix

INTRODUCTION

1

CHAPTER 1: Examining Writing Constructs in U.S. State and National Assessments
5
1. Introduction
5
2. Review of Literature
7
Genre Theories in Composition
10
3. Research Questions
17
4. Mode of Inquiry
17
4.1 State and NAEP Direct Writing Assessments
17
4.2 Coding Taxonomy
19
4.3 Procedure
20
5. Results
21
5.1 How do the features of writing tasks and rubrics vary across a sample of states and
NAEP?
21
Writing Process
22
Writing Context
23
Writing Components
24
Writing Mechanics
24
Writing Knowledge
25
5.2 What are the connections between these prompts and rubrics, especially in terms of
their genre demands?
25
Prompts
25
Rubrics
27
Connections between Prompts and Rubrics
28
5.3 What are the similarities and differences between NAEP and state writing
assessments?
29
5.4 Insights from a combined use of the two approaches
30
Prompts
31
Rubrics
31
Prompts and Rubrics Associations
33
6. Discussion
34
6.1 Prevalent Writing Assessment Practices
34
6.2 Genre Demands in Direct Writing Assessments
38
6.3 State and National Alignment
40
7. Implications
41
8. Limitations
42
CHAPTER 2: Predicting Students’ Writing Performance on NAEP from Assessment Variations

v

1. Introduction
2. Research Questions
3. Method
3.1 State and NAEP Direct Writing Assessments
3.2 Coding taxonomy
3.3 Coding Procedure
3.4 Distance between State Assessments and the NAEP
3.5 NAEP Sample
3.6 Students’ NAEP Composition Performance
3.7 Students’ Characteristics in NAEP
3.8 Structure of the Data Set and Statistical Analyses
3.9 Statistical Models
Unconditional model (Model 1)
Main effect model (Model 2)
Main effect model (Model 3)
Main effect model (Model 4)
4. Results
5. Discussion
6. Implications
7. Limitations

44
44
49
49
49
51
53
54
55
55
57
59
60
61
61
62
62
63
69
71
72

CHAPTER 3: Genre Demands in State Writing Assessments
74
1. Introduction
74
2. Research Questions
80
3. Method
81
3.1 State Direct Writing Assessments and Standards
81
3.2 Data Coding
82
Genre demands in prompts
83
Genres of prompts
83
Genre expectations in rubrics
84
Genre expectations in state standards
84
3.3 Data Analyses
85
4. Results
86
4.1a. How many state writing prompts possessed the problematic features of ambiguity or
implicit genre expectations?
86
4.1b. Which key words in prompts were associated with ambiguity and implicit genre
expectations, and how frequently do they appear?
89
4.2. What is the relationship between prompts’ genre specification and rubrics’ genremastery expectations?
95
4.3. What is the relationship between genre expectations in state standards and writing
prompts?
99
5. Discussion
100
5.1 Ambiguity in prompts
100
5.2 Genre Expectation in Standards, Rubrics, and Prompts
102
5.3 Validity of State Writing Assessments
103

vi

6. Implications
7. Limitations

103
105

CHAPTER 4: Summary and Moving Forward
106
1. Major Findings
106
1.1 Prevalent Writing Practices
106
1.2 Genre Demands in Direct Writing Assessments
107
1.3 State and National Alignment
107
1.4 The Relationship between the Variability between State and National Assessments
and Students’ NAEP Performance
108
1.5 The Relationship between Students’ Characteristics and their NAEP Performance
108
1.6 Ambiguity in Prompts and Genre-mastery Rubrics
110
1.7 Genre Expectation in Standards and Genres Assessed
110
2. Implication for Writing Assessment Practices
111
2.1 For State Writing Assessment and NAEP
111
2.2 Writing Prompt Design
112
3. Implication for Writing Instruction
112
4. Next Steps for Research
114
APPENDICES
Appendix A Tables
Appendix B Coding Taxonomies
Appendix C State Direct Writing Assessments

116
117
132
146

BIBLIOGRAPHY

152

vii

LIST OF TABLES

Table 1 Prompt-Rubric Contingencies for 81 Prompts

28

Table 2 States with Genre-Mastery Rubrics and/or State with Rubrics Containing Genre-Specific
Components
32
Table 3 Genre Assessed in States with both Genre-Mastery Rubrics and Rubrics Containing
Genre-Specific Components
33
Table 4 Sample Sizes, Achievement, and Student Demographics, 27 State Grade 8 HLM
Sample
56
Table 5 HLM Model Results

66

Table 6 Frequency (F) and Percentage (P) of Key Words Usage in Genres

91

Table 7 Prompts with Problematic Features and Used with Genre-Mastery Rubrics

96

Table 8 NAEP Coding & Frequency Counts and Percentage of States

117

Table 9 Sample Sizes, Achievement, and Student Demographics, 27 State Grade 8 NAEP
Reporting Sample
120
Table 10 Comparison of Sample Sizes and Student Demographics for 27 State Grade 8 NAEP
Reporting Sample and HLM Sample
121
Table 11 Raw Unweighted Descriptive Statistics of Variables in HLM Models

123

Table 12 Genre Expectations in Standards and Genre Assessed

125

Table 13 Prompt Coding—Troia & Olinghouse’s (2010) Coding Taxonomy

132

Table 14 Rubric Coding—Troia and Olinghouse’s (2010) Coding Taxonomy

136

Table 15 Rubric Coding—Jeffery’s (2009) Coding Taxonomy

141

Table 16 Seven-Genre Coding Scheme for Prompts—Adapted from Jeffery (2009) and Troia &
Olinghouse (2010)
142
Table 17 Standards Genre Coding—Troia and Olinghouse’s (2010) Coding Taxonomy Modified
to Accommodate Jeffery’s (2009) Genre Coding Taxonomy
144
Table 18 State Direct Writing Assessments

146

viii

LIST OF FIGURES

Figure 1 Genre Categories for 81 Prompts

26

Figure 2 Criteria Categories for 38 Rubrics

27

ix

INTRODUCTION
There are persistent discrepancies between state and national writing assessment results
(Lee, Grigg, & Donahue, 2007; Salahu-Din, Persky, & Miller, 2008). High proficiency levels are
often reported for state-mandated assessments, while low proficiency levels are reported for the
National Assessment of Educational Progress (NAEP). A possible explanation for this gap is that
state and national assessments vary in the ways they define the writing construct and measure
proficiency (Jeffery, 2009). The No Child Left Behind Act of 2001 (NCLB) gave states the
freedom to adopt vastly different standards for English language arts, and allowed states to
define content area proficiency levels and flexibly design their accountability systems (U.S.
Department of Education, 2004). As a result, “states’ content standards, the rigor of their
assessments, and the stringency of their performance standards vary greatly” (Linn, Baker, &
Betebenner, 2002, p.3). However, little is known about how these tests vary.
When the content and format of state-mandated assessments are comparable to the
national assessment, students are indirectly prepared for the NAEP. However, whether students
actually achieve higher scores on the NAEP when their state assessments are more similar to it
and lower scores when their state assessments are less similar is unknown. In other words,
whether this variation between state and national writing assessments predicts students’
performance on the NAEP remains unexamined.
Currently, the Common Core State Standards (CCSS) have been formally adopted by 45
states and the District of Columbia. Developed by two multistate consortia, the Smarter Balanced
Assessment Consortium (SBAC) and the Partnership for Assessment for Readiness for College
and Careers (PARCC), K-12 assessments aligned with CCSS will be in place starting with the
2014-2015 academic year. While this multistate effort may address the persistent discrepancies

1

between state and national writing assessments, it cannot explain the existing gap. A study of the
state and national writing assessments will not only contribute to explaining the existing gap, but
also inform policymakers and test designers by identifying the central characteristics of writing
constructs valued in the past and advise them in the further development of new writing
assessments.
The research presented in Chapter 1 examines what constitutes the writing construct in
state writing assessments and the NAEP, and explores the similarities and differences between
them through content analysis of state and NAEP writing prompts and rubrics. My adoption of
Troia & Olinghouse’s (2010) comprehensive coding taxonomy and Jeffery’s (2009) genre-based
coding schemes for content analysis ensures a broad presentation of recent thinking about
writing development, instruction, and assessment, and allows an in-depth look into the variability
of conceptions of writing constructs across states.
The research presented in Chapter 2 builds on the research presented in Chapter 1 by
examining whether the differences between state and national writing assessments can explain
some of the discrepancies found in the results of these assessments. This study quantifies these
differences as the Euclidean distance between state and NAEP writing constructs as defined by
the 90 indicators in Troia & Olinghouse’s (2010) and Jeffery’s (2009) coding taxonomies. The
study explores the relationship between these differences and students’ NAEP performance
through Hierarchical Linear Modeling (HLM). The findings suggest that students’ performances
on the NAEP reflect both their writing abilities and how well they are prepared for the type of
assessments the NAEP conducts. However, the large amount of unexplained variance between
students’ performances on NAEP from state to state suggests that there are more state-level
variables to be explored. This result does not suggest that state and NAEP assessments should be

2

made more similar to each other; rather, components of these assessments such as prompts and
rubrics should be examined to see whether they reflect evidence-based practices and whether
they ensure the valid interpretation and use of the results of those assessments.
Following the recommendations of the research presented in Chapter 2, the research
presented in Chapter 3 investigates the prompts in state writing assessments in depth and
identifies ambiguities and implicit genre expectations in the design of these prompts. Ambiguity
is defined as the presence of two or more genre demands in a prompt, while implicit genre
expectations in prompts means a lack of verbs (e.g., argue, convince) or nouns (e.g., stories) that
explicitly signal the desired genre. This is especially problematic when a prompt that is
ambiguous or has implicit genre expectations is used with a rubric that emphasizes genre
mastery. Therefore, the study also examines the use of genre-mastery rubrics with prompts that
possess problematic features. When state writing assessment prompts are ambiguous or contain
implicit expectations, a question is raised about whether the assessment is effectively and
accurately evaluating the students’ mastery of the genre in question. State standards provide an
answer by specifying what students are expected to learn. Therefore, this study also examines
state standards to identify the range of genres expected of middle school students. This study
highlights the connection between genre demands in writing prompts and genre-mastery
expectations in rubrics and state standards.
Together, this research investigates the writing constructs underlying state and national
writing assessments, explore(s) the relationship between the variability in state and national
assessments and students’ NAEP performance, and examine(s) an important component of
writing assessments—prompts—in depth. The findings should raise people’s awareness that

3

students’ performances on the NAEP do not only measure their writing abilities but also reflect
how well they are prepared for the type of assessments the NAEP uses.
Poorly developed assessments will provide inaccurate evaluations of students’ abilities,
impact curriculum in unwarranted ways, and lead to wrong decisions regarding students’
promotion and retention, as well as imprecise ratings of teacher effectiveness. These findings can
advise test designers about what central characteristics of the writing construct have been valued
in the past, and can be used in the development of new writing assessments. Furthermore, it is
hoped that these findings will direct the assessment and writing research communities’ attention
to validity-related issues in large-scale writing assessments and encourage more research to study
components of these large-scale writing assessments.

4

CHAPTER 1: Examining Writing Constructs in U.S. State and National Assessments
1. Introduction
In the U.S., persistent discrepancies exist between state and national writing assessment
results (Lee, Grigg, & Donahue, 2007; Salahu-Din, Persky, & Miller, 2008). The results of the
National Assessment of Educational Progress (NAEP) show low proficiency levels, yet statemandated assessments often report high proficiency levels. This inconsistency suggests that, in
order to ensure that the results of state and national assessments are comparable, more uniform
academic and assessment standards may be necessary.
One solution to this gap, the Common Core State Standards (CCSS), has already been
formally adopted in 45 states and Washington, D.C. Two multistate consortia, the Smarter
Balanced Assessment Consortium (SBAC) and the Partnership for Assessment for Readiness for
College and Careers (PARCC) worked together to develop K-12 assessments aligned with the
CCSS. These assessments will be implemented for the 2014-2015 school year.
Although these multistate efforts have attempted to address the persistent discrepancy
between the results of state writing assessments and the NAEP, they do not explain the existing
gap. One possible explanation of this gap is the varying ways in which state and national
assessments define the writing construct, and the differences in the measures they use to
determine proficiency levels (Jeffery, 2009). It is difficult to state with certainty whether these
variations fully account for the inconsistent results, though, because little is known about how
these assessments actually vary.
The No Child Left Behind Act of 2001 (NCLB) required states to implement statewide
accountability systems that consisted of challenging state standards and annual testing for all
grade 3-8 students. At the same time, these NCLB requirements were flexible enough that states

5

were able to adopt dramatically different standards for English language arts instruction and
assessment, some of which placed little emphasis on writing (Jeffery, 2009); this flexibility also
let each state define their own content area proficiency levels and design appropriate
accountability systems to assess those proficiency levels (US Department of Education, 2004).
As a result, “states’ content standards, the rigor of their assessments, and the stringency of their
performance standards vary greatly” (Linn, Baker, & Betebenner, 2002, p.3).
Variation in states’ standards, assessments, and performance benchmarks is associated
with differing conceptions of writing performance (Jeffery, 2009). On the one hand, this
variability may produce the discrepancy that is consistently observed between state assessments
and NAEP results and make state assessment and NAEP results difficult to reconcile. On the
other hand, the variability in the underlying conceptions of writing proficiency raises the concern
that teachers are emphasizing different aspects of composition in U.S. classrooms (Jeffery,
2009), because research has shown that tests impact instruction (Hillocks, 2002, Moss, 1994).
Hillocks (2002) found that writing instruction in classrooms is often used to help students
prepare for high-stakes assessments. In other words, whatever is valued in the assessments
students will take is what tends to be taught; the state-to-state variability in the underlying
conceptions of writing proficiency in assessment contexts thus leads to the variability of writing
instruction found in U.S. classrooms.
What constitutes the writing construct is complex. It can be understood through and
approached with multiple theoretical frameworks. A comprehensive perspective ensures a broad
presentation of current thinking about writing development, instruction, and assessment; thus,
such a perspective is more likely to shed light on the underlying writing construct. Troia and
Olinghouse (2010) developed a coding taxonomy to examine writing standards and assessments.

6

This taxonomy was derived from several theoretical frameworks, including Hayes’ cognitive
model of writing (Flower & Hayes, 1981; Hayes, 1996), socio-cultural theory (Prior, 2006),
genre theories (Dean, 2008), linguistic models of writing (Faigley & Witte, 1981), and
motivation theories (Troia, Shankland, & Wolbers, 2012). It consists of seven strands: (1)
writing processes, (2) context, (3) purposes, (4) components, (5) conventions, (6) metacognition
and knowledge, and (7) motivation.
Adopting this framework allows an in-depth look into the variability of conceptions of
the writing construct across states; therefore an analysis that uses it can inform policy makers
and test designers about the extant ways the writing construct is defined and proficiency is
measured to guide further development of writing assessments. Results from this type of analysis
can also advise them on which core characteristics of the writing construct that were valued in
the past can continue to be used in the future to supplement the CCSS and the common
assessments in each state. Moreover, these results can help the assessment community examine
the validity of those large-scale writing assessments.
2. Review of Literature
Dean (1999) conducted content analyses on some popular secondary composition
textbooks and studied sample writing tests from Texas, California, and Washington. The study
showed that while the textbooks reflected both traditional and current theories of writing, the
large-scale writing assessments reflected traditional rhetoric characteristics, which emphasize
style, form, and the mechanical aspects of writing. Hillocks (2002) studied writing assessments
in five states—New York, Illinois, Texas, Kentucky, and Oregon—and conducted 390
interviews with teachers and administrators. He found that state assessments tended to
undermine state standards and encourage writing instruction that helped prepare students for

7

high-stake assessments. As a result, high stakes testing does not guarantee quality classroom
instruction; instead, it encourages ineffective teaching and can come with unintended
consequences such as promoting a formulaic approach to writing.
Beck & Jeffery (2007) examined 20 exit-level state writing assessment prompts from
Texas, New York, and California, using task analysis of the prompts and genre analysis of the
corresponding high-scoring benchmark papers. They found that there was a lack of alignment
between the genre demands of the prompts and the genres of the corresponding benchmark
papers. The comparison of the genre demands in the prompts with the actual genres produced in
the corresponding benchmark papers showed that there was much greater genre variation in the
expected responses of Texas and California writing assessments than those from New York.
Only 20% of the California benchmark papers and 18% of the Texas benchmark papers were
aligned with the prompts, while 42% of the New York benchmark papers were aligned. Jeffery’s
(2009) study of 68 prompts from 41 state and national exit-level direct writing assessments
suggested that national writing assessments were different from state assessments in the degree
that they emphasized genre distinctions and provided coherent conceptualizations of writing
proficiency. The genre expectations in national writing assessments were consistently associated
with rubric criteria whereas this was not true of state assessments.
Studies that have examined how conceptualizations of writing constructs vary among
U.S. states have either examined small samples of states (Dean, 1999; Beck & Jeffery, 2007) and
their writing assessments (Hillocks, 2002), or targeted exit-level writing assessments for high
school students (Jeffery, 2009). Few studies have investigated how conceptualizations of writing
construct vary in middle schools among U.S. states. A look into what is emphasized in middle
school writing assessments, as well as the various definitions of the writing construct, will shed

8

light on the expectations of writing competence placed on students. Once deeper understandings
of these expectations and differences are developed, more resources can be allocated to help
students navigate this important but challenging stage of their writing development.
Middle school is an important stage for students to develop their abstract thinking and
more sophisticated ways of using language (De La Paz & Graham, 2002). Students who do not
learn to write well are less likely to use their writing to extend their learning, and more likely to
see their grades suffer (National Commission on Writing for America’s Families, Schools, and
Colleges [NCWAFSC], 2003, 2004). As an important transitional step for students between
elementary and high school, middle school education lays down a foundation for students’
studies in high school and later college. Weak writers in middle school suffer the consequences
of the growing trend of using writing proficiency as a factor in grade retention and advancement
and continue to be at great disadvantage in high school, and are thus less likely to attend college
(Zabala, Minnici, McMurrer, & Briggs, 2008).
The NAEP assesses students’ writing at eighth grade, and seventh and eighth graders are
also frequently assessed in state writing assessments. Consequently, a large sample can be
derived from states’ middle school writing assessments to compare with the NAEP’s direct
writing assessments. Direct writing assessments generally consist of writing prompts to guide the
student in writing about a particular topic; for example, a student may be presented with a
picture, and asked to write a response to that picture. This study aims to fill in gaps in the
research on large-scale writing assessments with a broader comparison by using writing
assessment prompts from 27 states and the NAEP 2007 writing prompts to examine the features
of states’ and NAEP’s direct writing assessments, and to explore the similarities and differences
between state and national writing assessments at the middle school level. The NAEP 2007 data

9

was selected because it contained state-level writing data and allowed state-level modeling
whereas the NAEP 2011 does not.
Troia and Olinghouse’s (2010) coding taxonomy is one analytical tool utilized for this
research. The indicators found within the seven strands in the coding taxonomy cover (a) all
stages of the writing process and specific composition strategies; (b) circumstantial influences
outside the writer that can impact writing performance; (c) a variety of communicative intentions
accomplished through different genres; (d) the features, forms, elements, and characteristics of
different texts; (e) the mechanics of producing text; and (f) the knowledge resources and (g)
personal motivational attributes within the writer that drive writing activity and writing
development. In writing assessments, the writer’s motivation (i.e., general motivation, goals,
attitudes, beliefs, and efforts) does not apply, because states rarely administer assessment
documents such as surveys alongside writing assessments to measure writers’ personal attributes.
Thus, the seventh strand from the original coding taxonomy was not used in this study.
Genre Theories in Composition
Among various theoretical frameworks, genre theories have been used to examine largescale writing assessments (Beck & Jeffery, 2007; Jeffery, 2009) and thus deserve further
mention. Genres thread through all elements of composition, and shape students’ ways of
thinking about the writing process. Different genres direct students to proceed differently with
different stages of writing process. For example, writing a persuasive composition makes
planning of certain content more useful than that for a narrative piece (Dean, 2008). Outlines that
direct students to begin their essays with thesis statements and explicit arguments and to continue
with evidence that supports those arguments and refute the counter-arguments are more
appropriate for persuasive compositions than outlines that direct students to begin their paper by

10

setting the scene and by continuing with a sequence of actions. Thus, knowing how to effectively
adopt writing process approaches for different genres in assessment contexts will help students
compose their texts more efficiently.
Genres connect texts and contexts. Devitt, Reiff, and Bawarshi (2004) proposed strategies
to help students deliberately use genres to make such a connection:
[Teachers] teach students to move from observation of the writing scene and its shared
goals, to the rhetorical interactions that make up the situations of this scene (the readers,
writers, purposes, subjects, and settings), to the genres used to participate within the
situations and scenes. (p. xviii)
In other words, students are taught to observe the context in which the desired writing is
expected to fulfill the communicative intent, and then use appropriate genres to fulfill this
communicative need; thus, genres bridge texts and contexts. For example, the school is
organizing a field trip. Students may have places that they would like to visit; thus, a persuasive
letter would be an appropriate genre to fulfill their communicative needs of convincing the
audience of their letters—likely school teachers, administrators, and staff—to allow them to visit
the places that they would like to visit.
Genres also serve writing purposes. When students study genre, they are “studying how
people use language to make their way in the world” (Devitt, 1993). If not taught explicitly what
each genre means students will lack the knowledge of genres’ structures, and have a difficult
time coming up with appropriate writing content for different purposes. For example, when
students are expected to write to persuade, without having genre knowledge of the structural
elements and/or information that is canonical for persuasive papers, students may be unable to
use argumentation schemes—“ways of representing the relationship between what is stated in the

11

standpoint and its supporting justificatory structure” (Ferretti, Andrews-Weckerly, & Lewis,
2007, p.277)—such as argument from consequences and argument from example. For example,
one prompt asked students to write about whether they think it is a good idea for their school to
have candy and soda machines. Those who were against this idea could have argued that these
machines would promote unhealthy eating habits among students. This would be an argument
from potential negative consequences, which could be further illustrated with examples. For
instance, the fact that students purchased candy and soda more frequently and consumed more
unhealthy food than before could be cited to illustrate the argument that candy and soda
machines’ promoting unhealthy eating habits.
Genres specify the writing content (i.e., features, forms, elements, or characteristics of
text) to be included in a text. Donovan and Smolkin (2006) believed that “an important part of
‘doing school’ is mastering the most frequently appearing generic forms” (p.131). Berkenkotter
and Huckin (1995) argued that “genres are essential elements of language just as words,
syntactic structure, and sound patterns. In order to express one’s individual thoughts, one must
use available patterns for speech, that is to say, genres, in one way or another” (p. 160). There
are established genres in every language; people choose them and modify them to achieve
various purposes by relying on those writing components. For example, to tell a story, writers
will have a story line, setting, plot, and characters, as well as dialogue and a climax to elicit an
emotional response from the reader.
Genre impacts the mechanics of writing and guides formats. The content requirements
specified in a genre, and writers’ consideration of purpose and audience impact their use of
vocabulary and/or word choice, which potentially affect spelling (Pasquarelli, 2006). For
example, there are differences between vocabularies used in informative writing versus narrative

12

writing; these differences play out in the spellings of abstract technical vocabulary used in one
versus the more colloquial vocabulary used in the other. The sentence structure dictated by a
genre also impacts the use of punctuation, such as the often unorthodox use of punctuation in
poetry. Also, genres such as poetry have their established formats.
Genre knowledge also serves as an important component of the total writing knowledge
students need for successful composition. Genre knowledge is the knowledge about the purposes
of writing and the macrostructures of a text including text attributes, elements, and structure
common to specific types of writing. Donovan and Smolkin (2006) stated that “genre
knowledge develops prior to conventional writing abilities” (p.131). Though genre knowledge
does not guarantee successful performance, there is an interactive relationship between genre
knowledge and performance; prior genre knowledge can prompt students’ writing performances
under new circumstances both in positive and negative ways and expand knowledge of a
particular genre to various strategies (Bawarshi & Reiff, 2010; Devitt, 2009; Dryer, 2008; Reiff
& Bawarshi, 2011).
Jeffery (2009) used genre theories to explore the writing construct underlying state
writing assessments. Jeffery’s (2009) study was based on Ivanic’s (2004) framework of six
“discourses of writing”—“skills discourse,” “creativity discourse,” “process discourse,” “genre
discourse,” “social practices discourse,” and “social political discourse” (Ivanic, 2004, p. 224).
Ivanic defined “discourses of writing” as “constellations of beliefs about writing, beliefs about
learning to write, ways of talking about writing, and the sorts of approaches to teaching and
assessment which are likely to be associated with these beliefs” (p.224).
In Ivanic’s framework, “skills discourse” describes writing as applying knowledge of
sound-symbol relationship and syntactic patterns to compose a text. Thus, a big part of “learning

13

to write” is learning sound-symbol relationships and syntactic patterns. Likewise, the “teaching
of writing” involves the explicit teaching of skills such as phonics, with accuracy emphasized in
the assessment criteria (Ivanic, 2004).
“Creativity discourse” views writing as the product of an author’s creativity. “Learning to
write” is therefore expected to be achieved by writing on topics that interest writers. The
“teaching of writing” also involves implicit teaching of creative self-expression. In this case,
“whole language” and “language experience” are emphasized, while interesting content and style
are valued in the assessment criteria (Ivanic, 2004).
Ivanic calls the practical realization of the composing processes in the writer’s mind
“process discourse.” In this view, “learning to write” is learning both the mental and practical
processes in composing a text, and the “teaching of writing” involves explicit teaching of these
processes (Ivanic, 2004).
Writing as text-types forged by social context is termed “genre discourse” by Ivanic. In
this understanding, “learning to write” is thus to learn the characteristics of different types of
writing that serve different purposes in different contexts. Predictably, the “teaching of writing”
involves the explicit teaching of genres. The appropriateness of the genre utilized by students is
valued in assessment criteria (Ivanic, 2004).
“Social practices discourse” portrays writing as purpose-driven communication in a
social context. Consequently, the point of “learning to write” is to write for real purposes in reallife contexts. Therefore the “teaching of writing” involves explicit instruction in functional
approaches and the implicit teaching of purposeful communication. Whether writing is effective
for the given purpose is valued in assessment criteria in this case (Ivanic, 2004).

14

Finally, “socio-political discourse” explains writing as a socio-politically constructed
practice open to contestation and change. “Learning to write” is therefore the process of
understanding why different types of writing have their unique characteristics and to choosing a
position from alternatives. “Teaching to write” involves explicit teaching of critical literacy
skills, including “critical language awareness.” Social responsibility is highly valued in
assessment criteria in this discourse (Ivanic, 2004).
Through an inductive analysis of the rubrics for the exit-level writing assessment
prompts, Jeffery (2009) developed a five-criteria coding scheme for rubrics: rhetorical, genremastery, formal, expressive, and cognitive. These rubric types represent what different
“discourses of writing” value as assessment criteria (Ivanic, 2004). Rhetorical rubrics focus on
“the relationship between writer, audience, and purpose across criteria domains” (Jeffery, 2009,
p.10). Genre-mastery rubrics emphasize “criteria specific to the genre students are expected to
produce” (Jeffery, 2009, p.11). Formal rubrics conceptualize “proficiency in terms of text
features not specific to any writing context” (Jeffery, 2009, p.11). Cognitive rubrics target
“thinking processes such as reasoning and critical thinking across domains” (Jeffery, 2009, p.12).
Expressive rubrics conceptualize “good writing” as “an expression of the author’s uniqueness,
individuality, sincerity and apparent commitment to the task” (Jeffery, 2009, p.12).
Meanwhile, through an inductive analysis of exit-level state direct writing assessments,
Jeffery (2009) developed a six-genre coding scheme for prompts. The six genres of prompts are
argumentative, persuasive, explanatory, informative, narrative, and analytic. Argumentative
prompts differ from persuasive prompts by calling abstractly for “support” of a “position” and by
not designating a target audience. An example of an argumentative prompt is “many people
believe that television violence has a negative effect on society because it promotes violence. Do

15

you agree or disagree? Use specific reasons and examples to support your response.” In contrast,
persuasive prompts require students to convince an identified audience to act on a specific issue.
Moreover, persuasive prompts are unlike argumentative prompts because they invite students to
take a one-sided perspective on an issue, while argumentative prompts often expect students to
consider multiple perspectives on an issue. An example of a persuasive prompt is “you want your
parent or guardian to allow you to go on a field trip with your classmates. Convince your parent
or guardian to allow you to do this.” In contrast to argumentative and persuasive prompts,
“which explicitly identify propositions as arguable and direct students to choose from among
positions” (p.9), explanatory prompts anticipate that students will “explain how or why
something is so” (p.9). An example of an explanatory prompt is “a good friend plans to visit you
for the first time in the U.S. You want to help him/her get ready for the trip. Explain what you
would do.” With the above coding frameworks, 68 prompts and 40 rubrics were coded in
Jeffery’s (2009) study, and the inter-rater agreement was .87 for prompt coding and .83 for rubric
coding.
Jeffery (2009) suggested that one way to illuminate the underlying construct
conceptualizations in large-scale writing assessments is to analyze the relationships between
genre demands and scoring criteria. Jeffery’s (2009) six genre coding taxonomy can be used to
supplement Troia and Olinghouse’s (2010) coding taxonomy by further differentiating the
persuasive and argumentative genres. On the other hand, Jeffery’s (2009) five-criteria coding
scheme can be used to code rubrics to study how prompts and rubrics are associated, while Troia
and Olinghouse’s (2010) coding taxonomy allows an examination of the writing constructs
defined by prompts and rubrics together.

16

3. Research Questions
This study explores how state and national assessments define and measure the writing
construct by studying the features of their writing assessments.
More specifically, this study aims to answer the following questions:
1. How do the features of writing prompts and rubrics vary across a sample of states and the
NAEP?
2. What are the connections between these prompts and rubrics, especially in terms of their genre
demands?
3. What are the similarities and differences between NAEP and state writing assessments?
4. Mode of Inquiry
4.1 State and NAEP Direct Writing Assessments
This study was built upon a prior Institute of Education Sciences (IES)-funded study—
the K-12 Writing Alignment Project (Troia & Olinghouse, 2010-2014). In the K-12 Writing
Alignment Project, appropriate assessment personnel were located through states’ Department of
Education websites. Email inquiries and phone calls were conducted to request documents.
Because the K-12 Writing Alignment Project examined the alignment between state writing
standards and assessments prior to the adoption of the CCSS, the use of the NAEP 2007
assessment ensured that students’ NAEP results were an effect of the instruction under state
writing standards and assessments current at that time. Also, the NAEP 2007 data contained
state-level writing data and allowed state-level modeling, whereas the 2011 data did not. Because
the NAEP assessment with which state assessments were compared was from 2007, state direct
writing assessments were gathered mainly from between 2001 and 2006 to ensure the
representation of the time period. Also, because the study aimed to analyze representative state

17

writing assessments, while some states had major revisions which changed what their
representative writing assessment might be, it was important to identify the number and dates of
the major revisions between 2001 and 2006.
After the number and dates were identified, a representative writing prompt, its rubric,
and the administrative manual for each genre in each grade being assessed were collected from
each time span between major revisions. This resulted in the selection of 78 prompts and 35
rubrics from 27 states1 in total (See Appendix C for details). There was no NAEP data available
for Alaska, Nebraska, Oregon, and South Dakota for the time period in question. There were no
state writing standards or writing assessments available for Connecticut, Iowa, Pennsylvania,
Montana and New Mexico between 2001 and 2006. Ohio did not assess 7th grade and 8th grade
for writing during the period 2001-2006. Therefore, those states’ direct writing assessments were
not included in this analysis.
Next, state direct writing assessment documents were compiled to include (a) verbal
directions from administration manuals for direct writing assessments; (b) actual prompts; (c)
supporting materials provided (e.g., dictionary or writer’s checklist); (d) sessions arranged for
writing tests (e.g., planning session, drafting session, revising session); (e) time given; (f) page
limits; and (g) whether (and what kind(s) of) technology was used. The number of compiled
documents for each state corresponded with the number of responses expected from students
each year. In other words, if students were expected to respond to one prompt with rotated genres
each year, prompts from the rotated genres were all compiled into a single document to represent
the scope of genres assessed. If students were expected to respond to multiple prompts each year,

1

The following chose not to participate in the study: Colorado, Delaware, the District of Columbia, Georgia,
Hawaii, Maryland, Minnesota, Mississippi, New Hampshire, New Jersey, North Dakota, South Carolina, Utah, and
Wyoming.

18

those prompts were compiled separately into multiple documents. These compiled documents
and rubrics were coded with the coding taxonomy.
The publically-released NAEP 2007 writing prompts, scoring guide, and writing
framework were collected. There were three NAEP writing prompts from eighth grade included
in this analysis: a narrative prompt, an informative prompt, and a persuasive prompt. These three
writing prompts were included because they were publicly available and considered
representative of the genres assessed. Other writing prompts were not released due to possible
future use.
4.2 Coding Taxonomy
This study used the coding taxonomy developed by Troia and Olinghouse (2010) which
was modified to accommodate Jeffery’s (2009) genre coding scheme for prompts, as well as her
criteria coding scheme for rubrics. These two coding frameworks served to provide
comprehensive coverage of the underlying writing construct, focused study of the powered
genres in state and NAEP direct writing assessments, and the relationships between prompts and
rubrics. When used to code the writing prompts, Troia and Olinghouse’s (2010) coding
taxonomy ensured comprehensive coverage of the writing construct as measured by the 80
indictors under the six strands; thus, not only the genre demands of the writing prompts were
examined, but also the writing process, the assessment context, and the required writing
knowledge. Jeffery’s (2009) coding taxonomy, derived from an inductive analysis of exit-level
state direct writing assessments, focused on the genre demands of the writing prompts and could
differentiate among similar genres such as the persuasive and argumentative genres, as well as
the expository and informative genres. As a result, a seven-category genre coding scheme (see
Table 16 in Appendix B) was developed by adapting from the third stand (i.e., purpose) of Troia

19

and Olinghouse’s (2010) coding taxonomy and Jeffery’s (2009) genre coding scheme. These
seven categories are: descriptive, persuasive, expository, argumentative, informative, narrative,
and analytic. When used to code the writing rubrics, Troia and Olinghouse’s (2010) coding
taxonomy ensured a comprehensive coverage of the writing components and the writing
conventions noted in the writing rubrics. Together with the coding from writing prompts, they
defined the writing constructs assessed. Jeffery’s (2009) coding taxonomy categorized the
writing rubrics based on the most prominent features of the rubrics—each rubric could only
appear in one of the categories (i.e., rhetorical, formal, genre-mastery, cognitive, and expressive).
The taxonomy identified the most dominant rubrics used for each genre of writing; thus, there
would be associative patterns between genre demands in the prompts and rubric categories. In
summary, Troia and Olinghouse’s (2010) coding taxonomy examined the writing construct
defined together by prompts and rubrics while Jeffery’s (2009) coding taxonomy focuses on the
genre demands and the connections between prompts and rubrics. For the current study, these
two taxonomies can complement each other to reveal the writing constructs underlying largescale writing assessments.
4.3 Procedure
In the K-12 Writing Alignment Project, three raters coded state and NAEP writing
prompts with the first (writing processes), second (context), third (purposes), and sixth strands
(metacognition and knowledge) from Troia and Olinghouse’s (2010) coding taxonomy. The first
rater, paired with either the second rater or the third rater, coded each compiled assessment
document. The inter-rater reliabilities in this study were all calculated using Pearson r absence
and presence of agreements. The inter-rater reliability of rater 1 and rater 2 was .97 for prompt
coding; the inter-rater reliability of rater 1 and rater 3 was .95 for prompt coding. The reason that

20

only four strands were coded with prompts was that writing processes and writing contexts were
often specified in the verbal directions for test administration, and writing purposes and writing
knowledge were often specified in the writing prompts. Two separate raters coded state and
NAEP writing rubrics with the fourth (components) and fifth (conventions) strands from Troia
and Olinghouse’s (2010) coding taxonomy. These last two strands were coded with rubrics
because writing components and writing conventions were often specified in scoring rubrics. The
inter-rater reliability was .95 for rubrics coding. Differences were resolved through discussion.
Two raters coded state and NAEP writing prompts with the seven category-genre coding
scheme adapted from the third strand (purpose) of Troia and Olinghouse’s (2010) coding
taxonomy and Jeffery’s (2009) genre coding scheme. These raters also coded state and NAEP
writing rubrics with Jeffery’s (2009) criteria coding scheme. The author of this dissertation
served as one of the two raters. A graduate student in Digital Rhetoric & Professional Writing
served as the second rater. The two raters first practiced coding with a training set. When they
reached 85% inter-rater agreement, they moved into coding the actual prompts and rubrics. The
inter-rater reliability was .93 for prompt coding and .86 for rubrics coding. Differences were
resolved through discussion.
5. Results
5.1 How do the features of writing tasks and rubrics vary across a sample of states and
NAEP?
There were direct writing assessments from 27 states in the sample; however, because
Rhode Island and Vermont had the same New England Common Assessment Program (NECAP)
direct writing assessment, there were 26 distinct sets of prompts and rubrics from state writing
assessments. In the sample, 15 states had 7th grade writing assessments, and 18 states, including

21

Rhode Island and Vermont, had 8th grade writing assessments. There were six states that had
both 7th grade and 8th grade assessments (see Appendix C).
According to Troia & Olinghouse’s (2010) coding taxonomy, the writing constructs
assessed in state and national assessments were defined by prompts and rubrics together and
consisted of the writing process, writing context, writing content, writing mechanics, and writing
knowledge (see Table 8 in Appendix A).
Writing Process
There were four states that had general references to the writing process in their writing
directions and three states that gave students a choice of prompts. Out of 27 states, all but one
directed students to plan their compositions before they wrote. However, while the majority of
these states gave students planning pages, they did not give students separate planning sessions.
Only Kansas and Nevada gave students both pages and sessions for planning. Compared with
planning and drafting, revising was a less emphasized stage of the writing process. There were
twelve states that did not direct students to revise. Among the other fifteen states, only Kansas
and Massachusetts gave students both time and pages for revision. Arizona, Kentucky, Missouri,
and Washington gave students pages for revision (but no extra time), and only Nevada gave
students 15 minutes for revision (but no extra pages). One possible explanation of why fewer
states focused on revision was that some states directed students to edit rather than revise. For
example, 18 states directed students to edit. However, there were still seven states that did not
direct students to revise or edit their writing. Ten states emphasized the importance of publishing
by directing students to write a final product.
There were ten states that offered test-taking strategies to students. The most popular testtaking strategy was about space management—e.g., Massachusetts included the following verbal

22

directions in their administration manual “YOU MUST LIMIT YOUR WRITING TO THESE
FOUR PAGES; BE SURE TO PLAN ACCORDINGLY (originally capitalized)”
(Massachusetts, 2002, Grade 7)— with seven states advising students of that; two states advised
students about time management—e.g., Oklahoma included the following verbal directions in
their administration manual “Try to budget your time wisely so you will have time to edit and
revise your composition” (Oklahoma, 2006, Grade 8); and one state offered students strategies
about topic choice—i.e., Kansas’ administration manual contained the instruction that “you will
choose the one topic that you like the most and that you feel will allow you to do your best
writing. Keep this in mind as you consider each description” (Kansas, 2004, Grade 8).
Writing Context
Seven states gave students at least two writing tasks. New York gave students four
integrated writing tasks—short and long listening and responding, and short and long reading
and responding. Most states (20 out of 27) had a general mention of audience in their writing
prompts. Prior to 2007, only West Virginia had online writing sessions; students in other states
wrote on paper with pencils. Students in West Virginia were expected to log on to a website,
where they wrote a multiple-paragraph essay equivalent to a one-to-two page handwritten essay.
They did not have access to the spell check or grammar check options. Their papers were read
and scored by a computer that had been trained with essays written by West Virginia seventh and
tenth grade students. Within a few weeks West Virginia students would receive a detailed report
of their scores.
Nineteen states provided procedure facilitators for students’ writing; the most popular
procedure facilitators were checklists and rubrics. Eleven states allowed students to use
dictionaries or thesauri during writing exams. The prompts of Arkansas and Idaho situated

23

students’ writing in other disciplines; for example, “Your social studies class …” or “As an
assignment in your history class, ....” None of the writing prompts required students to consider
multiple cultural perspectives on an issue.
Only two states out of the 27 did not specify the response length; the typical length was
two pages. Around half of the states in the sample (13/27) did not have a time limit on their
writing assessments. Among the fourteen states that had a time limit, ten states had a specified
amount of time with an average of 52 minutes; the other four states gave students 45 minutes
with an optional extended period of time if needed.
Writing Components
All states evaluated the general organization and content of students’ compositions in
their rubrics; however, there were seven states that did not emphasize the general structure of
students’ essays and one state (i.e., Texas) that did not emphasize details. Ten states evaluated
the genre-specific information of students’ essays including organization, content, and ideas;
specifically, five states evaluated narrative components, four states evaluated expository
components, six states evaluated persuasive components, and three states evaluated response to
writing components. Most states (24/27) evaluated sentence fluency, style, and semantic aspects
(e.g., word choice) of students’ compositions. Seven states emphasized the use of figurative
language, one state (i.e., Kentucky) the use of citations and references, and no states considered
the use of multimedia (which is consistent with paper-and-pencil writing tasks).
Writing Mechanics
The majority of states’ writing rubrics had general reference to writing conventions (22
states), capitalization (19 states), punctuation (19 states), spelling (18 states), and grammar (24
states). Only Kentucky emphasized specific word-level capitalization and punctuation. Four

24

states emphasized students’ correct use of punctuation regarding sentence ending and clausal
linking. Six states emphasized the spelling of high frequency words; among these states,
Wisconsin also emphasized the spelling of graphophonemic elements. In addition, Kentucky
emphasized the spelling of abbreviations. The most frequently emphasized grammatical aspects
were: sentence construction (19 states), verbs and verb phrases (7 states), pronouns (4 states),
modifiers (4 states), nouns and noun phrases (3 states), adjectives (3 states), and adverbs (1 state,
i.e., West Virginia). Only Arkansas and Kentucky had general reference to formatting; twelve
states referred to specific aspects of formatting, e.g., paragraphing or using appropriate spacing
between words and sentences.
Writing Knowledge
The majority of states (nineteen states) explicitly directed students to recall their topical
knowledge when composing; the prompts in those states often set up situations in ways such as
“think about a time ….” However, none of the states used prompts to evoke students’ genre
knowledge, linguistic knowledge, procedure knowledge, or self-regulation.
5.2 What are the connections between these prompts and rubrics, especially in terms of
their genre demands?
Prompts

25

Figure 1 Genre Categories for 81 Prompts

Figure 1 shows the percentages of prompts of each genre in the sample. Out of 81 writing
prompts, including three NAEP prompts, there were 26 expository, 19 persuasive, 17 narrative, 6
informative, 6 literary analysis, 4 argumentative, and 3 descriptive prompts. Expository and
informative prompts combined comprised a little less than 40% of the prompts in the sample.
Expository prompts and informative prompts either assessed students’ abilities to “explain” how
something worked and why or “provide” facts about more concrete objects. Persuasive essays
were the second most used type of prompt and directed students to persuade an audience to agree
with their positions on an issue. Similar to persuasive prompts, the four argumentative prompts
directed students to provide evidence to support a position; however, they often did not explicitly
26

direct students to convince an identified audience. Together, persuasive and argumentative
prompts were a little less than one-third of the prompts in the sample. Narratives were the third
most assessed genre. They asked students to give an account of either an imaginary or actual
incident. Narrative prompts often had straightforward directions such as “tell about a time when
…” or “write a story…” The three descriptive prompts differed from the informative prompts by
directing students to provide attributes or details about an object, while the informative prompts
often asked the students to provide facts.
Rubrics

Figure 2 Criteria Categories for 38 Rubrics

27

Figure 2 shows the percentages of rubrics of each type in the sample. Among 38 rubrics,
including three NAEP rubrics, there were 19 genre-mastery rubrics, 12 rhetorical, 4 formal, 2
expressivist, and only 1 cognitive. Genre-mastery rubrics were the most used rubrics in state and
national direct writing assessments and comprised half of all the rubrics analyzed, emphasizing
students’ mastery of genres. Rhetorical rubrics were the second most used rubrics and comprised
almost one-third of rubrics examined, emphasizing the importance of addressing the audience
and achieving one’s writing purposes. There were only a few formal rubrics, which emphasized
the general structure and conventions of a paper. The two expressivist rubrics assessed students’
creativity in composing their papers, and the single cognitive rubric emphasized students’ critical
thinking shown through their writing.
Connections between Prompts and Rubrics

Table 1 Prompt-Rubric Contingencies for 81 Prompts
Count
Rubric Category
Rhetorical

Genre-mastery

Persuasive

6

8

Expository

15

Narrative

Formal

Total
Cognitive

Expressivist

2

3

0

19

8

3

0

0

26

7

8

0

0

2

17

Argumentative

3

1

0

0

0

4

Descriptive

0

2

1

0

0

3

Informative

3

3

0

0

0

6

Analytic

1

5

0

0

0

6

35

35

6

3

2

81

Genre Categories
for Prompts

Total

Table 1 shows the association between prompt genres and rubric types. Out of 81
prompts, there were 35 prompts assessed with rhetorical rubrics and 35 prompts assessed with

28

genre-mastery rubrics. There were only six prompts assessed with formal rubrics, three with
cognitive rubrics, and two with expressivist rubrics.
For informative prompts, the number assessed with rhetorical rubrics and genre-mastery
rubrics was the same. For persuasive and narrative prompts, there were slightly more prompts
assessed with genre-mastery rubrics than rhetorical rubrics. The majority of analytic prompts
were assessed with genre-mastery rubrics. There were only two descriptive prompts, both of
which were assessed with genre-mastery rubrics. For expository and argumentative prompts, the
majority were assessed with rhetorical rubrics.
Genre-mastery rubrics were used to evaluate all seven genres of writing—persuasive,
expository, narrative, argumentative, descriptive, informative, and analytic. Rhetorical rubrics
were used to evaluate all genres of writing except descriptive. Formal rubrics were used to
evaluate persuasive and expository writing; cognitive rubrics were only used to evaluate
persuasive writing; and expressivist rubrics were only used to evaluate narratives.
5.3 What are the similarities and differences between NAEP and state writing assessments?
More than 70% of states’ middle school writing assessments involved: directing students
to plan before drafting and to write for either a general audience or a specifically-identified
audience; providing procedure facilitators such as checklists; specifying the length of the writing;
and explicitly directing students to access their topical knowledge. The NAEP 8th grade writing
assessments also had these characteristics. For example, NAEP directed students to plan, write,
and review their writing, gave students a page for planning, and gave students a brochure of
planning and reviewing strategies to facilitate students’ writing. Also, NAEP did not give
students separate sessions for different stages of writing or specify the length of students’
writing.

29

Over 70% of states’ middle school writing assessments evaluated students’ texts’ quality
based on their organization, structure, content, details, sentence fluency, style, semantic aspects,
and grammar. More than 60% of states’ assessments evaluated students’ essays on capitalization,
punctuation, spelling, and sentence construction. The NAEP 8th grade writing tasks assessed all
of the above aspects. The NAEP 8th grade test also directed students to write in response to two
prompts and set a time limit of 25 minutes on each of these writing tasks. Only seven states
required two responses from students. Around half of the states in the sample (13/27) did not
have a time limit on their writing assessments. The other fourteen states had an average time
limit of 52 minutes, or 45 minutes with an optional extended period of time if needed.
While expository, persuasive, and narrative prompts were the most assessed genres in
state writing assessments, informative, persuasive, and narrative writing were assessed in the
NAEP 2007 direct writing assessments. Expository writing and informative writing were similar
because they both required students to explain something. However, they were also different
because expository writing directed students to explain more abstract concepts while informative
writing often directed students to provide factual information about concrete objects, events, or
phenomena. Genre-mastery rubrics were the most-used rubric type in state direct writing
assessments. Similarly, all the NAEP’s rubrics were genre-mastery rubrics.
5.4 Insights from a combined use of the two approaches
Troia & Olinghouse’s (2010) coding taxonomy provided a comprehensive framework to
examine writing assessments, as well as details about the components of these assessments (i.e.,
prompts and rubrics), while Jeffery’s (2009) coding taxonomy allowed an analysis of the most
dominant genre demands of prompts and the most emphasized features of rubrics. Moreover,
Troia & Olinghouse’s (2010) coding taxonomy examined writing constructs defined by prompts

30

and rubrics together, while Jeffery’s (2009) coding taxonomy examined the association between
prompts and rubrics.
Prompts
The descriptive genre was absent from Jeffery’s (2009) genre coding scheme because this
genre was not assessed in exit-level writing assessments in that study. The descriptive genre was
identified by state contacts during the K-12 Writing Alignment Project’s data collection; Troia
and Olinghouse’s (2010) coding taxonomy included the descriptive genre as one of the purposes.
In the K-12 Writing Alignment Project, the genre coding of the prompts was based on states’
identification of the prompts’ genres if given. For example, if a state identified one of its prompts
as expository, then the prompt was coded as expository. As a result, though there were
informative and analytical genres in Troia and Olinghouse’s (2010) coding taxonomy, few
prompts were coded informative or analytical in the K-12 Writing Alignment Project study
because these prompts were often identified by states as expository or writing in response to
literature. In this study, the genre coding of prompts was determined based on the prompts.
When there was ambiguity in prompts, states’ identification was taken into consideration. Some
responses to literature could be categorized as narrative, expository, or informative, while others
invited students to analyze literary elements in the provided literature and were therefore coded
as analytic. In the preliminary analysis of state writing prompts, one prompt was identified by its
state as summary. However, because summary only appeared once among 76 prompts and was
used to provide information about an object or event, it was also categorized as informative.
Rubrics
Table 2 shows those states with genre-mastery rubrics and/or with rubrics containing
genre-specific components. According to Jeffery’s (2009) criteria coding scheme, NAEP’s and

31

eleven states’ writing rubrics were genre-mastery rubrics. However, among these eleven states,
five states’ writing rubrics were not considered to contain genre-specific components according
to Troia and Olinghouse’s (2010) coding taxonomy. This occurred because these states’ writing
rubrics prioritized the assessment of genre and framed other evaluation criteria under it but did
not refer to specific genre elements. Also, according to Troia and Olinghouse’s (2010) coding
taxonomy, ten states’ writing rubrics contained genre-specific components. However, among
them, four states’ writing rubrics were not considered genre-mastery rubrics. Again this was
reasonable because though these rubrics contained genre-specific components, the overall
orientation or emphasis of these rubrics was not focused on genre mastery. For example, specific
genre components might be referred to in rubrics for the purpose of emphasizing the importance
of being “effective” with audience. Only NAEP’s and six states’ (Alabama, California, Illinois,
Indiana, New York, West Virginia) writing rubrics were both genre-mastery rubrics and
contained genre-specific components.
Table 2 States with Genre-mastery Rubrics and/or States with Rubrics Containing Genrespecific Components
States whose rubrics
contained genrespecific components
Alabama
California
Illinois
Indiana
Kansas
Missouri
New York
Nevada
Wisconsin
West Virginia

States whose rubrics
were genre-mastery
rubrics
Alabama
California
Idaho
Illinois
Indiana
Kentucky
Rhode Island
Vermont
New York
Virginia
West Virginia

States whose rubrics both were genremastery rubrics and contained genrespecific components
Alabama
California
Illinois
Indiana
New York
West Virginia

32

In this way, only these six states’ writing assessments placed similar levels of emphasis
on genre as NAEP’s writing assessments, though the genres they assessed were different from
those elicited by the NAEP.
Prompts and Rubrics Associations
For these six states with rubrics that were both genre-mastery rubrics and contained
genre-specific components, the following Table 3 shows the genres assessed with these rubrics
as well as those genres NAEP assessed.
Table 3 Genre Assessed in States with both Genre-mastery Rubrics and Rubrics
Containing Genre-Specific Components
State/NAEP

Genres Assessed

Alabama
California
Illinois
Indiana
New York
West Virginia
NAEP

Descriptive, Expository, Narrative, Persuasive
Narrative, Persuasive, Analytical, Informative
Narrative, Persuasive
Narrative, Persuasive, Analytical
Analytical, Expository
Descriptive, Persuasive, Narrative, Expository
Narrative, Informative, Persuasive

Only California assessed all the genres that NAEP assessed with a similar level of
emphasis on the genre demands. However, California also assessed the analytical genre, which
NAEP did not.
In summary, a combined use of Troia & Olinghouse’s (2010) coding taxonomy and
Jeffery’s (2009) coding scheme made it possible to examine the genres assessed particularly in
middle school writing assessments, as well as to differentiate similar genres such as persuasive
and argumentative, and expository and informative. Use of both also allowed a close look at
levels of emphasis on genre demands in state and NAEP writing assessments.

33

6. Discussion
6.1 Prevalent Writing Assessment Practices
The results of this study showed that only three states gave students choices for prompts,
thus illustrating it was not a popular practice at least by 2007. Studies of choices in the writing
assessment literature have either shown statistically non-significant results regarding students’
writing quality (Chiste & O’ Shea, 1988; Powers & Fowles, 1998; Jennings, Fox, Graves, &
Shohamy, 1999) or mixed results (Gabrielson, Gordon, & Engelhard, 1995; Powers, Fowles,
Farnum, & Gerritz, 1992). This may explain why offering a choice of prompts was not a popular
practice in state writing assessments.
The results of this study showed that the writing process approach had an impact on the
writing assessment, because the majority of states (26 states) directed students to plan, and more
than half of the states directed students to revise and edit. However, few states provided separate
planning, revision, and editing sessions. Teachers are encouraged to engage students daily in
cycles of planning, translating, and reviewing and teach students to move back and forth between
various aspects of the writing process as their texts develop (Graham et al., 2012). Though one
can argue that assessment should not mimic the entire process, but rather reflect on-the-spot
performance, if writing assessments are to measure, function as, and shape writing instructions in
schools, the writing procedures in assessments should emulate the process that students are being
taught to follow.
To date, it has been unclear exactly what students’ writing behaviors actually are under
the assessment pressures and time limits: whether students start composing immediately
regardless of planning directions when there is not a separate planning session, and whether
students revise at the end of their compositions or move back and forth between various aspects

34

of the writing process while they develop their texts. More research is needed to study students’
writing assessment behaviors to provide a solid foundation for designing the testing procedures
in direct writing assessments. Also, because assessments have a strong impact on what is taught
in schools, if states adopt the writing process approach to text production during testing sessions,
instructional practices in schools are more likely to reflect this approach. Hillocks (2002) found
that teachers tend to use some stages of the writing process but not others, e.g., some teachers in
Illinois, Texas, and New York only incorporated editing. He suggested “the success of the
assessment in promoting better teaching of writing is dependent on the character of the
assessment” (Hillocks, 2002, p.196).
Olinghouse, Santangelo, and Wilson (2012) found that only limited information about
students’ writing abilities across a range of skills can be generalized from students’ performance
on single-occasion, single-genre, holistically scored writing assessments. Chen, Niemi, Wang,
Wang, and Mirocha (2007) have shown that three to five writing tasks are required to make a
reliable judgment about students’ writing abilities. However, the results of this study showed that
only seven states gave students even two prompts. The only exception was New York, which
gave students four integrated writing tasks that included responding after both listening and
reading. The writing tasks from New York’s assessment have shown a potential path to increase
students’ writing opportunities by integrating listening and reading assessments with writing
assessments, although this practice has raised the question of how to distinguish students’ writing
abilities from other abilities.
Another possible way to increase students’ writing opportunities is to use writing
portfolios to supplement the direct writing assessment (Moss, 1994). Because direct writing
assessments are often constrained by time and resources available, when stakes are high a

35

combined use of direct writing assessments and writing portfolios ensures a more accurate
evaluation of students’ writing abilities. Because the feasibility and cost of implementing largescale portfolio assessments remain a challenge, Gearhart (1998) cautioned that the quality of
students’ portfolios reflects not only students’ competence, but also depends on a range of
circumstantial variables. They include “teachers’ method of instruction, the nature of their
assignments, peer and other resources available in the classroom, and home support” (p.50), thus
making comparability an issue.
Audience specification has been an extensively researched aspect of prompt design.
However, the results of these studies have been mixed (Cohen & Riel, 1989; Chesky & Hiebert,
2001). For example, Redd-Boyd and Slater (1989) observed that audience specification had no
effect on scores, but influenced students’ motivation and composing strategies. Perhaps because
of this, the majority of states (20/27) specified an audience in their state writing prompts, and at
least 30% of writing rubrics emphasized the importance of authors’ consideration of audience in
their compositions. However, these writing prompts incorporated a wide range of audiences
including general “readers,” pen pals, and students’ classes, classmates, or teachers.
Checky and Hiebert (2001) examined high school students’ writing and found that there
were no significant differences in the length or quality of students’ writing as a function of peers
or teachers as a specified audience. Cohen and Riel (1989) compared seventh-grade students’
writings on the same topic to peers in other countries and those to their teachers. They found that
the quality of students’ texts written for their peers was higher than those intended for their
teachers, and suggested that contextualization could lead to improvements in the quality of
students’ classroom writing. However, contextualization of students’ writing in direct writing
assessments has remained challenging because the audiences are often just the raters. Some

36

states have tried to construct semi-authentic scenarios for students’ writing; for example, two
states situated their writing tasks within disciplinary contexts without relying heavily on
disciplinary content knowledge, thus illuminating a way to construct a semi-authentic scenario in
a setting with which students would be familiar.
In summary, state writing assessments have managed to incorporate extensively
researched aspects, but often such incorporations remain only partial. Most state writing
assessments only directed students to plan and draft, with less emphasis on revision; most states
directed students’ writing towards an audience, but contextualization of students’ writing still
remained a challenge. A few states gave students more than one prompt, but even the secondmost-common option of two prompts is not enough to be able to make a generalization about
students’ global writing abilities. Possible reasons for this partial incorporation dilemma are that
a) assessment programs have limited resources, b) the nature of standardized assessments
restricts the contextualization of tests to ensure comparability, or c) the understanding of
students’ assessment behaviors, especially in terms of their interaction with test items, is
insufficient. More research is needed on students’ assessment behaviors and different methods of
assessing students’ writing abilities (e.g., integrated writing tasks).
An emphasis on organization, content, and details was a feature for almost all writing
rubrics; word choice, sentence fluency, style, and grammar, including sentence construction,
were also highly prized aspects of students’ papers. General conventions, such as capitalization,
punctuation, and spelling, were also assessed by the majority of states. This shows that,
regardless the rubric types, these aspects are considered necessary for demonstrating writing
proficiency by most states. Only ten states included genre-specific components in their rubrics;
persuasive texts’ components are most often specified compared with other genres. While

37

expository is the most assessed genre (16 states), only four states have specified expository texts’
components in their rubrics. Genre demands in state writing assessments will be discussed in the
next section.
By 2007, only West Virginia had online writing sessions with their state direct writing
assessments. However, aligned with the CCSS, the new K-12 assessments developed by the
SBAC and the PARCC will be administered via computer. Computer technology has entered
most classrooms. In 2009, around 97% of teachers in U.S. public schools had computers in the
classroom. The ratio of students to computers in the classroom every day was 5.3 to 1. About
40% of teachers reported that they or their students often used computers in the classroom during
instructional time (U.S. Department of Education, 2010). It is possible that many students are
now used to composing using computers. However, if the former state writing assessments were
taken with paper and pencils, it is important that students are well prepared for the transition.
6.2 Genre Demands in Direct Writing Assessments
The results of this study showed that the most popular prompt genre in middle school
assessments was expository, followed by persuasive, narrative, informative, analytic,
argumentative, and finally descriptive. Jeffery’s (2009) analysis of high school exit-level
prompts indicated that the most popular genre was persuasive, followed by argumentative,
narrative, explanatory, informative, and analytic. Persuasive and argumentative genres
comprised over 60% of all the prompts (Jeffery, 2009). Therefore, the transition from middle
school to high school writing assessments signifies an emphasis shift from expository
compositions to persuasive and argumentative compositions. This makes sense because
argumentative compositions are more abstract and place more cognitive demands on students
(Crowhurst, 1988); thus, it might be most suitable for assessments of high school students.

38

Meanwhile, informative prompts have appeared infrequently both in this study and
Jeffery’s (2009) study. Given that informative prompts often require students to provide factual
information about objects or events and place less cognitive demands on students than even
expository prompts (Jeffery, 2009), it might be a genre most suitable for students at grades lower
than middle school, unless specified by states’ standards. These findings suggest that to ensure a
continuum of students’ learning and mastery of these genres, it is important that students are
provided more opportunities to practice argumentative writing in high school; given that
informative and descriptive genres are less emphasized in middle school and exit-level writing
assessments, it is important that students are provided more opportunities to master these genres
in lower grades.
The results of this study showed that half of the rubrics were genre-mastery rubrics.
There were few rubrics that emphasized creativity and critical thinking, which is in accordance
with what Jeffery (2009) found with the exit-level writing rubrics. Moreover, the expressivist
rubrics, though appearing only two times, corresponded with narrative genres, and the cognitive
rubrics corresponded with persuasive prompts, showing a consistency with Ivanic’s (2004)
framework. Different from Jeffery’s (2009) finding that rhetorical rubrics were used with all
genres of exit-level prompts, this study found that genre-mastery rubrics were used with all
genres, while rhetorical rubrics did not correspond with descriptive prompts. The number of
states that used genre-mastery rubrics was about the same as the number of states that used
rhetorical rubrics. In a way, this finding confirms the assertion that “the appropriateness of
language to purpose is most often prioritized in assessing writing regardless of the task type”
(Jeffery, 2009, p.14). Meanwhile, the large number of genre-mastery rubrics suggests that states
have started to place more genre-mastery expectations on students. However, as discussed

39

earlier, only ten states included genre-specific components in their rubrics and only four states
included components of the most popular genre, expository texts; as a result, only six states had
genre-mastery rubrics that contained genre-specific components. This finding suggests that the
genre evaluation criteria that states place on students’ writing are either vague or not fully
utilized to assess students’ genre mastery.
6.3 State and National Alignment
State writing assessments and NAEP seem to align in their adoption of the writing
process approach, their attention to audience and students’ topical knowledge, their
accommodations through procedure facilitators, and their inclusion of organization, structure,
content, details, sentence fluency, and semantic aspects as well as general conventions such as
punctuation, spelling, and grammar in their assessment criteria.
However, NAEP’s writing assessment differs from many states’ by having explicit
directions for students to review their writing, giving students two timed writing tasks, making
the informative genre—which was rarely assessed in state assessments—one of the three genres
assessed, and including genre-specific components in their writing rubrics. The fact that all of
NAEP’s writing rubrics are genre-mastery rubrics with genre-specific components can be
considered one of the biggest differences from most of the state writing assessments.
Thus, when state and national writing assessment results are compared, these two
assessments differ in the genres they assess, the amount of time and number of tasks they give to
students, and the level and specificity of genre demands they emphasize in their evaluation
criteria. These differences are observed in this study. When there is a discrepancy between state
and national assessment results, can these differences explain some of the discrepancy? Research

40

with variables that can quantify these differences and model the relationship between these
differences and writing assessment results will help answer this question.
7. Implications
More research needs to be done on the interaction between assessment procedures and
students’ assessment behaviors and performances. For example, further research could examine
whether it increases the validity of writing assessments by incorporating explicit directions for
different stages of the writing process and providing brochures with tips about planning, drafting,
revising and editing.
Under the allowance of time and resources, more writing opportunities should be
provided to students during writing assessments so that their writing abilities can be evaluated
more accurately. When this is not possible, states should be more explicit about the interpretation
of their writing assessments, so that students’ performances and results reflect the actual genre
assessed and specific measures used (Olinghouse et al., 2012). When states intend to evaluate
students’ genre-mastery skills, it is helpful to include genre-specific components in their rubrics
so that their expectations are made explicit to students, raters, and educators. These
recommendations are also applicable to the new K-12 assessments developed by the SBAC and
the PARCC.
Students taking the NAEP 2007 were expected to write for three purposes—narrative,
informative, and persuasive. It is not clear whether informative writing encompassed expository
writing or referred to expository writing in NAEP. However, this study shows that informative
writing has rarely been assessed in state writing assessments, while expository writing has been
widely assessed in middle school. It is recommended that NAEP clarify and elaborate the
categories of persuasive, informative, and narrative in its assessments.

41

Applebee (2005) suggested that such an attempt for clarification and elaboration has
already taken place with the NAEP 2011 writing assessments. For example, the NAEP 2007
writing framework generally suggested that “students should write for a variety of purposes—
narrative, informative, and persuasive” (National Assessment Governing Board, 2007, p.11);
while the NAEP 2011 writing framework stated that it will “assess the ability: 1. to persuade, in
order to change the reader’s point of view or affect the reader’s action; 2. to explain, in order to
expand the reader’s understanding; 3. to convey experience, real or imagined” (National
Assessment Governing Board, 2010, p.21). Further, the framework explicitly listed how “to
explain” looks like for different grade levels:
On the NAEP Writing Assessment, tasks designed to assess students’ ability to write to
explain at grade 4 might call for a basic explanation of personal knowledge or an
explanation of a sequence of pictures and/or steps provided in the task. Grade 8 tasks may
ask students to analyze a process or write a response that compares similarities and
differences between two events or ideas. Grade 12 tasks may focus on asking students to
identify the causes of a problem or define a concept. (p.37)
It is clear that “to explain” in the new framework encompasses both informative writing and
expository writing. The framework places more emphasis on informative writing in grade 4, and
more on expository writing in grades 8 and 12.
More research is needed to investigate different methods of writing assessments, such as
using integrated writing tasks, and study students’ assessment behaviors, such as their
interactions with writing prompts and instructions.
8. Limitations

42

This study only analyzed seventh and eighth grade direct writing prompts. Grade-level
expectations for writing performance change from the elementary grades to the high school
grades; however, this study could not examine those changes without also examining the
elementary grades and the high school grades. Future studies should investigate writing
expectations from elementary grades to high school grades because such studies will highlight
the changes and help tailor the expectations to appropriate grade level. Indirect writing
assessment items also contribute to states’ definitions of the writing construct; however, they are
beyond the scope of this study.
Because there was no NAEP data available for five states, thirteen states and the District
of Columbia chose not to participate the study, and six states did not have 7th grade and 8th
grade writing standards and assessments available for the period 2001-2006, only 27 states’
direct writing assessments were included in this analysis. Therefore, the writing constructs
examined in this study and the comparison between states and NAEP assessments were limited
to these 27 states.
The sample of the NAEP examined was limited to publically-released data comprised of
three prompts and three rubrics. These prompts represent the genres assessed in the NAEP; but it
is possible that they do not showcase all the genres assessed. For example, the informative
prompt was coded to assess informative writing in this study; however, it is possible that there
were informative prompts that actually assessed expository writing. Without examining the other
writing prompts in the NAEP, it is hard to determine how different those writing prompts are
from the released sample. Therefore, the writing construct assessed in the NAEP might not be
completely captured by this study since the analysis was based only on the publically-released
sample.

43

CHAPTER 2: Predicting Students’ Writing Performances on the NAEP from Assessment
Variations
1. Introduction
Persistent discrepancies are identified between state and national writing assessment
results (Lee, Grigg, & Donahue, 2007; Salahu-Din, Persky, & Miller, 2008). State mandated
assessments often report high proficiency levels, but the results of the National Assessment of
Educational Progress (NAEP) indicate low proficiency levels. The variation between state and
national assessments’ definitions of the writing construct and measurements of writing
proficiency is one possible explanation of this gap. However, little is known about how these
assessments actually vary. Even less is known about how this variation predicts students’
performance on the NAEP. One factor contributing to the differences in students’ performances
between state tests and the NAEP is the differing writing constructs that the state and NAEP tests
assess; as a result, students’ performances on the NAEP does not only indicate students’ writing
abilities, it also reflects how well students are prepared for the type of assessments the NAEP
utilizes.
Research has shown that high-stakes assessments (i.e., state-mandated assessments) have
an impact on classroom instruction (Hillocks, 2002; Moss, 1994). When the content and format
of state-mandated assessments are comparable to the national assessment, students are indirectly
prepared for the NAEP. However, whether students actually achieve higher scores on the NAEP
when their state assessments are more similar to NAEP, and lower scores when their state
assessments are less similar, is unknown. In other words, whether this variation between state
and national writing assessments predicts students’ performance on the NAEP remains
unexamined. This study aims to fill this gap in the research.

44

To examine the impact of the variations between state and national writing assessments
on students’ performances, it is important to control those variables found in existing research
that tend to have an influence on those performances. Students’ demographic backgrounds, their
writing attitudes and motivations, and their previous experiences with writing have a significant
influence on their writing development and performances, which will be discussed next.
Gabrielson, Gordon, and Englehard (1999) studied the effect on writing quality of
offering students a choice of writing tasks. To do this, they examined persuasive essay writing
tasks administered to 34,200 grade 11 students in the 1993 Georgia state writing assessments.
These tasks were organized into packets of single tasks for groups in the assigned-task condition
and packets of pairs of tasks for groups in the choice-of-task condition. They found that while
the choice condition had no substantive effect, gender, race, and the specific writing tasks given
had a significant impact on the writing quality in both the multivariate analysis of variance and
the univariate analysis. Female students’ essays received higher scores than those of male
students. White students’ essays received higher scores than those of Black students. The writing
task variable had significant interaction with gender and race. Female students were more likely
to perform better than male students on some writing tasks rather than others; White students
were also likely to perform better than Black students on certain writing tasks. Because the
purpose of the study was to investigate the effect on students’ writing quality of offering students
a choice of writing tasks, and also for test security reasons, the fifteen tasks were not revealed
and there was no further illustration of what different characteristics these writing tasks
possessed in the study.
Ball’s (1999) case study of a sample text written by an African-American high school
male sophomore student revealed influence of African-American Vernacular English (AAVE) on

45

the student’s grammatical and vocabulary choices, spelling variations, and discourse style and
expressions in his writing. Kanaris (1999) examined writings about a recent excursion by 29 girls
and 25 boys in grades 3-4, and found that the girls tended to write longer and more complex
texts, with a greater variety of verbs and adjectives and more description and elaboration; the
boys were more likely than the girls to use the first person singular pronoun, and less likely to
take themselves away from the center of the action.
Research also suggests that students’ English language proficiency plays an important
role in their writing performances. Research such as Silva’s (1993) has examined the nature of
English as the First Language (L1) writing and English as a Second Language (ESL/L2) writing,
and found that L2 writing is distinct from L1 writing by appearing to be less fluent, less accurate,
and less effective with L1 readers than L1 writing. L2 writers’ texts are simpler in structure, and
include a greater number of shorter T-units and more coordination, as well as a smaller number
of longer clauses, less subordination, fewer noun modifications, and minimal passive sentence
constructions. They also include more conjunctives and fewer lexical ties, as well as have less
lexical control, variety, and sophistication overall (Silva, 1993). ESL students’ English
proficiency levels greatly influence their writing abilities, so that students with different
proficiency levels include a variety of lexical and syntactic features in their writing: number of
words, specific lexical classes, complementation, prepositional phrases, synonymy/antonymy,
nominal forms, stative forms, impersonal pronouns, passives, relative clauses, deictic reference,
definite article reference, coherence features, participial phrases, negation, present tense,
adverbials, and 1st/2nd person pronouns (Ferris, 1994).
It is also common for students with special needs to experience substantial difficulty with
writing (Graham & Harris, 2005). Gilliam and Johnson (1992) compared the story telling and

46

writing performance of 10 students with language/learning impairment (LLI) between the ages of
9 and 12 years and three groups of 30 normally-achieving children matched for chronological
age, spoken language, and reading abilities using a three-dimensional language analysis system.
They found that LLI students produced more grammatically unacceptable complex T-units,
especially in their written narratives, than students from the three matched groups.
Newcomer and Barenbaum (1991) reviewed research investigating the written composing
ability of children with learning disabilities and concluded that these children struggled with
most aspects of mechanics/syntax/fluency, and as a result were less skilled than other children in
writing stories and expository compositions. Resta and Eliot (1994) compared the performance
of 32 boys between the ages of 8 and 13 years belonging to three groups—those with attention
deficits and hyperactivity (ADD+H), those with attention deficits without hyperactivity (ADDH), and those without attention deficits—on the Written Language Assessment, and found that
both ADD+H and ADD-H children had poorer performance on most of the written language
subtests than children without attention deficits. They therefore concluded that children with
attention deficits possessed significant limitations in their writing and composition.
Students’ attitudes and motivation are yet more factors that have a significant impact on
their writing development (Mavrogenes & Bezrucko, 1999) and writing achievements (Graham,
Berninger, & Fan, 2007). Moreover, students’ positive beliefs and attitudes about writing
determine their motivations to write (Bruning & Horn, 2000), while difficulties created by lack
of knowledge and complexity of writing tasks can adversely influence their motivation levels
(Zimmerman & Risemberg, 1997). Meanwhile, motivation is not a unitary construct; rather, it is
“a domain-specific and contextually situated dynamic characteristic of learners” (Troia,
Shankland, & Wolbers, 2012, p.6). In other words, a student’s motivation to write is independent

47

of their motivation to read, and changes according to the performance contexts. Therefore,
performance contexts affect motivation, while in turn “positive motivation is associated with
strategic behavior, task persistence, and academic achievement” (Troia et al., 2012, p.6).
Students’ perceptions of prompt difficulties are related to both students’ knowledge about
the writing topic (Powers & Fowles, 1998) and prompts’ characteristics such as question type
(e.g., compare/contrast, descriptive/narrative, argumentative) (Polio & Grew, 1996; Way, Joiner,
& Seaman, 2000) and topic specificity (Chiste & O’Shea, 1988; Polio & Grew, 1996). However,
previous research has failed to detect a strong relationship between students’ perception of
prompt difficulty and their writing performance (Powers & Fowles, 1998).
Students’ writing activities inside classrooms tend to have a positive effect on students’
writing composition. The meta-analysis conducted by Graham, Kiuhara, McKeown, and Harris
(2012) suggested that “four of the five studies that examined the effects of increasing how much
students in grades 2 to 6 wrote produced positive effects” (p. 42). The only study that had a
negative effect involved English language learners (Gomez & Gomez, 1986). Thus, while
students’ writing activities inside classrooms are related to their writing performances, their
backgrounds also need to be considered.
Students’ experiences with writing also play a significant role in their writing
achievements. In the NAEP 2007 writing assessments, students’ experiences with writing were
surveyed through questions asking about the feedback they received from teachers and the use of
computers in their daily writing. Research has shown that teachers’ and peers’ feedback tend to
improve students’ writing quality and productivity (Rogers & Graham, 2008), while a lack of
immediate feedback can negatively impact students’ motivation (Zimmerman & Risemberg,
1997). Meanwhile, students’ use of technology is likely to increase their compositions’ length,

48

their adherence to conventions, and the frequency of revisions; it also cultivates students’
positive attitudes towards writing and improves their writing quality (Bangert-Drowns, 1993;
Goldberg, Russel, & Cook, 2003).
In summary, students’ writing performance on assessments is closely related to their
backgrounds and prior writing experiences. Therefore, a study of the relationships between state
and NAEP writing assessment variations and students’ NAEP writing performances necessitates
controlling for the following variables relating to students’ individual characteristics: students’
attitudes towards writing and perceptions of prompt difficulty, their demographic backgrounds
(i.e., gender, race/ethnicity, English language proficiency, social economic status, and disability
status), their writing activities inside classrooms, and their experiences with writing.
2. Research Questions
Through multi-level modeling analysis, this study explores state and NAEP assessment
data to answer the following research question: Do students from states that use writing
assessments with a higher degree of similarity to NAEP writing assessment features, measured
by the Euclidean distance with the multi-dimensional writing construct, perform better on the
NAEP, controlling for students’ attitudes towards writing and perceptions of prompt difficulty,
their demographic backgrounds, their writing activities inside classrooms, and their experiences
with writing?
3. Method
3.1 State and NAEP Direct Writing Assessments
This study was conducted upon data from a prior IES-funded study—the K-12 Writing
Alignment Project (Troia & Olinghouse, 2010-2014). In the K-12 Writing Alignment Project,
states’ Department of Education websites were first used to locate appropriate assessment

49

personnel. Documents were then requested through email inquiries and phone calls. Because the
K-12 Writing Alignment Project examined the alignment between state writing standards and
assessments prior to the adoption of the CCSS, and the NAEP 2007 data contained state-level
writing data allowing state-level modeling, the NAEP 2007 data was used. State direct writing
assessments were gathered mainly from between 2001 and 2006, to allow comparisons to be
made with the NAEP 2007. The number and dates of the major revisions between 2001 and 2006
were identified for each state to ensure the collection of its representative state writing
assessments. From each time span between major revisions, a representative writing prompt, its
rubric, and the administrative manual for each genre in each grade being assessed were collected.
In this study, 78 prompts and 35 rubrics from 27 states2 were analyzed (see Appendix C
for details). NAEP data was not available for Alaska, Nebraska, Oregon, and South Dakota for
the time period selected. State writing standards or writing assessments were not available for
Connecticut, Iowa, Pennsylvania, Montana and New Mexico between 2001 and 2006. There was
no writing assessment for 7th grade and 8th grade in Ohio during the period 2001-2006.
Consequently, these states’ direct writing assessments were not analyzed in this study.
The state direct writing assessment documents were compiled to be used for coding. The
complied files include the following components: verbal directions from administration manuals
for direct writing assessments, actual prompts, supporting materials provided (e.g., dictionary,
writer’s checklist), sessions arranged for writing tests, time given, page limits, and whether (and
what kinds of) technology was used. The number of responses expected from students each year
determined the number of compiled files for each state. For example, if students took only one
prompt with rotated genres each year, the prompts from the rotated genres were all compiled into
2

The following chose not to participate in the study: Colorado, Delaware, the District of Columbia, Georgia,
Hawaii, Maryland, Minnesota, Mississippi, New Hampshire, New Jersey, North Dakota, South Carolina, Utah, and
Wyoming.

50

a single document to represent the scope of genres assessed and the number of prompts (i.e., one
prompt in this case) assessed in a test administration.
This study included three publically-released NAEP 2007 writing prompts from eighth
grade (i.e., a narrative prompt, an informative prompt, and a persuasive prompt), scoring guide,
and writing framework. These three writing prompts were released to represent the genres the
NAEP assessed; other writing prompts were not released due to test security and possible future
use.
3.2 Coding taxonomy
This study used Troia and Olinghouse’s (2010) seven-stranded coding taxonomy. The
coding taxonomy was derived from several theoretical frameworks—Hayes’ cognitive model of
writing (Flower & Hayes, 1981; Hayes, 1996), socio-cultural theory (Prior, 2006), genre theories
(Dean, 2008), linguistic models of writing (Faigley & Witte, 1981), and motivation theories of
writing (Troia, Shankland, & Wolbers, 2012)—to assure a broad representation of current
thinking about writing development, instruction, and assessment. The coding taxonomy
consisted of seven strands: (1) writing processes, (2) context, (3) purposes, (4) components, (5)
conventions, (6) metacognition and knowledge, and (7) motivation. In writing assessments, the
indicators in the seventh strand—motivation, which refers to personal attributes within the writer
such as general motivation, goals, attitudes, beliefs and efforts—did not apply, because states
rarely administered assessment documents such as surveys alongside the writing assessments to
measure these personal attributes. The indicators found within those six strands in the coding
taxonomy covered: all stages of the writing process; specific composition strategies;
circumstantial influences outside the writer; a variety of communicative intentions accomplished
through different genres; features, forms, elements, and characteristics of text; the mechanics of

51

producing text; and knowledge resources within the writer that drive writing activity and writing
development.
Meanwhile, Jeffery’s (2009) genre and criteria coding schemes, derived from high school
exit writing prompts, were used to supplement Troia and Olinghouse’s (2010) coding
framework. A preliminary frequency analysis of state writing prompts’ genres coded with Troia
& Olinghouse’s (2010) coding taxonomy indicated that only a few genres were assessed in state
writing assessments—expository, descriptive, persuasive, response-to-literature, descriptive,
narrative, and summary. As a result, the third strand of Troia and Olinghouse’s (2010) coding
taxonomy was replaced by a seven-category genre coding scheme—descriptive, persuasive,
expository, argumentative, informative, narrative, and analytic.
In this coding scheme, persuasive prompts and argumentative prompts were
differentiated to represent common and subtle differences between these two genres.
Argumentative prompts differ from persuasive prompts by calling abstractly for “support” of a
“position,” and by not designating a target audience. In contrast, persuasive prompts require
students to convince an identified audience to act on a specific issue. Moreover, persuasive
prompts are unlike argumentative prompts because they invite students to take a one-sided
perspective on an issue, while argumentative prompts often expect students to consider multiple
perspectives on an issue. A new strand evaluating rubrics’ most dominant features was created
by using Jeffery’s (2009) criteria coding scheme. Rubrics in the sample were categorized into
one of the five criteria coding schemes: rhetorical, genre-mastery, formal, cognitive, and
expressivist.
The result of this was a coding taxonomy containing seven strands and 90 indicators. For
each compiled document, all the indicators could only be coded 0 or 1 (absent or present). The

52

exception was that indicators for planning, drafting, and revising in the first strand could have up
to three points each to accommodate information, including whether students were directed to
plan, draft, and revise, as well as the time and pages or writing space given for each step. For
example, Kansas directed eighth grade students to plan, draft, and revise and gave students the
time and space to do each step, thus, it received a maximum score of nine for these three
indicators: plan, draft, and revise. Louisiana directed eighth grade students to draft and gave
students the time and space to do so, but did not direct students to plan and revise, nor did it give
students time or space for these activities, thus it received a minimum score of three for only one
indicator—draft. When there were multiple compiled assessment documents for either seventh
grade or eighth grade, a sum score of these coded compiled assessment documents was used for
each indicator for a state. When a state had both 7th and 8th grade writing assessments, an
average score of the 7th grade and 8th grade coded compiled assessment documents was used for
each indicator for the state.
3.3 Coding Procedure
In the K-12 Writing Alignment Project, the first (writing processes), second (context),
third (purposes), and sixth (metacognition and knowledge) strands from Troia and Olinghouse’s
(2010) coding taxonomy were used to code state and NAEP writing prompts by three raters,
because writing processes and writing contexts were often specified in the verbal directions of
test administrations, and writing purposes and writing knowledge were often specified in writing
prompts. The first rater was paired with either the second rater or the third rater to code each
compiled assessment document. The first rater and the second rater reached an inter-rater
reliability of .97; the first rater and the third rater reached an inter-rater reliability of .95. Because
writing components and writing conventions were often specified in the scoring rubrics, the

53

fourth (components) and fifth (conventions) strands from Troia and Olinghouse’s (2010) coding
taxonomy were used to code state and NAEP writing rubrics by two separate raters. They
reached an inter-rater reliability of .95 and resolved differences through discussion.
In this study, two raters coded state and NAEP writing prompts with the seven-category
genre coding scheme adapted from the third strand (purpose) of Troia and Olinghouse’s (2010)
coding taxonomy and Jeffery’s (2009) genre coding scheme. These raters also coded state and
NAEP writing rubrics with Jeffery’s (2009) criteria coding scheme. The inter-rater reliability
was .93 for prompt coding and .86 for rubrics coding. Differences were resolved through
discussion.
Once the coding of prompts and rubrics was finished, each state’s writing assessments
were characterized by the 90 indicators under the seven strands, including Jeffery’s (2009)
criteria coding scheme and the six strands from Troia and Olinghouse’s (2010) coding taxonomy.
These indicators were used to calculate the distance between state assessments and the NAEP in
the next step.
3.4 Distance between State Assessments and the NAEP
Because state and NAEP direct writing assessments were coded with the above
taxonomy, the writing constructs in these assessments were examined in multiple dimensions. As
a pure mathematical concept, Euclidean distance measures the distance between two objects in
Euclidean n-spaces. More specifically, state X’s writing construct could be defined by the 90
indicators in the coding taxonomy as (x1, x 2, … x90), and NAEP Y’s writing construct could be
defined by the 90 indicators in the coding taxonomy as (y1, y 2, … y90). Euclidean distance can be
calculated as:
d(X,Y )  (x1  y1)2  (x 2  y 2 )2 ...  (x 90  y 90 )2 

90

 (x  y )
i

i1

54

i

2

where d(X, Y) indicates the amount of difference between state and NAEP direct writing
assessments. A small d(X, Y) means that state and NAEP direct writing assessments are similar.
A large d(X, Y) means that state and NAEP direct writing assessments are different. The
Euclidean distance is unstandardized because most of the indicators are coded as 0 or 1; thus, it
is less likely that some indicators carry much more weights than other indicators and dominate
the distance.
Because the number of compiled documents are the same number of prompts that
students should write in response to in state writing assessments, states with more compiled
documents were given more codes as each compiled document was coded with the coding
taxonomy once. NAEP gave students two writing prompts, thus, states which gave students two
prompts had writing assessments more similar to NAEP’s based on the Euclidean distance. The
value of d(X, Y) for each state can be found in the last column in Table 4 below. The value is in
the range of 7.48 and 15.2, with a mean of 9.97 and a standard deviation of 1.53.
3.5 NAEP Sample
A total of 139,910 eighth grade students participated in the NAEP 2007 writing
assessments. From this total, 85,437 students from the 27 states where direct assessments were
gathered were selected. When weighted, this represented a population of 2,415,129 (see Table 9
in Appendix A for descriptive statistics). Because this sample was missing some data, the sample
used in Hierarchical Linear Modeling (HLM) analysis was reduced. There were 73,754 eighth
grade students in the HLM sample (see Table 4 below for descriptive statistics). The
demographics of the HLM sample and the 27-state NAEP sample were very similar (see Table
10 in Appendix A for comparisons between the 27-state NAEP sample and HLM sample).
3.6 Students’ NAEP Composition Performance

55

Table 4 Sample Sizes, Achievement, and Student Demographics, 27 State Grade 8 HLM Sample

State

Weighted
N

Mean Student
Achievement

SE(Mean)

%
Black

%
Hispanics

%
Asian

%
American
Indian

%
Female

% LEP

%
With
IEPs

% Free/
reducedprice
lunch

Distance
between
state and
NAEP

Alabama

N
2360

48406

150.877

1.335

33.2%

1.9%

0.8%

0.4%

51.1%

1.1%

9.2%

47.6%

12.845

Arizona

2199

57486

150.436

1.426

5.4%

37.9%

2.6%

6.5%

49.8%

8.7%

7.1%

42.6%

10.428

Arkansas

2081

29300

152.304

1.21

22.3%

7.1%

1.0%

0.3%

48.5%

3.7%

11.2%

51.6%

9.138

California

6361

366387

151.844

0.997

6.4%

46.4%

12.9%

1.3%

50.3%

18.4%

6.6%

47.4%

11.314

Florida

3302

157639

160.332

1.4

21.7%

23.2%

2.4%

0.3%

49.9%

4.7%

11.5%

41.9%

8.124

Idaho

2460

17890

155.447

1.079

1.0%

12.8%

1.5%

1.6%

48.9%

5.1%

8.0%

38.8%

9.327

Illinois

3337

128181

162.029

1.508

17.7%

17.7%

4.6%

0.1%

49.4%

2.5%

11.8%

38.7%

10.954

Indiana

2309

67987

156.499

1.247

11.5%

5.9%

1.2%

0.2%

50.4%

2.3%

10.6%

33.5%

7.483

Kansas

2380

28803

157.12

1.385

7.7%

11.7%

1.9%

1.5%

50.1%

3.7%

10.3%

35.7%

9.274

Kentucky

2251

38972

152.067

1.376

9.9%

1.6%

1.0%

0.0%

50.9%

0.9%

8.0%

46.5%

9.314

Louisiana

2059

41170

148.265

1.24

41.7%

2.2%

1.2%

1.0%

49.1%

0.8%

11.2%

59.1%

9.925

Maine

2243

12942

162.335

1.106

1.5%

0.7%

1.4%

0.2%

49.9%

1.5%

14.1%

33.0%

9.274

Massachusetts

2944

57051

168.863

1.524

8.4%

9.7%

5.4%

0.2%

48.9%

3.1%

13.5%

25.4%

10.770

Michigan

2195

100740

153.185

1.286

17.1%

2.6%

2.4%

0.9%

50.3%

1.5%

10.8%

31.3%

9.925

Missouri

2495

62339

154.508

1.126

17.6%

2.7%

1.6%

0.1%

50.2%

1.6%

10.6%

36.1%

9.381

Nevada

2136

22842

146.746

1.063

9.4%

33.3%

8.8%

1.6%

51.0%

8.4%

9.2%

36.7%

8.944

New York

3050

170662

157.207

1.273

16.9%

17.3%

6.8%

0.3%

50.9%

3.6%

13.3%

46.1%

15.199

North Carolina

3452

86993

154.978

1.266

28.0%

6.9%

2.4%

1.3%

50.2%

3.7%

13.7%

42.5%

9.220

Oklahoma

2233

36291

153.877

1.161

8.9%

8.2%

2.2%

20.0%

50.0%

3.2%

12.6%

47.5%

8.832

Rhode Island

2248

10034

156.225

0.832

7.6%

16.6%

3.0%

0.5%

50.4%

2.2%

16.0%

30.5%

10.050

Tennessee

2436

64043

157.487

1.398

23.9%

4.7%

1.5%

0.0%

50.7%

1.7%

8.2%

43.9%

10.440

Texas

5951

246259

153.128

1.16

15.3%

43.3%

3.1%

0.2%

49.9%

5.7%

6.6%

49.3%

9.899

Vermont

1744

5956

162.968

1.174

1.6%

1.0%

1.6%

0.4%

47.9%

2.3%

16.2%

26.7%

10.050

Virginia

2301

74430

157.838

1.257

27.3%

5.6%

4.6%

0.2%

49.7%

2.9%

9.7%

26.7%

8.944

Washington

2418

62506

160.472

1.453

5.3%

12.7%

9.4%

2.3%

48.9%

4.4%

8.0%

33.4%

9.000

West Virginia

2537

19100

147.663

1.082

4.8%

0.9%

0.7%

0.2%

50.8%

0.7%

13.7%

46.7%

8.307

Wisconsin

2272

52385

159.204

1.435

8.4%

6.4%

3.3%

1.2%

49.3%

3.4%

11.7%

28.9%

9.539

Total

73754

2066794

Note. The means and percentages reported are for the samples weighted to represent U.S. students.

56

Eighth grade students’ writing performances on 20 NAEP writing prompts were used for
this analysis. In the NAEP database, each student wrote in response to two prompts; five
plausible values were generated from students’ conditional distributions. These five plausible
values were used as the outcome variable—students’ NAEP performance. The NAEP 2007
writing assessment was designed with six overarching objectives. Students were expected to
write (a) for three purposes (i.e., narrative, informative, and persuasive); (b) on a variety of tasks
and for diverse audiences; (c) from a variety of stimulus materials and within various time
constraints; (d) with a process of generating, revising, and editing; (e) with effective
organization, details for elaborating their ideas, and appropriate conventions of written English;
and (f) to communicate (National Assessment Governing Board, 2007). All students’ writing
products were first evaluated by NAEP for legibility, staying on task, and ratability. If they
passed the above evaluation, they were then scored based on a six-point rubric, where 1 was
Inappropriate, 2 was Insufficient, 3 was Uneven, 4 was Sufficient, 5 was Skillful, and 6 was
Excellent. If they did not pass the initial evaluation and thus did not receive a score, they were
not included in this study.
3.7 Students’ Characteristics in NAEP
The dataset used for analysis was from the NAEP 2007 eighth grade student database.
Student characteristics data were gathered through student and teacher surveys. There were 34
student characteristic variables. They were categorized into six groups for the convenience of
reporting results. These six groups did not suggest six factors nor should those variables be
considered as indicators of these factors. Because the main purpose of this study is to investigate
the effect of state-level variables on students’ writing performances by controlling for the
comprehensive coverage of the students’ characteristics variables, all related students’

57

characteristics variables were included and scale reduction was not considered necessary. The six
groups were employed to allow reporting variables similar in meaning to NAEP survey
descriptions. First, there was students’ demographic background, which consisted of students’
ELL status, free/reduced lunch eligibility status, with or without Individualized Education Plans
(IEP), gender, race or ethnicity, as well as location states. Second, students’ attitudes towards
writing were measured by whether they considered writing stories or letters as a favorite activity,
and whether they found writing helpful in sharing ideas. Third were students’ perceptions of the
difficulty of the NAEP writing tests. Fourth, students’ levels of motivation for taking the NAEP
writing assessments were evaluated by measuring their perceptions of their efforts on the NAEP
writing tests and the importance of success on the tests.
Fifth, students’ writing activities inside classrooms included (a) the frequency and types
of writing they did in school, including writing that was used to express their thoughts/
observations on their in-school writing activities, a simple summary of what they read, a report
based on what they studied, an essay they analyzed, a letter/essay, a personal or imagined story,
or business writing; (b) the aspects of writing they had worked on in school, including how often
they brainstormed, organized papers, made changes, or worked with other students; and (c) their
writing in content areas, including how often they wrote one paragraph in their English, science,
social studies, history, and math class.
Sixth, students’ experiences with writing consisted of (a) their computer use, i.e., whether
they had used a computer from the beginning, for changes, or for the internet when writing
papers for school; and (b) their teachers’ expectations and feedback, such as how often teachers
talked to students about their writing or asked them to write more than one draft, and whether

58

teachers graded students more heavily for spelling, punctuation, or grammar, paper organization,
quality and creativity, and length of paper.
3.8 Structure of the Data Set and Statistical Analyses
The NAEP 2007 writing assessments used stratified multi-stage cluster sampling.
Schools in the nation were grouped into strata based on their locations, sizes, percentages of
minority students, student achievement levels, and area incomes. Then schools were selected
randomly within each stratum, and students were selected randomly within schools. Selected
schools and students were assigned weights to represent a national sample. To reduce NAEP
testing time, the NAEP used “matrix sampling”—students only took a portion of the full NAEP
battery of potential items. This sampling method ensured an accurate estimate of the population’s
performance but resulted in large intervals of individual estimates of abilities. Instead of a single
score indicating a students’ writing ability, five plausible values were drawn from the conditional
distribution of student writing ability estimates based on the student’s background characteristics
and the patterns of responses to the items that were administrated to the student. Therefore, an
analysis of NAEP achievement data required that statistical analyses be conducted for each of the
five plausible values and the results synthesized (Rubin, 1987).
This study used appropriate weights and statistical procedures to address the special
characteristics of the NAEP data set. Data management was mostly done using SPSS. AM
statistical software is designed with procedures to handle the weighting and jackknifing needs of
complex data sets such as the NAEP’s. This study used AM to calculate achievement means and
standard errors as well as generate descriptive statistics of the 27-state NAEP reporting sample
and the HLM sample.

59

Given the hierarchical organization of the NAEP data set, in which students were nested
within states, a multi-level analysis was most suitable because it ensured more precise parameter
estimation and allowed more accurate interpretation (Goldstein, 1987). HLM software is
designed with features to use weights at level-1, level-2, or both levels to produce correct HLM
estimates, as well as features to run analyses with each of the five plausible values and
synthesize the results of these analyses by averaging values and correcting standard errors
(Raudenbush & Bryk, 2002). This study used HLM 7.0 to create a sequence of two-level
models—state level and student level—to examine the research question.
The overall weight was used at level 1—student level—because it adjusted for the
unequal probability for both the student to be selected and the school that the student was
enrolled in to be selected. No weight was used at level 2—state level. All binary variables such
as demographic variables were uncentered; all continuous variables such as students’ writingexperience variables were grand mean centered; and the state-level variable—the distance
between a state’s and the NAEP’s writing assessments (i.e., d(X,Y))—was uncentered. The
uncentering of the binary variables allowed interpretations to be made about differences in
performances between students in separate categories for all of the binary variables such as
female and male. The grand mean centering of students’ writing experience variables afforded
understandings about students with average writing experience for each variable. Finally, the
uncentering of the state-level variable made it possible to interpret the results of states with same
writing assessments as the NAEP (i.e., no distance between state and NAEP writing
assessments).
3.9 Statistical Models

60

To answer the research questions, this study utilized four statistical models. Similar to
Lubienski & Lubienski’s (2007) data analysis design, this study first ran an unconditional model,
then added demographic variables, then students’ writing-experience variables, and finally the
state-NAEP distance variable. This procedure allowed the researcher to examine the extent of
additional variance in the outcome variable that the inclusion of each group of variables
explained. The total variance in students’ NAEP performances was decomposed into a betweenstates component (state level) and a between-students component (student level).
Unconditional model (Model 1). State- and student-level variables did not enter the
model. The unconditional model measures whether there was a significant difference between
states’ mean scores on the NAEP.
Y  0 e

where
Y

is one of the students’ five plausible values,

e

is the random error between states, and



is the random error between students
When discussing the results, special attention was paid to var(e), to see whether it was

significant. A significant var(e) means that there are significant differences among states in terms
of students’ performance. Therefore, the differences among states can be further modeled.
Main effect model (Model 2). Student-level demographic variables entered the model as
fixed effects.
Level 1: Y   0   1 X 1  ...  kXk  
Level 2:  0   00  e
Combined model: Y   00  1X1 ... kXk e 
61

where
Y

is one of the students’ five plausible values,

Xk

is the students’ demographic variables,

e

is the random error between states, and



is the random error between students
Main effect model (Model 3). Both student-level demographic variables and writing-

experience variables entered the model as fixed effects.
Level 1: Y   0   1 X 1  ...  kXk  
Level 2:  0   00  e
Combined model: Y   00  1X1 ... kXk e 
where
Y

is one of the students’ five plausible values,

Xk

is the students’ demographic variables and writing-experience variables,

e

is the random error between states, and



is the random error between students
Main effect model (Model 4). Both state-level variables (i.e., the distance between

NAEP and state writing assessments) and student level variables (i.e., demographic variables and
writing-experience variables) entered the model as fixed effects.
Level 1: Y   0   1 X 1  ...  kXk  
Level 2:  0   00   01d  e
Combined model: Y  00  01d1X1 ...kXk e
where
Y

is one of the students’ five plausible values,
62

Xk

is the students’ demographic variables and writing-experience variables,

d

is the distance between states’ assessments and the NEAP,

e

is the random error between states, and



is the random error between students
When discussing the results, special attention was paid to 01, to determine whether it was

significant. A negative 01 indicates that the more state assessments differ from the NAEP, the
lower students’ NAEP performances will be. A positive 01 indicates that the more different state
assessments are from the NAEP, the higher students’ NAEP performances will be.
4. Results
This study utilized four hierarchical linear models to examine whether the distance
between NAEP and state writing assessments can predict students’ performances on NAEP.
Table 11 in Appendix A shows the raw and unweighted descriptive statistics for all the variables
used in the HLM analyses.
The HLM results can be found in Table 5 below. Because whether the difference
between state and NAEP writing assessments can predict students’ NAEP performances is the
main interest of this study, standard errors are provided for the intercept and the state and NAEP
difference variable. The unconditional model (model 1) showed that the average writing
performance of all students was 155.5. It also showed that 54.863% of the variance was between
states, and 45.137% of the variance was within states. Because the between-state variance was
very significant, it made a multi-level model necessary.
Model 2 added student-level demographic variables. It showed that the student-level
demographic variables were all significant. The intercept of 160.387 was the estimated mean
achievement of a student who was at the level of 0 on all the binary predictors (i.e., male, White,
63

non-ELL, without IEPs, and not eligible for free/reduced lunch). Except Asian students, students
from other minority ethnicities (i.e., Black, Hispanic, and American Indians) had an average
score lower than the estimated mean achievement of a student with the above level-0
characteristics. Similarly, students who were ELLs, had IEPs, or were eligible for free/reducedprice lunch also had lower average scores. Female students had higher average scores than male
students. Student-level demographics explained an additional 33.185% of the variance between
states and an additional 33.151% of the variance within states. The between-state variance
remained very significant.
Model 3 also included student-level demographic variables, and added student-level
writing experience variables. It showed that almost all student writing experience variables were
significant except the following: how often students wrote a letter or essay for school, and their
perception of the importance of success on the writing test they were undertaking. The
intercept of 161.692 was the estimated mean achievement of a level-0 student on all the binary
predictors (i.e., male, White, non-ELL, without IEPs, and not eligible for free/reduced lunch) and
at the mean of all the continuous predicators (i.e., students’ writing experience variables).
Students’ attitudes towards writing and their perceptions of the difficulty of the NAEP writing
test were positively related to their NAEP performance. More specifically, students who enjoyed
writing, thought that writing helped to share ideas, and considered the NAEP writing assessment
easier than other tests tended to get higher scores. However, students’ perceptions of their efforts
and the importance of success on the NAEP writing test were negatively related with students’
NAEP performance. More specifically, students who believed that they tried harder and
considered their successes on the NAEP writing assessments more important tended to get lower
scores. Student-level writing-experience variables explained an additional 10.397% of the

64

variance between states (43.582% instead of 33.185%) and an additional 10.285% of the
variance within states (43.436% instead of 33.151%). The between-state variance remained very
significant.
Model 4 included both student-level demographic variables and writing experience
variables, and added the variable of primary interest—the state and NAEP difference variable. It
showed that when differences in students’ backgrounds and writing experiences were controlled,
state and NAEP direct writing assessment differences were significant. The intercept of 163.148
was the estimated mean achievement of a level-0 student on all the binary predictors (i.e., male,
White, non-ELL, without IEPs, and not eligible for free/reduced lunch), at the mean of all the
continuous predicators (i.e., students’ writing experience) and from a state with same writing
assessment as the NAEP (i.e., no distance between state and NAEP writing assessments). More
specifically, 163.148 was the predicted mean achievement of a White, non-IEP, non-ELL,
subsidized lunch-ineligible male student with average frequency of certain writing practices,
average amount of feedback from teachers, and average perception of difficulty and importance
of the NAEP writing assessments from a state which had the same writing assessment as NAEP.
The state and NAEP distance/difference variable was found to be statistically significant
with a coefficient of -0.143 and a standard error of 0.067. With every difference of one unit
between states’ writing assessments and the NAEP writing assessment, the predicted
achievement of such an above-mentioned student would be a significant 0.143 points lower than
the average.
All student-level demographic variables remained very significant. Almost all the
student-level writing experience variables were significant, except the two that were insignificant
in model 3. The variables that were positively related to students’ NAEP performances in model

65

Table 5 HLM Model Results

Model 2:
Student
Demographics

Model 3:
Student
Demographics
+Writing
Experience

155.5***

160.387***

161.692***

163.148***

0.144

0.197

0.189

0.693

Model 1:
Unconditional
Model

Variable

Model 4: Student
Demographics+
Writing Experience
+State Difference

Fixed effects
Intercept
(S.E.)
State Level
Distance between NAEP and State Assessments

-0.143*

(S.E.)
Student level

0.067

Demographics
Black
Hispanic

-13.436***
-8.951***

-13.843***
-8.07***

-13.832***
-8.016***

Asian
American Indian

10.438***
-11.885***

7.978***
-9.234***

8.076***
-9.276***

Female
ELL

18.143***
-25.766***

12.23***
-22.208***

12.229***
-22.189***

IEP

-33.929***

-30.253***

-30.238***

-13.059***

-10.488***

-10.468***

1.64***
2.562***

1.646***
2.566***

free/reduced lunch
Writing Experience in school
Writing stories/letters is a favorite activity
Writing helps share ideas
How often teacher talk to you about writing

0.654*

0.658*

How often write thoughts/observation
How often write a simple summary

0.569***
1.534***

0.568***
1.547***

How often write a report
How often write an essay you analyze

-0.905***
1.97***

-0.897***
1.983***

66

Table 5 (cont’d)
How often write a letter/essay for school
How often write a story personal/imagine

-0.024
-0.453**

-0.05
-0.459**

How often write business writing

-2.55***

-2.552***

How often when writing-get brainstorm
How often when writing-organize papers

-0.982***
0.71***

-0.977***
0.698***

How often when writing-make changes
How often when writing-work with other students

6.031***
-1.449***

6.034***
-1.452***

Write paper-use computer from begin

-0.951***

-0.941***

Write paper for school-use computer for changes
Write paper for school-use computer for internet

3.615***
1.677***

3.627***
1.681***

How often write one paragraph in English class
How often write one paragraph in science class
How often write one paragraph in social studies/history class
How often write one paragraph in math class

4.204***
0.959***
0.926***
-2.735***

4.209***
0.946***
0.942***
-2.738***

How often teacher asks to write more than 1 draft
Teacher grades important for spelling/ punctuation/ grammar

1.174***
-0.863***

1.181***
-0.871***

Teacher grades important for paper organization
Teacher grades important for quality/creativity

2.743***
3.1***

2.739***
3.105***

Teacher grades important for length of paper

-1.257***

-1.263***

Difficulty of this writing test
Effort on this writing test

-2.644***
-0.378*

-2.644***
-0.389*

-0.269

-0.278

Importance of success on this writing test
Random effects
Intercept (variance between states)
Level 1 (variance within states)
Intraclass correlation (proportion of variance between states)
Variance in achievement between states explained (%)
Variance in achievement within states explained (%)
Note. *p<.05. **p<.01. ***p<.001.

67

638.408

426.552

360.18

360.137

525.226
0.548633

351.111
0.548505

297.089
0.547995

297.062
0.547988

NA
NA

33.185%
33.151%

43.582%
43.436%

43.588%
43.441%

3 remained positively related in model 4: whether students considered writing stories or letters a
favorite activity and thought writing helped share ideas; the frequency with which teachers
talked to students about writing; how often students wrote thoughts or observations, simple
summaries, and analyses of essays; how frequently students organized papers, and made changes
when writing for school; the frequency of students’ use of computers for changes and for
accessing the internet when writing papers for school; how frequently students wrote one
paragraph in English, science, and social studies or history classes; how often teachers asked
students to write more than one draft; and whether teachers in their grading emphasized the
importance of paper organization and quality or creativity. The variables that were negatively
related to students’ NAEP performance in model 3 remained negatively related in model 4: how
frequently students wrote a report for school, a personal or imagined story, and business writing;
the frequency with which students brainstormed or worked with other students when writing for
school; how often students used a computer from the beginning when writing; the frequency of
students writing one paragraph in math class; whether teachers in their grading emphasized the
importance of spelling, punctuation, or grammar, and of length of paper; and students’
perceptions of their efforts and the importance of success on the NAEP writing assessment.
A few students’ writing experience variables consistently had large, statistically
significant coefficients in both model 3 and model 4. These variables were the frequency with
which students made changes when writing for school, used computers for changes when writing
papers for school, wrote one paragraph in English class, and had teachers who in their grading
emphasized the importance of quality or creativity and paper organization, as well as whether
students thought that writing helped share ideas. State-NAEP differences explained an additional
0.006% of the variance between states (43.588% instead of 43.582% in model 3) and an

68

additional 0.005% of the variance within states (43.441% instead of 43.436% in model 3). The
between-state variances remained significant.
5. Discussion
The main finding of this study is that students’ preparedness for the NAEP tasks, namely
their home states’ assessments’ similarity to the NAEP, also plays a role in students’
performance on the NAEP. Students from those states with writing assessments more similar to
the NAEP perform significantly better than students from states with writing assessments that are
less similar to the NAEP. However, this predictor only explains a little of the variance in the
outcome variable—students’ NAEP performances; thus, it does not negate the interpretation of
NAEP scores as an indicator of students’ writing abilities.
Research has shown that students’ demographic backgrounds have a significant impact
on students’ writing quality (Garielson, Gordon, & Englehard, 1999; Ball, 1999; Kanaris, 1999,
Silva, 1993; Ferris). This study’s results confirm these assertions. All of the students’
demographic variables were found to be statistically significant in all models. More specifically,
students who were ELLs, had IEPs, or were eligible for free/reduced priced lunch performed
significantly poorer than students who were without those characteristics. Students who were
Black, Hispanic, or American Indian performed significantly poorer than students who were
White. Asian students performed significantly better than White students, and female students
performed significantly better than male students.
Research has shown that students’ attitudes and motivations have a significant impact on
their writing achievements (Graham, Berninger, & Fan, 2007). More specifically, students’
positive beliefs and attitudes about writing contribute to their levels of motivation to write
(Bruning & Horn, 2000). This study’s results confirm this assertion by finding that students who

69

thought that writing helped to share ideas performed better than students who did not. However,
this study also finds that students’ perceptions of the importance of the NAEP writing test were
not significantly related to their writing performances. Moreover, students who believed that they
exerted more effort on the NAEP writing test did not perform as well as those who did not. It is
possible that students who found they needed to devote more effort were also those students who
found the writing test more difficult, which would explain why they did not perform as well.
Research has also shown that students’ writing activities inside classrooms, such as how
often they write, have a positive effect on students’ compositional quality (Graham, Kiuhara,
McKeown, & Harris, 2012). In this study, almost all students’ writing activities inside the
classroom were found to be significantly related to their writing performance except the
frequency with which students wrote letters or essays for school. However, some of the students’
writing activities were found to be negatively related to their writing performance, including how
frequently students wrote reports, personal/imaginative stories, and business writing; the
frequency with which they brainstormed and worked with other students when writing; and the
frequency of writing wrote one paragraph in math class. It is unclear why these activities were
negatively related to students’ writing performances. Among the positively related variables,
how often students revised and wrote in English class was consistently associated with large
coefficients in all models. This finding seems to confirm the assertion that the frequency with
which students write has a positive effect on their writing quality.
Research has also shown that students’ writing experiences have a significant impact on
their writing quality. All variables regarding students’ writing experiences were found to be
significantly related to their performance. However, some of the students’ writing experiences
were found to be negatively related to their writing performance, including the frequency of

70

using computers from the beginning when writing papers, and whether teachers emphasized the
importance of spelling/punctuation/grammar and length of papers in their grading. Perhaps
teachers’ overemphasis on the mechanics of students’ compositions distracted them from
improving the organization and overall quality of their compositions. Among the positively
related variables, whether teachers emphasized quality or creativity and paper organization in
their grading was consistently found to have large coefficients in all models. This finding
suggests that though teachers’ feedback tends to improve students’ writing quality (Rogers &
Graham, 2008), the things teachers emphasize in their feedback also matters.
6. Implications
The results of this study show that state and NAEP assessment differences play a role in
students’ performances on the NAEP. This finding has three implications: First, it should raise
awareness that students’ NAEP performances are a result of many factors, including the
similarity of students’ home state assessments to the NAEP. Because the NAEP is a low-stakes
assessment, students are unlikely to prepare for it; however, high-stakes assessments in students’
home states tend to impact the instruction and writing experience students get in school. When
states’ assessments are more similar to the NAEP, students have indirectly prepared for it; as a
result, their performance on the NAEP is slightly better than that of students whose home state
assessments are more dissimilar. Therefore, when students’ performances on the NAEP are
compared, we have to be aware of their different levels of preparedness as a result of their home
states’ writing assessments’ similarities and differences with the NAEP.
Second, this finding does not suggest that state and NAEP assessments should be
designed to be more similar. Instead, both the NAEP and states’ assessments can move forward
by incorporating more evidence-based writing assessment practices, which are likely to shrink

71

the differences between the NAEP and states’ assessments. As a result, students’ performances
on the NAEP are less likely to be impacted by their different levels of preparedness for the
NAEP’s tasks. Third, the large amount of unexplained variance remaining between states
suggests that there are still more state-level variables to be explored, such as the alignment
between states’ standards and assessments, and the stringency of states’ accountability policies.
7. Limitations
This study only controlled for students’ characteristics in the multilevel modeling. It did
not study teacher characteristics and school characteristics. Teachers’ characteristics (such as
their educational backgrounds and teaching experiences) and schools’ characteristics (e.g., staff
opportunities for professional development in writing, and the existence of and extent to which
writing was a school-wide initiative) are both likely to impact students’ performances on the
NAEP. However, investigation of these groups of characteristics was beyond the scope of this
project.
In this study, the main variable of interest was at the state level, and the outcome variable
was at the student level, thus, state- and student-level were two essential levels to investigate the
research question of this study. It is assumed that compared with the impact of states’ assessment
characteristics and students’ backgrounds and experiences in writing, the impact as a result of
differences among teachers and schools is relatively small on students’ NAEP performances.
While there has been limited research done to study the teachers’ and schools’ effects on
students’ achievements using NAEP data, Lubienski and Lubienski (2006) examined the NAEP
2003 data with hierarchical linear models to study whether the disparities in mathematics
achievement were a result of schools’ performances or student demographics. Their study found
that when students’ demographic differences are controlled for, private school advantages no

72

longer exist. This suggests that students’ demographic variables have more impact on students’
performances than one of the central characteristics of schools.
The assumption referred to above is also made for two computational reasons. First, it
simplifies the model and increases the precision and efficiency of estimation, as well as allowing
a focused investigation of the research question. Second, unless there is strong evidence
supporting teacher-level and school-level effects, it is better not to include these two levels
because it causes computational difficulties and can produce meaningless and inaccurate
estimation as a result of small variances. Nevertheless, it is acknowledged that teachers’ and
schools’ characteristics are important components of students’ experiences with schooling.
Therefore, future research should be conducted to investigate state-level differences when
teachers’ and schools’ characteristics are accounted for in addition to students’ characteristics.

73

CHAPTER 3: Genre Demands in State Writing Assessments
1. Introduction
Since implementation of the No Child Left Behind Act of 2001, state assessments have
been a heated topic for discussion given their important role in states’ accountability systems. As
state assessments tend to influence curricula, student promotion and retention, and ratings of
teacher effectiveness (Conley, 2005), their validity has also been explored (Beck & Jeffery,
2007; Carroll, 1997).
A validity concern raised regarding state writing assessments is the level of ambiguity in
prompts. Beck and Jeffery (2007) examined 20 state exit-level writing assessment prompts from
Texas, New York, and California, and found that the terms “discuss” and “explain” appeared in
20% of the prompts. However, words like “discuss” do not necessarily align with conventional
genre categories. For example, a prompt may ask a student to “discuss” something; depending on
what follows “discuss,” however, such a prompt can be requesting either an explanation or an
argument. Because “discuss” can be used for eliciting a range of rhetorical purposes, it becomes
“an ambiguous directive that does little to help students understand what is expected of them”
(Beck & Jeffery, 2007, p.65).
Meanwhile, besides the traditional meaning of “explain,” which asks the writer to explain
how something works and often leads to an expository essay, “explain” has been used in two
other ways: as an indication that students should take a position and argue for it, which can be
classified as argumentative, and that they should give the definition and classification of
something, which can be considered descriptive. Thus, there is a lack of precision in these
writing prompts. Jeffery’s (2009) study of 68 prompts from 41 state exit-level direct writing
assessments, in which students produced texts in response to prompts, also suggested that verbs

74

such as “explain” generated more than one genre category depending on the objects of “explain.”
These objects “varied with respect to the degree of abstraction and the extent to which
propositions were presented as arguable” (Jeffery, 2009, p.8).
Moreover, in Beck and Jeffery’s (2007) study, 14 prompts out of the 20 examined
specified multiple rhetorical purposes. For example, one prompt asked students to “discuss two
works of literature,” choose to “agree or disagree with the critical lens,” and then “support” their
opinions (p.68). Beck and Jeffery (2007) suggested that, although this prompt was categorized as
“argumentation,” the expectation that students should produce an argument was implicit, thus
making the prompt ambiguous.
Ambiguity in prompts and implicit expectations of prompts can be viewed as two
separate problematic features, rather than the unitary concept in Beck and Jeffery’s (2007) study.
Ambiguity is defined in this paper as the presence of two or more conflicting genre demands in a
prompt. For example, consider the following prompt: “You find something special. Describe
what it is and what you do with it.” The initial statement that “You find something special” can
be understood as setting the stage for a narrative account. “Describe what it is” suggests a
descriptive text is expected. “Describe …what you do with it” can be interpreted in two ways.
The first interpretation is that the writer should “explain what you do with it,” which suggests an
expository text is expected; the second interpretation is the meaning of “tell us what you decide
to do with it,” which along with “you find something special” again suggests a narrative text is
expected. Because these three genre demands compete for an examinee’s attention, this prompt
can be considered ambiguous.
Critics in the genre studies and writing rhetoric communities may argue that there are
very few “pure” genre structures invoked in real communicative contexts; rather, there is often

75

blending. In that case, perhaps we should encourage students to do this kind of blending. This
might be a valid approach to prepare students for real communicative tasks; however, there are
often high stakes involved in the large-scale assessments and time constraints imposed on
students during testing. Therefore, we have to be aware of the additional cognitive demands we
place on students, as well as threats to the validity of the assessments when prompts can be
interpreted from multiple perspectives.
The second potentially problematic feature of writing assessment prompts is implicit
expectations. A prompt’s implicit expectation is defined in this paper as the prompt’s lack of
verbs (e.g., “argue,” “convince”) or nouns (e.g., “story”) that explicitly signal the genre desired
in response to the writing prompt. For example, consider the following prompt: “Write about an
important lesson that children should learn.” This prompt can be also phrased, “Explain an
important lesson that children should learn,” which suggests an expository text is expected.
However, none of the words in either version of the prompt explicitly signal the desired genre.
Thus, this prompt would be considered to have an implicit rather than explicit genre expectation.
When discussing possible reasons for the confusing signals about genre expectations in
the prompts they examined, Beck and Jeffery (2007) suggested that test designers may assume
that students have limited experience with different genres, and thus lack sufficient vocabulary
knowledge to associate these key verbs, nouns, and phrases with responding using specific
genres. As a result, test designers resort to terminology they feel will be familiar to students,
such as “support.” However, practice is ahead of research in this area. Little research has been
done to examine the thinking processes that students adopt when reading writing prompts.
Students’ vocabulary precision is one potential area for future research using procedures such as
think-aloud protocols and interviews.

76

A prompt can be ambiguous, or contain implicit expectations, or both. Therefore, tools
are needed to examine prompts for ambiguity and lack of explicit genre expectations. Glasswell,
Parr, and Aikman (2001) have outlined conventional genre classifications with six genres: “to
explain,” “to argue or persuade,” “to instruct or lay out a procedure,” “to classify, organize,
describe, or report information,” “to inform or entertain through imaginative narrative,” and “to
inform or entertain through recount” (p.5). They also specified these genres’ purposes, functions,
types, features, text organization/structure, and language resources. Their work can serve as a
reference for identifying genre demands in prompts.
Meanwhile, by identifying demand verbs and corresponding objects (e.g., “convince” and
“your friend” in “convince your friend to try something new”), syntax analysis (Jonassen,
Hannum, & Tessmer, 1999) can be used to spot words that signal rhetorical processes that can be
matched with genre demands (Beck & Jeffery, 2007; Jeffery, 2009). The basis of syntactic
analysis is the sentence, in which each word is assigned a label (e.g., subject, verb, object of that
verb). Such labeling allows the key verbs and objects of the verbs to be spotted and matched with
genre demands.
The ambiguities and implicit expectations in writing prompts may be attributable to the
following factors: (a) test designers using terminology that they consider most familiar to
students, such as “support,” rather than adopting more explicit verbs for genres, such as “argue,”
and (b) test designers purposefully including conflicting genre demands to give students choices
in their compositions (Beck & Jeffery, 2007). However, such ambiguities and implicit
expectations pose threats to the validity of state writing assessments for the following reasons:
(a) different interpretations of writing prompts can lead to students producing compositions that
are not representative of their writing abilities, and (b) a lack of consensus among test designers

77

as well as scorers of the responses may lead to unclear expectations of writing from students,
which will result in unfair judgments of students’ writing competence. This is especially
problematic when a prompt is ambiguous or has implicit expectations while it has a rubric that
emphasizes genre mastery. Therefore, it is important to examine this phenomenon.
Jeffery’s (2009) five-criteria coding scheme provides just such a tool for examining
rubrics for genre mastery. This coding scheme was developed through an inductive analysis of
rubrics for exit-level writing assessment prompts. The coding scheme includes rhetorical, genremastery, formal, expressive, and cognitive rubrics. Rhetorical rubrics focus on “the relationship
between writer, audience, and purpose across criteria domains” (p.10). Genre-mastery rubrics
emphasize “criteria specific to the genre students are expected to produce” (p.11). Formal rubrics
conceptualize proficiency “in terms of text features not specific to any writing context” (p.11).
Cognitive rubrics target “thinking processes such as reasoning and critical thinking across
domains” (p.12). Expressive rubrics portray “good writing” as “an expression of the author’s
uniqueness, individuality, sincerity and apparent commitment to the task” (p.12). Jeffery (2009)
suggested that one way to illuminate the underlying proficiency conceptualizations in large-scale
writing assessments is to analyze the relationships between genre demands and scoring criteria.
Using the above coding framework, 40 rubrics were coded in Jeffery’s (2009) study with interrater agreement of .83.
When state writing assessment prompts are ambiguous or contain implicit expectations, it
brings into question whether students are expected to demonstrate mastery of the demands of the
genre(s) presented in prompts; if not, what genres are students expected to master? State
standards provide an answer by specifying what students are expected to learn. Moreover, state
standards tend to have a significant impact on classroom instruction—teachers have been

78

reported to increase their instructional emphasis on writing for specific genres in response to
changes in standards (Stecher, Barron, Chun, & Ross, 2000).
For these reasons, an examination of genre expectations in state standards that correspond
with state writing assessments will help identify the range of genres middle school students are
expected to master in different states. It will not only present the state of alignment between
genre expectations in standards and assessments using a representative sample, but also help
provide an answer to what genres are expected to be mastered when ambiguity and implicit
expectation arise. Troia and Olinghouse’s (2010) coding taxonomy with comprehensive
coverage of 21 genres provides just such a tool for identifying the genre expectations in state
standards. Their taxonomy was derived from several theoretical frameworks, including Hayes’
cognitive model of writing (Flower & Hayes, 1981; Hayes, 1996), socio-cultural theory (Prior,
2006), genre theory (Dean, 2008), linguistic models of writing (Faigley & Witte, 1981), and
motivation theories of writing (Troia, Shankland, & Wolbers, 2012). The indicators found within
the “writing purpose” strand in their coding taxonomy cover a variety of communicative
intentions accomplished through different genres.
While a small number of studies have been conducted to examine the ambiguity or genre
demands of high school exit-level writing prompts (Beck & Jeffery, 2007; Jeffery, 2009), no
research has been done to examine the genre demands of middle school state writing assessment
prompts, or issues with ambiguity and implicit expectations in those prompts. Nevertheless,
writing in middle school is important because middle school students start to be able to think
abstractly and use language in more complex ways (De La Paz & Graham, 2002). A study of
genre expectations in the prompts for middle school students thus becomes necessary because it
will make an important part of the writing expectation explicit and thus help better prepare

79

students for writing tasks. As the NAEP assesses students’ writing at grade 8, seventh graders
and eighth graders are also frequently assessed in state writing assessments. It is therefore
important that these large-scale assessments are examined in terms of their writing constructs to
ensure their validities. The fact that both NAEP and many states assess students’ writing at grade
8 also provides a large sample to compare national and state writing assessment at the same
grade level, which has not yet been extensively studied.
This study aims to fill that gap by examining genre expectations in seventh and eighth
grades. In addition to classifying state writing assessment prompts into different genre
categories, this study will use syntactic analysis to investigate multiple competing or conflicting
genre demands within each prompt to shed light on the problems of ambiguities and implicit
expectations in writing prompts for middle school students. For each prompt, the demand verbs
and corresponding objects will be identified and the rhetorical purposes signaled will be matched
with the existing genre demands outlined in Glasswell, Parr, and Aikman (2001). This study will
also highlight the connection between genre demands in writing prompts and genre-mastery
expectations in rubrics and state standards to discuss the validity of state writing assessments.
2. Research Questions
Through analyses of state writing assessment prompts, writing rubrics, and state writing
standards, this paper aims to answer the following questions:
1. How many state writing prompts possess the problematic features of ambiguity and/or implicit
genre expectations? Which key words in prompts are associated with ambiguity and implicit
genre expectations, and how frequently do they appear?
2. What is the relationship between prompts’ genre specification and rubrics’ genre-mastery
expectations?

80

3. What is the relationship between genre expectations in state standards and writing assessment
prompts?
3. Method
3.1 State Direct Writing Assessments and Standards
This study was carried out using data from a prior IES-funded study—the K-12 Writing
Alignment Project (Troia & Olinghouse, 2010-2014). In the K-12 Writing Alignment Project,
email inquiries and phone calls were conducted to request documents from appropriate
assessment personnel located through states’ Department of Education websites. Because the K12 Writing Alignment Project examined the alignment between state writing standards and
assessments prior to the adoption of the CCSS and used the NAEP 2007 assessment for its
inclusion of state-level data, state direct writing assessments were gathered mainly from between
2001 and 2006, to ensure the representation of the time period. Representative state writing
assessment documents including a representative writing prompt, its rubric, and the
administrative manual for each genre in each grade being assessed were collected from each time
span between major revisions of state assessments.
This study examined 78 prompts and 35 rubrics from 27 states3 (see Appendix C for
details). No NAEP data existed for Alaska, Nebraska, Oregon, and South Dakota for the chosen
time period. State writing standards or writing assessments were not available for Connecticut,
Iowa, Pennsylvania, Montana and New Mexico between 2001 and 2006. No 7th grade and 8th
grade writing assessment existed in Ohio during the period 2001-2006. As a result, this study did
not include these states’ direct writing assessments.

3

The following chose not to participate in the study: Colorado, Delaware, the District of Columbia, Georgia,
Hawaii, Maryland, Minnesota, Mississippi, New Hampshire, New Jersey, North Dakota, South Carolina, Utah, and
Wyoming.

81

The collected state direct writing assessment documents were compiled. In each compiled
file, there are verbal directions from administration manuals for direct writing assessments,
actual prompts, supporting materials provided (e.g., dictionary, writer’s checklist), sessions
arranged for writing tests, time given, page limits, and whether (and what kinds of) technology
was used. There were as many compiled documents for each state as the written responses
expected from students each year. In other words, if students took only one prompt with rotated
genres each year, there would be a single compiled document for that state containing a
representative prompt from each rotated genre to represent the scope of genres assessed. These
compiled documents and rubrics were later coded with the coding taxonomy.
Similar procedures were applied to gathering state standards. Within each state and grade,
all standards closely related to writing were coded. To ensure the reliability of coding within and
across states, the unit of content analysis (i.e., the smallest grain size for a set of standards) was
determined to be lowest level at which information was presented most consistently in a set of
standards and designated level A. The next level of organization was designated level B, the
next, level C, and so forth. Each individual code was applied within level A only once to avoid
duplication, but multiple different codes could be assigned to any given unit. To accommodate
the potential for additional information presented at higher levels of organization for a set of
standards, unique codes were assigned at these superordinate levels (levels B, C, and so on), but
duplication of codes from the lower levels was not allowed. Therefore, states’ writing standards
were rendered comparable regardless of their different organizations. In this study, genre
expectations in state standards were only examined for grades 7 and 8.
3.2 Data Coding

82

Genre demands in prompts. To distinguish genre demands within each prompt, this
study used syntactic analysis (Jonassen, Hannum, & Tessmer, 1999) to identify demand verbs
and their corresponding objects in prompts. Key words such as main verbs were recorded,
tallied, and considered as signals for rhetorical purposes. These signals were compared with the
conventional genre classifications as outlined in Glasswell, Parr, and Aikman (2001). When
there were two or more genre demands within a prompt, the prompt was recorded as ambiguous.
When there were no explicit verbs/nouns for genres, the prompt was recorded as containing an
implicit expectation. All explicit verbs/nouns for genres (e.g., “argue,” “convince”) were
recorded. Concordance software was used to count the frequencies for all explicit verbs/nouns.
Genres of prompts. This study used a seven-category genre coding scheme adapted
from the third strand (purposes) of Troia and Olinghouse’s (2010) coding taxonomy and
Jeffery’s (2007) genre coding scheme to code the genres of the prompts. Troia and Olinghouse’s
(2010) coding taxonomy ensured comprehensive coverage of the writing purposes with 21
indicators. A preliminary frequency analysis of state writing prompts’ genres coded with this
coding taxonomy indicated that only seven genres were assessed in state writing assessments—
expository, descriptive, persuasive, response-to-literature, descriptive, narrative, and summary.
Jeffery’s (2009) coding taxonomy was derived from an inductive analysis of state exit-level
direct writing assessments and differentiated similar genre categories such as persuasive and
argumentative prompts and expository and informative prompts. Such differentiations were
helpful in distinguishing similar genres in this study. Therefore, a seven-category genre coding
scheme was used. These seven categories were: descriptive, persuasive, expository,
argumentative, informative, narrative, and analytic. The author of this dissertation served as one
of the two raters. A graduate student in Digital Rhetoric & Professional Writing served as the

83

second rater. The two raters first practiced coding with a training set. When they reached 85%
inter-rater agreement, they moved into coding the actual prompts and reached an inter-rater
reliability of .93. Differences were resolved through discussion.
Genre expectations in rubrics. This study used the five-criteria coding scheme
developed by Jeffery (2009) to examine rubrics for genre-mastery expectations. While the
coding scheme includes rhetorical, genre-mastery, formal, expressive, and cognitive rubrics,
special attention was paid to the connection of genre demands in prompts and the genre-mastery
category as coded in rubrics. Genre-mastery rubrics emphasized criteria specific to the genre
expected in the prompts; though these rubrics might contain descriptions that also signify other
categories such as expressive, cognitive, or formal, all the descriptions were “framed by the
specific communicative purpose that characterizes the genre” (Jeffery, 2009, p.11). Jeffery
(2009) gave this example from a 6-point rubric in Nevada: “clarifies and defends or persuades
with precise and relevant evidence.” This example signified a genre-mastery category because of
the expectation of effective persuasive writing. These rubric types represented what different
“discourses of writing”—“constellations of beliefs about writing, beliefs about learning to write,
ways of talking about writing, and the sorts of approaches to teaching and assessment which are
likely to be associated with these beliefs” (Ivanic, 2004, p.224)—value as assessment criteria.
The relationships between genre demands in prompts and rubric types illuminated the underlying
proficiency conceptualizations contained in large-scale writing assessments (Jeffery, 2009). The
two raters who coded the prompts followed the same procedure and coded the rubrics. They
reached an inter-rater reliability of .86 and resolved differences through discussion.
Genre expectations in state standards. Genre expectations in state standards have been
coded with the third strand (purposes) of Troia and Olinghouse’s (2010) seven-strand coding

84

taxonomy in the K-12 Writing Alignment Project. The genre expectations that appeared in those
27 states’ grade 7 and grade 8 standards were recorded. The inter-rater reliability was .87 for
standards coding. To also allow genre expectations in state standards and writing prompts to be
comparable using Jeffery’s (2009) genre coding taxonomy, when the persuasive and expository
genres were coded in the writing standards according to Troia and Olinghouse’s (2010) coding
taxonomy, they were further categorized as either persuasive or argumentative and either
expository or informative as in Jeffery’s (2009) genre coding taxonomy. As a result, genre
expectations in state standards were coded with the third strand (purposes) of Troia and
Olinghouse’s (2010) seven-strand coding taxonomy modified to accommodate Jeffery’s (2009)
genre coding scheme.
In the current study, the “purposes” strand of Troia and Olinghouse’s (2010) was
modified by breaking out persuasion and argumentation to accommodate Jeffery’s (2009) genre
coding taxonomy; the 21 writing purposes in the strand were changed into 22 purposes. The
author of this dissertation and a doctoral student in English Literature served as raters. The two
raters coded standards following the same procedure of coding prompts and rubrics. They
reached an inter-rater reliability of .86 and resolved differences through discussion.
3.3 Data Analyses
The percentages of prompts that were either ambiguous or contained implicit
expectations were recorded. The key verbs/nouns associated with ambiguity and implicit
expectations and their frequencies were also recorded. The connections between the ambiguity
and implicit expectations of prompts and their rubrics’ categories were examined, with special
attention to the genre-mastery category. Genre expectations in standards were obtained from the
coding of standards using Troia & Olinghouse’s (2010) coding taxonomy modified to

85

accommodate Jeffery’s (2009) genre coding scheme. Ambiguity and implicit genre expectations
in prompts were determined through the syntactic analysis of prompts in the former data coding
step. Genre expectations from state standards were presented with the genres assessed in state
writing prompts. The genres assessed in state writing prompts were identified by the two raters
using the seven-category genre coding scheme adapted from the third strand (purposes) of Troia
and Olinghouse’s (2010) coding taxonomy and Jeffery’s (2009) genre coding scheme. When
there was ambiguity in a prompt, states’ identification of the genre of the prompt was taken into
consideration.
4. Results
4.1a. How many state writing prompts possessed the problematic features of ambiguity or
implicit genre expectations?
Among 78 prompts, 11 prompts from seven states were considered ambiguous, and seven
prompts from four states were determined to have implicit genre expectations. In other words,
14% of prompts were ambiguous, and 9% of prompts had implicit genre expectations. Together,
23% of prompts possessed one of the two problematic features.
Ambiguous prompts were mostly expository, narrative, argumentative, and informative
prompts. The genre coding was based on the syntactic analysis of the prompts; however, in the
case of ambiguity, states’ identification of the prompts’ genres was taken into consideration.
There were six expository prompts that were ambiguous. For example, the Massachusetts 2002
prompt asked students to “think of someone who is [their] personal hero,” “describe this person,”
and “explain two qualities they most admire about him or her” in “a well-developed
composition.” If students were only expected to “describe this person,” the prompt could be
easily categorized as descriptive, or if students were only expected to “explain two qualities,” the

86

prompt could be easily categorized as expository; however, when the two demand verbs were
used in a parallel way, without any specific noun (e.g., “descriptive”, “expository”) to indicate
the genre, it was hard to determine which genre was expected. However, a state contact in
Massachusetts helped us identify that the genre the prompt was written to assess was expository.
Narrative prompts often had explicit directions for students; for example, “write a story”,
or “tell about a time when…” However, there were three cases when narrative prompts
employed demand verbs in a way that made the genre expectation ambiguous. For example, in a
response-to-literature prompt from Indiana, students were provided with the situation that “if
Bessie had kept a journal about her flying experiences, how might she have described her
thoughts and emotions?” and directed to “write an essay in which you describe one of Bessie’s
flying experiences.” Though “describe” might appear to suggest a descriptive text, “one of
Bessie’s flying experiences” indicated a particular experience; moreover, the “journal” context
seemed to suggest a narrative retelling of what had happened. Furthermore, because “describe”
was used in many different genres, it was hard to make a judgment about the expected genre
based on the verb “describe” alone. Consequently, this prompt may have made it difficult for
students to figure out whether they should spend more time describing Bessie’s thoughts and
emotions from her flying experience or telling a story about one of her flying experiences.
Similarly, in the other two cases, “describe” and “explain” were used in an ambiguous way to
prompt students’ narrative skills.
There were only four argumentative prompts in this sample. None of the prompts used
“argue” as a demand verb; instead, these prompts used “explain” and “describe.” Moreover, the
way in which a prompt from Virginia used the demand verb “explain” could potentially lead
students to interpret it as looking for expository composition. This prompt read, “Your school is

87

planning to issue laptop computers to ninth graders next year. Do you think this is a good idea?
Write to explain why or why not.” Different from expository prompts, which often asked
students to select or identify an item, an event, or a phenomenon to be explained, this prompt
asked students to take a position on a two-sided issue and use reasons to support their positions.
Therefore, this was classified as an argumentative prompt; however, the use of “explain” as the
demand verb made this prompt’s genre expectation ambiguous.
There were only five informative prompts. These informative prompts also often used
“explain” and “describe” as the demand verbs, with one exception. The prompt from Arizona
read, “Your class has joined a pen pal program. You have selected a pen pal who lives in another
state. Write a letter to your new pen pal introducing yourself and telling about your interests.”
The verb “tell” is a rhetorical term that is often used in narrative genre writing to mean
entertaining through the course of recounting an experience and happenings. In this prompt,
however, “tell” was used as a synonym of “inform,” which directed students to provide
information about their interests, rather than constructing or reconstructing a view of the world
like a narrative often does.
Prompts with implicit expectations were mostly persuasive, expository, and
argumentative prompts. Persuasive prompts often had explicit verbs such as “convince” or
“persuade.” However, there was one persuasive prompt that did not have any explicit verbs. This
Kentucky prompt read, “Select one current issue that you feel people should be concerned about.
Write a letter to the readers of the local newspaper regarding this issue. Support your position
with specific reasons why the readers should be concerned about this issue.” This prompt did not
have a demand verb that explicitly indicated any genre. However, “support” and “position” were
often employed by persuasive and argumentative prompts. In contrast to argumentative prompts,

88

persuasive prompts often contain an explicit reference to their audience. In this case, it was the
readers of the local newspaper. However, the lack of demand verbs made this prompt’s genre
expectation implicit rather than explicit.
Two argumentative prompts also lacked explicit demand verbs. These two response-toliterature prompts from Michigan had very similar structures. One prompt read, “Is this a good
example of seventh-grade writing? Why or why not? Use details from the student writing sample
to support your answer.” In this prompt, there was no demand verb that indicated explicitly the
genre to which the students’ writing should conform. However, students were expected to take a
position either arguing that it was a good example or it was not, and use details to support their
positions. Such a genre expectation was considered to be implicit.
Though the majority of the expository prompts used the demand verb “explain,” there
were still cases where students were given a topic and directed to write about the topic without a
clear indication of the genre. For example, a prompt from Arkansas read, “What advice would
you consider the best? Why? Write an essay about the best advice. Give enough detail.” In this
prompt, there was no explicit verb indicating the genre of the prompt. The noun “essay” also did
not specify the genre because it could be used to refer to all kinds of writing, including
persuasive, narrative, argumentative, and literary analysis essays. Though this prompt might be
categorized as expository because when one writes about a topic one frequently has to explain
information about the topic, without an explicit demand verb the genre expectation remained
implicit.
4.1b. Which key words in prompts were associated with ambiguity and implicit genre
expectations, and how frequently do they appear?

89

The key words mentioned in the previous section associated with ambiguity and implicit
genre expectations were: “explain,” “describe,” “essay,” “support,” “discuss,” and “tell.” Table 6
below is a table of the frequencies of words discussed in this section and the percentages of
prompts in each genre in which they were used.
“Explain” was widely used in 69% of expository prompts and 83% of literary analysis
prompts. It was also used in 22% of persuasive, 6% of narrative, 25% of argumentative and 40%
of informative prompts. In other words, “explain” was used in every one of the seven genre
prompts except descriptive prompts. Some of these uses evoke unconventional meanings of
“explain”. For example:
(1) Write a fictional story about a day during your favorite season. Create a main character or
characters and describe the action that takes place during that day. Explain where and when the
story takes place (Indiana 2002 8th grade).
(2) Explain how someone lost a privilege as a result of not being responsible (Michigan 2006 8th
grade).
(3) Compare your social life as a teenager with your social life as a young child. Explain how it
is different and how has it remained the same? Support your main points with examples (Kansas
2004 8th grade).
In these three examples, “explain” was used in several different ways. In the first case, “explain”
was a synonym of “describe;” in the second case, it could be interpreted as “give an account of
how someone loses a privilege,” while “lost” in the past tense also seemed to suggest students
should “tell a story of how someone lost a privilege;” in the third case, it was used in a traditional
sense, meaning providing information about the given topic.
“Describe” was also widely used in all genres except persuasive prompts. It was used in

90

Table 6 Frequency (F) and Percentage (P) of Key Words Usage in Genres

explain
detail
support
describe
essay
reason
convince
story
tell
persuade
answer
position
idea
conclusion
persuasive
response
opinion
compare
discuss
justify
argue
point
evidence
theme

Persuasive
(n=18)
F
P
4 22%
6 33%
7 39%
0
0%
4 22%
8 44%
8 44%
0
0%
0
0%
6 33%
0
0%
3 17%
0
0%
3 17%
3 17%
0
0%
3 17%
0
0%
1
6%
1
6%
2 11%
0
0%
0
0%
0
0%

Expository
(n=26)
F
P
18 69%
11 42%
8 31%
6 23%
9 35%
4 15%
0
0%
0
0%
1
4%
0
0%
2
8%
0
0%
1
4%
0
0%
0
0%
2
8%
0
0%
2
8%
2
8%
0
0%
0
0%
2
8%
0
0%
2
8%

Narrative
(n=16)
F
P
1
6%
5 31%
0
0%
6 38%
1
6%
0
0%
0
0%
7 44%
7 44%
0
0%
0
0%
0
0%
1
6%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%

Argumentative
(n=4)
F
P
1
25%
3
75%
3
75%
1
25%
1
25%
1
25%
0
0%
0
0%
0
0%
0
0%
2
50%
1
25%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%

91

Descriptive
(n=3)
F
P
0
0%
0
0%
0
0%
3 100%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%

Informative
(n=5)
F
P
2 40%
2 40%
0
0%
1 20%
0
0%
0
0%
0
0%
0
0%
1 20%
0
0%
0
0%
0
0%
1 20%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%
0
0%

Analysis
(n=6)
F
P
5 83%
3 50%
5 83%
2 33%
3 50%
0
0%
0
0%
1 17%
0
0%
0
0%
2 33%
0
0%
1 17%
0
0%
0
0%
1 17%
0
0%
0
0%
0
0%
1 17%
0
0%
0
0%
1 17%
1 17%

100% of descriptive, 23% of expository, 38% of narrative, 25% of argumentative, 20% of
informative, and 33% of literary-analysis prompts. “Describe,” when used alone as the only
demand verb in a prompt, often indicated a descriptive prompt; however, some states also used it
by itself to indicate a narrative prompt. It was also used in a combination with other demand
verbs such as “explain” to indicate genres other than descriptive and narrative. Consider the
following examples:
(1) Describe a time when you or someone you know had a difficult experience but learned a
valuable lesson from it (Michigan 2006 7th grade).
(2) Think of a teacher that you will always remember. Describe this teacher (Alabama 2004 7th
grade).
(3) Think of someone who is your personal hero. In a well-developed composition, describe this
person and explain two qualities you most admire about him or her (Massachusetts 2002 7th
grade).
In examples (1) and (2), the meaning of “describe” was different. In example (1), “describe” was
used as the equivalent of “tell a story about a time when …,” while in examples (2) and (3),
“describe” was used in the traditional sense to mean “provide details and attributes about
something.” However, different from example (2), in example (3), “describe” was used in
conjunction with “explain” to indicate another genre. When “describe” was used alone to
indicate genres other than descriptive or used with other demand verbs in a parallel manner,
ambiguity in genre expectations often happened.
“Essay” was another popular word used in prompts; however, its lack of genre
specification made it similar to other abstract nouns such as “writing,” “composition,” or
“answer.” Among its eighteen occurrences, there were only two times when words that explicitly

92

indicated genres such as “expository” and “persuasive” were used ahead of it. There were other
times when “essay” was used with demand verbs to clearly indicate genres. However, when
“essay” was used alone—such as in the example, “What advice you would consider the best?
Why? Write an essay about the best advice. Give enough detail”—it did not add much to the
genre specification of the prompts.
“Support” was another word that was widely used with all kinds of genres; it was used
with persuasive, expository, argumentative, and literary-analysis writing. The term “support”
was traditionally used in persuasive or argumentative prompts in combination with words such
as “position,” “points,” and “evidence.” However, this study showed that “support” was used in
31% of expository prompts. Among these uses, there were a variety of things that “support” was
supposed to reinforce—“opinion,” “ideas,” “position,” “theme,” “points,” “response,” “answer,”
“details,” “reasons,” and “conclusions.” The use of “opinion,” “conclusions,” and “position” was
strongly associated with persuasive and argumentative essays. The use of “reasons” was strongly
associated with persuasive writing; however, it was also used with expository writing.
Surprisingly, “points,” which traditionally is more associated with persuasive writing, was only
used in expository prompts. “Answer” and “details” were used more often with expository
writing than any other genre.
“Discuss” was only used three times in the 73 prompts. “Discuss” did not signify any
specific genre itself. However, in each case it was used in conjunction with other demand verbs.
Here are the three examples:
(1) Describe a special privilege or right that people your age are sometimes given and discuss the
responsibilities that go with it (Michigan 2006 8th grade).

93

(2) Your teacher has asked you to write an essay discussing what you would do if you could be
President for one day… Now write an essay about what you would do if you could be the
President for one day… Explain your ideas clearly so that your teacher will understand
(Arkansas 2007 8th grade).
(2) The Television Advertisers Association is sponsoring an essay contest for students. Students
are invited to submit essays that discuss ONE thing about television advertising they believe
should be changed. Write an essay for the contest identifying the change that should be made and
persuading your reader why this change is important (Wisconsin 2007 8th grade).
In these three prompts, “discuss” was used with “describe,” “explain,” and “persuade.” The use
of “discuss” alone did not indicate the genre of the prompts; thus, the genre specification of the
prompt depended on the interaction between “discuss” and other demand verbs. In examples (2)
and (3), “explain” and “persuade” were used in the traditional sense and “discuss” was used to
reinforce the rhetorical purpose expected, so the prompt could be easily categorized as
expository. In example (1), however, “discuss” added another task besides “describe” without
specifying the genre, which made the prompt ambiguous.
“Tell” was a verb that often explicitly indicated narrative writing. However, in this study,
“tell” was also found to be used in expository and informative prompts. Consider the following
examples:
(1) Write a narrative composition telling about ONE time you observed something that was
really strange or weird (Illinois 2010 8th grade).
(2) Write an editorial for the local newspaper about the importance of being kind to others. Tell
about a time when you observed or participated in an act of kindness. Support your response
with details or examples (Kentucky 2007 8th grade).

94

(3) Write a letter to your new pen pal introducing yourself and telling about your interests
(Arizona 2005 7th grade).
(4) Think about a person who has had an influence on your and your life … Write an essay
telling who this person is and explaining why he/she has had such an influence on you (Alabama
2004 7th grade).
In examples (1) and (2), “tell” was used in the conventional way. Example (1) was an explicit
narrative prompt. Contrasted with example (1), in example (2) students were expected to explain
the importance of being kind to others; however, students were also expected to tell about an
event. The expectation that the event should be “told” to support the explanation was implicit,
which resulted in ambiguity. In example (3), “tell” was used as a synonym for “provide details.”
In example (4), “tell” was used as a synonym for “identify.”
These results show that these genre-associated key words were often utilized in
ambiguous ways. There was little consensus about how they should be used to make genre
expectations clear and explicit for students.
4.2. What is the relationship between prompts’ genre specification and rubrics’ genremastery expectations?
Among the 32 prompts that were used with genre-mastery rubrics, five prompts from
three states possessed problematic features (i.e., ambiguity or implicit expectations). In other
words, among the 15 prompts that possessed problematic features, five prompts were used with
genre-mastery rubrics. These genre-mastery rubrics directed raters to evaluate students’
compositions in terms of whether they demonstrated students’ mastery of the genres as the most
important criteria. Table 7 shows the five prompts with problematic features used with genre-

95

mastery rubrics. The table includes the five prompts’ rhetorical processes, key words, genres
assessed as a result of prompt coding, and problematic features.
Table 7 Prompts with Problematic Features and Used with Genre-mastery Rubrics

Rhetorical Purposes

Key Words

Genre Assessed

Problematic
Feature

IN
2002
G8

write a fictional story; create a main
character or characters; describe the action;
explain where and when; details; an event or
series of events

fictional, story,
character, describe,
action, explain,
detail, event

Narrative

Ambiguity

IN
2003
G8

write an essay; describe one of Bessie's
flying experiences; include two ideas from
the poem

essay, describe,
experience, idea

Narrative

Ambiguity

KY
2008
G8

select one current issue; write a letter to the
readers of the local newspaper; support your
position with specific reasons

select, issue, letter,
support, position,
reason

Persuasive

Implicit
Genre
Expectation

KY
2007
G8

write an editorial for the local newspaper
about the importance of being kind to others;
tell about a time; support your response with
details or examples

editorial, tell, time,
support, response,
detail, example

Expository

Ambiguity

VA
2011
G8

write to explain why or why not

Explain

Argumentative

Ambiguity

The rubrics used with the above five prompts all encompassed the genres to be assessed
in an implicit or explicit way. However, the interplay between the ambiguity in prompts and the
criteria in rubrics might further complicate the writing assessments as illustrated below.
In Indiana’s writing rubrics, students were assessed with regard to whether their
compositions fully accomplished tasks such as supporting an opinion, summarizing, storytelling,
or writing an article. They were also assessed on whether they “included vocabulary to make
explanations detailed and precise, description rich, and actions clear and vivid.” In other words,
the writing rubrics included a range of genres. However, the ambiguity in prompts can interfere
with students’ understanding of what the task entailed.

96

One example of this was identified in the 2002 prompt, which was intended to assess
students’ storytelling ability. It asked students to include key elements of narrative composition
such as “main character or characters,” “actions,” “where and when,” and “event.” It could be
implied that for students to fully accomplish this storytelling task, they had to include these
elements. The prompt used language emphasizing “describe actions” and “explain when and
where.” However, the loose use of “explain” as a synonym of “describe” might have led students
to interpret it as if they were expected to provide reasons for choosing the place and the time of
the event instead of just describing the place and the time. This ambiguity could have interfered
with students’ capability to accomplish the task as assessed in the rubrics. When students were
instructed to provide “detailed and precise” explanations of “where and when” in their
compositions, which was also assessed in the rubrics, how should their compositions be
evaluated?
Similarly, the 2008 prompt also tried to assess students’ storytelling ability. The main
demand verb “describe” could possibly have distracted students from telling about one of
Bessie’s flying experiences using key elements of narrative compositions, and instead directed
students to provide a “rich” description of Bessie’s flying experiences in general. In this case,
how should their compositions have been evaluated? Could students’ “detailed and precise”
explanations and “rich” descriptions compensate for students’ seemingly off-task performance?
These two examples illustrate that the ambiguity in writing prompts can lead students to write
compositions in an unexpected way yet still meet the evaluation criteria of the rubrics. This
complicates the evaluation of students’ writing abilities.
In Kentucky’s 8th grade writing rubrics, students were assessed regarding whether they
skillfully applied characteristics of the genre; the writing rubrics did not identify the specific

97

genres corresponding to the prompts. The 2008 prompt did not explicitly specify the genre to be
assessed, which left interpretation of the intended genre to students and raters. The 2007 prompt
directed students to tell about an event while explaining the importance of being kind to others.
Such an arrangement is atypical for the expository genre that students were assessed on; thus, it
would pose a challenge for raters to assess whether students skillfully applied characteristics of
the assessed genre. This example shows that when prompts are ambiguous, there is little
agreement on the genre that was meant to be assessed; even though the rubrics directed raters to
assess students’ genre-mastery skills, it is impossible for raters to know what characteristics of
the genres they should look for in students’ writing. Therefore, the ambiguity in prompts
undermines the rubrics’ emphasis on assessing students’ genre-mastery skills.
In Virginia’s writing test composing rubrics, students’ narrative organization was
expected to be intact: minor organizational lapses may be permissible with other modes of
writing, but in all types of writing, a strong organizational plan was expected to be apparent. The
rubrics included a range of genres, but did not identify what those “other modes of writing”
were. The rubrics still expected an apparent strong organizational plan, while they gave students
some flexibility to structure their texts. The 2011 prompt assessed students’ argumentative
writing. The prompt’s use of “explain” rather than “argue” might have led students to interpret it
as an expository prompt. As a result, students might have organized their texts to make their
explanations detailed and precise instead of focusing on employing strong and relevant evidence
to support their positions regarding whether they considered it a good idea for their schools to
“issue laptop computers to ninth graders next year.” Depending on their interpretations of the
prompt, students’ organizational plans would differ. Meanwhile, students were assessed on these
organizational plans. This example echoes the examples from Indiana and Kentucky and shows

98

that 1) the ambiguity in writing prompts might lead students to write compositions in an
unexpected way or in a different mode that nevertheless still meet certain criteria of the rubrics,
thus complicating the evaluation of students’ writing abilities; and 2) the ambiguity in prompts
undermines the rubrics’ emphasis on assessing students’ genre-mastery skills.
4.3. What is the relationship between genre expectations in state standards and writing
prompts?
Table 12 in Appendix A shows the relationship between genre expectations in state
standards and writing prompts. It includes the genres expected to be mastered at grades 7 and 8
in state standards, the percentage of each genre out of the total genre occurrences in that state’s
standards, the genres assessed, and the percentage of the genres in the state standards that were
actually assessed (e.g., if a state’s standards included five genres but only two were assessed,
then the percentage would be 40%). Those genres in standards that appeared in more than 10%
of all the genre occurrences were bolded to highlight those more frequently mentioned genres.
Among the seven genres assessed, the most widely-referred-to genre was narrative; it was
referenced in 25 states’ writing standards; this was followed by persuasive (24 states), expository
(23 states), informative (22 states), descriptive (12 states), analysis (7 states) and argumentative
(4 states). There were another 12 states whose standards implicitly referred to argumentative
genre by describing argumentative genre features without distinguishing argumentative from
persuasive texts and 11 states whose standards implicitly referred to features of literary analysis
without labeling it as such.
Among the 27 states evaluated in this study, 12 covered all the genres they assessed in
their writing standards. Moreover, they also referred to those genres more frequently than other
genres in their standards. Another nine states covered all the genres they assessed in their writing

99

standards. However, they referred to those genres less frequently than some other genres in their
standards.
Most importantly, there were six states whose writing standards did not cover all the
genres they assessed. Alabama and North Carolina included persuasive writing in their writing
assessments; however, persuasive writing was not covered in their writing standards. Maine
covered descriptive writing in their writing assessments; descriptive writing was not addressed in
their writing standards. Oklahoma assessed expository writing; expository writing was not
included in their writing standards. Virginia assessed argumentative writing; argumentative
writing was not covered in their writing standards. Finally, West Virginia’s writing assessments
contained both descriptive and narrative compositions; however, neither of these two genres was
covered in their writing standards.
The percentage of those genres in state standards assessed was in the range of 0-60%
with an average of 18%. For example, North Carolina included the following writing purposes in
their standards: narrate, express, explain, inform, analyze, reflect, and evaluate; however, none of
these purposes were assessed; instead persuasive composition was assessed in their writing
assessments, thus, 0% of those genres in North Carolina’s writing standards were assessed.
Vermont included the following writing purposes in their standards: respond to literature
potentially covering literary analysis, direct, narrate, persuade, and inform; among them, literary
analysis, persuasive, and informative composition were assessed in New England Common
Assessment Program (NECAP) direct writing assessment, thus, 60% of these genres in
Vermont’s writing standards were assessed.
5. Discussion
5.1 Ambiguity in prompts

100

The findings in the results section identified five scenarios that create ambiguity and
implicit expectations in state writing prompts: a) the meanings of “demand verbs” are evoked in
unconventional ways, such as “describe a time;” “explain where and when;” b) demand verbs are
absent in prompts, for example, “write an essay about the best advice and give enough detail;” c)
two “demand verbs” that signal different genres are used in a way that competes for writers’
attention, e.g., “describe a person and explain two qualities;” d) demand verbs, such as
“describe,” “explain,” “support,” and “discuss,” which are widely used in a variety of genres,
are used on their own without other supplemental information to specify the genre; and e) nouns
like “writings,” “responses,” “essays,” “paragraph” are used by themselves to denote the type of
writing expected without any other genre-specific demand verbs or nouns.
The findings suggest that “explain,” “describe,” “essay,” “support,” “discuss,” and “tell”
were often used in ambiguous ways or were employed to refer to genre implicitly. These findings
have confirmed Beck & Jeffery’s (2007) assertion about the lack of consensus of the use of the
demand verbs “explain” and “discuss,” as well as terms such as “support.” These findings further
suggest unspecified uses of terms such as “describe,” “essay,” and “tell.” “Discuss” appeared
much less frequently in middle school prompts than exit-level high school prompts; instead,
“describe” appeared much more frequently.
The introduction section of this chapter discussed possible reasons for why the above five
scenarios occurred—test designers use terminologies considered most familiar to students, or
they purposefully include conflicting genre demands to give students a choice in their
compositions (Beck & Jeffery, 2007). These possible reasons cannot justify the threats such
ambiguities and implicit expectations pose to the validity of state writing assessments. When
writing prompts can be interpreted in multiple ways, students may produce compositions that are

101

not representative of their writing abilities. This may also lead to unclear expectations about
writing performance, resulting in unfair judgments of students’ writing abilities.
5.2 Genre Expectation in Standards, Rubrics, and Prompts
About 45% of state writing standards covered all the genres they assessed and referred to
those genres more frequently than other genres; 33% of state writing standards covered all the
genres they assessed but referred to those genres less frequently than some other genres. The
alarming fact is that 22% of state writing standards did not cover the genres that were assessed in
the corresponding state writing assessments. When state writing standards do not cover the
genres to be covered in state writing assessments, it leaves teachers to determine whether those
genres are important enough to be taught. As a result, students receive different levels of
preparation to write in those genres. The consequence is that state writing assessments may not
only assess students’ writing abilities but also assess students’ preparedness for the tests.
Genre-mastery rubrics allow states to emphasize students’ mastery of genres in their
evaluation criteria. When attention is given to genres and explicit direction is included in these
rubrics, writing expectations are more likely to be concrete. Thus, utilizing genre-mastery rubrics
with explicit genre-component directions for the raters is helpful. If genres are well-specified in
prompts and evaluated with genre-mastery rubrics, students’ abilities to accomplish the tasks are
more likely to be fairly assessed. When a prompt is ambiguous or has implicit expectations with
a rubric that emphasizes genre mastery (five prompts out of 68 in this study), this is especially
problematic. In this scenario, not only are students given a prompt that can be interpreted in
multiple ways, but also their compositions are assessed using criteria about which students are
not provided enough information or explicit directions. Therefore, when students’ mastery of

102

genre is an important criterion in rubrics, it is even more important that prompts are explicit with
respect to genre expectations.
5.3 Validity of State Writing Assessments
The above aspects posed potential threats to the validity of state writing assessments—the
standards do not cover what is to be assessed, and the prompts do not explicitly specify genres
while their rubrics assess students’ mastery of genres. Standards of test development emphasize
that “the instructions presented to test takers should contain sufficient detail so that test takers
can respond to a task in the manner that the test developer intended” (AERA/APA/NCME, 2011,
p.47). When writing rubrics assess students’ mastery of genres but the prompts do not explicitly
specify the genres being assessed, students lack sufficient information to respond to those
prompts in the way that test developers intended. If test designers purposefully include
conflicting genre demands to give students choices in their compositions, there is little evidence
to suggest that this practice actually helps “increase students’ engagement and allow them to
demonstrate their best possible writing performance” (Beck & Jeffery, 2007, p.76). Therefore,
aligning assessments with state standards, aligning rubrics’ criteria with prompts’ genre
expectations, and making prompts’ genre expectations explicit will help ensure the valid
interpretation of state writing assessments.
6. Implications
State assessments should be aligned with state standards to ensure that those genres being
assessed are also covered in state standards. This is important because state standards specify
what students are expected to learn. Teachers have been reported to increase their instructional
emphasis on writing for specific genres in response to changes in standards (Stecher, Barron,
Chun, & Ross, 2000). When genres are assessed without being specified in state standards, it

103

leaves to teachers to decide whether those genres are important for students to learn; as a result,
students receive different levels of preparation to write for those genres and have to shoulder the
consequences of high-stakes testing. Prompts should make their assessed genres explicit. To
avoid the five scenarios that tend to cause ambiguity and/or implicit genre expectations in state
writing prompts, the following recommendations should be considered:
a) Try to include relevant demand verbs in prompts. For example, use “tell” in narrative prompts;
use “persuade” in persuasive prompts (along with an explicit audience), use “argue” in
argumentative prompts, and so forth;
b) Make sure that the meanings of “demand verbs” such as those above are evoked in
conventional ways;
c) When two or more “demand verbs” which signal different genres have to be used in the same
prompt, their relationships should be explicit. In other words, it should be explicit how those
rhetorical processes should work together to achieve a specified purpose. For example, if it is
expected that students will explain the importance of being kind to others, tell about a time when
they observed or participated in an act of kindness, and support their response with details or
examples; then the prompt should specify the role of the narrative event in students’
compositions such as “Write to explain the importance of being kind to others. In your
expository essay, include details and an example in which you tell about a time when you
observed or participated in an act of kindness to elaborate your idea;”
d) When demand verbs such as “describe,” “explain,” “support,” and “discuss,” which are
widely used in a variety of genres, are used on their own, there should be other supplemental
information giving more details about genre expectations; and

104

e) More concrete nouns that signify genres, such as “story,” “description,” “exposition,”
“persuasion,” and “argument” should be used in prompts to indicate the expected responses.
These practices will help make genre expectations in prompts explicit. Future research
can be conducted to investigate whether state writing assessments are more likely to be fair
assessments of students’ writing abilities under these circumstances—when those genres
explicitly assessed in prompts are covered by state writing standards and genre-mastery rubrics
are used to evaluate whether students’ compositions accomplish the specified task demands.
More research is needed to examine the thinking processes that students adopt when
reading writing-assessment prompts. Students’ vocabulary precision is also a potential area for
future research using procedures such as think-aloud protocols and interviews.
7. Limitations
This study only explored the coverage of genres in prompts, rubrics, and state standards.
It did not explore the attributes of those genres students are expected to master, though a study of
those would contribute to our understanding of genre knowledge specified in schooling.
Meanwhile, genre expectations in state standards were only examined at grades 7 and 8. On the
one hand, this might have caused underrepresentation of genre expectations in some states, when
genres expected and assessed at lower grades did not appear again in seventh and eighth grade
state standards. On the other hand, a rationale for including only seventh and eighth grade state
standards was that if states intended to emphasize certain genres to be mastered by seventh and
eighth graders, they should include those genres in the state standards for those grades regardless
of whether those genres had appeared in earlier grades. It would be even more important for
those genres to be specified in the state standards for those grades if those genres were further
assessed in the state’s writing assessments.

105

CHAPTER 4: Summary and Moving Forward
The three pieces of research presented in this dissertation have investigated the writing
constructs underlying state and national writing assessments, explored the relationship between
the differences in state and national assessments and students’ NAEP performances, and
examined important components of writing assessments in depth. This chapter will review major
findings, highlight implications for state writing assessments and the NAEP, as well as for
writing prompt design, and offer some future directions for research.
1. Major Findings
1.1 Prevalent Writing Practices
Among the 27 states examined, only three states gave students choices of prompts, thus
illustrating it was not a popular practice (at least by 2007). The writing process approach had an
impact on the writing assessment because the majority of states (26/27) directed students to plan,
and more than half of the states directed students to revise and edit. However, few states
provided separate planning, revision, and editing sessions. Only seven states gave students two
prompts. The only exception was New York, which gave students four integrated writing tasks
that included responding after both listening and reading activities. The integrated writing tasks
in New York’s assessment suggest a potential path for increasing students’ writing opportunities
by integrating listening and reading assessments with writing assessments.
The majority of states (20/27) specified an audience in their writing prompts, and at least
30% of writing rubrics emphasized the importance of authors’ consideration of the intended
audience in their compositions. However, the writing prompts incorporated a wide range of
audiences including general “readers,” pen pals, and students’ classes, classmates, or teachers.

106

An emphasis on organization, content, and detail was a feature in almost all writing
rubrics; word choice, sentence fluency, style, and grammar, including sentence construction,
were also highly prized aspects of students’ papers. General conventions, such as capitalization,
punctuation, and spelling were also assessed by the majority of states. This shows that,
regardless of the rubric types, these aspects are considered necessary for demonstrating writing
proficiency by most states. Only ten states included genre-specific components in their rubrics;
persuasive essays’ components are most often specified compared with other genres. While
expository is the most assessed genre (16/27 states), only four states specified expository essays’
components in their rubrics. By 2007, only West Virginia had online writing sessions for their
state direct writing assessments.
1.2 Genre Demands in Direct Writing Assessments
The most popular prompt genre in middle school assessments was expository, followed
by persuasive, narrative, informative, analytic, argumentative, and lastly descriptive. Half of the
rubrics were genre-mastery rubrics. Few rubrics emphasized creativity and critical thinking.
Genre-mastery rubrics were used with all genres, while rhetorical rubrics were not used with
descriptive prompts. About the same number of states used either genre-mastery rubrics or
rhetorical rubrics. Only six states had genre-mastery rubrics that contained genre-specific
components. This finding suggests that the genre evaluation criteria that states place on students’
writing are either vague or not fully utilized to assess students’ genre mastery.
1.3 State and National Alignment
State writing assessments and the NAEP align in their adoption of the writing process
approach, their attention to audience and students’ topical knowledge, their accommodations
through procedure facilitators, and their inclusion of organization, structure, content, details,

107

sentence fluency, and semantic aspects as well as general conventions such as punctuation,
spelling, and grammar in their assessment criteria. However, the NAEP writing assessment
differs from many states’ writing assessments by having explicit directions for students to review
their writing, giving students two timed writing tasks, making the informative genre—rarely
assessed in state assessments—one of the three genres assessed, and including genre-specific
components in their writing rubrics. One of the biggest differences between the NAEP and most
of the state writing assessments is that all of NAEP’s writing rubrics are genre-mastery rubrics
with genre-specific components. Thus, when state and national writing assessment results are
compared, these two assessments differ in the genres they assess, the time and the number of
tasks they give to students, and the level and specificity of genre demands they emphasize in
their evaluation criteria.
1.4 The Relationship between the Variability between State and National Assessments and
Students’ NAEP Performance
Students’ preparedness for the NAEP tasks, namely their home states’ assessments’
similarity to NAEP, is found to play a marked role in students’ performance on the NAEP.
Students from those states with writing assessments more similar to the NAEP perform
significantly better than students from states with writing assessments more different from the
NAEP. However, this predictor only explains a small amount of the variance in the outcome
variable (students’ NAEP performance); consequently, it does not negate the interpretation of
NAEP scores as an indicator of students’ writing abilities.
1.5 The Relationship between Students’ Characteristics and their NAEP Performance
All of the students’ demographic variables were found to be statistically significant in all
models. More specifically, students who were English Language Learners, had IEPs, or were

108

eligible for free/reduced priced lunch performed significantly poorer than students who were
without those characteristics. Black, Hispanic, or American Indian students performed
significantly poorer than students who were White. Asian students performed significantly better
than White students, and female students performed significantly better than male students.
Students who thought that writing helped share ideas performed better than students who
did not. Students’ perceptions of the importance of the NAEP writing test were not significantly
related to their writing performances. Moreover, students who believed that they exerted more
effort on the NAEP writing test did not perform as well as those who did not.
Almost all students’ writing activities inside the classroom were found to be significantly
related to their writing performance, the exception being the frequency with which students
wrote letters or essays for school. However, some of the students’ writing activities were found
to be negatively related to their writing performance. These included the frequency students
wrote reports, personal/imaginative stories, and business writing, how regularly they
brainstormed and worked with other students when writing, and how often they wrote one
paragraph in math class. The frequency of students’ revision and writing in English class was
consistently found to be strongly positively related to their writing performance.
All variables regarding students’ writing experiences were found to be significantly
related to their performances. However, some of the students’ writing experiences were found to
be negatively related to their writing performances, including how frequently they had used
computers from the beginning when writing papers, and whether teachers emphasized the
importance of spelling/punctuation/grammar and length of papers in their grading. Among the
positively related variables, whether teachers emphasized papers’ quality or creativity and paper

109

organization in their grading was consistently found to have a strong positive relationship with
students’ writing performance
1.6 Ambiguity in Prompts and Genre-mastery Rubrics
Among 78 prompts, 11 prompts from seven states were considered ambiguous, and seven
prompts from four states were considered to have implicit genre expectations. In total, 23% of
prompts possessed one of the two problematic features: 14% of prompts were ambiguous, and
9% of prompts had implicit genre expectations.
Ambiguous prompts were mostly expository, narrative, argumentative, and informative
prompts. Prompts with implicit expectations were mostly persuasive, expository, and
argumentative prompts. Key words associated with ambiguity and implicit genre expectations
include “explain,” “describe,” “essay,” “support,” “discuss,” and “tell.”
Among the 15 prompts that possessed these problematic features (i.e., ambiguity and
implicit expectations), five prompts from three states were used with genre-mastery rubrics. In
other words, these three states expected students to show their mastery of genres assessed (but
not clearly or directly explained) in the prompts.
1.7 Genre Expectation in Standards and Genres Assessed
Among the seven genres assessed, the most widely referred to genre was narrative; it was
referred to in 25 states’ writing standards; this was followed by persuasive (24 states), expository
(23 states), informative (22 states), descriptive (12 states), analytic (7 states) and argumentative
(4 states). There were another 12 states whose standards implicitly referred to the argumentative
genre by describing argumentative genre features without distinguishing argumentative from
persuasive writings, and 11 states whose standards implicitly referred to features of literary
analysis without labeling it as such. About 45% of state writing standards (12/27 states) covered

110

all the genres assessed in those states and referred to those genres more frequently than other
genres; 33% of state writing standards (9/27 states) covered all the genres those states assessed
but referred to those genres less frequently than some other genres. Around 22% of state writing
standards (6/27 states) did not cover all of the genres that were assessed in the corresponding
state writing assessments.
2. Implication for Writing Assessment Practices
2.1 For State Writing Assessment and NAEP
State assessments should be aligned with state standards to ensure that those genres
assessed are also covered in state standards. Prompts should make their assessed genres more
explicit. When states intend to evaluate students’ genre-mastery skills, it is helpful to include
specific genre components in their rubrics so that their expectations are more explicit to students,
raters, and educators. Under the allowance of time and resources, more writing opportunities
should be provided to students so that their writing abilities can be assessed more accurately.
These recommendations are also applicable to the new CCSS-aligned K-12 assessments
developed by the SBAC and the PARCC.
State and NAEP assessment differences play a role in students’ performance on the
NAEP. Students’ NAEP performances are a result of many factors, including the similarity of
students’ home state assessments to the NAEP. When students’ performances on NAEP are
compared, we have to be aware of their different levels of preparedness as a result of their state
writing assessments’ similarities and differences with the NAEP.
Instead of focusing on the differences between state and NAEP assessments, both NAEP
and states’ assessments can move forward by incorporating more evidence-based writing
assessment practices, which are likely to shrink the differences between states’ and NAEP

111

assessments. As a result, students’ performances on the NAEP are less likely to be impacted by
their different levels of preparedness for NAEP tasks.
2.2 Writing Prompt Design
To make the assessed genres more explicit in writing prompts, the following practices are
recommended:
a) Include relevant demand verbs in prompts whenever possible.
b) Make sure that the meanings of “demand verbs” in a) are evoked in conventional ways;
c) When two or more “demand verbs” that signal different genres have to be used in the same
prompt, how those rhetorical processes should work together to achieve a specified purpose
should be specified;
d) When demand verbs, which are often widely used in a variety of genres, are used on their
own, there should be other supplemental information giving more details about genre
expectations; and
e) Use more concrete nouns and adjectives that signify genres in prompts.
3. Implication for Writing Instruction
Research has shown that process writing instruction, including gathering information,
prewriting or planning, drafting, and editing, has a positive impact on students’ writing qualities.
Because some writing assessments also directed students to follow part of the writing process,
teachers should continue to adopt a writing process approach for their instruction.
In addition to the writing process, teachers should also pay attention to the contextual
factors in writing instruction. By 2007, only West Virginia had online writing sessions for their
state direct writing assessments. However, the new generation of the assessments, which the two
multi-state consortiums developed, is on computer. Thus, teachers can provide students with

112

more computer-based writing opportunities, as well as use research to inform their awareness of
the impact of the word-processing software on students’ writing qualities. The results of the
research suggest that prompts did not often specify the genre expectations and rubrics tended to
emphasize different aspects of writing construct. As a result, teachers can utilize rubrics in their
writing instruction so that not only students have more explicit understanding of the writing
expectations, but students can also learn to use rubrics to inform their planning of writing.
In terms of writing components, organization, structure, content, and detail were
emphasized in almost all writing rubrics. Teachers can provide paragraph structure instruction
and text structure instruction because research has shown that this kind of instruction is effective
on students’ writing qualities. Because word choice, sentence fluency, style, and grammar,
including sentence construction, were also highly prized aspects of the students’ papers, teachers
can use text models to direct students to examine specific attributes of the texts and use sentence
combining exercises to improve students’ sentence construction and writing performance.
Teachers should generally avoid traditional grammar instruction involving worksheets and
decontextualized practice (Graham & Perin, 2007; Hillocks, 1984), but, instead, use students’
own writing as examples in their instruction and provide students authentic editing opportunities.
General conventions, such as capitalization, punctuation, and spelling, were also assessed
by the majority of states. These conventions should be taught in developmentally and
instructionally appropriate ways. In terms of spelling, previously taught words should be
reinforced in written work and reviewed periodically to promote retention. Students should be
encouraged to correct their own capitalization, punctuation, and spelling mistakes after practice
and assessment occasions.

113

Certainly, all these discussions do not suggest that teachers should teach to the test
because large-scale writing assessments can only incorporate the measurable portion of writing
constructs. Some expectations for writing performance in real life demands cannot be addressed
in large-scale writing assessments due to all kinds of constraints. Those expectations that are
addressed might still raise the question whether they can provide a valid and reliable assessment
of students’ writing abilities. For example, the integrated writing tasks are celebrated for its
similarity to the writing tasks that students are likely to encounter in real life, but issues exist
with their psychometric properties as how to distinguish students’ reading and writing abilities in
such tasks. Therefore, a constant struggle in test design is to balance the content dimension of the
test and its psychometric dimension. Because of this limitation with large-scale assessments,
teachers’ instruction should definitely not only include the large-scale assessments’ content and
format, but also provide students with learning opportunities as a way to prepare for real life
writing demands.
4. Next Steps for Research
More research is needed to investigate different methods of writing assessment, such as
using integrated writing tasks. More research is also needed to study students’ assessment
behaviors, such as their interactions with writing prompts, especially the thinking processes that
students adopt when reading writing prompts. Students’ vocabulary understanding could be a
potential area for future research using procedures such as think-aloud protocols and interviews.
Future research can be done to investigate the state-level difference when school- and
teacher-level variables are entered as part of a multi-level model. The remaining large amount of
unexplained variance between states found in this study suggests that there are still more statelevel variables to be explored, such as alignments between states’ standards and assessments, and

114

the stringency of states’ accountability policies. Future research can also be conducted to
examine how subgroups are affected by alignment variability and whether other factors in NAEP
database might explain higher than expected achievement for students in subgroups.
Another potentially fruitful area for future research is to investigate whether state writing
assessments are more likely to be fair assessments of students’ writing abilities under the
recommended circumstances—when those genres explicitly assessed in prompts are covered by
state writing standards and genre-mastery rubrics are used to evaluate whether students’
compositions accomplish the specified task demands. Moreover, experimental research can be
conducted to examine connections between prompt design and student outcome within states
using generalizability theory by varying aspects of prompt design.
It is hoped that these findings can advise test designers about what central characteristics
of the writing construct have been valued in the past and can continue to be incorporated into
future assessments, and what pitfalls are to avoid when designing writing prompts. It is also
hoped that these findings can raise the general public’s awareness that students’ performances on
the NAEP reflect both their writing abilities and how well they are prepared for the type of
assessments the NAEP uses. Furthermore, it is hoped that these findings will draw the
assessment and writing research communities’ attention to validity-related issues in large-scale
writing assessments and encourage more research to investigate components of these large-scale
writing assessments in-depth.

115

APPENDICES

116

Appendix A Tables
Table 8 NAEP Coding & Frequency Counts and Percentage of States

Strand

States' Frequency Counts (n) and
Percentage (p)

G7
(N=15)

G8
(N=18)

Total
(N=27)

Indicators

n

p

n

p

n

p

101 General Writing Process

1

0.067

3

0.167

4

0.148

102 Topic/Genre Selection

2

0.133

3

0.167

3

0.111

103 Gather Information

2

0.133

4

0.222

5

0.185

G8
NAEP

Writing

104 Pre-Writing/Planning

13

0.867

18

1

26

0.963

X

Process

105 Drafting Text

15

1

18

1

27

1

X

106 Revising

9

0.6

9

0.5

15

0.556

X

107 Editing

9

0.6

12

0.667

18

0.667

108 Publishing

8

0.533

4

0.222

10

0.37

109 Strategies

2

0.133

9

0.5

10

0.37

201 Purpose

15

1

18

1

27

1

X

202 Task

15

1

18

1

27

1

X

203 Audience

14

0.933

13

0.722

20

0.741

X

204 Collaboration

0

0

0

205 Sharing

0

0

0

206 Feedback

0

0

0

207 Text Models

0

0

0

Writing

208 Guidance/Support

0

0

0

Context

209 Computer Technology

1

0.067

0

1

0.037

210 Procedural Facilitator

12

0.8

13

0.722

19

0.704

211 Reference Materials

8

0.533

6

0.333

11

0.407

212 Source Materials

4

0.267

5

0.278

7

0.259

213 Disciplinary Context

1

0.067

1

0.056

2

0.074

214 Writing In/Writing Out of School

0

215 Length of Writing

0

X

0

13

0.867

16

0.889

25

0.926

216 Quantity of Writing

3

0.2

6

0.333

7

0.259

X

217 Time for Writing

6

0.4

10

0.556

14

0.519

X

218 Sophistication

0

0

0

401 General Organization

15

1

18

1

27

1

X

402 General Structure

11

0.733

14

0.778

20

0.741

X

117

Table 8 (cont’d)
403 General Content

15

1

18

1

27

1

X

404 Elaboration/Detail
405 Genre Specific Organization &
Content/Ideas

14

0.933

18

1

26

0.963

X

0

0

405A Narrative

3

0.2

3

0.167

5

0.185

Writing

405B Expository

3

0.2

1

0.056

4

0.148

Component

405C Persuasive

4

0.267

3

0.167

6

0.222

405D Poetic

0

405E Response to Writing

0

0.133

2

0.111

3

0.111

406 Sentence Fluency

12

0.8

17

0.944

24

0.889

407 Style

13

0.867

17

0.944

24

0.889

4

0.267

6

0.333

7

0.259

14

0.933

16

0.889

24

0.889

1

0.067

1

0.056

1

0.037

409 Semantic Aspects
410 Citations and References
411 Multimedia

0

501 General Conventions
502 Capitalization-General

0

0.6

16

0.889

22

0.815

11

0.733

12

0.667

19

0.704

0

0

0

503A Sentence Beginning

0

0

0

503B Word Level

0

503C Text Level

0

504 Punctuation-General

11

0.733

1

0.056

1

0
12

505 Punctuation-Specific

0.667

0.037

19

0.704
0

505A Sentence Ending

1

0.067

4

0.222

4

0.148

505B Clausal Linking

1

0.067

4

0.222

4

0.148

0

505D Word Level

0

0
1

0.056

0
1

0.037

506 Quotes/Dialogue

0

0

0

507 Handwriting-General

0

0

0

508 Handwriting-Manuscript

0

0

0

Writing

509 Handwriting-Cursive

0

0

0

Convention

510 Keyboarding

0

0

0

118

X

0

0

505C Parenthetical

X

0

9

503 Capitalization-Specific

X

0

2

408 Figurative Language

X

X

Table 8 (cont’d)
511 Spelling-General

10

0.667

12

512 Spelling-Specific

0.667

18

0

512A Graphophonemic Elements
512B High-Frequency Words

2

0.667
0

0

1

0.056

1

0.037

0.133

5

0.278

6

0.222

512C Graphomorphemic Elements

0

0

0

512D Common Spelling Rules

0

0

0

512E Other Elements

0

1

0.056

1

0.037

0.867

15

0.833

24

0.889

513 Grammar-General

13

514 Grammar-Specific

0
2

0.133

2

0.111

3

0.111

514B Verbs & Verb Phrases
514C Pronouns & Pronominal
Phrases

5

0.333

6

0.333

7

0.259

2

0.133

2

0.111

4

0.148

1

0.067

2

0.111

3

0.111

0

1

0.056

1

0.037

2

0.133

4

0.222

4

0.148

11

0.733

13

0.722

19

0.704

515 Formatting-General

2

0.133

2

0.111

2

0.074

516 Formatting-Specific

6

0.4

8

0.444

12

0.444

601 Topic Knowledge

9

0.6

13

0.722

19

0.704

514E Adverbs
514F Modifiers
514G Sentence Construction

Writing

602 Genre Knowledge

0

0

0

Knowledge

603 Linguistic Knowledge

0

0

0

604 Procedural Knowledge

0

0

0

605 Self-Regulation

0

0

0

119

X

0

514A Nouns & Noun Phrases

514D Adjectives

X

X

X

Table 9 Sample Sizes, Achievement, and Student Demographics, 27 State Grade 8 NAEP Reporting Sample

State

Weighted
N

Mean
Student
Achievement

55739

147.579

SE(Mean)

%
Black

%
Hispanics

%
Asian

%
American
Indian

%
Female

%
LEP

%
With
IEPs

%
Free/reducedprice lunch

1.346

35.9%

2.2%

0.8%

0.3%

50.5%

1.4%

10.7%

50.1%

Alabama

n
2710

Arizona

2644

69384

148.227

1.441

5.8%

38.8%

2.7%

6.6%

49.1%

9.3%

7.9%

45.6%

Arkansas

2369

33196

150.634

1.162

23.8%

7.3%

1.2%

0.5%

48.2%

3.7%

11.3%

52.7%

California

8121

461402

147.889

0.971

7.3%

48.0%

11.8%

1.2%

48.3%

20.1%

7.9%

48.8%

Florida

3903

186141

158.042

1.313

23.0%

23.8%

2.5%

0.3%

49.5%

5.2%

12.3%

42.9%

Idaho

2807

20291

154.248

1.177

1.0%

12.8%

1.4%

1.5%

47.3%

4.9%

8.1%

38.7%

Illinois

3870

146929

159.927

1.489

19.1%

17.8%

4.5%

0.1%

48.7%

2.6%

12.2%

40.1%

Indiana

2623

77274

154.758

1.339

12.6%

6.4%

1.0%

0.2%

50.2%

2.5%

11.1%

34.9%

Kansas

2660

32160

156.263

1.386

8.1%

11.6%

1.9%

1.5%

49.8%

3.7%

10.3%

36.1%

Kentucky

2491

43056

151.443

1.373

10.2%

1.6%

1.0%

0.1%

50.4%

1.1%

8.1%

46.5%

Louisiana

2336

46721

146.693

1.258

43.6%

2.3%

1.2%

0.9%

48.2%

0.9%

11.5%

59.9%

Maine

2520

14596

161.034

1.066

1.6%

0.8%

1.5%

0.2%

49.4%

1.6%

15.0%

33.7%

Massachusetts

3437

64751

166.754

1.567

9.0%

10.5%

5.4%

0.2%

47.9%

3.5%

13.9%

26.5%

Michigan

2526

116199

151.058

1.338

18.6%

2.8%

2.3%

0.9%

49.8%

1.6%

11.2%

32.5%

Missouri

2776

69320

152.83

1.201

18.8%

2.6%

1.5%

0.2%

49.5%

1.7%

11.0%

37.5%

Nevada

2525

27139

143.094

1.046

10.5%

34.8%

8.2%

1.5%

49.2%

9.4%

10.7%

38.0%

New York

3647

199919

154.181

1.262

19.0%

17.9%

6.7%

0.3%

49.8%

4.1%

13.8%

48.0%

North Carolina

4042

101678

152.833

1.242

30.2%

7.1%

2.4%

1.3%

49.3%

3.7%

13.5%

44.0%

Oklahoma

2527

41091

152.789

1.2

9.4%

8.3%

2.2%

19.8%

49.5%

3.2%

12.8%

48.4%

Rhode Island

2566

11446

153.816

0.768

7.9%

17.2%

3.0%

0.5%

49.7%

2.8%

17.0%

31.4%

Tennessee

2725

71516

156.156

1.301

25.6%

4.7%

1.5%

0.1%

49.2%

1.8%

8.6%

45.3%

Texas

6783

278798

151.059

1.165

15.8%

43.8%

2.9%

0.2%

49.4%

6.4%

7.2%

50.4%

Vermont

1955

6679

161.534

1.078

1.6%

1.0%

1.4%

0.5%

47.2%

2.2%

16.7%

27.6%

Virginia

2631

84978

156.931

1.259

27.8%

5.7%

4.4%

0.2%

49.1%

2.8%

9.9%

26.9%

Washington

2840

73881

157.735

1.417

6.1%

12.7%

9.6%

2.4%

48.0%

4.7%

8.6%

34.4%

West Virginia

2818

21229

146.265

1.177

5.3%

0.8%

0.6%

0.1%

50.1%

0.6%

14.2%

47.4%

Wisconsin
Total

2585

59616
2415129

157.71

1.411

9.6%

6.2%

3.3%

1.2%

49.0%

3.4%

11.3%

29.9%

85437

Note. The means and percentages reported are for the samples weighted to represent U.S. students.

120

Table 10 Comparison of Sample Sizes and Student Demographics for 27 State Grade 8 NAEP Reporting Sample and HLM
Sample
Full Sample

State
AL
AZ
AR
CA
FL
ID
IL
IN
KS
KY
LA
ME
MA
MI
MO
NV
NY
NC
OK
RI
TN
TX
VT
VA

n
2710
2644
2369
8121
3903
2807
3870
2623
2660
2491
2336
2520
3437
2526
2776
2525
3647
4042
2527
2566
2725
6783
1955
2631

%
Black

%
Hispa
nics

35.9

2.2

%
Asian

%
Ameri
can
Indian

%
Female

0.8

0.3

50.5

HLM Sample

%
ELLs

%
With
IEPs

% Free/
reducedprice
lunch

1.4

10.7

50.1

5.8

38.8

2.7

6.6

49.1

9.3

7.9

45.6

23.8

7.3

1.2

0.5

48.2

3.7

11.3

52.7

7.3
23.0

48.0
23.8

11.8
2.5

1.2
0.3

48.3
49.5

20.1
5.2

7.9
12.3

12.8

1.4

1.5

47.3

4.9

8.1

38.7

17.8

4.5

0.1

48.7

2.6

12.2

40.1

6.4

1.0

0.2

50.2

2.5

11.1

11.6

1.9

1.5

49.8

3.7

10.3

36.1

10.2

1.6

1.0

0.1

50.4

1.1

8.1

46.5

1.6

2.3
0.8

1.2
1.5

0.9
0.2

48.2
49.4

0.9
1.6

11.5
15.0

10.5

5.4

0.2

47.9

3.5

13.9

26.5

2.8

2.3

0.9

49.8

1.6

11.2

32.5

1.5

0.2

49.5

1.7

11.0

34.8

8.2

1.5

49.2

9.4

10.7

38.0

19.0

17.9

6.7

0.3

49.8

4.1

13.8

48.0

9.4

7.1
8.3

2.4
2.2

1.3
19.8

49.3
49.5

3.7
3.2

13.5
12.8

17.2

3.0

0.5

49.7

2.8

17.0

31.4

4.7

1.5

0.1

49.2

1.8

8.6

45.3

1.6
27.8

1.0
5.7

2.9
1.4
4.4

0.2
0.5
0.2

49.4
47.2
49.1

6.4
2.2
2.8

7.2
16.7
9.9

2380
2251
2059
2243
2944
2195
2495
2136
3050
3452
2233

48.4

25.6

43.8

2309

44.0

7.9

15.8

3337

37.5

10.5

30.2

2460

33.7

18.6

2.6

3302

59.9

9.0

18.8

6361

34.9

8.1

43.6

2081

42.9

19.1
12.6

2199

48.8

1.0

n
2360

2248
2436
5951

50.4

1744

27.6

2301

26.9

121

%
Asian

%
Ameri
can
Indian

%
ELLs

%
With
IEPs

% Free/
reducedprice
lunch

%
Female

1.9

0.8

0.4

51.1

1.1

9.2

47.6

5.4

37.9

2.6

22.3

7.1

1.0

6.5

49.8

8.7

7.1

42.6

0.3

48.5

3.7

11.2

51.6

6.4

46.4

12.9

1.3

50.3

18.4

6.6

47.4

21.7

23.2

2.4

0.3

49.9

4.7

11.5

41.9

1.0

12.8

1.5

1.6

48.9

5.1

8.0

38.8

17.7

17.7

4.6

0.1

49.4

2.5

11.8

38.7

11.5

5.9

1.2

0.2

50.4

2.3

10.6

33.5

7.7

11.7

1.9

1.5

50.1

3.7

10.3

35.7

9.9

1.6

1.0

0.0

50.9

0.9

8.0

46.5

41.7

2.2

1.2

1.0

49.1

0.8

11.2

59.1

1.5

0.7

1.4

0.2

49.9

1.5

14.1

33.0

8.4

9.7

5.4

0.2

48.9

3.1

13.5

25.4

17.1

2.6

2.4

0.9

50.3

1.5

10.8

31.3

17.6

2.7

1.6

0.1

50.2

1.6

10.6

36.1

9.4

33.3

8.8

1.6

51.0

8.4

9.2

36.7

16.9

17.3

6.8

0.3

50.9

3.6

13.3

46.1

28.0

6.9

2.4

1.3

50.2

3.7

13.7

42.5

8.9

8.2

2.2

20.0

50.0

3.2

12.6

47.5

7.6

16.6

3.0

0.5

50.4

2.2

16.0

30.5

23.9

4.7

1.5

0.0

50.7

1.7

8.2

43.9

15.3

43.3

3.1

0.2

49.9

5.7

6.6

49.3

1.6

1.0

1.6

0.4

47.9

2.3

16.2

26.7

27.3

5.6

4.6

0.2

49.7

2.9

9.7

26.7

%
Black

%
Hispa
nics

33.2

Table 10 (cont’d)
WA
WV
WI

2840
2818
2585

6.1

12.7

9.6

2.4

48.0

4.7

8.6

34.4

5.3

0.8

0.6

0.1

50.1

0.6

14.2

47.4

9.6

6.2

3.3

1.2

49.0

3.4

11.3

2418
2537
2272

29.9

85437

Total
Note. The means and percentages reported are for the samples weighted to represent U.S. students.

122

73754

5.3

12.7

9.4

2.3

48.9

4.4

8.0

33.4

4.8

0.9

0.7

0.2

50.8

0.7

13.7

46.7

8.4

6.4

3.3

1.2

49.3

3.4

11.7

28.9

Table 11 Raw Unweighted Descriptive Statistics of Variables in HLM Models
VARIABLE NAME (N=73754 Students from 27 States)

MEAN

SD

MIN

MAX

9.97

1.53

7.48

15.2

Plausible Value 1

154.8

34.05

4.72

285.24

Plausible Value 2

154.8

34.12

0

284.44

Plausible Value 3

154.9

34.08

0

300

Plausible Value 4

154.9

34.18

0

283.28

Plausible Value 5

155.1

34.15

0

293.82

Black

0.16

0.36

0

1

Hispanic

0.18

0.38

0

1

Asian

0.04

0.2

0

1

American Indian

0.02

0.13

0

1

0.5

0.5

0

1

0.05

0.22

0

1

0.1

0.3

0

1

Free/Reduced-priced Lunch

0.44

0.5

0

1

Writing stories/letters is a favorite activity

2.17

0.94

1

4

2.6

0.89

1

4

State level
Distance between NAEP and state writing assessments
Student level

Female
English Language Learners (ELLs)
Individualized Education Plan (IEPs)

Writing helps share ideas
How often teacher talk to you about writing

2.4

0.6

1

3

2.34

1.24

1

4

2.6

1.07

1

4

How often write a report

2.55

0.84

1

4

How often write an essay you analyze

2.53

0.93

1

4

How often write a letter/essay for school

2.38

0.92

1

4

How often write a story personal/imagine

2.43

0.96

1

4

How often write business writing

1.6

0.81

1

4

How often when writing-get brainstorm

1.9

0.62

1

3

How often when writing-organize papers

2.21

0.74

1

3

2.6

0.59

1

3

How often when writing-work with other students

2.09

0.68

1

3

Write paper-use computer from begin

1.97

0.74

1

3

Write paper for school-use computer for changes

2.24

0.75

1

3

Write paper for school-use computer for internet

2.49

0.63

1

3

How often write one paragraph in English class

3.56

0.76

1

4

How often write one paragraph in science class

2.86

1.01

1

4

How often write one paragraph in social studies/history class

3.13

0.95

1

4

How often write one paragraph in math class

1.98

1.1

1

4

How often teacher asks to write more than 1 draft

2.26

0.63

1

3

Teacher grades important for spelling/ punctuation/ grammar

2.59

0.57

1

3

How often write thoughts/observation
How often write a simple summary

How often when writing-make changes

123

Table 11 (cont’d)
Teacher grades important for paper organization

2.55

0.59

1

3

2.6

0.57

1

3

Teacher grades important for length of paper

2.09

0.65

1

3

Difficulty of this writing test

1.47

0.71

1

4

Effort on this writing test

2.05

0.81

1

4

Importance of success on this writing test

2.67

1

1

4

Teacher grades important for quality/creativity

124

Table 12 Genre Expectations in Standards and Genre Assessed
State

Grade

AL

7

AR

7

8

AZ

7

8

Genre Expectations

% Total Genre

Genre

% Genre

in Standards

Occurrences

Assessed

Assessed

Respond
Narrative
Poetic
Express
Exchange
Expository
Describe
Research
Respond
Narrative
Poetic
Persuade
Expository
Describe
Summarize
Reflect
Research
Respond
Narrative
Poetic
Persuade
Expository
Describe
Reflect
Research
Record
Respond
Direct
Narrative
Poetic
Exchange
Persuade
Expository
Inform
Describe
Summarize
Functional
Record
Respond
Direct
Narrative
Poetic
Exchange
Persuade
Expository
Inform
Describe
Summarize
Functional

9.1%
9.1%
9.1%
27.3%
9.1%
9.1%
9.1%
9.1%
10%
20%
5%
10%
20%
15%
5%
5%
10%
12.5%
12.5%
6.3%
18.8%
18.8%
12.5%
6.3%
12.5%
3.7%
3.7%
3.7%
14.8%
3.7%
14.8%
11.1%
11.1%
14.8%
7.4%
7.4%
3.7%
3.8%
3.8%
3.8%
15.4%
3.8%
15.4%
7.7%
11.5%
15.4%
7.7%
7.7%
3.8%

Descriptive
Expository
Narrative
Persuasivea

38%

Persuasive
Expository

22%

Expository

13%

Informative

8%

Narrative

8%

125

Table 12 (cont’d)
CA

7

FL
(1996)

6-8

FL
(2007)

8

ID

7

IL

8

Respond
Narrative
Persuade
Expository
Describe
Summarize
Research
Argumentative*
Record
Respond
Direct
Narrative
Express
Exchange
Persuade
Expository
Inform
Reflect
Argumentative*
Analysis*
Record
Remind
Direct
Narrative
Poetic
Express
Exchange
Persuade
Expository
Inform
Summarize
Research
Argumentative*
Respond
Direct
Express
Persuade
Expository
Inform
Analyze
Evaluate
Research
Record
Remind
Direct
Narrative
Poetic
Express
Exchange
Persuade
Expository
Inform
Analyze
Synthesize

14.3%
14.3%
14.3%
14.3%
14.3%
14.3%
14.3%
-------13.6%
22.7%
4.5%
9.1%
9.1%
4.5%
9.1%
13.6%
9.1%
4.5%
--------------5.6%
5.6%
16.7%
11.1%
5.6%
5.6%
5.6%
5.6%
11.1%
16.7%
5.6%
5.6%
-------7.7%
7.7%
7.7%
15.4%
15.4%
23.1%
7.7%
7.7%
7.7%
1.6%
1.6%
1.6%
15.6%
7.8%
1.6%
10.9%
14.1%
10.9%
17.2%
3.1%
3.1%

126

Narrative
Persuasive
Analysis
Informative
(Summary)

57%

Expository
Persuasive

20%

Expository
Persuasive

17%

Expository

11%

Narrative
Persuasive

13%

Table 12 (cont’d)

IN

7

8

KS

8

KY
(1999)

7

8

KY
(2006)

7

Evaluate
Research
Functional
Argumentative
Remind
Respond
Narrative
Exchange
Persuade
Expository
Inform
Describe
Summarize
Research
Argumentative*
Analysis*
Remind
Respond
Direct
Narrative
Exchange
Persuade
Expository
Inform
Describe
Synthesize
Summarize
Research
Argumentative*
Analysis*
Direct
Narrative
Exchange
Persuade
Expository
Inform
Argumentative*
Record
Respond
Express
Summarize
Reflect
Respond
Synthesize
Reflect
Respond
Narrative
Poetic
Express
Exchange
Persuade
Expository
Inform
Describe

4.7%
1.6%
3.1%
1.6%
7.1%
7.1%
14.3%
14.3%
7.1%
14.3%
14.3%
7.1%
7.1%
7.1%
--------------5.9%
5.9%
5.9%
11.8%
17.6%
5.9%
11.8%
11.8%
5.9%
5.9%
5.9%
5.9%
--------------21.4%
7.1%
7.1%
35.7%
7.1%
21.4%
-------16.7%
16.7%
16.7%
16.7%
33.3%
25%
25%
50%
4.8%
4.8%
4.8%
14.3%
4.8%
4.8%
9.5%
14.3%
4.8%

127

Narrative
Persuasive
Analysis

30%

Expository
Narrative

17%

Expository
Informative

33%

Persuasivea
Narrativea

0%

Persuasivea
Expositorya

0%

Persuasive
Narrative

13%

Table 12 (cont’d)

8

LA

7

8

MA

7

ME

5-8

Analyze
Synthesize
Summarize
Reflect
Research
Functional
Respond
Narrative
Poetic
Express
Exchange
Persuade
Expository
Inform
Analyze
Synthesize
Summarize
Reflect
Evaluate
Research
Functional
Respond
Narrative
Exchange
Persuade
Expository
Inform
Describe
Analyze
Evaluate
Research
Functional
Argumentative*
Respond
Narrative
Exchange
Persuade
Expository
Describe
Analyze
Evaluate
Research
Argumentative*
Respond
Narrative
Poetic
Expository
Inform
Research
Analysis*
Narrative
Express
Exchange
Persuade

4.8%
4.8%
4.8%
9.5%
4.8%
4.8%
5.3%
5.3%
5.3%
15.8%
5.3%
5.3%
5.3%
10.5%
5.3%
5.3%
5.3%
10.5%
5.3%
5.3%
5.3%
18.8%
12.5%
6.3%
12.5%
6.3%
6.3%
6.3%
6.3%
6.3%
12.5%
6.3%
-------7.1%
14.3%
7.1%
14.3%
14.3%
7.1%
14.3%
7.1%
14.3%
-------10.0%
20.0%
10.0%
30.0%
20.0%
10.0%
-------27.3%
9.1%
9.1%
9.1%

128

Persuasive
Expository

13%

Narrative
Expository

18%

Narrative
Expository

22%

Expository

17%

Persuasive
Expository
Descriptivea

22%

Table 12 (cont’d)

MI

6-8

MO

7

NC

7

RI

8

VT

8

NV

8

Expository
Inform
Summarize
Reflect
Research
Argumentative*
Respond
Narrative
Poetic
Persuade
Expository
Inform
Synthesize
Reflect
Research
Argumentative*
Respond
Narrative
Exchange
Persuade
Expository
Describe
Summarize
Argumentative*
Analysis*
Narrative
Express
Expository
Inform
Analyze
Reflect
Evaluate
Respond
Direct
Narrative
Poetic
Persuade
Expository
Inform
Describe
Reflect
Analysis*
Respond
Direct
Narrative
Persuade
Inform
Analysis*
Respond
Narrative
Exchange
Persuade
Inform
Describe

9.1%
9.1%
9.1%
9.1%
9.1%
-------4.8%
19.0%
14.3%
23.8%
4.8%
19.0%
4.8%
4.8%
4.8%
-------12.5%
12.5%
25%
12.5%
12.5%
12.5%
12.5%
-------12.5%
12.5%
12.5%
12.5%
12.5%
12.5%
25%
7.4%
11.1%
18.5%
11.1%
11.1%
7.4%
22.2%
7.4%
3.7%
-------15.4%
23.1%
15.4%
23.1%
23.1%
-------9.1%
9.1%
9.1%
9.1%
9.1%
9.1%

129

Persuasive
Narrative
Argumentative
Expository

44%

Expository

14%

Persuasivea

0%

Analysis
Persuasive
Informative

33%

Analysis
Persuasive
Informative

60%

Narrative

10%

Table 12 (cont’d)

NY

8

OK

8

TN

8

TX

7

Summarize
Evaluate
Research
Functional
Argumentative*
Analysis*
Record
Respond
Narrative
Poetic
Exchange
Expository
Inform
Analyze
Summarize
Research
Argumentative
Record
Respond
Direct
Narrative
Exchange
Persuade
Inform
Synthesize
Summarize
Reflect
Evaluate
Research
Argumentative*
Analysis*
Draw
Record
Respond
Direct
Narrative
Poetic
Express
Exchange
Persuade
Expository
Inform
Describe
Synthesize
Reflect
Research
Functional
Argumentative*
Analysis*
Draw
Record
Respond
Direct
Request

9.1%
9.1%
18.2%
9.1%
--------------4.2%
20.8%
8.3%
8.3%
12.5%
4.2%
12.5%
16.7%
4.2%
4.2%
4.2%
5.9%
5.9%
5.9%
11.8%
11.8%
5.9%
5.9%
5.9%
17.6%
5.9%
5.9%
11.8%
--------------2.9%
2.9%
11.8%
5.9%
8.8%
5.9%
2.9%
5.9%
8.8%
14.7%
5.9%
5.9%
5.9%
5.9%
2.9%
2.9%
--------------5.6%
16.7%
2.8%
5.6%
2.8%

130

Expository
Analysis

18%

Argumentative
Expositorya

8%

Expository

6%

Narrative

6%

Table 12 (cont’d)
Narrative
8.3%
Poetic
5.6%
11.1%
Express
Exchange
8.3%
Persuade
2.8%
Expository
2.8%
11.1%
Inform
Describe
2.8%
Summarize
2.8%
Reflect
2.8%
Evaluate
2.8%
Research
2.8%
Argumentative
2.8%
Analysis*
-------VA
8
25%
Narrative
25%
Persuade
25%
Expository
25%
Inform
WA
7
Record
4.4%
Remind
1.5%
Respond
2.9%
Direct
2.9%
11.8%
Narrative
Poetic
8.8%
Express
4.4%
Exchange
4.4%
20.6%
Persuade
10.3%
Expository
13.2%
Inform
Describe
1.5%
Analyze
1.5%
Reflect
2.9%
Evaluate
1.5%
Research
4.4%
Functional
1.5%
Argumentative
1.5%
WI
5-8
16.7%
Respond
33.3%
Narrative
16.7%
Exchange
16.7%
Persuade
16.7%
Expository
Argumentative*
-------Analysis*
-------WV
7
11.1%
Poetic
11.1%
Express
11.1%
Exchange
11.1%
Persuade
11.1%
Expository
22.2%
Inform
22.2%
Research
Note.*genres potentially covered by state standards.
a
assessed genres not covered by state standards.

131

Argumentativea

0%

Expository
Persuasive

11%

Persuasive

20%

Descriptivea
Persuasive
Narrativea
Expository

29%

Appendix B Coding Taxonomies
Table 13 Prompt Coding—Troia & Olinghouse’s (2010) Coding Taxonomy
100s
Writing Processes: Any aspect of the stages or specific strategies that one uses when producing a
piece of writing
Guiding Question: Is this something that relates to the writer’s actions in composing the text?
Actions are things that the writer does. Actions are differentiated from the purpose guiding those
actions, the products of those actions, or the knowledge required to initiate those actions.
Indic
ator
101

102

103

104

105

106

107

108

109

Definition

Examples

General Writing Process: A general
reference to the writing process
Topic/Genre Selection: The process of
determining the general topic, theme, focus,
point of view, or genre of the writing
Gather Information: The process of
collecting relevant information as it pertains
to the topic
Pre-Writing/Planning: The process of using
activities prior to writing to generate,
structure, or organize content
Drafting Text: The process of producing
written text that is later expected to be
altered
Revising: The process of altering existing
text in order to better achieve communicative
aims with content, organization, and style
Editing: The process of altering existing text
to better match expectations for writing
conventions
Publishing: The process of preparing the
final form of a text possibly for public
distribution
Strategies: The process of using steps or
supports in order to problem solve during the
writing process

proceed through the writing process, produce a
well written paper using the writing process, the
process of writing
[Prewrite] establish a controlling idea or focus,
generate and narrow topics,
Develop a comprehensive and flexible search
plan, selecting appropriate information to set
context, research (for the purpose of gathering
information)
outlining, brainstorming, [Prewrite] generating
ideas, [Prewrite] organize ideas,
Draft: complete a draft demonstrating
connections among ideas,
Revise, rewrite (if clear that changes are being
made to draft),
Proofreading, revise for spelling, revise for
capitalization, revise for punctuation,
final copy, final draft, final product

re-reading, time management, test-taking

200s
Writing Context: The social, physical, or functional circumstances outside the writer that influence
text production.
Guiding Question: Is this something that is located outside the writer’s text and outside the writer’s
mind?
201

202

Purpose: General reference to the
objective or intent in creating a piece of
writing
Task: General reference to the writing
task

given the writing task, writing is appropriate for
the task at hand, writing in different genres,

132

Table 13 (cont’d)
appropriate for the given topic, format
requirements, context
203
204
205

206

207

208

209
210
211

212

213

214

215
216
217

218

Audience: General reference to a reader
or readers for a piece of writing
Collaboration: Cooperatively working with
others to produce a piece of writing
Sharing: Telling or showing ideas, plans,
or a piece of writing to others that may or
may not elicit a response; sharing can occur
at any point during the writing process
Feedback: Verbal or written information in
response to an author's work at any point in
the writing process received from peers or
adults
Text Models: Examples of structures,
forms, or features used as explicit cues for
text production
Guidance/Support: Verbal or written
assistance, aside from feedback, tailored
to the needs of students during writing from
peers or adults
Computer Technology: Using a computer
as a tool in the process of writing
Procedural Facilitator: External material
used to support the process of writing,
Reference Materials: Sources of
information consulted to support writing
mechanics and formatting
Source Materials: Reference to source
materials that are integrated into the
written content
Disciplinary Context: The general or
particular academic setting (content area)
in which a piece of writing is produced is
specified
Writing In/Writing Out of School: The
general place in which a piece of writing is
produced is specified
Length of Writing: Length of a piece of
writing is specified
Quantity of Writing: The number of
pieces of writing is specified
Time for Writing: Duration and/or
frequency of sustained student writing is
specified
Sophistication: Expectations for
complexity in a given text

tell a peer ideas for writing

peer conferencing to elicit suggestions for
improvement

Use literary models to refine writing style.

with the help of peers, with teacher modeling,
with assistance, in response to a prompt or cue,
using dictation
digital tools, use appropriate technology to
create a final draft.
rubric, checklist, graphic organizer, story map
dictionaries, thesauruses, style manual

web sites, articles, texts, documents,
encyclopedic entries
writing across the curriculum, writing for a
range of discipline specific tasks, writing a
procedural text in science, writing in the content
areas

Brief, multi-page, short, long, # paragraphs
specified
portfolio, several, numerous
60 minutes, over two sessions, routinely

multiple perspectives, sensitivity to cultural
diversity

300s
Writing Purposes: The variety of communicative intentions that can be accomplished through
many different genres.
Guiding Question: Is this something that relates to why the writer is writing and does not appear in
the actual text?

133

Table 13 (cont’d)
301
302
303
304

305
306
307

308

309
310
311
312

313
314
315

316
317
318

319

320

321

Draw: Producing a picture or diagram for
the purpose of communicating
Record: Copying text or taking notes on
information
Remind: Bringing attention to something
for the purpose of recall
Respond: Responding to a stimulus, such
as a question, prompt, or text, through
writing
Direct: Giving directions, commands, or
procedures
Request: Asking for information or action
Entertain/Narrate: Giving an account,
either fictional or factual that often
provides amusement and enjoyment
Poetic: Evoking imagination or emotion
through intentional manipulation of form
and language
Express: Conveying thoughts, feelings, or
beliefs for personal reasons
Exchange: Conveying thoughts, feelings,
or beliefs for social reasons
Persuade: Convincing an identified
audience to act on a specific issue
Exposit/Explain: Explaining, clarifying,
or expounding on a topic; this may be done
generally or in depth through elaboration
Inform: Giving facts about a subject which
may or may not be integrated
Describe: Giving details/attributes about
an object or event
Analyze: Systematically and intentionally
examining something through details and
structure
Synthesize: Combining various things into
one coherent, novel whole
Summarize: Using a brief statement or
paraphrase to give the main points
Reflect: Thinking deeply and carefully
about something after the fact, often using
written text to learn
Evaluate: Examining the match between
others’ writing intent and form using
criteria
Research: Using systematic investigation
to obtain information/knowledge for a piece
of writing
Functional: Completing forms,
applications, and other fill-in types of
documents

illustration, picture, diagram, drawing
note taking, copy
reminder, list
response, personal response, on-demand
writing, text as stimulus for writing something
new, response to literature
how-to, procedure, instructions, manual,
technical text
request, solicitation
narrative, personal narrative, story, memoir,
recount, biography, autobiography, fiction,
fantasy, fable, folktale, myth, legend, adventure,
mystery, tall tale, fairytale, drama, short story
poetry, free verse, haiku, lyric, ballad, rhyme,
sonnet, couplet, cinquain, limerick, dactyl, ode
journal writing, diary writing
email, blog, letter, editorial
persuasive essay,
explanation, essay, exposition

informational piece, article, report
description, descriptive text
critique, literary criticism

synthesis, lab report,
summary, synopsis, paraphrase
reflections, reflective writing, writing-to-learn

book review

experiments

checks, resumes

600s
Writing Metacognition & Knowledge: Knowledge resources within the writer that are drawn
upon to compose a written text and/or knowledge that is the focus of development during

134

Table 13 (cont’d)
instruction (explicit reference to knowledge, recognition, distinguishing, identifying,
recognizing, learning, or understanding must be made) or reflection on the knowledge one
possesses.
Guiding Question: Is this something that is happening in the student’s mind (e.g., thinking about or
analyzing their writing)? If it is something that the student is doing, or that is revealed in their
writing, it cannot be a 600.
601

602

603

604

605

Topic Knowledge: Knowledge of facts,
information, or experiences pertaining to a
particular subject that are within the writer
and used to compose a written text
Genre Knowledge: Knowledge about the
purposes of writing and/or the
macrostructures of a text that are within the
writer and used to compose a written text
Linguistic Knowledge: Knowledge of the
microstructures of a text that are within the
writer and used to compose a written text
Procedural Knowledge: Knowledge of the
procedures or processes of writing that are
within the writer and used to compose a
written text
Self-Regulation: The process of explicitly
managing, reflecting upon, and/or
evaluating one's behaviors, performance,
thoughts, or feelings

135

use personal experience to develop content for
an essay, through experimentation, develop
knowledge about natural phenomena for writing
text attributes, elements, structure common to
specific types of writing

sound-symbol relationships, spelling rules,
grammatical rules, vocabulary
knowledge of how to plan or revise, knowledge
of how to use specific things during the writing
process (e.g., knowing how to use a dictionary)

Table 14 Rubric Coding—Troia and Olinghouse’s (2010) Coding Taxonomy
400s
Writing Components: Features, forms, elements, or characteristics of text observed in the
written product
Guiding Question: Is this something that you can observe in the text itself? Is this something you
can see without the writer(s) being present?
Indicat
or

Definition

Examples

401

General Organization: How written content
for a whole text is organized to achieve an
intended purpose

402

General Structure: Portions of a text that
bridge content and organization through
structural representation

403

General Content: Topical information or
subject matter presented within the text or
content that is a more specific example of
a structural representation

401/403

Rubric descriptors that will receive both a
general organization code [401] and a
general content code [403].

136

 Order and Organization
o out of order
o writing progresses in an order that
enhances meaning
o logical organization
o progression of text may be confusing or
unclear
 Unifying theme
 Clear structure
 Coherence
 Central idea
 Controlling idea
 Introduction
 Beginning
 Middle
 End
 Conclusion
o beginning, middle, end may be weak or
absent
• Ideas and content
o topic/idea development
o ideas are fresh, original, or insightful
o content goes beyond obvious
• References to the topic
o the writer defines the topic
o topic may be defined, but not developed
• Main idea
o the writer states main idea
o writing lacks main idea
• Topic sentence
• Information is very limited
 Control of topic
 Establishing a context for reading
• References to addressing the task
o fully accomplishes the task
o accomplishes the task
o minimally accomplishes the task
o does not accomplish the task
o addresses all parts of the writing task
• References to addressing the prompt
o addresses all of the specific points in the
prompt

Table 14 (cont’d)
o

•

•

•



404



Elaboration/
Detail: Information that illustrates,
illuminates, extends, or embellishes
general content

137

addresses most of the points in the
prompt
References to purpose
o demonstrates a clear understanding of
purpose
o demonstrates a general understanding of
purpose
o demonstrates little understanding of
purpose
References to addressing/awareness of genre
o response is appropriate to the assigned
genre
o uses genre-appropriate strategies
o response does not demonstrate genre
awareness
o organization appropriate to genre
o awareness of genre/purpose
Organizing Ideas
o ideas are organized logically
o meaningful relationships among ideas
o related ideas are grouped together
o ideas go off in several directions
o ideas may be out of order
o writing does not go off on tangents
Focus
o stays focused on topic and task
o may lose focus
o lapse of focus
o writing may go off in several directions
o the writing is exceptionally clear and
focused
o consistent focus on the assigned topic,
genre, and purpose
o sustained focus and purpose
o stays fully focused on topic/purpose
o sustained or consistent focus on topic
o clarity, focus, and control
o sustained focus on content
o maintains consistent focus on topic
o clear focus maintained for intended
audience
Details
o supporting details are relevant
o writer makes general observations
without specific details
o examples, facts, and details
o concrete details
o minimal details
o omits details
o includes unrelated details
o list of unrelated specifics without
extensions
o anecdotes

Table 14 (cont’d)
o

405
405A

Elaborate/ elaborated/ elaboration/
elaborating ideas that are fully and
consistently elaborated
o minimal elaboration
Genre Specific Organization & Content/Ideas: Structural elements and/or information that is
canonical for a specific genre










Narrative

405B

Expository/
Procedural/
Descriptive/
Informational

405C

Persuasive

405D

Poetic

405E

Response to Writing



















Story line
Plot
Dialogue
Setting
Characters
Goals
Tells a story
Events
Sequence of events
o thoroughly developed sequence of
significant events
o lacks a sequence of events
Reactions
Structure showing a sequence through time
Chronology
Chronological sequence of ideas
References to canonical text structures of the
genre
o cause/effect
o similarity and difference
o compare/contrast
Thesis
Anticipates reader’s questions
Supports an opinion
Question and answer
Reasons
Points
Sub-points
Position
o maintains position/logic throughout
o subject/position (or issue) is clear,
identified by at least an opening
statement
o subject/position is vague
o subject/position (or issue) is absent
o defends a position
Evidence
Rhyme

 Connections to experience or texts
 Interpretation
 Connects text to self, the outside world, or
another text
 Supports a position in response to the text

138

Table 14 (cont’d)

406

Sentence Fluency: The variety,
appropriateness, and use of sentences in
the text

407

Style: Language intentionally used to
enhance purposes, forms, and features

139

 Demonstrates understanding of literary work
o demonstrates clear understanding of
literary work
o demonstrates a limited understanding of
literary work
o demonstrates little understanding of
literary work
 Supports judgments about text
o provides effective support for judgments
through specific references to text and
prior knowledge
o provides some support for judgments
through references to text and prior
knowledge
o provides weak support for judgments
about text
o fails to provide support for judgments
about text
Interpretation
 Sentence variety
o variety of sentence structures
o sentences vary in length and structure
o uses an effective variety of sentence
beginnings, structures, and lengths
o includes no sentence variety
o writer uses varied sentence patterns
o sentences are purposeful and build upon
each other
 Style
 Voice
 Tone
 Register
o writer chooses appropriate register to
suit task
 Repetition
o writing is repetitive, predictable, or dull
reader senses person behind the words
 Audience
o reader feels interaction with writer
o indicates a strong awareness of
audience’s needs
o communicates effectively with audience
o displays some sense of audience
o some attention to audience
o little or no awareness of audience
 Language
o writer effectively adjusts language and
tone to task and purpose
o language is natural and thoughtprovoking
o attempts at colorful language often
come close to the mark, but may seem
overdone or out of place

Table 14 (cont’d)
o

408

Figurative Language: Words, phrases or
devices used to represent non-literal
connections to objects, events, or ideas

409

Semantic Aspects: Words, phrases, or
devices used to enhance the meaning of the
text from a literal standpoint



















410

411

Citations and References: Attributions for
contributed or borrowed material for
writing, including quotations
Multimedia: The integration of various
mediums of expression or communication
as part of writing, including illustrations,
photos, video, sound, and digital archival
sources to accomplish communicative aims
that could not be accomplished using any
single medium

140

vivid, precise, and engaging language
that is appropriate to the genre
o writer uses language that is easy to read
o writer uses language that is difficult to
read
Metaphor
Simile
Personification
Symbolism
Hyperbole
Onomatopoeia
Imagery
Word Choice
o words are accurate and specific
o uses different beginning words for
sentences
Transitions
o ideas are connected with transitions
o varied transitions
o paper is linked with transitions
o smooth transitions between ideas,
sentences, and paragraphs
o connectives
Vocabulary
o accurate, precise vocabulary
o chooses vocabulary precisely
o control of challenging vocabulary
o academic words
o domain-specific vocabulary
o technical vocabulary
Descriptive words
o descriptive language
o rich description
Imagery
Humor
Synonyms
Sensory details

Table 15 Rubric Coding—Jeffery’s (2009) Coding Taxonomy
Rubric
Types
Rhetorical

Definition

Examples

Focusing on the relationship between
writer, audience, and purpose across
criteria domains, and containing terms
framed within the context of
appropriateness, effectiveness, and
rhetorical purpose

Genremastery

Emphasizing criteria specific to the genre
students are expected to produce by
identifying a specific rhetorical purpose,
such as to convince an audience to take
action or to engage an audience with a
story, and varying rubric content to match
prompt types, as well as containing terms
framed by the specific communicative
purpose that characterize the genre

Formal

Conceptualizing proficiency in terms of
text features not specific to any writing
context with features not framed by any
particular considerations, such as the
author’s thinking or creativity, and with
characteristics that might be applicable to
a variety of writing contexts, as well as
defining good writing in relatively broad
terms by focusing on features such as
coherence, development and organization
Targeting thinking processes such as
reasoning and critical thinking across
domains, and explicitly valuing clarity of
ideas, logical sequencing, and other
features that implicate students’
cognitions
Emphasizing writing as a product of the
author’s processes, especially creativity,
and conceptualizing “good writing” as an
expression of the author’s uniqueness,
individuality, sincerity, and apparent
commitment to the task, as well as
containing terms framed by an
overarching concern with personality and
perspective

Cognitive

Expressive

141

 Successfully addresses and controls the
writing task with a strong sense of audience
and purpose
o reader
o audience
o purposefully
o effectively
o appropriately
 The writing is focused and purposeful, and it
reflects insight into the writing situation
o the writing situation
o the rhetorical context
 A persuasive composition states and
maintains a position, authoritatively defends
that position with precise and relevant
evidence, and convincingly addresses the
readers concerns, biases, and expectations
o “logically” and “clearly” with persuasive
or argumentative writing
 Clarifies and defends or persuades with
precise and relevant evidence; clearly defines
and frames issues
• Is well organized and coherently developed;
clearly explains or illustrates key ideas;
demonstrate syntactic variety

• A typical essay effectively and insightfully
develops a point of view on the issue and
demonstrates outstanding critical thinking
o Explicit emphasis on “critical thinking”
 Approach the topic from an unusual
perspective, use his/her unique experiences or
view of the world as a basis for writing, or
make interesting connections between ideas
o Interesting connection between ideas

Table 16 Seven-Genre Coding Scheme for Prompts—Adapted from Jeffery (2009) and
Troia & Olinghouse (2010)
Genre Categories
(P) Persuasive

Characteristics







(A) Argumentative 




(N) Narrative










(E) Explanatory







(I) Informative






Directed students to convince or persuade an audience
Identified a local audience as target for persuasion
Often specified a form for persuasion (e.g. letter, newspaper
article, speech)
Specified a relatively concrete issue with clear implications
(e.g. attendance policy)
Called for one-sided perspective (did not invite consideration of
multiple perspectives
Key terms: “convince”, “persuade”, “agree or disagree”,
“opinion”
Directed students to argue a position on an issue
Did not identify a specific audience
Did not specify form
Addressed relatively abstract philosophical issue without clear
implications
Called for consideration of multiple perspectives
Key terms: “position”, “point of view”
Directed students to tell real or imagined stories
Sometimes directed students to connect stories to themes (e.g.
provided quotation)
Did not identify a context (e.g. audience) for writing
Might direct the student to engage the reader
Used words like “event”, “experience” or “a time” to evoke
memories
Key terms: “tell”, “describe”, “story”, “narrative”,
“imagination”
Directed students to explain why something is so or what is so
Might present arguable propositions as inarguable (e.g.
importance of homework)
Do not explicitly identify a proposition as arguable
But may allow for choice (e.g. explain qualities are important in
a sport)
Might include language consistent with argument or persuasion
(e.g. “support”)
Typically asked students to address relatively abstract concepts
Typically do not identify a target audience
Key terms: “explain”, “what”, “why”
Directed students to explain a process or report on concrete,
factual information
142

Table 16 (cont’d)


(AN) Analytic

(D) Descriptive












Similar to Explanatory in except for object of explanation
(relatively concrete)
Typically do not identify a target audience
Key terms: “explain”, “how”, “procedure”
Directed students to analyze pieces of literature
Did not identify a target audience
May provide pieces of literature for analysis
Included discipline-specific language and
Referred the work’s author or speaker
Key terms: “describe”, “show”, “author”, “elements”
Direct students to give details/attributes about an object or
event
Key terms: “describe”, “description”, “descriptive text”

143

Table 17 Standards Genre Coding—Troia and Olinghouse’s (2010) Coding Taxonomy
Modified to Accommodate Jeffery’s (2009) Genre Coding Taxonomy
300s
Writing Purposes: The variety of communicative intentions that can be accomplished through
many different genres.
Guiding Question: Is this something that relates to why the writer is writing and does not appear in
the actual text?
Indicator
301
302
303
304

305
306
307

308

309
310
311
312

313
314
315

316
317
318

Definition
Draw: Producing a picture or diagram
for the purpose of communicating
Record: Copying text or taking notes on
information
Remind: Bringing attention to
something for the purpose of recall
Respond: Responding to a stimulus,
such as a question, prompt, or text,
through writing
Direct: Giving directions, commands, or
procedures
Request: Asking for information or action
Entertain/Narrate: Giving an account,
either fictional or factual that often
provides amusement and enjoyment
Poetic: Evoking imagination or emotion
through intentional manipulation of
form and language
Express: Conveying thoughts, feelings,
or beliefs for personal reasons
Exchange: Conveying thoughts, feelings,
or beliefs for social reasons
Persuade: Convincing an identified
audience to act on a specific issue
Exposit/Explain: Explaining, clarifying,
or expounding on a topic; this may be
done generally or in depth through
elaboration
Inform: Giving facts about a subject
which may or may not be integrated
Describe: Giving details/attributes about
an object or event
Analyze: Systematically and
intentionally examining something
through details and structure
Synthesize: Combining various things
into one coherent, novel whole
Summarize: Using a brief statement or
paraphrase to give the main points
Reflect: Thinking deeply and carefully
about something after the fact, often
using written text to learn

144

Examples
illustration, picture, diagram, drawing
note taking, copy
reminder, list
response, personal response, on-demand writing,
text as stimulus for writing something new,
response to literature
how-to, procedure, instructions, manual,
technical text
request, solicitation
narrative, personal narrative, story, memoir,
recount, biography, autobiography, fiction,
fantasy, fable, folktale, myth, legend, adventure,
mystery, tall tale, fairytale, drama, short story
poetry, free verse, haiku, lyric, ballad, rhyme,
sonnet, couplet, cinquain, limerick, dactyl, ode
journal writing, diary writing
email, blog, letter, editorial
persuasive essay,
explanation, essay, exposition

informational piece, article, report
description, descriptive text
critique, literary criticism

synthesis, lab report,
summary, synopsis, paraphrase
reflections, reflective writing, writing-to-learn

Table 17 (cont’d)
319

320

321

322

Evaluate: Examining the match between
others’ writing intent and form using
criteria
Research: Using systematic
investigation to obtain
information/knowledge for a piece of
writing
Functional: Completing forms,
applications, and other fill-in types of
documents
Argue: Supporting a position on an
abstract proposition

145

book review

experiments

checks, resumes

opinion piece, argument, position piece

Appendix C Table 18 State Direct Writing Assessments

Prompts

State

Alabama

Assessment
Year
Range

Grades
Assessed

How many
Direct/OnDemand
Test
responses
were there?

Rubrics

What genre(s) were the
Direct/On-Demand
Test?

What year
was the
Direct/OnDemand
Test
gathered
from?

What kinds of Scoring
Rubrics are used?

What year
were the
Scoring
Rubrics
gathered
from?

2002-2010

G7

1

Narrative, Descriptive,
Expository, Persuasive

2004

Holistic and Analytic

2009

2005-2010

G7

1

Informational

2005

Analytic

2003

2005-2010

G8

1

Narrative

2005

Analytic

2003

2004-2006

G7

2

No set genre

2005, 2006,
2007

Analytic

2006, 2007

Arizona

Arkansas

146

Table 18 (cont’d)

2004-2006

G8

2

No set genre

2005, 2006,
2007

Analytic

2006, 2007

2002

Holistic

2002

California

2002-2008

G7

1

Randomly chosen
(Response to literature,
persuasive, summary,
narrative)

Florida

2001-2009

G8

1

Expository or
persuasive

2007

Holistic

2007

Idaho

2003-2008

G7

1

Expository

2006

4 point holistic scale

2006

2010

Analytic. There were
two rubrics (one for
narrative, and one for
persuasive).

2010

2001-2006

Holistic rubrics for
Writing Applications
and Language
Conventions for grades
3-8. The response to
literature also had a
Reading Comprehension
rubric in addition to the
WA and LC

2003,
2005,2006

Illinois

Indiana

2006 fall2010

2001-2009

G8

G7

2

2

Narrative and
persuasive

Narrative, response to
literature, persuasive

147

Table 18 (cont’d)

Kansas

2001-2009

G8

2

Narrative, response to
literature, persuasive

2001-2006

Holistic rubrics for
Writing Applications
and Language
Conventions for grades
3-8. The response to
literature also had a
Reading Comprehension
rubric in addition to the
WA and LC

1998-2007

G8

1

Expository

2004

6 traits analytic

unknown

2006-2009

G8

1

Informative, narrative,
persuasive

2006,
2007,2008

Analytic

2006-2009

2001-2005

G7

1

Persuasive

2004

Holistic

2001-2005

2007, 2008

Two rubrics during this
time period: one
measured the dimension
of composing and the
other measured
style/audience
awareness. Each
dimension was worth 4
points, for a possible
total of 8 points.

2006-2011

Kentucky

Louisiana

2006
spring-2011

G7

1

Expository or narrative

148

2003,
2005,2006

Table 18 (cont’d)

Maine

Massachusetts

Michigan

1999-2011
Spring
(LEAP)

G8

1

Narrative or expository

2003, 2006

Always the same
rubrics: one measuring
composing; another
measuring
style/audience
awareness and a third
measuring the
conventions of sentence
formation, usage,
mechanics, and spelling
(each dimension worth
one point for a total of 4
points).

Spring 2001
- Spring
2007

G8

1

Rotates between
Narrative and
Persuasive

2002, 2004

Analytic

2004

fall 20012010

G7

1

Personal narrative and
expository

2007

Analytic (Development,
Conventions)

2007

1

Writing from
experience and
knowledge

2003
winter,
2004
winter,
2005 winter

Holistic six-point rubric

2003
winter,
2004
winter,
2005 winter

2003
winter-2005
winter

G7

149

2001-2006

Table 18 (cont’d)
2005 Fall2007 Spring

G7 & G8

1

Writing from
experience and
knowledge

2005 fall,
2006 fall

Holistic six-point rubric

2005 fall,
2006 fall

Missouri

Spring
2006Spring 2010

G7

1

Exposition

2006

Holistic

2006

Nevada

2001-2007

G8

1

Narrative

2007

Holistic and analytic for
voice, organization,
ideas, and conventions

2007

Spring 2006
- Spring
2007

G7

2 long, 6
short

Not specified

2006

Holistic

2006

Spring 2006
- Spring
2007

G8

2 long, 6
short

Not specified

2006

Holistic

2006

North Carolina

2003-2008

G7

1

Argumentative

2006

Holistic 4 Point Rubrics
for content and 2 point
rubrics for conventions

2006

Oklahoma

2006-2010

G8

1

Vary (narrative,
expository, persuasive)

2010

Analytic 5 traits

2010

Rhode Island

2005-2010

G8

3 short, 1
long

No set genre
(persuasive, responseto-text, informational)

Fall 2006

Short = 4 pt Holistic
Long = 6 pt Holistic

Fall 2006

Tennessee

2002-2007

G8

1

Expository/informative

2004

Holistic 6 Point Rubrics

2002-2007

1

Unspecified (student's
can respond however
they like)

2009

Holistic

2009

New York

Texas

2003-2010

G7

150

Table 18 (cont’d)

Vermont

2005-2010

G8

3 short, 1
long

No set genre
(persuasive, response to
text, informational)

2006

short = 4 pt Holistic
long = 6 pt Holistic

2006

Virginia

2006present

G8

1

Not specified

2011

Analytic - 3 Domains

2006

Fall 1998Spring 2007

G7

2

Narrative & Expository

2011

Holistic - 2 Domains

2009, 2010

2005

Analytic

2005

2007

Holistic

2012

Washington

West Virginia

2005present

G7

1

Randomly chosen
(descriptive,
persuasive,
informative, narrative)

Wisconsin

2003present

G8

1

Not specified

151

BIBLIOGRAPHY

152

BIBLIOGRAPHY

American Educational Research Association/American Psychological Association/National
Council of Measurement in Education. (2011). Standards for educational and
psychological testing. Washington, D.C.: American Educational Research Association.
Ball, A. F. (1999). Evaluating the writing of culturally and linguistically diverse students: The
case of the African American vernacular English speaker. In C. R. Cooper & L. Odell
(Eds.), Evaluating writing (pp.225-248). Urbana, IL: National Council of Teachers of
English.
Bangert-Drowns, R. L. (1993). The word processor as an instructional tool: A meta-analysis of
word processing in writing instruction. Review of Educational Research, 63, 69-93.
Bawarshi, A.S., & Reiff, M.J (2010). Genre: An introduction to history, theory, Research, and
Pedagogy. Reference Guides to Rhetoric and Composition. Fort Collins, CO: WAC
Clearinghouse.
Beck, S. & Jefery, J. (2007). Genres of high-stakes writing assessments and the construct of
writing competence. Assessing Writing, 12(1), 60-79.
Berkenkotter, C., & Huckin, T. N. (1995). Genre knowledge in disciplinary communication.
Hillsdale, New Jersey: Erlbaum.
Brunning, R., & Horn, C. (2000). Developing motivation to write. Educational Psychologist, 35,
25-37.
Carroll, W. M. (1997). Results of third-grade students in a reform curriculum on the Illinois state
mathematics test. Journal for Research in Mathematics Education, 28(2), 237–242.
Chen, E., Niemi, D., Wang, J., Wang, H., & Mirocha, J. (2007). Examining the generalizability
of direct writing assessment tasks. CSE Technical Report 718. Los Angeles, CA: National
Center for Research on Evaluation, Standards, and Student Testing (CRESST).
Chesky, J. & Hiebert, E. H. (1987). The effects of prior knowledge and audience on high school
students’ writing. Journal of Educational Research, 80, 304-313.
Chiste, K. B., & O’Shea, J. (1988). Patterns of question selection and writing performance of
ESL students. TESOL Quarterly, 22, 681-684.
Cohen, M. & Riel, M. (1989). The effect of distant audiences on students’ writing. American
Educational Research Journal, 26(2), 143-159.

153

Conley, M. W. (2005).Connecting standards and assessment through literacy. Boston, MA:
Pearson.
Crowhurst, M. (1988). Research review: Patterns of development in writing
persuasive/argumentative discourse. Retrieved from ERIC database. (ED299598)
Dean, D. (1999). Current-traditional rhetoric: Its past, and what content analysis of texts and tests
shows about its present (Doctoral dissertation, Seattle Pacific University).
Dean, D. (2008). Genre theory: Teaching, writing, and being. Urbana: National Council of
Teachers of English.
De La Paz, S., & Graham, S. (2002). Explicitly teaching strategies, skills, and knowledge:
Writing instruction in middle school classrooms. Journal of Educational Psychology,
94(4), 687-698.
Devitt, A. (1993). Generalizing about genre: New conceptions of an old concept. College
Composition and Communication, 44, 573-586.
Devitt, A. (2009). Teaching critical genre awareness. (Bazerman, C., Bonini, A., & Figueriredo
D., Ed.). Genre in a Changing World. 337-351. Fort Collins, CO: WAC Clearinghouse and
Parlor Press.
Devitt, A., Reiff, M., & Bawarshi, A. (2004). Scenes of writing: Strategies for composing with
genres. New York: Pearson/Longman, 2004.
Donovan, C., & Smolkin, L. (2006). Children’s understanding of genre and writing development.
In C. A. MacArthur, S. Graham, & J. Fitzgerald (Eds.), Handbook of writing research (pp.
131-143). New York: Guilford.
Dryer, D. B. (2008). Taking up space: On genre systems as geographies of the possible. JAC,
28.3-4:503-534.
Faigley, L., & Witte, S. P. (1981). Coherence, cohesion, and writing quality. College
Composition and Communication, 32(2), 2-11.
Ferretti, R., Andrews-Weckerly, S., & Lewis, W. (2007). Improving the argumentative writing of
students with learning disabilities: Descriptive and normative considerations. Reading &
Writing Quarterly: Overcoming Learning Difficulties, 23(3), 267-285.
Ferris, D. (1994). Lexical and syntactic features of ESL writing by students at different levels of
L2 proficiency. TESOL Quarterly, 28(2), 414-420.
Flower, L. S., & Hayes, J. R. (1981). Plans that guide the composing process. In C. H. Friderksen
& J. F. Dominic (Eds.), Writing: The nature, development, and teaching of written
communication (pp. 39-58). Hillsdale, NJ: Lawrence Erlbaum Associates.

154

Gabrielson, S., Gordon, B., & Englehard, G. (1995). The effects of task choice on the quality of
writing obtained in a statewide assessment. Applied Measurement in Education, 8(4), 273290.
Gearhart, M. & Herman, J.L. (2010). Portfolio assessment: Whose work is it? Issues in the use of
classroom assignments for accountability. Educational Assessment, 5(1), 41-55.
Gilliam, R., & Johnston, J. (1992). Spoken and written language relationships in
language.learning-impaired and normally achieving school-age children. Journal of Speech
and Hearing Research, 35, 1303-1315.
Glasswell, K., Parr, J., & Aikman, M. (2001). Development of the asTTle writing assessment
rubrics for scoring extended writing tasks (Technical Report 6). Auckland, New Zealand:
Project asTTle, University of Auckland.
Goldstein, H. (1987). Multilevel models in educational and social research. London: Griffin.
Gomez, R., Parker, R., Lara-Alecio, R., & Gomez, L. (1996). Process versus product writing
with limited English proficient students. The Bilingual Research Journal, 20(2), 209-233.
Graham, S., Berninger, V.W., & Fan, W. (2007). The structural relationship between writing
attitude and writing achievement in first and third grade students. Contemporary
Educational Psychology, 32, 516-536.
Graham, S. & Harris, K. (2005). Improving the writing performance of young struggling writersTheoretical and programmatic research from the center on accelerating student learning.
Journal of Special Education, 39(1), 19-33.
Graham, S., McKeown, D., Kiuhara, S. A., & Harris, K. R. (2012). A meta-analysis of writing
instruction for students in elementary grades. Journal of Educational Psychology, 104(4),
879-896.
Graham, S., & Perin, D. (2007). A meta-analysis of writing instruction for adolescent students.
Journal of Educational Psychology, 99(3), 445-476.
Hayes, J. R. (1996). A new model of cognition and affect in writing. In M. Levy & S. Ransdell
(Eds.), The science of writing (pp. 1-27). Hillsdale, NJ: Erlbaum.
Hillocks, G. (2002). The testing trap: How state writing assessments control learning. New
York: Teachers College Press.
Ivanic, R. (2004). Discourses of writing and learning to write. Language and Education, 18(3),
220-245.
Jeffery, J. (2009). Constructs of writing proficiency in US state and national writing
assessments: Exploring variability. Assessing Writing, 14, 3-24.

155

Jennings, M. Fox, J., Graves, B., & Shohamy, E. (1999). The test takers’ choice: An
investigation of the effect of topic on language-test performance. Language Testing, 16(4),
426-456.
Jonassen, D. H., Tressmer, M., & Hannum, W. H. (1999). Task analysis methods for
instructional design. Mahwah, NJ: Lawrence Erlbaum.
Kanaris, A. (1999). Gendered journeys: Children’s writing and the construction of gender.
Language and Education, 13(4), 254-268.
Lee, J. Grigg, W.S., & Donahue, P. L. (2007). The nation’s report card: Reading 2007 (No.
NCES 2007496). Washington, DC: US Department of Education.
Linn, R., Baker, E., & Betebenner, D. (2002). Accountability systems: Implications of
requirements of the No Child Left Behind Act of 2001. Educational Researcher, 31(6), 316.
Lubienski, S. T., & Lubienski, C. (2006). School sector and academic achievement: A multilevel
analysis of NAEP Mathematics Data. American Educational Research Journal, 43(4), 651698.
Moss, P. (1994). Validity in high stakes writing assessment: Problems and possibilities.
Assessing Writing, 1(1). 109-128.
National Assessment Governing Board. (2007). Writing framework and specifications for the
2007 National Assessment of Educational Progress. Washington, DC: U.S. Department of
Education.
National Assessment Governing Board. (2010). Writing framework for the 2011 National
Assessment of Educational Progress. Washington, DC: U.S. Department of Education.
National Commission on Writing for America’s Families, Schools, and College. (2003, April).
The neglected R: The need for a writing revolution. New York, NY: College Entrance
Examination Board. Retrieved from
www.writingcommission.org/pro_downloads/writingcom/neglectedr.pdf
National Commission on Writing for America’s Families, Schools, and College. (2003, April).
Writing: A ticket to work…or a ticket out. A survey of business leaders. New York, NY:
College Entrance Examination Board. Retrieved from
www.writingcommission.org/pro_downloads/writingcom/writing-ticket-to-work.pdf
Newcomer, P. L., & Barehaum, E. M. (1991). The written composing ability of children with
learning disabilities: A review of the literature from 1980 to 1990. Journal of Learning
Disabilities, 24, 578-593.
Olinghouse, N., Santangelo, T., & Wilson, J. (2012). Examining the validity of single-occasion,

156

single-genre, holistically scored writing assessments. In E. V. Steendam, M. Tillema, G.
Rijlaarsdam, & H. V. D. Bergh (Eds.), Measuring writing: Recent insights into theory,
methodology and practices (pp. 55-82). New York: Guilford.
Pasquarelli, S. L. (2006). Teaching writing genres across the curriculum: Strategies for middle
school. Charlotte, NC: IAP-Information Age Publishing, Inc.
Polio, C. & Glew, M. (1996). ESL writing assessment prompts: How students choose. Journal of
Second Language Writing, 5(1), 35-49.
Powers, D. E., & Fowles, M. E. (1998). Test takers’ judgments about GRE writing test prompts.
ETS Research Report 98-36. Princeton, NJ: Educational Testing Service.
Powers, D. E., Fowles, M. E., Farnum, M., & Gerritz, K. (1992). Giving a choice of topics on a
test of basic writing skills: Does it make any difference? ETS Research Report No. 92-19.
Princeton, NJ: Educational Testing Service.
Prior, P. (2006). A sociocultural theory of writing. In C. A. MacArthur, S. Graham, & J.
Fitzgerald (Eds.), Handbook of writing research (pp. 54-66). New York: Guilford.
Prosser, R., Rasbash, J., & Goldstein, H. (1991). Software for three-level analysis. Users’ guide
for v.2. London: Institute of Education.
Raudenbush, S. W., & Bryk, A. S. (2002). Hierarchical linear models: Applications and data
analysis methods (2nd ed.). Thousand Oaks, CA: Sage.
Reiff, M. J. & Bawarshi, A. (2011). Tracing discursive resources: How students use prior genre
knowledge to negotiate new writing contexts in first-year composition. Written
Communication, 28, 3: 312-337.
Redd-Boyd, T. M. & Slater, W. H. (1989). The effects of audience specification on
undergraduates’ attitudes, strategies, and writing. Research in the Teaching of English,
23(1), 77-108.
Resta, S., & Eliot, J. (1994). Written expression in boys with attention deficit disorders.
Perceptual and Motor Skills, 79, 1131-1138.
Rogers, L., & Graham, S. (2008). A meta-analysis of single subject design writing intervention
research. Journal of Educational Psychology, 100, 879-906.
Rubin, D. B. (1987). Multiple imputations for nonresponse in surveys. New York: John Wiley
and Sons.
Salahu-Din, D., Persky, H., & Miller, J. (2008). The nation’s report card: Writing 2007. U. S.
Department of Education, Institute of Education Sciences. Washington, DC: National
Center for Education Statistics.
157

Silva, T. (1993). Toward an understanding of the distinct nature of L2 writing: The ESL research
and its implications. TESOL Quarterly, 27, 657-676.
Stecher, B. M., Barron, S. L., Kaganoff, T., & Goodwin, J. (1998). The effects of standards
based assessment on classroom practices: Results of the 1996-1997 RAND survey of
Kentucky teachers of mathematics and writing (CRESST Tech. Rep. No. 482). Los
Angeles: University of California, National Center for Research on Evaluation, Standards,
and Student Testing (CRESST).
Troia, G. A., & Olinghouse, N. (2010-2014). K-12 Writing Alignment Project. IES funded.
Troia, G. A., Shankland, R. K., & Wolbers, K. A. (2012). Motivation research in writing:
Theoretical and empirical considerations. Reading and Writing Quarterly, 28, 5-28.
US Department of Education. (2004). Charting the course: States decide major provisions under
No Child Left Behind. Retrieved from
http://www.ecs.org/html/Document.asp?chouseid=4982.
U.S. Department of Education, National Center for Education Statistics. (2010). Teachers' Use of
Educational Technology in U.S. Public Schools: 2009. National Center for Education
Statistics. Retrieved April 2014, from http://nces.ed.gov/pubs2010/2010040.pdf
Zabala, D., Minnici, A., McMurrer, J., & Briggs, L. (2008). State high school exit exams:
Moving toward end-of-course exams. Washington, DC: Center on Educational Policy.
Zimmerman, B. J., & Risemberg, R. (1997). Become a self-regulated writer: A social cognitive
perspective. Contemporary Educational Psychology, 22, 73-101.

158