HOW PRINCIPALS’ COGNITIVE SCHEMAS IMPACT THEIR IMPLEMENTATION OF TEACHER EVALUATION POLICIES AND TEACHER EVALUATION SYSTEMS By David B. Reid A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Educational Policy – Doctor of Philosophy 2017 ABSTRACT HOW PRINCIPALS’ COGNITIVE SCHEMAS IMPACT THEIR IMPLEMENTATION OF TEACHER EVALUATION POLICIES AND TEACHER EVALUATION SYSTEMS By David B. Reid Since 2009, the United States (US) federal government has spearheaded a nationwide teacher evaluation reform effort, encouraging states to change their process of evaluating teachers with a focus on evaluations that better distinguished teacher performance as well as provide better information on what makes a high-quality teacher (US Department of Education, 2009). The US Department of Education enacted this reform through the Race to the Top (RTTT) initiative and by granting many states Elementary and Secondary Education Act (ESEA) waivers if they changed their teacher evaluation systems to align with ESEA priorities, such as evaluating teachers in part based on student assessment data. These two levers are the primary reason that since 2009 more than two-thirds of states have made significant changes to how teachers are evaluated (The Center for Public Education, 2014). However, the passage of the 2015 Every Student Succeeds Act (ESSA) allows for an increased role of states and districts in shaping teacher evaluation policies and teacher evaluation systems. All of these changes have the potential to impact how states, districts, schools, and individual principals make sense of and use teacher evaluation policies and systems. Organizational and individual sensemaking of teacher evaluation policies and systems is of particular importance due to the high-stakes most states attach these policies and systems (e.g. using these policies and systems for the hiring, firing, and tenure decisions of teachers). Because school principals play a pivotal role in how teacher evaluations look in practice, I argue policymakers, researchers, and practitioners must better understand how principals think about and ultimately enact these complex policies and systems. In this dissertation, I answer the following research questions: 1) How do principals’ cognitive schemas influence how principals come to understand and implement teacher evaluation policies and systems?; 2) What role does external pressure play in shaping principal learning and enactment of teacher evaluations policies and systems?; and 3) In what ways, if any, do principals’ experience and external pressure interact during the implementation process? This exploratory multi-case study examines data from school principals (N=12) in Michigan, including interviews (n=36), observations (n=24), and questionnaires (n=12) collected in 2016 and 2017. Additionally, teacher interviews (n=12) and specific teacher evaluation district documents inform this study. Results show: (1) principals in high-pressure environments perceive a pressure to differentiate their teachers’ final evaluation ratings, which typically results in these principals rating their teachers more critically than their peers who work in low-pressure environments; (2) principals with high-levels of experience engage in situational leadership, while principals with low-levels of experience engage in relational leadership, which impacts how teacher evaluation policies and systems look in practice; and (3) principals with high-levels of experience are less likely than their more inexperienced peers to use previous teacher data (evaluations/test scores) when evaluating teachers. Implications for policy and practice are discussed. Copyright by DAVID B. REID 2017 ACKNOWLEDGMENTS Country music artist Kenny Chesney wrote a song called, “I Didn’t Get Here Alone” in which he thanks the many people who helped make him successful. I am in no way comparing myself to Mr. Chesney, but many of the lyrics from the song ring true when I think about my career in education and my time at Michigan State University (MSU). I certainly did not get to this point of my life and career alone, so in the paragraphs that follow, I aim to thank the individuals who have supported me along the way. Without the support of these people I likely would not have been able to complete this program and dissertation. I will apologize in advance, as I am sure I will leave out the names of some people who were very supportive and influential during my time working in and studying the field of education. First, I would like to thank my wife, Molly, who left her job and moved across the country to support me. For that I will be forever grateful. Second, I would like to thank my advisor, Anne-Lise Halvorsen, whose guidance and support is unparalleled. Her constant mentorship during my time at MSU was more than I ever could have asked for or imagined. Additionally, I would like to thank Josh Cowen, who provided great guidance as the co-chair on this dissertation and always helped me consider ideas and concepts in a more critical way. I also want to thank Rebecca Jacobsen, who, although I was not her advisee, treated me as one, and took countless time out of her schedule to support me throughout my entire time at MSU. Additionally, I would like to thank Chris Torres for joining my dissertation committee and always providing thoughtful feedback on my writing. Finally, thank you to Michael Sedlak, who supported me greatly during my time at MSU. I most certainly need to thank my graduate student colleagues who have been there for v me through the entire program. The number of people to whom I owe a thank you is many and I am sure I will not mention everyone, but I would like to mention a few by name. I am indebted to Jason Burns, Ben Creed, Kim Jansen, Alyssa Morley, John Lane, and Rachel White. Thank you all for your constant engagement and support. To my parents, brother, and sister, who have supported me for as long as I can remember. It is extremely reassuring to know I always have your unwavering support. Thank you. Finally, thank you to my former students from Phoenix, Arizona. You are the reason I decided to pursue this degree. Your perseverance, humor, and brilliance serves as a constant reminder of why I do this work. vi TABLE OF CONTENTS LIST OF TABLES .........................................................................................................................ix Chapter 1: Introduction……………………………………………………………………………1 Research Questions………………………………………………………………………………..7 Contribution of the Dissertation…………………………………………………………………...8 Outline of the Dissertation………………………………………………………………………....9 Chapter 2: Literature Review…………………………………………………………………….11 Part I: Education Policy Implementation Overview……………………………………………..12 Part II: Teacher Evaluation Policy Implementation……………………………………………...16 Part III: Principal Cognition and Policy Implementation………………………………………..19 The Co-Evolution of Teacher Evaluation Policies and the Role of Principals…………………..22 The Role of Principal Cognition During Teacher Evaluations…………………………………..26 Experience and Context: Why and How They Matter for Policy Implementation………………27 Gap in the Literature……………………………………………………………………………..32 Chapter 3: Framing the Research………………………………………………………………...34 Cognitive Schemas……………………………………………………………………………….35 Sensemaking Theory……………………………………………………………………………..40 Individual vs. Collective Sensemaking…………………………………………………………..43 The Usefulness of Sensemaking…………………………………………………………………47 Chapter 4: Research Design and Methodology………………………………………………….51 Research Design and Research Questions….……………………………………………………51 Rationale for a Case Study…………………………..…………………………………………...52 Study Context: Educator Evaluations in Michigan………………………………………………53 Participants and Sampling Strategy…………………………………………………………..….58 Data Collection………...………………………………………………………………………...60 Data Analysis…………………………………………………………………………………….64 Establishing Validity………………………………………………………………...………….. 67 Limitations……………………………………………………………………………………….69 Chapter 5: How Principals’ Cognitive Schemas Impact Their Implementation of Teacher Evaluation Systems……………………………………………………………………..70 Overarching Theme: Individual vs. Collective Sensemaking……………………………………72 Subtheme One: Principal Leadership…………………………………………………………….72 Nuances…………………………………………………………………………………………..80 Subtheme Two: Use of Prior Evaluation Data…………………………………………………...83 Nuances…………………………………………………………………………………………..86 Subtheme Three: Accurate Reflection of Teacher Effectiveness………………………………..89 vii Nuances…………………………………………………………………………………………..93 Subtheme Four: Hiring Decisions………………………………………………………………..94 Nuances…………………………………………………………………………………………..96 Chapter Summary………………………………………………………………………………..97 Chapter 6: The Role of External Context and Experience in Principal Learning and Implementation of Teacher Evaluation Systems ……………………………............................100 Theme One: Differentiating Teacher Evaluation Ratings……………………………………...101 Nuances…………………………………………………………………………………..….….106 Theme Two: What do Teacher Evaluations Measure?................................................................107 Nuances…………………………………………………………………………………………111 How do Experience and External Pressure Interact during the Implementation Process?..........112 Chapter Summary………………………………………………………………………………123 Chapter 7: Discussion, Implications, and Conclusions…………………………………………125 The Goals of Teacher Evaluation Policy………………………………………………………..125 Principals’ Role in Teacher Evaluations………………………………………………………...131 Implications: For Policymakers…...…………………………………………………………….137 Implications: For Practitioners…………..……………………………………………………...140 Limitations and Future Research…………………………………………………………….….143 Conclusions……………………………………………………………………………………..145 APPENDICES ………………………………………………………………………………….150 Appendix A Principal Questionnaire ………………………………………………………….151 Appendix B Principal Interview Protocol 1……………………………………………………156 Appendix C Principal Interview Protocol 2…………………………………………………....158 Appendix D Principal Interview Protocol 3…………………………………………................160 Appendix E Teacher Interview Protocol………………………………………………….……161 Appendix F Observation Protocol……………………………………………………………...163 WORKS CITED…………………………………………………………………….…………..164 viii LIST OF TABLES Table 1.1. Dissertation Participants (N=12)...…………………………………………………….6 Table 4.1. Timeline of Educator Evaluation Changes in Michigan Since 2009…………………56 Table 4.2. Principal Participant Sample………………………………………………………….59 Table 4.3. Principal Background Information (TPS or Charter, Principal Experience)…………60 Table 4.4. Principal Data Collected…………………………………………………….…..……64 ix Chapter 1: Introduction Researchers and program evaluators often find that in practice policies and systems differ significantly from what policy designers had envisioned (Elmore, 1980; Honig, 2006; Honig & Hatch, 2004; Lipsky, 1980; Odden, 1991; Spillane, 2000; Spillane, Reiser, & Reimer, 2002; Weatherly & Lipsky, 1977). Initially researchers concluded inconsistent policy and system implementation was due to a lack of practitioner will and capacity (McLaughlin, 1987; Odden, 1991), but more recently education scholars have concluded the process of how policies and systems enter organizational environments is much more complex (Honig, 2006; Spillane, 2000; Spillane et al., 2002). In an effort to better understand this complex process, scholars who study education policy and system implementation have turned to cognitive and learning science theories to examine what factors shape how policies and systems are interpreted and ultimately implemented by practitioners (Cohen & Hill, 2001; Halverson & Clifford, 2006; Spillane et al., 2002). Originating in cognitive psychology, theories of cognition examine the interaction between one’s psychological processes and the information that comes in contact with and passes through one’s psychological network (Grider, 1993). Through the lens of cognitive theory, an individual’s ability to learn is dependent upon how he or she receives, organizes, and processes new and existing information (Grider, 1993). Research that uses the cognitive frame finds a complex cognitive process occurs when local actors (e.g. school principals, teachers, or district administrators) attempt to reconcile their previous understandings, habits, and current situational context with new policy demands (Halverson & Clifford, 2006; Honig, 2006; Spillane et al., 2002). Theories of cognition can help scholars of policy and system implementation understand how and why there is often a disconnect between policymakers’ desired outcomes 1 and goals and what happens in practice. One specific cognitive learning theory that is particularly useful when looking at how individuals interpret evolving policies and systems is sensemaking theory. Sensemaking theory acknowledges that past experiences and prior knowledge shape learning and that learning occurs through our social and situational context (Greeno, 1998; Weick, 1995). Scholars use sensemaking theory to attempt to explain how individuals and organizations interpret the policies, systems, and reforms with which they come in contact (Coburn, 2005; Halverson & Clifford, 2006; Spillane, 2000; Spillane et al., 2002). This research generally finds individuals’ prior knowledge, beliefs, and experiences greatly impact how individuals think about and make sense of new and changing information. Central to using a sensemaking theory frame in policy research is the idea that cognition does not simply explain how individual actors interpret information, policies and systems, but also explains how these individuals respond to changes in their environment (Spillane et al., 2006). Spillane et al. (2006) further describe this idea by concluding that an individual’s prior knowledge, experience, and beliefs all serve as a lens for what he or she notices in their environment and how information is processed, organized, and interpreted (p. 49). In short, how an individual makes sense of information has a strong relationship to how this information ultimately enters, remains in, or disappears from, this individuals’ environment of practice. Education is a particularly interesting and timely field to study how cognition impacts policy and system interpretation and implementation because it is arguably one of the most active policymaking arenas today with multiple agencies, including federal, state, and local governments, creating new policies at an ever-increasing pace (Honig, 2006; Spillane & Kenney, 2012). These governments ask school leaders and their faculty to implement a dizzying amount 2 new and reformed policies and systems year after year. For example, the No Child Left Behind (NCLB) act of 2001 required all students in grades K-8 be tested annually in reading in math. This resulted in new testing policies, standards policies, and accountability policies for both teachers and school leaders. Of the many policies and systems schools must implement one that is a prime candidate for study is teacher evaluation. Currently within the field of education there is perhaps no more polarizing issue than teacher evaluation policies and teacher evaluation systems. The polarizing nature of these policies and systems is due in part to the increasing acknowledgement that teacher quality can positively impact student outcomes, such as achievement and attendance (Aaronson, Barrow, & Sander, 2007; Chetty, Friedman, & Rockoff, 2014; Rockoff, 2004) and in part to research suggesting teacher evaluation systems have historically done a poor job distinguishing teacher performance (Donaldson, 2009; US Department of Education, 2009; Weisberg, Sexton, Mulhern, & Keeling, 2009). In an effort to address deficient teacher evaluation systems, the 2009 Race to the Top (RTTT) initiative encouraged states to change the process of evaluating teachers, with focus on evaluations that better distinguished teacher performance as well as provided better information on what makes a high-quality teacher (US Department of Education, 2009). Additionally, the US Department of Education granted many states Elementary and Secondary Education Act (ESEA) waivers if they changed their teacher evaluation policies to align with ESEA priorities, such as evaluating teachers in part based on student assessment data. These two levers are the main reasons since 2009 more than two-thirds of states have made significant changes to their teacher evaluation policies and systems (The Center for Public Education, 2014). This nationwide reform effort has resulted in school leaders having to make sense of new teacher evaluation policies and systems at a rapid pace. 3 Additionally, for many school leaders, teacher evaluation policies and systems are continuing to evolve, tasking school leaders with making sense of multiple editions of these policies and systems. To date, much of the research on teacher evaluation policy reform has focused teachers (Aaronson et al., 2007; Donaldson, 2013; Taylor & Tyler, 2012). This makes sense given the widespread evidence supporting teachers are the most important school-based factor that can positively impact student achievement (Aaronson et al., 2007; Chetty et al., 2014; Rockoff, 2004). However, more recent research has begun to acknowledge the important role principals play in student outcomes, such as attendance, achievement, and graduation rates (Beteille, Kalogrides, & Loeb, 2009; Branch, Hanushek, & Rivkin, 2009; Clark, Martorell, & Rockoff, 2009; Grissom & Loeb, 2009). Additionally, and of particular importance to this work, scholars and policymakers have begun to acknowledge that school principals play a crucial role in how policies and systems, including teacher evaluation systems, play out in practice (Donaldson & Papay, 2014; Halverson, Kelley, & Kimball, 2004; Koyama, 2014; Rigby, 2015). How principals interpret, communicate, and ultimately carry out new teacher evaluation policies and systems has great implications for how these policies and systems look in practice. Although principals have always assumed the responsibility of evaluating their staff, the process is now higher-stakes because in many cases new teacher evaluation policies and systems have tied a teacher’s evaluation score to career defining decisions, such as hiring, firing, and tenure decisions. How principals make sense of teacher evaluation policies and systems matters because: (1) this sensemaking process demonstrates to policymakers and researchers the ways in which policies and systems may be interacting and working in schools; (2) principals’ sensemaking has the potential to highlight the unintended consequences of these policies and systems; and (3) 4 perhaps most importantly, studying how principals make sense of teacher evaluation policies and systems offers feedback to policymakers regarding the ultimate impact the policy is having in practice. Examining principal policy and system interpretation and implementation through the lens of cognition is a useful approach to better understand how different people charged with implementing the same policy make sense of this process. Although school leaders undoubtedly think about and ultimately implement policies and systems differently (Lipsky, 1980; Weatherly & Lipsky, 1977) focusing on the cognition of these leaders has the potential to shed light on how they make sense of and use the policies that are entering their systems of practice. Specifically, in this dissertation I study principals’ cognitive schemas, which scholars define as a cognitive framework that helps individuals interpret and organize information (Weick, 1995). Schemas are useful when studying how individuals interpret and implement policies and systems because schemas focus on how characteristics of individuals impact how they process, interpret, and make sense of new and evolving information (Spillane, 2000; Spillane & Lee, 2014; Spillane et al., 2002). Given the amount of information school leaders are asked to understand and make sense of in their environments, using the cognitive frame to examine this process has the potential to yield difficult to capture findings. Additionally, in educational research the cognitive frame is being used with increasing frequency to better understand how individuals, including principals, shape, mold, influence, ignore, prioritize, and interpret information, policies and systems. In this study I focus on principals who have high-levels and low-levels of experience and who are facing high-levels and low-levels of external pressure. For the purposes of this dissertation, I define low-levels of experience as an individual who has held the position of school principal for four years or fewer. I define a principal with high-levels of experience as an 5 individual who has been a principals for at least nine years. Like teachers, research shows principals improve their practice significantly in the first five years of their practice (Leithwood, Seashore-Louis, Anderson, & Wahlstrom, 2004; Seashore-Louis, Wahlstrom, Leithwood, & Anderson, 2010). Given this research I chose to study principals with fewer than five years of experience and more than five years of experience. Additionally, I wanted to leave a gap in years of experience because I felt this would strengthen the results of this work (as opposed to studying principals with four versus five years of experience). I define principals who face high external pressure as principals whose schools are labeled either red or orange on Michigan’s 2014 Accountability Report Card. Principals who work in schools with green, lime, or yellow ratings are considered to have low external pressure (See Table 1.1). A more detailed rationale for using this scorecard is provided later in this work. The goal of using these attributes of school leaders is to make the process of sensemaking more predictive. For example, do principals with low-levels of experience in high-pressureenvironments make sense of policies and systems differently than principals with low-levels of experience in low-pressure environments? We know that sensemaking happens and impacts policy and system implementation (Coburn, 2005; Halverson et al., 2004; Rigby 2015; Spillane et al., 2002), but we do not have a predictive theory beyond saying “sensemaking happens.” Table 1.1. Dissertation Participants (N= 12) Principal High-Pressure Low-Pressure High-Experience 3 3 Low-Experience 3 3 6 Research Questions The goal of this dissertation is to better understand how principal experience and external pressure impact how principals implement evolving teacher evaluation policies and systems. To assist in answering these questions I reviewed education policy and system implementation research, which helped me construct an analytic framework from which I developed my research questions. Specifically, this study asks the following: (1) How do principals’ cognitive schemas (i.e., highly developed background knowledge due to experience) influence how they come to understand and implement teacher evaluation policies and systems? (2) What role does external context (i.e. high-pressure vs. low-pressure environments) play in shaping principal learning and enactment of teacher evaluations policies and systems?; and (3) In what ways, if any, do principals’ experience and external pressure interact during the implementation process? Throughout this dissertation I refer to both teacher evaluation policies and teacher evaluation systems. When referring to teacher evaluation policy I mean the specific legislation of the state of Michigan. For example, at the time of data collection, Michigan required all teachers be evaluated annually (in most circumstances), and that 25 percent of a teacher’s final evaluation score be based on student growth and assessment data. Subsequent references to teacher evaluation policy (and principals’ sensemaking and cognition of this policy) refers to the actual state legislated policy in place at the time of data collection. A requirement of Michigan’s teacher evaluation policy was that schools must use a teacher evaluation system. At the time of data collection the four state approved systems were: Charlotte Danielson’s Framework for 7 Teaching, the Marzano Teacher Evaluation Model, The Thoughtful Classroom, and 5 Dimensions of Teaching and Learning. Although a majority of district’s opted to use one of these system, districts were allowed to choose another teacher evaluation system if this system met certain requirements laid out by Michigan’s teacher evaluation policy. Any subsequent reference to a principal’s thinking, use, or enactment of their teacher evaluation system refers to their specific system, and not necessarily the policy. At times I use both “policy” and “system” when discussing principals’ cognition and sensemaking, but other times I intentionally use either “policy” or “system” when describing the results of this work. This is a small, but important difference to keep in mind as you continue to read this dissertation. Contribution of the Dissertation Although research on how cognition impacts policy and system implementation and is increasing, additional research should seek to better understand how and why specific characteristics impact how these policies and systems play out in practice. Few studies look at the impact of specific implementer characteristics and how the cognitive schemas of individuals with these characteristics impact how these individuals interpret policies and systems. By exploring these nuanced and complex ideas, this dissertation will make two primary contributions. First, this dissertation contributes to the work of sensemaking theory by providing insights and hypotheses of how to two specific implementer characteristics—school leader cognitive schemas (particularly focusing on background knowledge developed due to experience) and the amount of external pressure facing a school impact policy and systems interpretation and implementation. Research shows cognitive schemas impact policy implementation (Halverson et al., 2002; Rigby, 2015) and external pressure impacts policy implementation (Koyama, 2014; Spillane, 2000). However, while we know these characteristics 8 matter for policy interpretation and implementation we lack an understanding of how and why they matter. With the goal of moving beyond the notion that “sensemaking happens,” examining these specific implementer characteristics separately constitutes key contributions to this study, but looking at how these two important characteristics interact during the implementation process is the main theoretical contribution of this work. Second, the overarching goal of this work is to be able to describe and explain individual principal sensemaking by better understanding how individual principals with certain attributes interpret and implement teacher evaluation policies and systems. The results of this work have the potential to inform future practice as schools and districts may be able to better anticipate how certain individuals will interpret teacher evaluation policies and systems. Practitioners may be able to use the results of this work to better design teacher evaluation professional development and training for principals within their school district. Put differently, the results of this work can serve as a template for districts who can make strategic decisions and individualize training and development to best address the needs of all principals in their district. Outline of the Dissertation The chapters that follow describe the relevant literature, frame the research, discuss the design and methods of the study, show the findings of the work, and finally examine the implications and conclusions of this work. Chapter two presents a review of the literature, specifically focusing on the different waves of educational policy implementation in general and then moving in to a more nuanced review of policy and system implementation and cognition. This chapter then reviews literature on cognition and teacher evaluation policy and system implementation, focusing on the importance of teacher evaluations and school principals in the implementation process. Chapter three frames the research, including providing a conceptual framework of policy implementation. The theoretical framework is then discussed, as is the 9 importance of cognitive schemas and sensemaking theory in this work. The chapter concludes with the rationale of why experience and external pressure are important variables to look at when studying policy and system implementation in education. Chapter four discusses the design and methodology of the study, including the research questions, research design, the rational for using qualitative methods, and for using a case study approach for this work. This chapter then discusses the site and participant selection, data collection methods, data analysis, and the validity of this data collection. Chapters five and six examine the findings of this work. Finally, chapter seven discusses the findings, focusing on the implications of and the conclusions drawn from this work. 10 Chapter 2: Literature Review The purpose of chapter two of this dissertation is to describe and synthesize research that has examined education policy implementation generally as well as review research which examines how cognition and sensemaking impacts policy and system implementation. Additionally, this chapter reviews the literature on principal sensemaking and principal teacher evaluation policy and system implementation. This literature review provides context for my study by providing a historical overview of the different waves of policy implementation research, including how this research has evolved over the past several decades to sharpen its focus on how individual sensemaking impacts policy implementation. This chapter also sets up chapter three of this dissertation, which establishes the theoretical framework that guided my data collection and analysis. Finally, this chapter grounds this dissertation in a stream of research and scholarship that extends over many decades that will help situate the results of this work in the current literature on how sensemaking impacts policy and system implementation. Part one of this literature review examines this history of education policy and system implementation, including how scholars have increasingly focused on individual cognition when examining what factors impact how policies and systems play out in practice. Part two of this review focuses specifically on research that examines how cognition and sensemaking impact teacher evaluation policy and system implementation. Finally, part three of this chapter reviews literature that examines how principals make sense of and implement policies and systems generally and how principals make sense of and implement teacher evaluation policies and systems specifically. Additionally, part three of this review examines research on how the two primary variables of this study, individual experience and external pressure, impact individual interpretation and implementation of policies and systems. 11 Part I: Education Policy Implementation Overview There is a broad literature on how individuals and organizations make sense of and implement policies and reforms. Over the past fifty years scholars have attempted to better understand how and why policy implementation varies in certain contexts and with certain individuals. This has become an increasingly complicated endeavor as Honig (2006) writes, “the policies under investigation on the whole are significantly more comprehensive and varied than in previous decades” (p. 4). However, despite the complexity of studying how individuals and organizations interpret and make sense of policies and reforms, researchers generally agree on three waves of education policy implementation research (Honig, 2006; Odden, 1991). Researchers from one wave (approximately 1960-1969 (Odden, 1991)) of educational policy implementation research generally found that the policies and reforms handed to schools and school leaders often lacked clear expectations and directions for these local implementers, resulting in policy ambiguity (Lipsky, 1980; Weatherly & Lipsky, 1977). This lack of clarity often resulted in unsuccessful attempts by individuals and organizations to implement these policies in ways envisioned by policy designers (Honig, 2006; Lipsky, 1980; Weatherly & Lipsky, 1977). Furthermore, research from wave one concluded even if local implementers understood how a policy or reform was expected to be implemented, these individuals often lacked the will and capacity to implement these policies and reforms as intended by the designers of the policy (Honig, 2006; McLaughlin, 1987; Weatherly & Lipsky, 1977). The most widely studied policy from wave one was the Elementary and Secondary Education Act (ESEA) passed in 1965. Most of the research examining the implementation of ESEA found local implementers did not faithfully attempt to implement the aspects of ESEA as envisioned by the designers (Murphy, 1971). 12 Wave two of policy implementation research generally found that given time, local actors often tried to implement policies with the best of intentions and if given enough time and support, in practice policies and reforms often resembled, at least in part, the original vision of policymakers (Honig, 2006). This wave of research moved beyond the idea that local policy implementers lacked the skill, will, or capacity to implement policies and reforms, acknowledging policy implementation was a more complex process. Although researchers from wave two began to acknowledge that people and places mattered greatly when examining policy implementation, this work did not explore how people and places mattered (Honig, 2006). Eventually, scholars from wave three of policy implementation research began to examine how specific individuals and their characteristics impacted policy and reform interpretation and implementation. Additionally, these researchers began to focus on the impact of place on policy implementation (Honig, 2006). This focus on person and place began to expand throughout wave three of policy implementation research and near the end of the third wave, scholars began exploring specifically how and why interactions among people and places impacted how policies were interpreted and ultimately how these policies played out in practice (Honig, 2006). Most recently, over the past twenty or so years, education policy implementation research has continued to sharpen its focus on how people matter, specifically how individual cognition and sensemaking impact how policy interpretation and implementation occurs (Coburn, 2001; Coburn, 2005; Halverson et al., 2004; Honig, 2006; Rigby, 2015; Spillane et al., 2002). As Honig (2006) writes, “Whereas past implementation research generally revealed that policy, people, and places affected implementation, contemporary implementation research specifically aims to uncover their various dimensions and how and why interactions among these dimensions 13 shape implementation in particular ways” (p. 14). Specifically this line of research has focused on how people draw on their various identities, social situations, and prior knowledge and experiences to shape how they make sense of and implement policies and reforms (Honig, 2006; Spillane, Reiser, & Gomez, 2006). The evolution of policy implementation research sets the stage for this dissertation. Although earlier policy implementation research suggested factors such as policy design and local resistance may be the primary causes of a disconnect between policymakers and practitioners, as Spillane et al. (2006) write, “This work suggests that viewing implementation failure exclusively as a result of poor clarity or deliberate attempts to ignore or sabotage policy neglects the complexity of the human sensemaking processes consequential to implementation” (p. 47). Today, research that focuses on how cognition and sensemaking impacts policy interpretation and implementation has examined how many different groups of people make sense of policies and reforms including: (1) how teacher cognition affects policy interpretation and implementation (Booher-Jennings, 2005; Firestone, Monfils, Schorr, Hicks, & Martinez, 2004; Kennedy, 2010); (2) how principal cognition affects policy interpretation and implementation (Coburn, 2005; Halverson et al., 2004; Halverson & Clifford, 2006; Rigby, 2015); and how a host of other individuals’ cognition from central office personnel (Honig, 2006) to mayors (Hess, 2008), impacts how policies play out in practice. This line of research generally finds that how individuals make sense of policies and reforms has a strong relationship to how these individuals think about and ultimately implement policies (Spillane et al., 2002). One prominent study by Spillane (2000) that examined school leader’s implementation of mathematics reform notes: 14 Cognitive science offers a number of plausible explanations for the dominant patterns in district leaders’ understanding of the mathematics reforms, explanations that are not mutually exclusive. Whereas more conventional implementation accounts might focus on district leaders’ attempts to sabotage the mathematics reforms or their limited capacity to carry out reformers’ proposals, a cognitive frame suggests that implementation failure was due in important measure to what district leaders understood from the reforms (p. 169). This work suggests individuals make sense of new information through their existing knowledge and beliefs and that individual sensemaking is constantly filtered and updated as new information is learned and processed (Spillane, 2000; Weick, 1995). Subsequent work by Spillane et al. (2002) makes the following argument: “What a policy means for implementing agents is constituted in the interaction of their existing cognitive structures (including knowledge, beliefs, and attitudes), their situation, and policy signals” (p. 388). In sum, research on policy interpretation and implementation and the impact cognition has on policy interpretation and implementation has evolved from the idea that implementers either ignore or modify the wishes of policymakers due to a lack of skill or will. Instead, research today suggests the human sensemaking process plays the larger role in how policies are interpreted ultimately play out in practice. Specifically, an individual’s knowledge, beliefs, context, and attitude impact policy implementation and differences in policy implementation occur when individuals make sense of a policy reform and draw connections between these new ideas and their existing understandings and knowledge (Spillane et al., 2002). 15 Part II: Teacher Evaluation Policy Implementation Schools began to focus some form of attention on evaluating teachers in the early part of the 20th century. The move towards evaluating teachers was due in part to the growing belief of scholars and practitioners that it was necessary to determine teachers’ impacts on student learning, including how teachers teach students to be successful and productive citizens (Cubberley, 1929). Cubberley’s (1929) work detailed evaluation of teachers should include how teachers delivered instruction and how they managed student behaviors. Despite this early attempt to evaluate teacher effectiveness the evaluation of teachers was often reduced to completing a mere checklist of teacher responsibilities, such as teacher attendance, timeliness, and professionalism (Darling-Hammond, Wise, & Pease, 1983; Wise, Darling-Hammond, McLaughlin, & Bernstein, 1985). The use of a simple checklist was due in part to the growing responsibilities and demands of school administrators and these types of evaluation remained a common practice of evaluating teachers for decades. Because of the widespread evidence on the importance of teacher quality for student outcomes (Chetty et al., 2014; Darling-Hammond, 2000) the past twenty years or so has seen a push to systematically evaluate teachers in an effort to provide better information on what makes a quality teacher. Recent research examining the implementation of teacher evaluation policies and systems has looked at many specific areas including how principals and teachers initially respond to and accept a new policies (Milanowski & Heneman, 2001), how principals communicate feedback to teachers (Kimball, 2003), the impact of teacher evaluations on principals’ human capital decisions (Goldring et al., 2015; Grissom, 2011), and the relationship of teacher evaluation systems to student achievement (Chetty et al., 2014; Derrington, 2013; Donaldson, 2009; Rigby 2015). Each of these studies suggests individuals’ cognition, 16 experience, and context, impacts how these policies and systems are interpreted and enacted in local systems of practice. More broadly, research has suggested individuals and schools struggle to implement teacher evaluation policies due to the number of interrelated components (Kennedy, 2010), which is one explanation as to why teacher evaluation policy implementation remains a challenge. Many scholars believe implementing teacher evaluation policies is becoming even more complex, as Derrington and Campbell (2013) note, “For principals as collaborative instructional leaders, new accountability-driven evaluation policies are affecting the relationships of those principals with their teachers, their sense of what being an instructional leader means, and their capacity to handle the complexities of operating and leading a school” (p. 239). Given the complex nature of evolving teacher evaluation policies scholars have increasingly turned to studying how individual cognition and sensemaking and organizational context impact how these systems play out in practice. For example, Halverson and Clifford (2006) use a distributed cognitive theory model to show how local context shapes practice finding that cognitive systems of teacher evaluations are more complicated than envisioned by policy designers. The authors note: Even practitioners perceived as successful implementers of standards-based teacher evaluation practices need to navigate trade-offs as they adjust the demands of the new policy artifacts to the needs of their existing contexts (Halverson et al., 2004; Kimball, 2003; Milanowski & Heneman, 2001). The tendency of teacher evaluation practices to run headlong into the traditions of local practice provides a prime opportunity to study how practitioners make sense of the new in terms of the old (p. 581). 17 Since the publication of this article almost 10 years ago, teacher evaluation systems have become more complex and more high-stakes, strengthening the argument for more research to better understand the implementation process. Another study that examined cognition and teacher evaluation policy implementation was conducted by Halverson et al. (2004) who found local implementers vary the implementation of teacher evaluation policies and this variation is shaped by principals’ individual roles, contexts, and their specific artifact they are using. Halverson et al. (2004) conclude: Principal sensemaking seemed to be primarily a function of principal self-perception of their role as a leader and the knowledge and skills they bring to that role; prior evaluation practices in the school and district; and school context factors such as teacher morale and existing challenges facing the school (e.g., student population risk factors, external accountability pressures) (p. 39). Other work by Rigby (2015) examines how cognition affects teacher evaluation policy implementation and finds first-year principals receive a variety of messages from colleagues, supervisors, and teachers about how to conduct teacher evaluations. This work finds that as principals build their professional identity they come to understand similar policies differently than their colleagues and highlights the variations in implementation of the principals asked to conduct teacher evaluations. In sum, researchers are continuing to sharpen their focus on how individual cognition impacts how policies and systems enter systems of practice. This is particularly true of teacher evaluation policies and systems, which have received unprecedented attention from scholars, policymakers, and practitioners alike. However, despite this escalation of research, researchers and policymakers agree teacher evaluation systems continue to fall short of their intended goals, including identifying high-quality teachers, distinguishing teacher 18 performance, and improving teacher performance through feedback and support (Kennedy, 2010; US Department of Education; Weisberg et al., 2009). This lack of progress towards providing better information on teacher quality and better identifying high-quality teachers appears to be the case for this dissertation’s context as well, as in 2015, 97% of teachers in Michigan were rated as effective or highly effective and of the 96,000 teachers in the state, only 19 have been dismissed due to poor evaluations over the past five years (Michigan Department of Education, 2015). These statistics are concerning given Michigan’s overall low student achievement on state assessments (Chetty et al., 2014; Darling-Hammond, 2000; Michigan Department of Education, 2015). Because there is still much to understand about the factors that influence how teacher evaluation systems play out in practice, this study will fill an important gap in the current literature by providing insights as to how certain policy implementer characteristics influence policy interpretation and implementation. The high-stakes nature of teacher evaluation policies make it important to understand individual cognition and behavior more so than lower-stakes policies because so much is attached to these policy outcomes with respect to future teacher employment and ultimately student learning and achievement. While other policies are more transient, teacher evaluation policies are here to stay (in some form). How the people charged with making sense of and implementing these policies is of unique and great importance and must be better understood. Part III: Principal Cognition and Policy Implementation The third stream of research that guides this work focuses on one of the most important actors charged with implementing teacher evaluation policies and systems--school principals. Although early research suggested principals lacked the power and influence to change school 19 and teacher practices (Bidwell, 2001), more recent research suggests principals play a key role in how initiatives and reforms play out in practice (Coburn, 2005; Donaldson & Papay, 2014). How principals think about and implement education reforms is of particular importance as research suggests principals are second only to teachers as the educational resource who can most positively impact student outcomes, such as learning, increased attendance, and increased graduation rates (Leithwood, Seashore-Louis, Anderson, & Wahlstrom, 2004; Seashore-Louis, Wahlstrom, Leithwood, & Anderson, 2010). The type of sensemaking in which an individual engages has great implications for how policies and systems permeate through an organization (Coburn, 2005; Weick 1995). This is particularly true with people who are in positions of leadership as leaders impact how other individuals within an organization receive, think about, and are forced to act upon a policy reform (Coburn, 2005; Ganon-Shilon & Schechter, 2016; Spillane et al., 2002). As Ganon-Shilon and Schechter (2016) note: Leaders play an important role in shaping what and how teachers learn about educational change and reform, so school principals and middle leaders, particularly, influence teachers’ sense-making both directly and indirectly. Directly, they influence what teachers find themselves making sense of, by facilitating access to some reform messages rather than others. Providing teachers with interpretive frameworks and ways of understanding reform demands, formal leaders enable the educational staff to adopt strategies that develop and construct their understanding of the reform’s intent. School leaders also influence teachers’ sense-making indirectly as they participate with the teachers in a collective learning process through formal meetings and informal conversations (p. 6). 20 Weick and Sutcliffe (2007) argue school leaders play an important role in ensuring everyone within a school can make sense of their responsibilities and as a result school leaders approach to sensemaking directly impacts how policies play out in practice (Ganon-Shilon & Schechter, 2016). For example, a study from Spillane et al. (2002) found novice principals typically prioritized establishing legitimacy with their peers and staff before trying to implement new policies and reforms and in doing so took on a form of collective sensemaking, in an effort to make teachers feel included in the policy implementation process. The type of sensemaking in which a principal engages is particularly important to note when examining the implementation of a policy as important as teacher evaluation policies. Teachers and administrators both understand the importance of these policies, but each look at the goals, uses, and purposes of the same policy quite differently. For example, teachers look at these policies individualistically, as their careers are in large part dependent on successful evaluations (Ganon-Shilon & Schechter, 2016). On the other hand, administrators think of the teacher evaluation process holistically, with the overall success of their school constantly in the forefront of their thinking (Ganon-Shilon & Schechter, 2016). In short, while teachers and other school staff are left to their own devices of how to assign meaning to a particular policy reform, principals play a large role in guiding teacher sensemaking. Because of this, how principals chose to navigate their own sensemaking process has great implications for how teachers assign meaning to their own evaluations and ultimately how teachers are evaluated. As Spillane et al., (2002) write, “While teachers often encounter district and state accountability mechanisms through media reports, policy directives, and union newsletters (among other sources), their evolving perceptions and understanding of these policies are likely to be mediated through 21 participation in their school community” (p. 732). This community includes, importantly, school principals. Distinguishing between individuals who make sense of information, policies, and reforms individually and those who engage in collective sensemaking is an important nuance to understand how policies look in practice. Although undoubtedly some of the characteristics of these two groups of individuals overlap during the sensemaking process, such as drawing on prior knowledge and experiences and current context, there is a distinction of how these groups of people think about policy implementation. Even within the sub-group of collective sensemaking one can hypothesize that there seems to be deliberate collective sensemaking, where a school principal collaborates to make sense of a policy, discuss how the policy will be implemented, etc., and informal collective sensemaking that is more along the lines of social context/network sensemaking. The later could still be very individualistic and seems to fall more along the lines of an individual influence like cognitive schema. For example, if I am a principal and have seven teacher friends, then both my cognitive schemas and my individual discussions with these seven teacher friends about teacher evaluation policy will influence how I think about the policy. This is very different then getting together seven teachers in an organization and collectively making sense of the policy. In short, how a principal engages in sensemaking likely influences how policies play out in practice. The Co-Evolution of Teacher Evaluation Policies and the Role of Principals One specific responsibility of the school leader is to evaluate the teaching staff, which principals have done in some form for the past century. Early research showed principals played a more hands-off role during evaluations as principals rarely evaluated classroom instruction and instead completed a checklist of teacher responsibilities, such as if a teacher showed up to work 22 on time (Darling-Hammond et al., 1983; Wise et al., 1985). As time progressed and principals began to observe teacher classroom instruction as part of a teacher’s evaluation they often did so haphazardly, using protocols that were not supported by theory or research (Porter, Youngs, and Odden, 2001). However, as teacher evaluation systems transitioned into more high-stakes policies, principals began to take a more active role in the evaluation process and over the past several decades principals have been asked to become competent evaluators of classroom instruction and provide meaningful and critical feedback to teachers, taking on the dual role of coach and evaluator (Duke & Stiggins, 1990; Duke & Stiggins, 1986). As their role in the teacher evaluation process became more active, the amount of time principals devoted to evaluating their staff increased drastically teachers (Halverson et al., 2004). Increasingly principals are tasked not only with observing more teacher classroom instruction, but meeting with these teachers outside of instructional time to discuss their instruction and progress. Additionally, the observation rubrics and evaluation forms principals must complete have become complex and time consuming to complete as principals must document evidence to support their claims (Halverson et al., 2004). Finally, in most circumstances, principals evaluate all teachers in their school. Because of all of these increased time demands some research has found in an effort to streamline evaluations and efficiently move through this process, principals scale back aspects of the policy, such as how long they observe teachers and the type and amount of feedback they provide teachers (Halverson et al., 2004). Most recently, principals have been tasked with taking on the role of an instructional leader, where the principal is charged with supporting teacher instruction and is held accountable, along with teachers, for student learning (Blasé, Blasé, & Phillips, 2010; Smylie, 2010). Now, more than ever, principals are expected to be “educational experts” and understand 23 what good teaching and learning looks like (Blasé et al., 2010; Halverson et al., 2004). Principals are not only charged with running the school and managing aspects of a complex organization, but new teacher evaluation policy requirements ask principals to understand the ins and outs of teaching and learning. For example, principals must evaluate things such as classroom management, student engagement, and strong lesson delivery from teachers. The level of expertise expected of a principal is becoming increasingly complex each time teacher evaluation policies evolve. Currently, as most schools use rigorous teacher evaluation systems, which typically include a student achievement based component, as well as an observational component with a detailed and structured observation rubric, the role of the principal in the evaluation process is much more defined than in previous years (Goldring et al., 2015; Steinberg & Donaldson, 2016). For example, in most situations principals are given specific directions of how and when to observe teachers, how to score teachers, and how to provide feedback to teachers (Goldring et al., 2015). In sum, the role of the principal in teacher evaluations has changed drastically as the teacher evaluation policy landscape has evolved. Now, in most cases part of a principals’ own evaluation includes student performance data and how they evaluate their staff. As a result, principals are incentivized to take a more active role in the evaluation and development of the teachers in their building. Given the research that suggests principals make many school-based decisions, including human capital decisions, based on teacher evaluation scores (Goldring et al., 2015; Jacob, 2011), recently in the field of education research much attention has been given to how principal cognition their interpretation and implementation of these important policies. 24 There is an increasing amount of research coming out revealing how school principals are implementing evolving teacher evaluation policies and ultimately evaluating the teachers in their building. For example, a majority of a teacher’s evaluation is based on principal observations (Steinberg & Donaldson, 2016), so research has begun to document how principals evaluate teachers with more detailed and structured observational rubrics. The research overwhelming finds principals assign teachers high evaluation ratings. One explanation for this is any judgement of teaching is inherently subjective and allows the observer much leeway as to how to actually evaluate instruction (Donaldson & Papay, 2014). As a result, researchers have found principals find it difficult to separate teacher instruction and everything else they know about a teacher (for example, a teacher’s contribution outside of the classroom), from their evaluation of that teacher (Donaldson, 2013; Papay & Johnson, 2012). As Donaldson and Papay (2014) note, “Although having clear standards, using highly qualified and well-trained evaluators, and focusing on evidence can help remove much of the subjective bias in observation measures, separating the personal from the professional can be difficult” (p. 2). Additional research suggests teacher effectiveness varies substantially, yet principals’ evaluations of teacher fail to differentiate this effectiveness (Grissom & Loeb, forthcoming). Research also suggests that principals tend to rate teachers more harshly in low-stakes evaluations compared to high-stakes evaluations, like those used for human capital decisions (Grissom & Loeb, forthcoming). During high-stakes evaluations, such as a teacher’s official evaluation score used for many important teacher career defining decisions, such as tenure and retention decisions, principals overwhelming rate teachers highly, which may explain in part why there is so little variation documented in teacher performance. 25 The Role of Principal Cognition During Teacher Evaluations Weatherley and Lipsky (1977) stressed the importance of “street-level bureaucrats” – the individuals who impact how policies are ultimately implemented. These individuals and their cognition, including their beliefs, skill, will, resources, time, context, and capacity, impact how policies looks in practice. Although principals are experiencing more clarity and structure around how they are to evaluate their teaching staff, principal cognition still greatly impacts how these policies play out in practice (Coburn, 2005; Halverson et al., 2004; Spillane et al., 2002). Principals have the potential to drive school improvement through policy implementation more than most other individual actors. As Spillane and Kenney (2012) write, While federal, state, and local government policy makers have gone to considerable lengths over the past several decades to target their policies at the technical core of schooling – specifying what teachers should teach, at times how they should teach, and acceptable levels of mastery for students – their initiatives, which represent a considerable shift in the policy environment of schools, ultimately depend on school administration for their successful implementation (p. 546). There is a mounting pile of evidence suggesting principal cognition is mediated through individual background characteristics and local context (Coburn, 2005; Hallinger & Heck, 1996; Rigby, 2015; Spillane, Halverson, & Diamond, 2004; Spillane et al., 2002). Prior literature on principals’ cognition and sensemaking of policies provides two broad strands of findings including (1) principals’ prior experiences greatly influence how they understand and make sense of new policies (Harris, Ingle, & Rutledge, 2014; Jacob & Lefgren, 2008; Nelson, Sassi, & Grant, 2001); and (2) principals implement policies based on what they believe is in the best interest of their local school and context (Coburn, 2005; Cohen & Hill, 2001; Koyama, 2014; 26 Matsumura & Wang, 2014; Spillane et al., 2002). Additional research focusing on school leader cognition looks at how these leaders interpret, make sense of, and communicate policy messages they receive in their local context, finding principals receive and deliver the same policy messages differently and this impacts how policies are implemented in their local contexts (Anagnostopoulos & Rutledge, 2007; Coburn, 2005; Rigby, 2015). Several studies have looked at how cognition impacts teacher evaluation policy and system implementation, concluding principals navigate trade-offs and adjust and negotiate the demands of evaluating teachers in their building, based on their prior knowledge and personal context (Halverson & Clifford 2006; Halverson et al., 2004; Rigby, 2015). Experience and Context: Why and How They Matter for Policy Implementation Research shows as teachers gain experience they become more effective at raising student achievement, increasing student attendance, and managing their classroom (Clotfelter, Ladd & Vigdor, 2006; Papay & Kraft, 2014; Rockoff, 2004). Evidence also suggests principals become more effective at their jobs as they gain experience and this is true particularly in their first three years as a school leader (Clark, Martorell, & Rockoff, 2009). Research also suggests the length of a principals’ tenure at one school, no matter how long they have served as a principal, impacts their role. This line of research suggests it takes five years for a principal to secure relationships with staff, improve staff effectiveness, fully implement policies and practices, and make significant education improvements (Coburn, 2001; Seashore-Louis et al., 2010). Research has long documented that prior knowledge aids learning by enabling the learner to make connections and thus deepen their understanding (Harris et al., 2014; Jacob & Lefgren, 2008; Nelson et al., 2001). This line of thinking is true for principals and their work. For 27 example, some research suggests teachers’ and administrators’ prior knowledge and experience influence their ideas about changing instructional practice (Cohen & Barnes, 1993; Halverson & Clifford, 2006). Research specifically on principals’ suggests principals’ experience greatly influences how they understand and make sense of new policies and reforms (Harris et al., 2014; Jacob & Lefgren, 2008; Nelson et al., 2001). This line of research suggests school leaders build mental models that shape what they think about when receiving new information and how these individuals perceive this information (Halverson et al., 2004). As principals continue to gain experience these models shape what individuals notice when encountering new reforms and policies which impacts how principals interpret information, accept or reject new ideas or information, and how principals think about and ultimately enact policies and reforms in their systems of practice (Cohen & Barnes, 1993; Halverson et al., 2004). Although prior knowledge and experience may be the single most important variable that impacts individual cognition, another important variable is current situational context. Specifically, the amount of outside pressure applied on individuals who are charged with interpreting and implementing new and evolving reforms greatly impacts how individuals think about this process (Grissom, 2011; Hill & Barth, 2004). For example, research suggests when individuals are introduced to and attempt to implement a new policy, individual actors’ behavior changes when outside pressures enter their environment, which impacts how these individuals attempt to implement policies and reforms (Grissom, 2011; Hill & Barth, 2004). There is a growing amount of research that examines specifically how external pressure, from federal, state, and local levels, impacts how principals conduct their work and more specifically, how they make sense of and implement policies (Booher-Jennings, 2005; Coburn, 2005; Halverson & Clifford, 2006; Matsumura & Wang, 2014; Rigby 2015). Specifically, 28 research shows external pressure impacts how principals reallocate instructional time (Diamond & Spillane, 2004; Firestone et al., 2004), use school-based resources (Dee, Jacob, & Schwartz, 2013), staff subject areas (Booher-Jennings, 2005), and even alter school lunches (Figlio & Winicki, 2003). Although much of this prior research has focused on attempts to raise student achievement, less work documents how external pressure impacts how principals work with and evaluate their teaching staff (an approach that has the potential to raise student achievement, given the wide-spread belief in the importance of teacher quality). Although schools and school leaders have always been responsible for outside policy implementation, the passage of NCLB in 2001 increased accountability and external pressure placed on schools. As Koyama (2014) writes, “Principals are mediators between external accountabilities, like those of NCLB, and school practices” (p. 283). Outside pressure matters for school principals, particularly new principals who are under pressure of increased accountability (Spillane & Lee, 2014). As Halverson et al. (2004) write, The sense we make of new information is also shaped by our social and situational context (Greeno, 1998). Organizations and institutions routinize existing models through policies, programs, and traditions. Thus, the intended effects of innovations are not necessarily altered by the malice or laziness of implementers, but instead by the best efforts of local actors seeking to satisfice conflicting goals (Spillane, Reiser, & Reimer 2002, Fischoff 1975; March & Simon, 1958). Actors make sense of new practices within their existing social and situational context, and often adjust the meaning of the new in terms of their established context of meaning (p. 4-5). Research that has examined how outside pressures impact how principals make sense of and implement policies finds principals mediate external district and state accountability policies 29 and demands in strategic ways, including “gaming the system” and implementing policies in ways consistent with local values and beliefs (Spillane & Keeney, p. 18). Additionally, research finds that principals respond to external pressures by restructuring the formal routines of their local instructional programs (Koyama, 2014). Work by Seashore-Louis and Robinson (2012) finds when external pressures and policies align with the values and perspectives of principals they will internalize these policies and attempt to implement them faithfully, but when external policies and demands do not align with a school leader’s vision they are less likely to make this effort (p. 42-43). Federal level pressure has increased immensely over the past several decades and particularly since the passage of NCLB in 2001. Specifically, schools have faced an increase of federal pressure from things including what should be taught, how much of it should be taught, how long it should be taught, and teacher quality and evaluation (Spillane & Kenney, 2012). Research suggests that federal level pressure can impact how principals make sense of and implement policies including what teachers teach (typically focusing on math and reading while deemphasizing other subjects) and how long they teach these subjects as well as increasing the amount of time and resources devoted towards preparing for tests (Booher-Jennings, 2005; Diamond & Spillane, 2004; Spillane & Keeney, 2012). Pressure from the state level also impacts how principals make sense of and implement school policies. Particularly since the passage of the Elementary and Secondary Education Act (ESEA) in 1965 and NCLB in 2001, states have experienced an increase of responsibilities and resources which expanded state level control of schools (Spillane & Kenney, 2012). Although the federal government can ask things of state and local schools they ultimately depend on state and local governments to develop and implement policies that are in line with federal 30 requirements (Spillane & Kenney, 2012). In most circumstances state level pressure comes from annual testing, school ratings, and teacher evaluations. Research shows state level pressure impacts principal decision making with things such as hiring and firing of teachers (CohenVogel, 2011). The majority of pressure school leaders face is from the state level and this pressure shapes implementation of state level policy (Booher-Jennings, 2005; Coburn, 2005; Matsumura & Wang, 2014). Finally, local level pressure also has the potential to shape how principals implement policies. Even as federal and state level pressure has increased, there has not been a decline in local level policy making (Spillane & Keeney, 2012). Local level pressure is the most pressing type of pressure administrators face, as they constantly have to meet with district administrators to show they are in compliance with district level goals and policies. Much of the prior research on how local level pressure impacts how principals make sense of and implement policies has focused on how administrators respond to outside pressure to raise student test scores, finding principals become more “data-driven” and make strategic decisions about who to teach and where to devote resources to specific students (Booher-Jennings, 2005). Booher-Jennings (2005) found that local school administrators responded to institutional pressure by emphasizing a singular measure of accountability (student achievement on test scores). Additionally, BooherJennings (2005) finds that schools make intentional decisions about their resources to help students on the “bubble” including providing these students more one on one or small group time, providing after school programs to these students, moving special area teachers (e.g. music, art, and gym) to teach test preparation activities, and providing these students with access to summer school (p. 241-242). 31 Gap in the Literature Although the amount of research on policy implementation is growing rapidly this study aims to fill two specific gaps are currently exist. First, the research above suggests individual cognition greatly impacts how policies play out in practice. However, research currently lacks a more predictive form of how cognition impacts how individuals make sense of teacher evaluation policies. For example, do school principals with high-levels of experience make sense of and ultimately implement teacher evaluation policies differently than their less experienced peers? If so, what does this look like and what does this say about how principals evaluate teachers? Although the research on teacher evaluation policies and how principals are interacting with this policies is growing, there is a gap in the literature as to how exactly principals with different experience levels and facing different accountability pressures think about and ultimately enact teacher evaluation policies. My dissertation will fill this gap in the literature by addressing what factors impact principals’ sensemaking of these policies and systems and answer what this ultimately means for how these policies and systems look in practice. In the end my goal is to be able to state hypotheses about how principals with certain characteristics think about and ultimately implement teacher evaluation policies. The second gap in the literature this study aims to fill is a lack of focus on the sensemaking of school principals and how this sensemaking impacts policy implementation. Research on educational policy implementation has focused heavily on teachers and how a teacher’s practice can impact policy implementation. This line of work sees teachers as the individuals who bear the most responsibility for what happens in their individual classrooms (e.g. if students are learning). As such, policy has often focused on teacher accountability mechanisms. However, we know very little about how principals cognitive schemas and the 32 context within which they work impacts their understanding and implementation of policies; particularly teacher evaluation policies. Where teachers’ understanding and willingness to implement a policy may impact an individual classroom, a school leader’s cognition and sensemaking impacts how policies enter and diffuse throughout the entire school building (Coburn, 2005; Derrington, 2013). Prior research has shown even the most diligent school leader and policy implementers adjust their sensemaking based on their local context and the meaning they give to a policy (Coburn, 2005; Halverson et al., 2004; Rigby, 2015). Given the uncertain nature of policy implementation and the ever expanding policy interpretation opportunities handed to school leaders, additional research should begin to understand how school leader characteristics affect their interpretation of policies and ultimately how this interpretation impacts policy implementation. Because education policies are implemented very differently in different contexts with different individuals, it is important to move beyond preconceived notions of policy implementation and begin looking at specific answers to questions of how and why policy implementation is executed in certain contexts with certain types of people. 33 Chapter 3: Framing the Research The purpose of this chapter is to describe cognitive schemas and sensemaking theory as well as to explain why these two sub-categories of cognitive theory are appropriate and useful lenses through which to view how principals think about and ultimately implement teacher evaluation policies and systems. Part one of chapter three examines cognitive schemas, an idea which derives from cognitive science and which I define as the pattern of how individuals think about collecting, organizing, and processing information (Piaget & Inhelder, 1958). This review will examine how cognitive research uses cognitive schemas generally and within the field of education specifically. Part two of this chapter focuses on sensemaking theory, including distinguishing between individual and collective sensemaking and making the argument for why sensemaking theory is best suited to help guide this work. One’s cognition, including their cognitive schema(s), impacts their subsequent sensemaking of a task or event. In this way, one’s cognitive schema(s) and an individual’s sensemaking work together and form the basis of the framing of this work. Researchers define cognitive schemas as specific knowledge structures individuals use to make sense of information and interpret this information in their environment (Piaget & Inhelder, 1958; Spillane et al., 2006). Sensemaking theorists believe past experiences and prior knowledge shape an individual’s learning and acknowledge that learning occurs through our social and situational context (Greeno, 1998; Weick, 1995). In this way, these two cognitive frameworks intersect and are useful when trying to explain the phenomenon of how individuals attempt to process, understand, interpret, and implement policies and systems. Research that studies individuals’ cognitive schemas and uses sensemaking theory to explain policy and system implementation suggests that even if individual actors receive the same message regarding how a policy should be implemented, these individuals will construct different interpretations of this 34 message based on what they already know and believe (Grider, 1993; Halverson et al., 2004; Spillane et al., 2006; Weick, 1995). This past work suggests that studying policy implementation through the lens of sensemaking can explain how principals make sense of policies and help explain how principals enact and make decisions around implementing policies and systems. Cognitive Schemas The origins of cognitive theory research date back to the late 1800s and the work of William James who believed human thinking consisted of non-repetitive thoughts that continually evolved as new information and experiences entered their thinking (James, 1890). Research using cognitive theory grew in the early part of the 20th century with the work Wilhelm Wundt who found that human experiences consist of measurable mental functions including awareness, perception, and reaction (Wundt, 1902). Wundt found that as individuals have more experiences, the range of ways they make sense of these experiences increases, because these individuals have more information to draw on when attempting to make sense of these experiences (Wundt, 1902). Research focusing on how individual’s receive, interpret, and act upon information continued to grow with the work of John Dewey who argued understanding one’s cognition was imperative to understanding the actions of that individual (Dewey, 1938). As cognitive theory continued to gain prominence, the idea of cognitive schemas developed. This work largely began in the 1930s with Frederic Bartlett, who introduced the concept of the cognitive schema (Bartlett, 1958). Bartlett conducted research focusing on how individuals interpret, remember, and make sense of full and incomplete information. As Grider (1993) writes: Another of Bartlett’s classic experiments involved the relaying of a story from person to person. When the story reached the tenth individual it had virtually become an entirely 35 different tale from the original version. The people had unknowingly changed segments of the story to fit their expectations (i.e. existing schemas) (Bell-Gredler, 1986). Bartlett’s findings helped to develop and ungird the key cognitive concepts of perception and mental processing (p. 9). Finally, Jean Piaget contributed greatly to the field of cognitive theory with his work on how individuals collect and organize information (Piaget, 1964). Like Bartlett, Piaget emphasized the importance of schemas in cognitive development, arguing cognitive schemas help individuals create mental representations and when linked together help one understand the world and respond to situations (Piaget, 1964). Both Bartlett and Piaget found that as individuals learn, they create cognitive schemas that help them code, process, understand, and respond to information (Bartlett, 1958; Piaget, 1964). In general, cognitive theorists, including Bartlett and Piaget, believe individuals use cognitive schemas to sort information in long-term memory, as well to understand new information that enters their system of thinking (Bartlett, 1958; Grider, 1993; Piaget, 1964). Currently, research using cognitive theories to study individual and organizational behaviors is expanding. This expansion is due in part to the interest in examining how individuals make sense of outside policies and interventions entering their organizations and systems of practice. Early research on how individual cognitive schemas impact how policies and interventions play out in organizations suggests that when an individual encodes new information, their already existing cognitive schema mediates how new information is received, organized and processed (Bartlett, 1958; Dewey, 1938; Piaget, 1964). Essentially, one uses previous knowledge to interpret new ideas. In this process, individuals sometimes even change new information to fit his or her existing cognitive schema (Grinder, 1993; Piaget, 1964). This is, 36 in part, why new policies or programs often are implemented in different ways even when local actors are working earnestly to faithfully implement the policy or program. Like many organizations, schools and school districts are experiencing an influx of new policies entering their systems of practice, resulting in individuals at some level having to make sense of this information (Honig, 2006; Spillane & Kenney, 2012). As a result, research has begun to examine how individual organizational members’ cognitive schemas impact how policies and outside interventions play out in schools. In educational research, Spillane et al. (2006) defined cognitive schemas as “specific knowledge structures that link together related concepts used to make sense of the world and to make predictions” (p. 49). For example, an experienced school leader will draw on his or her developed schema when attempting to make sense of good classroom instruction. This school leader’s past experiences influence what this leader expects to see in the classroom and ultimately impacts how they interpret or understand what is happening in the classroom. Individuals have varying cognitive schemas which is why focusing on this aspect of cognition is an appropriate way to address questions of policy and system implementation. Even if individual actors all receive the same message regarding a reform, individuals will construct different interpretations of this message based on what they already know and believe (Grider, 1993; Spillane et al., 2006). This process is not unidirectional, however. Just as existing schemas shape how new information is processed, so too does new information shape existing schemas. When confronted with new ideas that challenge preexisting understanding, individuals update their schema to reflect this new knowledge. In a seminal study in educational research that examines how a leader’s cognitive schema impacts policy implementation Halverson et al. (2004) note: 37 Our cognitive models, however, are not rigid structures that determine what we notice and name. Rather, our models interact with our perceptions and experience in an iterative process through which new experiences can come to shape our existing models. In organizations, new policies and programs can provide this jolt to existing practice, encouraging practitioners to reframe their practice in terms of the new expectations. The ways that practitioners make sense of new initiatives in terms of pre-existing models make the implementation of new, complex programs a far from linear and predictable process (p. 5). To be clear, each individual may have a unique schema. However, this is not to say that generalizations are impossible. In particular, there are some common factors that can be used to hypothesize about how new information is likely to be interpreted by individuals who share certain key characteristics. For example, the extent of one’s prior experience has been shown to be a key determinant shaping cognition. Experience shapes how individuals learn and enact new policies in multiple ways. First, prior knowledge and experience shapes what individuals notice when conducting a process (Weick, 1995). In this way, we might predict that more experienced school principals will have an easier time conducting teacher evaluations because they have experience doing this in the past. Even if the policy they are implementing is different, their past experiences can shape things such as how they communicate information, what they notice during observations of teachers’ instruction, and how they might navigate the process of evaluating teachers. This line of research suggests school leaders build mental models based on past experiences and these models impact what they notice and how they enact new versions of old policies (Halverson et al., 2004). However, other research suggests individuals who are familiar with the task at hand will 38 implement new tasks in old, familiar way. Therefore individuals with more experience might be less likely to implement new policies faithfully, instead transforming a new policy into iterations which they are familiar with and that makes sense to them. It is well established that as principals gain experience these mental models shape what individuals notice when encountering new reforms and policies which impacts how principals interpret information, accept or reject new ideas or information, and how principals think about and ultimately enact policies and reforms in their systems of practice (Cohen & Barnes, 1993; Halverson et al., 2004). Therefore, individual experience is likely to have some impact on individual sensemaking and specific to this study, how principals think about and ultimately evaluate the teachers in their building. Second, the degree of pressure one feels when trying to learn something new shapes how our preexisting schemes shape new information (Grissom, 2011; Hill & Barth, 2004). As was mentioned in the previous chapter, research suggests principals make sense of and implement policies as they attempt to mediate external district and state accountability policies and demands and they do so in strategic ways, including “gaming the system” and implementing policies in ways consistent with local values and beliefs (Spillane & Keeney, p. 18). Therefore, it is logical to predict that the amount of pressure facing school leaders may lead to some predictable ways individual principals think about and ultimately evaluate teachers. For example, we might imagine principals was large amounts of accountability pressure from the state level may be more likely to implement a new teacher evaluation policy with fidelity than their peers who work in environments with fewer pressures. In sum, cognitive schemas are key to consider when studying questions of policy and system implementation in order to understand how individuals make sense of new and existing information and situations. Using the lens of one’s cognitive schemas moves past the assumption 39 that “sensemaking happens” and instead has the potential to examine questions related to why sensemaking happens and how individual characteristics affect the policy implementation process (Halverson et al., 2004; Spillane et al., 2006). Sensemaking Theory There is a growing body of literature in education that uses sensemaking theory, a specific type of cognitive theory, to address questions of how people and organizations interpret and implement policies and reforms (Coburn, 2005; Halverson et al., 2004; Rigby, 2015; Spillane et al., 2002). The goal of much of the prior research using sensemaking theory has been to attempt to explain how individual and organizational sensemaking impacts how policies look in practice. Specifically, research that uses sensemaking theory has examined how individuals come to understand and enact policies and how this process is influenced by prior knowledge, the social context within which they work, and the nature of their connections to the policy or reform message (Coburn, 2005; Cohen & Hill, 2001; Spillane et al., 2002). Sensemaking theorists believe past experiences and prior knowledge shape individual and collective learning and this learning occurs through our social and situational context (Greeno, 1998; Weick, 1995). Sensemaking theory seeks to understand how people process, understand, and respond to change (Halverson et al., 2004; Spillane et al., 2002; Weick, 1995) and attempts to explain how and why social learning occurs (Weick et al., 2005). When there is a mismatch between what an individual expects and what an individual experiences, individuals are left to assign meaning to what has happened (Ganon-Shilon & Schechter, 2016) and sensemaking helps rationalize these experiences (Weick, 1995). As Ganon-Shilon and Schechter (2016) note: Structuring the unknown through sense-making enables individuals to act in ways that make sense. It involves coming up with a map of a shifting world as well as testing this 40 map with others through data collection, conversation, and action. Individuals, then, actively construct meaning by relating new information to preexisting cognitive frameworks labeled by scholars as working knowledge, cognitive frames, enactments or cognitive maps. (p. 4). Sensemaking theory is particularly useful when attempting to answer questions of how individual actors’ attempts to reconcile conflicting policy demands and implement policies and systems. For example, we might imagine school principals are faced with conflicting demands of how to evaluate teachers. Should the principal use the teacher evaluation policy as a means to hold teachers accountable for their performance, rank these teachers, and either award or dismiss these teachers based on these ratings? Or should principals use teacher evaluations as a means of support and feedback in an effort to help teachers improve their instructional practice? Principals are faced with these scenarios and must decide how they will think about the teacher evaluation process. The multiple paths one may take while making sense of a new and evolving policy is one reason why sensemaking theory provides another critical lens to analyze these data. In short, where one’s cognitive schemas impact how they think about implementing teacher evaluation policies, sensemaking theory is useful to study the factors that actually impact how this thinking plays out in practice. For example, a principal will come into an observation of teacher instruction with prior knowledge that will impact his or her focus (imagine a former math teacher who is focused on the instructional strategies of a math teacher). However, what differentiates this existing schema from the sensemaking frame is sensemaking theory includes the factors which influence the actions this principal will take during the observation. Therefore, we can hypothesize that a principal with a preexisting cognitive schema that priorities clear and concise mathematical instruction will focus on this during the observation of a teacher, but how 41 this principal ultimately rates this teacher is influenced by other contextual factors, which impacts the sense this principal will make, regardless of their cognitive schema. Weick (1995) argues there is a strong reflexive component of sensemaking that is particularly useful as people are navigating their way through a process, making sense of this process, and then updating their sensemaking as they make further sense of the ongoing process (p. 15). Weick et al. (2005) write: Explicit efforts at sensemaking tend to occur when the current state of the world is perceived to be different from the expected state of the world, or when there is no obvious way to engage the world. In such circumstances there is a shift from the experience of immersion in projects to a sense that the flow of action has become unintelligible in some way. To make sense of the disruption, people look first for reasons that will enable them to resume the interrupted activity and stay in action (p. 409). This description of sensemaking fits squarely into this study’s focus on policy and system implementation in schools. Individuals within schools, in this case school principals, expect to experience a certain event or go through a certain process when evaluating teachers. For example, principals with high-levels of experience have evaluated teachers in so form for their entire careers as a principal. Therefore, these past experiences shape how principals think about this process, including how they observe teacher instruction, how they communicate with teachers about the evaluation process, and how to use the results of evaluations. Principals with low-levels of experience also have some understanding of how teacher evaluations might look in practice as the vast majority have recently left the classroom where they were evaluated as a teacher (this is the case for the six principals in this study with low-levels of experience). What these principals expect to see during the evaluation process comes from their existing schema. 42 However, new teacher evaluation policies are disrupting what principals expect to see by introducing new ideas, concepts, routines, and expectations of what teacher evaluations should look like. This creates an opportunity for principals to make sense of this disruption. Individual vs. Collective Sensemaking Within the theory of sensemaking there are two general schools of thought regarding how individuals make sense of information. One is individual sensemaking where individuals make sense of unfamiliar situations on their own by relying on their own personal experiences, beliefs, and values in an effort to bring clarity to an uncertain situation (Ganon-Shilon & Schechter, 2016; Klein, Moon, & Hoffman, 2006). These individuals typically create mental models based on their previous and current cognition in an effort to explain uncertainties in their environment (Ganon-Shilon & Schechter, 2016). The individual vein of sensemaking theory comes from the broader cognition literature described above including how individuals’ cognitive schemas influence how individuals make sense of unclear or ambiguous situations (Bingham & Kahl, 2013; Fiss & Zajac, 2006; Maitlis & Christianson, 2014). Individual sensemaking suggests individuals make sense of situations individually based on their personal cognitive schemas, which are constantly evolving as they receive new and updated information (Maitlis & Christianson, 2014). The other vein of sensemaking theory is collective sensemaking. Collective sensemaking is rooted in studies of social interaction, which argue sensemaking occurs between individuals rather than within one individual (Maitlis & Christianson, 2014). Individuals who engage in collective sensemaking rely not only on their individual thoughts, beliefs, and experiences, but also on the thoughts, beliefs, and experiences of other individuals within their environment. This results in a shared social process of sensemaking (Ganon-Shilon & Schechter, 2016; Weick et 43 al., 2005). Scholars of collective sensemaking believe in a co-constructed sensemaking process between the people within an organization (Maitlis & Christianson, 2014). In recent research, sensemaking is becoming more widely acknowledged as a social process. Some scholars argue even when individuals act on and interpret information by themselves this individual sensemaking is most often embedded in a social context where individuals thoughts, feelings, and behaviors are influenced by other people within their social context (Maitlis and Christianson, 2014; Weick et al., 2005). As Maitlis and Christianson (2014) write: When sensemaking is seen as taking place within individuals, then collective meaning making occurs as individuals advocate for a particular view and engage in influence tactics to shape others’ understandings. In contrast, when sensemaking is regarded as unfolding between individuals, intersubjective meaning is constructed through a more mutually co-constituted process, as members jointly engage with an issue and build their understanding of it together (p. 78). In short, while individual sensemaking occurs in one’s head, collective sensemaking occurs among multiple people in an organization or an environment. Based on these definitions one might expect people to make sense of information, events, and processes differently based on the type of sensemaking with which they engaged. For example, do people who engage in individual sensemaking make sense of information in a similar or different way than those who engage in collective sensemaking? For those who engage in collective sensemaking, is how they make sense of a situation different based on who they collectively make sense with–such as collective sensemakers who make sense in structured professional developments with their staff 44 versus collective sensemakers who draw on their informal networks with their close friends or certain teachers/groups? The type of sensemaking with which an individual engages has great implications for how policies permeate through an organization (Coburn, 2005; Weick 1995). This is particularly true for people who are in positions of leadership as leaders impact how other individuals within an organization receive, think about, and are forced to act upon a policy reform (Coburn, 2005; Ganon-Shilon & Schechter, 2016; Spillane et al., 2002). As Ganon-Shilon and Schechter (2016) note: Leaders play an important role in shaping what and how teachers learn about educational change and reform, so school principals and middle leaders, particularly, influence teachers’ sense-making both directly and indirectly. Directly, they influence what teachers find themselves making sense of, by facilitating access to some reform messages rather than others. School leaders also influence teachers’ sense-making indirectly as they participate with the teachers in a collective learning process through formal meetings and informal conversations (p. 6). Weick and Sutcliffe (2007) argue school leaders play an important role in ensuring everyone within a school can make sense of their responsibilities and as a result school leaders approach to sensemaking directly impacts how policies play out in practice (Ganon-Shilon & Schechter, 2016). For example, a study from Spillane et al. (2002) found novice principals typically prioritized establishing legitimacy with their peers and staff before trying to implement new policies and reforms and in doing so took on a form of collective sensemaking, in an effort to make teachers feel included in the policy implementation process. 45 The type of sensemaking in which a principal engages is particularly important to note when examining the implementation of a policy as important as teacher evaluation policies. Teachers and administrators both understand the importance of these policies, but each look at the goals, uses, and purposes of the same policy quite differently. For example, teachers look at these policies individualistically, as their careers are in large part dependent on successful evaluations (Ganon-Shilon & Schechter, 2016). On the other hand, administrators think of the teacher evaluation process holistically, with the overall success of their school constantly in the forefront of their thinking (Ganon-Shilon & Schechter, 2016). In short, while teachers and other school staff are left to their own devices of how to assign meaning to a particular policy reform, principals play a large role in guiding teacher sensemaking. Because of this, how principals chose to navigate their own sensemaking process has great implications for how teachers assign meaning to their own evaluations and ultimately how teachers are evaluated. As Spillane et al., (2002) write, “While teachers often encounter district and state accountability mechanisms through media reports, policy directives, and union newsletters (among other sources), their evolving perceptions and understanding of these policies are likely to be mediated through participation in their school community” (p. 732). This community includes, importantly, school principals. Distinguishing between individuals who make sense of information, policies, and reforms individually and those who engage in collective sensemaking is an important nuance to understand how policies look in practice. While undoubtedly some of the characteristics of these two groups of individuals overlap during the sensemaking process, such as drawing on their prior knowledge and experiences and current context, there is a distinction of how these groups of people think about policy implementation. Even within the sub-group of collective sensemaking 46 one can hypothesize that there seems to be deliberate collective sensemaking, where a school principal collaborates to make sense of a policy, discuss how the policy will be implemented, etc., and informal collective sensemaking that is more along the lines of social context/network sensemaking. The latter could still be very individualistic and seems to fall more along the lines of an individual influence like cognitive schema. For example, if a principal and has seven teacher friends, then individual discussions with these seven teacher friends about teacher evaluation policy has the potential to influence how this principal thinks about the policy. This is very different then getting together seven teachers in an organization and collectively making sense of the policy. In short, how a principal engages in sensemaking likely influences how policies and systems play out in practice. The Usefulness of Sensemaking Sensemaking theory is a useful approach to study questions of how principals implement teacher evaluation policies for several reasons. First, Weick (1995) describes sensemaking as distinct from other approaches, such as social action theory, instructional leadership theory, principal-agent theory, and organizational/institutional theory, because sensemaking is most useful when looking at a sustained activity or an ongoing process (p. 13). Principals are constantly participating in the “sustained activity” of evaluating teachers throughout the school year. They are doing so both formally, through their district’s teacher evaluation process and informally through conversations, walkthroughs, and other ways they collect data on teachers. This process and their sensemaking of this process is ongoing and likely changes as they become more familiar with teacher evaluation policies and have experience conducting more evaluations. Consequently, the way a principal approaches his or her first or second teacher observation will likely differ from his or her fifteenth or sixteenth teacher observation. Observations of the same 47 teacher at different points of the school year may also vary. For example, a teacher will likely experience a different evaluation process from his or her principal in September when compared to the process that ensues in May as principals are likely to understand these policies better, or at least differently, as the year progresses. Additionally, Weick (1995) argues there is a strong reflexive component of sensemaking that is particularly useful as people are navigating their way through a process, making sense of this process, and then updating their sensemaking as they make further sense of the ongoing process (p. 15). This is useful in asking questions about how principals evaluate teachers, because principals likely do not completely understand new policies the first time they come in contact with them. Instead, principals gain additional, new, or different knowledge as they become more familiar with these policies and these newfound insights impact their sensemaking process. The ongoing nature of teacher evaluations and how principals make sense of these policies fits squarely into the sensemaking theory framework. As the roles and responsibilities of principals continue to change as teacher evaluation policies evolve, sensemaking theory serves as useful framework to look at the ever-changing expectations of principals and how principals interpret their new roles. For example, because teacher evaluations are high-stakes, what are the consequences for teachers as principals learn to use these systems? How much space is there in these policies for principals learning and how does this impact teachers who are evaluated by principals early in the school year or the first time the principal uses this system compared to teachers who are evaluated by the same principal who has gained experience using these systems? Sensemaking theory is also a useful approach to study how outside interventions change an existing model (Halverson et al., 2004; Weick, 1995). Research suggests one important factor 48 of principal interpretation of new teacher evaluation policies is how and what principals understand from prior teacher evaluation policies (Halverson et al., 2004). Therefore, as new teacher evaluation policies and evaluative requirements permeate the walls of schools, sensemaking theory provides a useful lens to study how principals interact with the changes. As opposed to other theories that look more closely at the impact that an entirely new policy has on an organization, sensemaking theory helps explore what is likely to occur when new iterations to an existing policy enter a system of practice. Almost all school principals were previously evaluating teachers and as a result they already have an idea of what teacher evaluation looks like. Therefore, as new policies aim to change systems of teacher evaluation, sensemaking theory will serve as a useful approach to better understand how changes to previously existing policies impact how these policies play out in different contexts. Finally, sensemaking theory is a particularly useful approaching when looking at texts, written language, and artifacts (Weick, 1995). The most closely aligned study to principals’ sensemaking of teacher evaluations comes from Halverson et al. (2004) who found that the potential effectiveness and usefulness of an educational artifact, such as a new teacher evaluation system, is dependent on how principals filter their understandings through pre-existing knowledge and structures (p. 38). As Halverson et al. (2004) write: Affordances are an actor’s perception of the ways the artifact can be used in practice. The actual use of a complex artifact, such as a teacher evaluation policy, depends not only on the features built into the design of the artifact, but also on affordances of artifact use perceived by actors (p. 6). These affordances, such as whether principals look for student engagement, classroom management, or strong lesson delivery, are likely to be widely different depending on the person 49 who is interacting with the artifact (Halverson et al., 2004). Therefore, a sensemaking approach is uniquely suited to examine how various individuals perceive the same document and how these documents play out in different contexts. This is particularly useful when studying how principals use teacher evaluations, as most recent teacher evaluation reform has focused on creating a less subjective, more standardized way to evaluate teachers and sensemaking theorists argue this is unlikely to occur. In sum, sensemaking theory provides a unique lens to study how principals interpret and implement teacher evaluation policies. Past studies confirm principals have an important place in policy implementation and how principals make sense of policies impacts not only their implementation but also all people with whom these policies come in contact. Although a growing body of literature has provided data on many important questions around this topic, there is a lack of scholarship documenting how principals with specific cognitive schemas make sense of new and evolving teacher evaluation policies. 50 Chapter 4: Research Design and Methodology The purpose of this chapter is to describe the research design of this dissertation and provide rationale for a case study method. Additionally, I describe the context of this study and provide details and rationale explaining why Michigan is a timely state to examine how principals make sense of and implement teacher evaluation policies and systems. I then introduce the participants of this study and explain and my sampling strategy, my data collection methods and sources, and my approach to data analysis. The chapter concludes by describing how I established validity for the results of this work. Research Design and Research Questions The goal of this dissertation is to better understand how principals’ experience and external pressure impact how principals implement evolving teacher evaluation policies and systems. Specifically, this study answers the following: (1) How do principals’ cognitive schemas (i.e., highly developed background knowledge due to experience) influence how they come to understand and implement teacher evaluation policies and systems; (2) What role does external context (i.e. high-pressure vs. low-pressure environments) play in shaping principal learning and enactment of teacher evaluations policies and systems; and (3) In what ways, if any, do principals’ experience and external pressure interact during the implementation process? To assist in answering these questions, I relied on decades of policy implementation research, specifically focusing on teacher evaluation policy implementation. Grounding my 51 analysis in previous research helped me construct an analytic framework. The data analysis that follows assisted me in describing and explaining the data collected for this study. Rationale for a Case Study This dissertation took on the design of a case study, as case studies have proven to a good design to understand how multiple variables interact in an environment (Derrington, 2013; Halverson & Clifford, 2006; Miles, Huberman, & Saldana, 2014). For example, variables such as environmental context, principal and teacher knowledge and skills, local, state, and federal accountability measures, and pressures from the district office or parents all come together and interact within the school environment. These variables create a complex environment in which a case study research approach has the potential to better understand the impact of these variables on organizational processes and policy implementation. Additionally, a qualitative research design is best suited to help answer these research questions because good qualitative research does the following: (1) takes place in natural settings in an attempt to make sense of or interpret a phenomenon (Denzin & Lincoln, 2003); (2) is grounded in the lived experiences of people (Marshall & Rossman, 1999); and (3) asks questions about how one variable interacts with another variable and why these variables act the way they do (Maxwell, 2005). Finally, according to Yin (2013), case studies are a preferred approach to answering “how” and “why” questions regarding a particular phenomenon. In its most general terms, a case study analysis takes on the form of in-depth data collection in an effort to compare a similar phenomenon across different contexts (Patton, 2014). The goal of case study research is to collect comprehensive, systematic, and in-depth information about each case of interest (Patton, 2014). Three major types of case studies are commonly used for social science research; (1) exploratory case studies (used to help the researcher develop an 52 idea or project); (2) descriptive case studies (used to help the researcher describe causal relationships within a phenomenon); and (3) explanatory case studies (used to help the researcher understand what influences behavior in a case) (Berg, 2007). This study takes on the design of an explanatory multi-case study in an effort to provide answers to my research questions and in an attempt to better understand how and why certain individual traits and characteristics affect policy and system implementation. Research should meet three conditions in order to conduct a reliable explanatory case study: (1) the research must seek to explain how or why a phenomenon occurs; (2) the research must examine a contemporary phenomenon; and (3) the researcher(s) must have no control over the phenomenon (Yin, 2013). My study meets each of these conditions. The individual cases in this research represent K-8 school principals throughout the state of Michigan tasked with implementing teacher evaluation policies and systems. The principals in their specific context and their cognitive schemas bound the larger multi-case study and principals’ thinking and enactment of teacher evaluation systems and policies is the primary unit of analysis of this work. Specifically, I began by developing a theory of what factors might influence how school principals make sense of and ultimately implement teacher evaluation policies and systems. From this theory, I selected individual cases that fit the criteria of the theory (more information on my sampling later in this chapter). After designing all data collection protocols, I began conducting individual case studies before writing individual case study reports. Finally, I analyzed each of these individual reports. Study Context: Educator Evaluations in Michigan Michigan’s effort at reforming teacher evaluation laws throughout the state began in 2009 when the state first applied for Race to the Top (RTTT) funding. In an effort to make their 53 application more competitive Michigan began to make changes encouraged by RTTT, including making student growth a significant part of a teacher’s evaluation (Keesler & Howe, 2015). Michigan did not receive RTTT funding in 2009 or again when they applied in 2010, however the passed legislation set in motion new teacher evaluation systems. In 2010 Michigan did receive an NCLB waiver and condition of receiving this waiver they were required to rework their teacher evaluation system (Keesler & Howe, 2015). Although the 2010 teacher evaluation legislation had certain expectations of districts, such as making student growth a significant part of teachers’ evaluations, the legislation still gave individual districts a lot of autonomy when determining how to evaluate teachers in their district. A larger shift occurred in Michigan’s teacher evaluation landscape in 2011, which increased the probationary period of beginning teachers from four years to five years and legislated that an untenured teacher, if rated effective or highly effective, could not be removed from his or her current teaching placement solely based on seniority (Michigan Department of Education, 2015). These changes aimed to improve the teacher workforce throughout the state by keeping the best teachers in classrooms. Additionally, the legislation said the state would put in place a teacher evaluation system beginning in the 2013-14 school year. In order to assist in creating a statewide system the Governor of Michigan created the Michigan Council for Educator Effectiveness (MCEE). MCEE consisted of educational researchers, educational experts, school principals, and members of the Michigan Department of Education (MDE) charged with developing a fair, rigorous, and transparent state-wide system for evaluating teachers and administrators. Together, these educational experts spent 18 months reviewing the most recent research across the country and globe regarding the most effective and fair way to evaluate teachers. In July of 2013, MCEE released its final proposal to overhaul Michigan’s 54 teacher evaluation system. Based on these recommendations, House Bills 5223 and 5224 were written and originally scheduled go into effect during the 2013-14 school year. However, despite initial bipartisan support, HB5223 and HB5224, stalled for more than two years. The House and Senate could not reconcile several areas of contention, including what percentage of a teacher’s evaluation should be tied to student test scores and what tests should be used to determine student achievement. As the legislation continued to stall, disagreements over teacher evaluation rubrics surfaced, with the Senate suggesting the type of rubric used to evaluate teacher should be a decision made by Local Education Agencies (LEAs), rather than limiting the rubric to one of the four originally recommended by MCEE. 55 Table 4.1. Timeline of Educator Evaluation Changes in Michigan Since 2009 Year 2009-10 Event RTTT Applications NCLB Waiver Brief Summary Michigan applied for but did not receive RTTT funding in 2009 and 2010. The state received an NCLB waiver in 2010. 2011 Public Act 101 Revised teacher tenure laws, established new requirements for teacher evaluations, new limits on collective bargaining. 2011 Public Act 102 Establishes MCEE to reform Michigan’s educator evaluation system 2013 MCEE submits final recommendations MCEE recommends four teacher evaluation frameworks, using student assessment data in teacher evaluations. 2013-14 HB5223/5224 Bills drafted based on MCEE recommendations. Stalled in legislation until 2015. 2015 Senate Bill 103 New requirements for teacher evaluations, including 40% use of student assessment data by 2018-19. During this time of stalled legislation, Michigan continued have an ineffective system of distinguishing between teacher effectiveness. For example, since the reform of tenure laws in 2011, of the almost 100,000 teachers in the state, only 19 were dismissed due to poor evaluations (Michigan Department of Education, 2015). Additionally, teachers in Michigan continued to be rated overwhelming effective or highly effectively; 97% of teachers in the state meet this criteria 56 (Michigan Department of Education, 2015). In 2015, Senate Bill 103, a new attempt at teacher evaluation reform, was proposed and passed the Senate. After some changes, the House agreed to approve SB103 and more than four years after Michigan began the process of overhauling the state’s teacher evaluation policies, SB103 passed, changing the evaluation measures of teachers and administrators. Beginning in 2018-19, 40% of a teacher’s evaluation will be based on student achievement data. Additionally, in most circumstances, multiple observations of teachers will occur annually. These changes are consistent with what many educational researchers and experts consider “smart” teacher evaluation policy. For example, these groups agree that using multiple measures to evaluate teachers, such as observations and student assessment data, weighting these measures evenly (i.e. 50% student assessment data and 50% observational data), and observing teachers multiple times is the best way to evaluate teachers (Darling-Hammond 2012; MET Project, 2013). Many policymakers and educational leaders believe these new evaluation laws have the potential to improve student achievement in the state by providing current teachers with effective feedback to help them improve their practice and by identifying and keeping effective teachers in the workforce. However, critics argue the large amount of discretion given to LEAs will bring into question how these individual entities will implement these new policies. These critics argue that while this new legislation guides districts and advises school districts on best practices, it lacks the legislative authority to truly impact how districts evaluate teachers. Given the tumultuous nature of Michigan’s teacher evaluation policy reform effort, Michigan is a timely case to study. At the time the data in this study were collected, the participants were learning new teacher evaluation systems and as a result their sensemaking will help shed light onto how principals, in general, might be navigating these new, complex systems. 57 This study has broader implications as other states continue to rework teacher and principal accountability systems in an effort to meet criteria set forth in the Elementary and Secondary Education Act (ESEA) and its subsequent revisions. In many states principals have had to negotiate and navigate impending teacher evaluation policy changes, changing evaluation rubrics, student growth models, and other evaluation logistics. It is plausible that principals in other states are experiencing similar teacher evaluation reforms and may have similar thoughts and beliefs as principals in Michigan. Participants and Sampling Strategy For this dissertation I targeted 12 public elementary school principals and 12 public school teachers. I targeted three principals who have minimal experience and face high outside pressure, three principals who have extensive experience and face high outside pressure, three principals that who have minimal experience and face low outside pressure, and three principals who extensive experience and face high outside pressure (See Table 4.2). After securing these principals, each principal asked for a volunteer teacher that we could observe during the evaluation process and that I could interview near the end of the data collection. I selected these 12 participants using criteria-based sampling. The 12 participants met my criteria of experience and current context. 58 Table 4.2. Principal Participant Sample Principal 1 2 3 High Experience / Low Pressure X X X High Experience / High Pressure 4 5 6 X X X Low Experience / Low Pressure 7 8 X X X Low Experience / High Pressure 9 10 11 12 X X X According to Marshall and Rossman (2006) when finding participants for a qualitative study it is important to consider; (1) if entry is possible; (2) if there is a high probability that a rich mix of the processes, people, programs, interactions, and structures of interest is present; (3) if the researcher is likely to be able to build trusting relations with the participants in the study; (4) if the study can be conducted and reported ethically; and (5) if data quality and credibility of the study can be reasonably assured (p. 62). After completing a list of my ideal target of principal criteria I began to reach out to principals who met the before mentioned experience and pressure criteria. I solicited principal participation by phone and email. In the end, I was able to achieve my goal of 12 principals, three from each of the aforementioned categories (see Table 4.3. for complete participant background information). I designed this sampling scheme to capture variation with the different principals from each criteria. Although this type of embedded design was not able to capture all important variables in each context, the design was useful to provide insights of the different 59 perspectives offered by these principals (McLaughlin & Talbert, 2001). The goal of this type of sampling is not to make generalizable statements about all principals with similar characteristics, but instead to begin hypothesis and theory building about principals with these type of characteristics and how these characteristics may impact policy implementation. Table 4.3. Principal Background Information (TPS or Charter, Principal Experience) Principal Mr. Bania Ms. Goldstein Dr. Wexler Mr. Bookman Ms. Hamilton Ms. Cohen Mr. Jarmel Ms. Robbins Mr. Ramon Ms. Steinman Ms. Chang Mr. Sherman TPS/ Charter TPS TPS Charter TPS TPS Charter TPS TPS TPS TPS TPS Charter School Rating Yellow Yellow Lime Red Red Red Red Red Red Yellow Lime Lime Years as Principal 10 10 9 10 10+ 10+ 1 3 3 1 4 4 Years at Current School 10 4 3 1 3 3 1 3 1 1 4 4 Years as Teacher 6-10 6-10 6-10 6-10 6-10 6-10 6-10 10+ 10+ 10+ 10+ 1-5 Level of Education M.A. M.A. Ed.D M.A. M.A. M.A. M.A. M.A. M.A. M.A. M.A. M.A. *Michigan’s 2014 Accountability Report Card Ratings: Green (highest), 85% or greater of possible points; Lime, between 70-84% of possible points; Yellow, 60-69% of possible points; Orange, 50-59% of possible points; and Red (lowest), Less than 50% of possible points. Data Collection According to Yin (2013) case studies typically draw information from sources including interviews, direct observations, participant observations, documentation, archival records, and artifacts. In this study I rely on four sources of information; (1) principal questionnaires; (2) interviews with principals and teachers; (3) observations of principals conducting evaluations of teacher instruction and observations of pre and post teacher evaluation conferences with principals and teachers; and (4) artifacts, including district teacher evaluation policies, teacher evaluation observation rubrics, principal observation notes of teacher instruction, and final 60 teacher evaluation ratings. My research questions were best addressed by these types of data and this type of data collection is consistent with other work in the field that has tried to better understand how school leaders make sense of and implement school policies (Coburn, 2005; Derrington, 2013; Koyama, 2014; Rigby 2015). Additionally, collecting these type of data allowed me to validate my data and findings through data triangulation by showing that diverse data collection methods confirm the findings (Miles et al., 2014). In this study my goal was to learn how principals in different environments with different experience levels make sense of and implement teacher evaluation policies and generate hypotheses of how and why these type of characteristics affect policy implementation. The coding for this study came from the four sources mentioned above. I administered a questionnaire to all principal participants at the beginning of data collection. (I began data collected in February of 2016 and collected the final data in January of 2017). The first part of the questionnaire asks participants about their relevant work experience, years serving as a principal (and teacher), level of education, how long they had served as a principal in their current school and additional school-context questions. The second part of the questionnaire asked principals a variety of questions about their school’s teacher evaluation system and policy. I used the questionnaire as a screening process to get background data on the principals to ensure the principals met the aforementioned criteria. Additionally, I use the questionnaire to generate some of the interview questions. Finally, I used data source triangulation, specifically the questionnaire, interviews, and observations, to strengthen the validity of my findings. For example, I compared and contrasted the answers principals gave on part two of the questionnaire to the answers they gave to me during interviews and to what I 61 observed the principals doing in practice. A complete version of the questionnaire is found in Appendix A. I interviewed the principals in this study three times each between February of 2016 and January of 2017. I conducted the interviews in one-on-one settings and focused on the principals’ experiences using and perceptions of teacher evaluation policies. I audio-recorded all interviews and I took notes during the conversation. Each interview lasted between 30 and 60 minutes. The interviews took place three times during the data collection – once at the beginning of the collection, again during the middle of data collection, and then near the end of data collection. I conducted the interviews in an effort to triangulate the data sources and strengthen the findings of this work. The purpose of the three interviews was to examine how principals make sense of teacher evaluation policies. The first interview focused on principals’ understanding and knowledge of the design of their current teacher evaluation system and principals’ beliefs about these systems. The second interview focused on principals’ experience implementing these systems. The final interview focused on reflecting on the observation of the teacher we coobserved. The full principal interview protocols are found in Appendices B, C, and D. I conducted teacher interviews (with available teachers) near the end of data collection in the spring, summer, and fall of 2016. I conducted the interviews in a one-on-one setting and focus on the teachers’ experiences with and perceptions of teacher evaluation policies. Each interview lasted between 30 and 45 minutes. I audio recorded the interviews and I took notes during the conversation. I conducted the interviews in an effort to get the teachers’ perspective on how their experiences with teacher evaluations, including how teachers perceive principals implement teacher evaluations and how their work is impacted by the teacher evaluation process. The semi-structured interviews focused on three main areas: (1) teachers’ understandings and 62 knowledge of the design of their current teacher evaluation system; (2) how they feel their principal implemented this system; (3) and how their practice is impacted by these evaluations. A complete teacher interview protocol is found in Appendix E. I observed each principal conducting a teacher observation that was used for a teacher’s final evaluation score. I collected these observations in the spring of 2016 and fall of 2016, as principals conducted official evaluations of their teachers. Each observation lasted between 30 and 60 minutes. Additionally, when available, I observed principals at the required teacher evaluation pre and post conferences. As I observed, I took field notes, completing them immediately following each observation to ensure accuracy. I shared the notes I took with both the principal and teacher to ensure I accurately represented their thinking and conversations during each observation. The purpose of the observations were to better understand how principals observe teachers in practice and how principals and teachers communicate about the evaluation and evaluation process. Additionally, as was previously mentioned, it is important in qualitative work to observe people in their natural environments (Yin, 2013). A complete field notes template for the observation is found in Appendix F. I collected district- and school-based teacher evaluation documents as provided by principals. These documents included district-wide and/or school specific teacher evaluation policies, observation and conference protocols, and other documents principals used while conducting teacher evaluations. Additionally, if given permission by principals and teachers, I collected final teacher evaluation scores and principal observation notes (see Table 4.4 for full data collection details). The purpose of collecting these documents was to better understand what principals were asked to do by their district and school and to understand how principals were making sense of what they were being asked to do. Finally, the purpose of collecting these 63 documents was to better understand what principals would be looking for during teacher observations and to better understand the type of feedback principals were giving teachers, including how principals actually rated individual teachers. I collected this information in an effort to see what these principals noticed during teacher observations and if and when this information was addressed during teacher evaluation post-conferences and in principal feedback to teachers. Table 4.4. Principal Data Collected Principal Mr. Bania Ms. Goldstein Dr. Wexler Mr. Bookman Ms. Hamilton Ms. Cohen Mr. Jarmel Ms. Robbins Mr. Ramon Ms. Steinman Ms. Chang Mr. Sherman Quest. X X X X X X X X X X X X Interview #1 X X X X X X X X X X X X Interview #2 X X X X X X X X X X X X Interview #3 X X N/A X X X X X X X X X Observe X X N/A X X X X X X X X X Post- Teacher Conf Interview N/A X X X N/A N/A X X X X X X X X X X X N/A N/A X X N/A X X Data Analysis I used the questionnaire as a screening process to get background data on the principals to ensure the principals met the aforementioned criteria and to generate some of the interview questions used in this study (Miles et al., 2014). Using Atlas.ti software I first analyzed all of the questionnaires, comparing participant responses to school-based questions about the participants’ beliefs, thoughts, and knowledge of their current teacher evaluation system. Additionally, I coded the background data gathered from the participants in an effort to look for themes, commonalities and differences between participants with similar and different characteristics. 64 The characteristics I coded include; 1) the age of the participant; 2) the participants’ level of education; 3) years of experience as a principal; 4) years of experience as a principal at their current school; and 5) the number of years each participant spent as a classroom teacher prior to becoming a principal. After coding all of the data from the questionnaire I coded individual participant interviews. I waited until I had collected fifty percent of the interviews to begin coding these data. Then I randomly selected three of these interviews to begin the coding process. These interviews were Mr. Sherman Interview #2, Dr. Wexler Interview #1, and Ms. Hamilton Interview #1. I began the coding process by looking for overarching themes within the data. In qualitative research themes are more general terms, phrases, or sentences which encapsulate larger groups of more specific codes (Miles et al., 2014). Once I documented these themes I began a generating specific codes, which relate to these overarching themes, but are more specific data points and generally include the language of the participants (Miles et al., 2014). I developed the codes inductively and as themes emerge from the coding process I grouped together by theme (Miles et al., 2014). After developing these codes I coded each of these interviews a second time, noting any discrepancies. Once I developed the initial codes, I began coding the additional interviews and added these codes to my code book. The initial themes that emerged were: 1) communication; 2) data use; 3) principal and teacher prior knowledge and experience; 4) relationships; and 5) the teacher evaluation system/policy. From these themes I developed a larger codebook. For example, under the theme of communication, specific codes included: 1) how principals communicate information about the teacher evaluation process; 2) how principals give feedback/scores to teachers; 3) how principals and teachers address discrepancies/disagreements during the evaluation process; and 4) how principals communicate 65 new/changing teacher evaluation policies/systems to their staff. I coded all of this data using Atlas.ti software to analyze and interpret patterns, trends, commonalities, and links among the participants (Miles et al., 2014). I then reviewed all codes, looking for common excerpts that highlighted similar themes and ideas. I then checked the validity of the coding process by recoding the data for a second time. I noted any discrepancies and these discrepancies were addressed in order to refine and justify assertions and to look for possible other alternative interpretations of the data (Guba & Lincoln, 1994; Miles et al., 2014). After I completed the coding process, I compared quotations to the original interview text, making sure these data were taken in context and accurately represented what the participants attempted to articulate. I completed the same process of coding for all of the observations collected during data collection. Specifically, I randomly sampled three observations to begin the process of developing codes. The observations I selected were; 1) Ms. Cohen’s post-conference; 2) Ms. Goldstein’s pre-conference observation; and 3) Mr. Bania’s post-conference observation. Finally, I coded all other documents and data including principal observation notes, principal final evaluation ratings of teachers, district documents (i.e. observation rubrics, etc.) using Atlas.ti software. I analyzed these documents individually, looking software and analyzed to interpret patterns, trends, commonalities, and links among the participants (Miles et al., 2014). After completing the coding process outlined above for all collected data I ran frequency checks using Atlas.ti software to further ensure the themes and codes which I developed accurately represented the overall tone, scope, and information presented in the data. Additionally, after individually coding all interview data I trained a colleague in how I coded these data, including providing my colleague my code book and explaining my thinking about 66 how I coded these data. I then provided a random sample of one interview to this colleague who coded the interview on her own. I then compared my colleague’s coding to my own to look for discrepancies and instances where we coding some or all of the data differently. In the end, we had 81 percent agreement on this sample. I then provided this colleague one additional interview and one field notes observation from my sample of data. Again, we compared the results of my colleagues coding to my own and had an 82 percent agreement after coding each of these two documents. Finally, I provided my colleague an additional 10 principal interviews, one teacher interview, and three field notes observations. Upon completion we compared all of my original coding and my colleague’s coding. In the end, we had 80 percent agreement on all of the coding. Establishing Validity According to Miles et al. (2014) the first step to establishing validity is to thoroughly prepare for the research. The researcher should have some familiarity with the setting and phenomena under study, strong conceptual interest, multidisciplinary approach, and good investigative skills (Miles et al., 2014). I have five years’ worth of experience working in public schools, first as a teacher and then on the administrative side as an instructional coach. Additionally, my coursework at Michigan State University, specifically my work in EAD991A (Teachers and Teaching in an Era of High-Stakes Testing), and TE931 (Introduction to Qualitative Research Methods), has helped me hone my qualitative research skills and prepared me for work on this important topic. Finally, my practicum, which focused on how principals make sense of teacher evaluation policies, helped prepare me for the role of the researcher. For my practicum, I developed skills in the following areas: (1) developing interview protocols; (2) interview participants; (3) coding qualitative research data; and (4) analyzing and writing a complete academic paper using qualitative data. 67 Maxwell (2005) defines validity as, “the correctness or credibility of a description, conclusion, explanation, interpretation, or other sort of account” (p. 106). In qualitative research there are several threats to validity including; (1) researcher bias; (2) reactivity; and (3) manipulation of the data (Maxwell, 2005; Miles et al., 2014). Researcher bias has the potential to influence the data the researcher identifies as important and/or the conclusions drawn from the data. Reactivity can take place when the simple act of conducting research changes the behavior of the participants in the study. Finally, data manipulation may occur when the research tries to find data that fits his or her existing theory or hypothesis. To combat these potential threats to validity, the researcher must thoroughly explore and explain his or her biases and how these biases will be dealt with throughout the duration of the study (Maxwell, 2005). I addressed my potential biases by ensuring all participants are allowed to read transcripts of recorded information and notes and they were afforded an opportunity to address any discrepancies that they feel do not accurately portray what they were trying to say or do. Specifically, to establish validity for all interview and observation data I left room to ask participants about any comments they make, making sure I clarified their statements before drawing any conclusions. Additionally, I contacted all participates to clarify any questions that arose during the transcribing and coding of the data. I also solicited critical feedback from colleagues throughout the data collection and writing process. Additionally, I constantly acknowledged how my past experiences may have impacted data collection and writing and I made every effort to remain neutral by asking non-leading questions, asking for clarifying comments, and collecting and using the data completely and in context. Finally, Lincoln (1995) identified eight standards for evaluating the quality of qualitative research: (1) standards set in the inquiry community; (2) positionality; (3) community; (4) 68 participant voice; (5) critical subjectivity; (6) reciprocity; (7) respect; and (8) sharing privileges. Throughout the research design, data collection and analysis, and writing, I attempted to meet each of these eight standards of quality by reviewing similar studies and dissertations that used a similar research design approach. I reviewed this study early in the dissertation design process, even dating back to the drafting of my dissertation proposal. Throughout this entire process I referred back to these studies in an effort to make sound methodological decisions and research design choices throughout this work. Given the sensitive nature of observing teachers while they were being observed and learning about their teacher evaluation scores in the post-conference, I made efforts to acknowledge my role in these environments. I was simply an observer be an observer and used the final interviews with both the teacher and the principal as a chance to answer follow-up questions. Limitations There are two main limitations to the design and methodology of this study. First, this study is limited by the participants. The participants in this study were not randomly selected. Additionally, the number of participants does not allow me to make generalizable statements about all principals. If I collected data from 12 other principals these principals could have provided different insights and thoughts, resulting in a different interpretation or analysis of the data. In this way, the principals in this study shape the findings by their experiences, thoughts, and beliefs. To address this participant limitation I used criteria-based sampling, soliciting principals from four different subcategories; (1) principals with high-levels of experience in high-pressure environments; (2) principals with high-levels of experience in low-pressure environments; (3) principals with low-levels of experience in high-pressure environments; and (4) principals with low-levels of experience in low-pressure environments. Although this type of 69 sampling cannot account for all differences, the goal of this work is to begin to hypothesize about how principals with certain characteristics think about and enact teacher evaluation policies. Therefore, this sampling scheme was necessary and appropriate to answer my research questions. The second limitation is, although principals were observed in their natural environment implementing their teacher evaluation policy, I did not observer each principal multiple times, with a variety of teachers, or during every interaction the principal had attempting to implement the policy. In this way, my presence as a researcher during data collection and the researcher may not have captured exactly how principals were conducting teacher evaluations in all circumstances. To account for this limitation I spoke with teachers when available to see if what I observed was an accurate or consistent representation of how these principals navigated the process of teacher evaluations. Despite the aforementioned limitations, the data collected in this study provide a great insight as to how the principals in this study navigate teacher evaluation policy implementation. Although not generalizable to the entire principal community, the results and analysis of this work will serve as the basis for hypothesis building and testing in future research. As principals across the United States continue to make sense of evolving teacher evaluation policies, the results of this work have the potential to be explore with different principals in similar contexts in varying locations throughout the United States. 70 Chapter 5: How Principals’ Cognitive Schemas Impact Their Implementation of Teacher Evaluation Systems “My prior knowledge as an administrator is huge. My first evaluations in my early years of being an administrator were probably based on a lot of feelings. As you grow as an administrator your feelings die.” - Ms. Cohen (10 years of experience) My research questions and my theoretical framework guide all of the findings in chapters five and six. This chapter answers my first research question: How do principals’ cognitive schemas influence their implementation of teacher evaluation systems? The overarching theme that weaves throughout my analysis of these findings is principals’ cognitive schemas influence the type of sensemaking in which they engage, impacting how they think about the overall process and purpose of teacher evaluations. In addition to this overarching theme, four dominate subthemes emerged through my analysis of the data using the lens of cognition and specifically sensemaking theory. The first subtheme is principals’ cognitive schemas influence the type of leadership in which they engage, which impacts how principals navigate the process of teacher evaluations. The second subtheme suggests principals’ cognitive schemas influence how they use previous teacher evaluation information during the teacher evaluation process. The third subtheme suggests principals’ cognitive schemas guide individual principal perceptions and beliefs of the accuracy of their current teacher evaluation system (in terms of accurately capturing a teacher’s effectiveness). This finding impacts how principals use and at times take liberties with these systems. The final subtheme suggests principals’ cognitive schemas affect how principals think about using the results of teacher evaluation scores when hiring new teachers. 71 Overarching Theme: Individual vs. Collective Sensemaking After an analysis of these data (again, principal questionnaires, interviews with the principals and teachers and observations of principals’ conducting a teacher evaluation), I found that principals in this study with high-levels of experience (nine or more years as a principal) engaged in “individual sensemaking,” while principals with low-levels of experience (principals with four of fewer years as a principal) engaged in “collective sensemaking.” To review, individual sensemaking is a type of sensemaking generated from an individual’s thoughts, experiences, and beliefs. Collective sensemaking occurs among multiple people who are attempting to make sense of a similar or the same situation (in this case, the teacher evaluation process). My findings suggest a principal’s cognitive schema influences how he or she comes to understand and implement teacher evaluation policies based on the amount of experience they had as an administrator. As was mentioned previously, research shows experience impacts how principals think about and ultimately implement policies and reforms (Coburn, 2001; SeashoreLouis et al., 2010). The findings from this study support these earlier findings, while providing more nuanced reasons as to how and why principal experience matters for teacher evaluation policy implementation. The four subthemes that follow further explain how principal experience influenced how these principals made sense of teacher evaluation policy implementation as well as provide nuanced information as to the thoughts, actions, and beliefs of these principals. Subtheme One: Principal Leadership All of the principals in this study said their leadership style impacted how they thought about and implemented their school’s teacher evaluation policy. However, as evidenced by talking with and observing these principals, the leadership style of principals varied. Principals with high-levels of experience primarily engaged in situational leadership. Social science 72 researchers define situational leadership as a leadership style which requires a rational, adept, individual who understands situations and responds and reacts to situations in ways that are beneficial to the organization (Grint, 2011; McCleskey, 2014). Situational leaders vary the amount of support, direction, and goals they provide individuals, understanding that individuals need different levels of support when attempting to accomplish a task (Gates, Blanchard, & Hersey, 1976). When describing their leadership style some of the principals with high-levels of experience in this study said specifically they were situational leaders, while others demonstrated the characteristics of this leadership style in both their thoughts and actions. For example, Ms. Cohen (10 years of experience) said: Situational leadership is what I subscribe to. I think it falls in line actually as an educator too. I think every person is a unique individual. They need to be treated that way. Every teacher needs something different. I have one teacher down the hallway. She’s only been teaching like three years. She’s amazing. I can say, “I’d like to look at your student data.” She’s like, “Okay.” She comes back with this spreadsheet and it’s all tallied and averaged at the bottom and red and color coded. I have another teacher, “I want to look at your data.” She brings me this binder with tests in it. Each one needs something different. When asked how her leadership style impacts how she evaluates the teachers in her building, Ms. Cohen said the way that she conducts observations of teachers and directs conferences with her teachers varies based on the individual. For example, Ms. Cohen explained she knew the strengths and weaknesses of the teachers in her building and therefore she was able to tailor their evaluations situationally. She referenced one teacher who she knew struggled with reading instruction in particular and because she knew of this struggle, Ms. Cohen focused her 73 attention on reading instruction during the observation process. The teacher admittedly thought her reading instruction needed support and while both Ms. Cohen and this teacher believed the teacher had strengths in other areas, reading instruction was an area of concern. Because this teacher struggled with reading instruction Ms. Cohen made sure to observe this teacher during a reading lesson; she provided feedback and support based solely on the teacher’s reading instruction; and she focused her conversations with this teacher almost entirely on how the teacher could improve as a reading instructor. In another example, Ms. Cohen said the teacher that she and I observed during her official evaluation struggled in using student data to make instructional decisions. As a result, the examination of student data was the focus of her observation. Prior to the observation, Ms. Cohen looked at this teacher’s lesson plan to see if she provided evidence of how she used data to inform the lesson. During the observation, Ms. Cohen looked around the teacher’s classroom to see whether student data were on display. During this teacher’s official teacher evaluation post-conference, aside from discussing the teachers’ final evaluation rating, the conversation focused exclusively on how this teacher was using student assessment data to guide her instruction and promote student growth. Ms. Cohen led all of her teacher evaluations this way, situationally evaluating teachers based on her personal understanding of where she believed the teachers needed support to become better educators. Mr. Bookman (10 years of experience) also described his leadership style as situational, particularly when evaluating his teaching staff. He said: I would describe my leadership style, honestly, as situational. Especially if I’m going to observe and then instruct a teacher, be an instructional coach. I’ll be honest with you, it makes some people uneasy because it’s like there’s this element of unpredictability, and 74 they want to be able to predict what I’m doing all the time, but I can’t even tell them what I might do in a situation because it’s like, “Well, tell me more about the situation.” I haven’t found anything that’s black or white in education in my experiences, and I’ve dealt with them all. I’ve dealt with the parents. I’ve dealt with the students. I’ve dealt with the teachers. There doesn’t ever seem to be anything that’s black or white. Mr. Bookman went on to explain he did not believe in a “one size fits all” leadership approach. This philosophy impacted how he approached evaluating the teachers in his building as he believed that all teachers needed something different. For example, Mr. Bookman described one of his teachers as having a particularly challenging class in terms of student behavior. Mr. Bookman said he factored in the challenging nature of this teacher’s class by increasing this teacher’s score on the professionalism part of the evaluation (although doing so was technically not permitted in his evaluation system). Mr. Bookman constantly referenced how he situationally evaluated teachers throughout the entire year and did not rely only on the official teacher evaluation when assigning final teacher evaluation ratings. For example, after our coobservation of one teacher, Mr. Bookman was quick to note that the lesson we observed was not an accurate reflection of the quality of the teacher. Mr. Bookman addressed what he deemed a subpar lesson with the teacher in the post-conference, but ultimately rated this teacher highly effective, because he knew this one lesson observation, although technically the official observation used for evaluative purposes, was not a true reflection of this teacher’s performance. Ms. Cohen and Mr. Bookman provide two illustrative examples of leadership that are representative of the majority of principals in this study with high-levels of experience. Ms. Goldstein (more than 10 years of experience) said, “My leadership style changes daily based on what’s happening in the building. At the end of the day, I have to make the decision of what’s 75 best for our students and our staff.” This sentiment further highlights the finding that principals with high-levels of experience led situationally, which impacted how they evaluated their teachers. Ms. Goldstein went on to explain that her district had trained all principals to evaluate their teachers in a very structured way, in an effort to make sure all teachers were receiving consistent evaluations throughout the district. However, because she thought about each teacher situationally, Ms. Goldstein found evaluating all teachers the same way a difficult task to accomplish. Ms. Goldstein said in her building teacher evaluations might look different for individual teachers and in her mind this variation was fine, because all teachers need something different. In contrast to their more veteran peers who engaged in situational leadership, principals with low-levels of experience overwhelmingly described their leadership style as relational. Briefly defined, relational leadership “expresses the degree to which a leader shows concern and respect for their followers, looks out for their welfare, and expresses appreciation and support” (Bass, 1990a; 1990b). The characteristics of relational leaders include working with their followers in an effort to achieve a goal and valuing the input and emotional needs of their followers. In this study, principals with low-levels of experience exhibited the characteristics of this definition. Because these principals were new or relatively new in their current role, they all wanted to make sure they established and improved relationships in their building. Wanting to secure positive relationships with their staff impacted how these principals thought about, constructed meaning around, and ultimately implemented teacher evaluation policies. Ms. Steinman (one year of experience) was representative of the larger group of principals with low-levels of experience. She said: 76 I am very relational in my leadership style. I have tried to consciously be more relational with them (teachers) in the non-evaluation sense because I don’t want them to feel like I’m picking on them or targeting them (when performing their official evaluation). That just might be my own insecurity. I might get over that later. It is still just really hard for me to feel like I am giving someone a bad score and then not having a relationship with them. Relationships allow you to have those tough conversations, and if you don’t have that relationship, that tough conversation can’t occur. When asked if her leadership style impacted how she thought about teacher evaluations Ms. Steinman said, “Yes, because evaluating a teacher does put a strain on your relationship. Especially the relational part. There is a fine balance there. I really want them to be able to just have their own voice and speak their own truth.” Ms. Steinman went on to say that she and her teachers co-developed how teacher evaluations would occur during the school year. Although Ms. Steinman was quick to point out that there were many logistical things that she could not change (such as the documents she needed to provide teachers and the time she spent observing teachers), she thought working with teachers to construct an understanding of how to best use their district’s teacher evaluation system would be a good approach and especially an effective approach when trying to secure positive relationships. Ms. Steinman went on to explain she has seen benefits of her relational leadership style, and, in her opinion, it was important for her staff to see she was working with them, especially on something as important as their evaluation. She said: I did a survey with the staff so I could get some feedback also. I think that I was reaffirmed in the idea that relationships are strong because almost all of them commented about how much they appreciated being treated like a professional, being allowed to have 77 a voice, and feeling like I really took time to get to know them on an individual basis. I think that reiterates that they also feel that I’m a relational leader in that way. Ms. Steinman did not suggest she let her staff take advantage of her when it came to their official evaluation; instead, she explained that the way she and her staff thought about this process was co-constructed. The co-construction of the evaluation process between Ms. Steinman and her teachers resulted in teachers having a say in what was valued during observations. One example of this co-construction is Ms. Steinman said that staff very much valued classroom routines and procedures. As a result, Ms. Steinman paid close attention to this section of the observation rubric during evaluations. She looked for evidence of clear routines and procedures during observations of teacher instruction as well as talked about routines and procedures during conversations with her staff. Ms. Robbins’s (three years of experience) actions provide further evidence that principals with low-levels of experience invested in relational leadership, especially when it came to teacher evaluations. Ms. Robbins explained she spent much of her time thinking about her leadership and how her leadership style impacted evaluations of the teachers in her building. Ms. Robbins constantly mentioned how she wanted to work with teachers so they would be successful during the evaluation process and for working together with her teachers meant communicating with her teachers during the process to make sure they both agreed on what was happening. She said: I do a lot of thinking before I conference with teachers. I really firmly believe that it’s so important to say the right thing the first time. You really can’t take things back. You could do a lot of damage. By not phrasing the things the right way, you can shut somebody down and discourage somebody that doesn’t need to be discouraged. You 78 could give false hope. I do think it’s really important to make sure that the message that teachers are getting is right, and that you’re being as fair as you can to them and that you’re not shutting somebody down or ruining a relationship. For Ms. Robbins, maintaining a strong relationship with the teachers was essential to the overall climate of her school. She noted that strained relationships could compromise things such as communication between her and her staff and the overall climate of her school Ms. Robbins continued: I’m very positive, very supportive. We try to work as a team. I try to take advantage of the expertise that’s in the building and to encourage people that may not be sticking their necks out and showing what they know and sharing their good ideas. I’m a new leader, so sometimes I’m not—I don’t always feel sure of myself. I just try to make sure that I’m keeping that—what’s best for the kids in mind. That it is important for me to do what’s good for teachers too, to make sure they feel taken care of and valued. I feel like our job as teachers is to continue—I’m still calling myself a teacher—it’s just to continue to grow ourselves and to improve so that we can meet the needs of the students as they change. Ms. Robbins went on to say that she was very positive in all her conversations with the teachers in her building when discussing their evaluation. Even if she was giving critical feedback, Ms. Robbins always tried to think back to when she was a teacher and how having these conversations were tough. Ms. Robbins said at times she might not be a critical of teachers as she needs to be, but she thought support and positive affirmation were better approaches than criticism and negative feedback. Ms. Steinman and Ms. Robbins provide two examples that are representative of principals with low-levels of experience in this study. These principals typically engaged in 79 relational leadership in an effort to secure positive relationships with their staff. However, this type of leadership style also impacted teacher evaluations as these principals were quick to give teachers the benefit of the doubt and at times avoid difficult conversations around teacher performance because they wanted to secure these relationships. Principals with low-levels of experience also empathized with teachers more often than their more experienced peers as they more recently were in the classroom and had recently went through the teacher evaluation process themselves as former teachers. Interesting to note is I did not provide and examples or definitions of leadership when asking principals how their leadership impacted their implementation of teacher evaluation policy. The principals in this study knew the terms “situational leadership” and “relational leadership” and used these terms unprompted to define their leadership style. Finally, I think it is important to note while situational leadership and relational leadership are two distinct leadership approaches, these leadership styles are not mutually exclusive. For example, relational leaders show concern for their followers and value their followers’ thoughts, ideas, and opinions. Situational leaders also have these characteristics. However, the ways in which situational and relational leaders approach evaluating teachers is different. As evidenced by the principals in this study, principals who engaged in situational leadership varied how they evaluated teachers, while principals who engaged in relational leadership typically evaluated all teachers similarly. Nuances. One way in which principals’ cognitive schemas impacted the way in which they thought about and ultimately evaluated teachers was the leadership style to which they subscribed. The principals in this study with high-levels of experience typically completed evaluations situationally, completing each on a case by case basis, while factoring in all they knew about the individual teacher. The principals in this study with low-levels of experience 80 typically engaged in relationship leadership, which including co-constructing how teacher evaluations looked in practice. Although there was a clear demarcation in leadership style between experience levels, the relationship between leadership style and experience did not hold for all principals. For example, one principal, Mr. Sherman (four years of experience) described his leadership style as situational. Mr. Sherman explained that given the current needs of his teaching staff, he did not think it was beneficial for all of his teachers to be evaluated in the same way. He explained that his teaching staff varied greatly in level of experience and as a result how he evaluated veteran teachers looked much different than how he evaluated pre-tenure teachers. In his mind this was a perfectly legitimate approach to teacher evaluations because given the varying experience levels of his staff, each of his teachers was best supported by an evaluation specific to their current experience level. Dr. Wexler (nine years of experience), said she thought relationships were the single most important factor when leading a school. Dr. Wexler explained her school experienced a low retention rate of teachers and she wanted to change this. Dr. Wexler believed focusing on building strong relationships would help decrease the number of teachers leaving her school and as a result she prioritized relational leadership, particularly during the teacher evaluation process. However, aside from these two examples, five of six principals fit into the aforementioned categories (i.e., high experience principals were situational leaders and low experience principals were relational leaders). One other nuanced difference in how principals’ cognitive schemas are impacted by experience is three principals with high-levels of experience (Mr. Bookman, Ms. Cohen, and Ms. Hamilton) admitted their leadership style changed as they gained more experience. For example, Mr. Bookman said: 81 Where I’ve gone wrong in the past, is not having the guts to do it. Not having the guts to have the tough conversations, and when you’re having those tough conversations keeping the emotion out of it. It’s a matter of fact thing, and it’s always in the guise of so that you can be a better teacher, and a better person, and you can be successful. You’ve got to have some tough conversations, and you got to ask some tough questions, but people respect that a heck of a lot more than they do someone who doesn’t address it. Mr. Bookman went on to say that wanting to develop positive and trusting relationships with his staff impacted how he had these “tough conversations” and how he ultimately evaluated his teachers. However, as he gained experience and became more comfortable in his role as a school leader, he will much more willing to have these tough conversations and critique teacher performance. Ms. Cohen also reflected on how she evolved as a leader and said: As a new administrator I did have a hard time. Now I’m pretty cut and dry. I say what I think. I think that’s just something you learn as you get older. Those strong-willed teachers in the beginning, some of them are scary. I’m evaluating them and I’m think, oh my gosh, they’re going to hate me when we get done. If you want to be a good administrator you need to forget about what people think about you. You’ll never make a good administrator if you waffle. My prior knowledge as an administrator is huge. My first evaluations I think, in my early years of being an administrator, were probably based on a lot of feelings. As you grow as an administrator your feelings die and you don’t have them anymore. I think you can look at things more objectively, taking the subjectivity out of it as much as possible. 82 Subtheme Two: Use of Prior Evaluation Data Another way in which principals’ cognitive schemas impacted the way in which they implemented teacher evaluation policy was how these individuals thought about and used prior teacher evaluation data during the teacher evaluation process. Six principals in this study (Mr. Bania, Ms. Goldstein, Dr. Wexler, Mr. Bookman, Ms. Hamilton, and Ms. Cohen) had at least nine years of service as a school principal. The principals in this study with high-levels of experience overwhelming stated they did not rely on previous teacher evaluation data (including prior teacher evaluation ratings and prior student assessment/achievement data) when evaluating teachers in the current year. When asked if he reviewed teachers’ previous evaluation data Mr. Bania (10 year of experience) said: Nope. Fresh each year. I mean, I kind of know where they’re at. If someone’s highly effective for three years in a row they didn’t have to be evaluated, like every other year. It’s broken down in here like who those people are. When I go in I know like those are The highly effective teachers that particular year, and these were those that were not highly effective three years in a row. We do look at percentages too as an administrative team to see how many highly effective teachers I had here versus how many effective versus other things. When I asked why he did not consult any prior information when evaluating teachers in the current year, Mr. Bania said he felt that whatever a teacher had done in the past should not be reflected in their current teacher evaluation. Mr. Bania also said they he wanted to rely on what he saw and heard from teachers in the current school year. He did not want to rely on past teacher performance data and only wanted to evaluate a teacher based on his impressions of that teacher within the current school year, as according to him, evaluations are “year to year.” 83 Mr. Bookman also indicated he did not use previous evaluation information when evaluating teachers in the current year teachers. He said: With teachers I don’t. I want no preconceived notions. You know what I mean? I’m smart enough and experienced enough that I can figure out what kind of teacher they are. I don’t want to see any letters that may have happened in their file or any past evaluations that three different evaluators did because that’s all arbitrary to me. Mr. Bookman was new to his current school, although he had been a principal for 10 years. Because he was new to his school Mr. Bookman did not want to have his mindset influenced by the previous administration when evaluating his new staff. Additionally, Mr. Bookman was confident that he would be able to accurately assess a teacher’s performance given his prior experience as an administrator. A third principal with high-levels of experience, Ms. Hamilton (more than 10 years of experience), also stated she never looked at previous teacher evaluation data, including student assessment data or teacher observation ratings. Ms. Hamilton explained: Not at all. Nope. Each year we have tabula rasa. I have a personal relationship with all of them and I know their peculiarities, I know their strengths, I know their areas of improvement, and I’m helping them with it all. Six principals in this study (Mr. Jarmel, Ms. Robbins, Mr. Ramon, Ms. Steinman, Ms. Chang, and Mr. Sherman) had four of fewer years of service as a school principal. When compared to their more experienced principal peers, these principals were much more likely to call upon, review, and use prior teacher evaluation data (including prior teacher evaluation ratings, both which they had provided and which other principals had provided, and student assessment data) during the current year teacher evaluation process. Ms. Robbins said: 84 When I first started as a principal I felt like I kind of needed to know what they had been working on, what the previous principal felt their strengths were and their weaknesses. When you take over sometimes, you may have somebody that had a personality conflict with the previous person. You may have somebody that was just really chummy with the previous principal that may have been getting something that she wouldn’t have given them as a score. I want to get an idea of what happened before I came here, just to see what you’re working on and then we’re going pick up together and let’s see what we do from here. When asked how reviewing this information impacted, if at all, how she approached evaluating her staff, Ms. Robbins said: I think it definitely—in some ways, it influenced the conversations that we had. Without saying, “Listen, I know that you and the previous principal were working on this,” I kind of had an idea of what that teacher was really focused on, what somebody had said to them before. It’s aware of— that is something that I’ve wanted to look for as I’ve been observing. If somebody had mentioned that we really have some very surface-level—we don’t have deep questioning happening in this classroom. My antenna’s up for that when I go in. In some ways, yeah, it is definitely going to have an impact. For Ms. Robbins, and other principals with low-levels of experience, it was important to get a complete picture of the teachers they would be evaluating. Ms. Steinman added: I think it (reviewing previous teacher evaluation information) might help focus me a little bit and then maybe help focus in on what I want to provide feedback to them on. I think in that sense, that’ll be nice, to have that knowledge prior. Ms. Robbins and Ms. Steinman provide two examples of principals with low-levels of 85 experience who want to make sure they have a complete picture of the teachers they are evaluating. Both Ms. Robbins and Ms. Steinman said this desire to have as much information as possible goes back attempting to establish positive and trusting relationships with their staff. During the post-conference with her teacher Ms. Robbins brought up and discussed the teacher’s previous evaluation at length. Ms. Robbins did brought up this information in an effort to show the teacher that she understand what happened in the past and how she would be structuring evaluations moving forward. Additionally, Ms. Robbins brought up specific scores and feedback from the prior evaluation providing her thoughts on these areas and whether she was in agreement with the previous assessment of this teachers’ performance. Another principal with low-levels of experience, Mr. Ramon (three years of experience), said while he initially tried not to review any prior teacher evaluation information he did end up reviewing teachers’ previous evaluation data, which he believed helped him feel more justified assigning ratings of teachers. Mr. Ramon said: I actually tried not to. I know they talk about diminished returns or anything that could potentially impact your evaluation, so I know I initially said I wasn’t, and then I did actually go back. I definitely tried to not let it impact the rating I would give the teacher but definitely it was interesting to see some of the feedback. I had some instances where I saw the same evaluation that they got last year was very similar to what they got this year. I think that’s where the consistency in the evaluation tool really comes out and shows that if you have proper training, you could potentially see similar variables when you’re going in and doing evaluation. Nuances. Another way in which principals’ cognitive schemas impacted how principals thought about and ultimately evaluated the teachers in their building was these principals’ beliefs 86 in the need to consult previous teacher evaluation data when evaluating teachers within the current school year. How principals thought about and ultimately decided whether or not to consult this information impacted these principals’ thoughts and behaviors during the evaluation process. For example, overwhelmingly, principals with high-levels of experience said they did not consult prior teacher evaluation information or data when conducting teacher evaluations in the current school year. When asked why they did not want to look at this data, these principals stated that each teacher deserved to start a year fresh and without any previous information influence the thinking of the evaluator. Additionally, principals with high-levels of experience were confident that their prior knowledge and experiences as a principal were enough to judge a teacher’s performance in the current year. The principals in this study with low-levels of experience overwhelmingly stated they looked at previous teacher evaluation information before and during the process of evaluating teachers in the current year. The reasons these principals provided regarding why they wanted to know this information included wanting to know what the previous evaluator had noticed in previous years and getting a more complete picture of individual teachers before providing their own evaluation. Interesting to note, even principals who were not new evaluators and were in their second or third year of evaluating the same teachers in their school looked back at how they had rated teachers in previous years. For example, Mr. Sherman who has been a principal for four years, all at the same school, said he looks at how he rated teachers in prior years because he wants to make sure he is consistent with his approach to evaluations from year to year. None of the principals with high-levels of experience looked back at how they had rated teachers in prior years (at least, not while conducting a teacher’s current evaluation). 87 Although there was a distinction between principals with high-levels of experience and their less experienced peers and how these principals used prior teacher evaluation information, the findings were not unanimous. For example, Ms. Cohen (10 years of experience) said while she does not review a teacher’s past evaluation scores before evaluating in the current year, over the summer months she does review this information to see if her teachers’ are progressing. She said: I use student assessment data from the entire year. Yes. I do look at that data. As far as their previous evaluation, not when I’m evaluating. I had a teacher last year that was on probation. I put her in a new spot this year. She’s not knocking it out of the park. I wouldn’t want that to sway the way I’m thinking I guess. (Over the summer) I do look at it to see are they moving forward, did they go backwards. Then I’ll sort of prepare myself because they know what it is. If it went down I need to validate why did it go down. I will let them argue a point. If I don’t give them credit for something and they can tell me, “I did do that, Ms. Cohen. This is how I did it.” You can tell when someone’s making something up unless they’re a really good liar. If they are that’s going to come out in the end sometime. Everything comes to the surface eventually. One principal with low-levels of experience, Mr. Jarmel (one year of experience) said he never looks at or review previous teacher evaluation information. Mr. Jarmel believed he should not be influenced by what prior evaluators had written or observed and that wanted to form his own opinions about the teachers in his building. However, Mr. Jarmel provides the only example of a principal with a low-level of experience who expressed the opinion of not wanting to review prior teacher evaluation information before or during evaluating teachers in the current year. 88 Subtheme Three: Accurate Reflection of Teacher Effectiveness A third way in which principals’ cognitive schemas impacted the way in which they thought about and implemented teacher evaluation policies was individual principals’ beliefs on the accuracy of their teacher evaluation system. Principals with high-levels of experience overwhelmingly believed their current teacher evaluation system was an accurate representation of teacher effectiveness. For example, when asked if she believed the final evaluation score a teacher received was an accurate representation of that teachers’ effectiveness, Ms. Goldstein said: Yeah, I do. I truly do, because of the way I do it, because of the dialogue I’ve had, and the things that I’ve observed to provide evidence to support why I feel they were where they’re at. Yes, I do feel like it’s a pretty accurate reflection. Ms. Goldstein was very confident in her ability to evaluate teacher performance and because of her belief in her ability as an evaluator, Ms. Goldstein felt any system that she used would produce an accurate evaluation of teacher effectiveness. Mr. Bania (10 years of experience) added to this sentiment and said: Well, I think, yes, the teachers that are effective get marked as effective or highly effective. In the observations I’ve done and the rubric scores I’ve given them, it pretty much—when I see the score I’m like, “Yeah, I think that’s what I’ve observed as them as a teacher. Mr. Bania, much like Ms. Goldstein, was confident that his teacher evaluation system and his teacher evaluation ratings were an accurate representation of teacher effectiveness because the scores matched what he cognitively thought was effective teaching. When asked if he felt his experience as an evaluator aided in his confident in the accuracy of his teacher evaluation system 89 Mr. Bania said that was a fair statement. He added that he faithfully implemented his teacher evaluation system and the end result was an accurate rating of teacher effectiveness for all of his teachers. Dr. Wexler (nine years of experience) perhaps best describes the sentiments felt by all principals with high-levels of experience regarding the accuracy of her school’s teacher evaluation system. Dr. Wexler believed her teacher evaluation policy and system was very subjective because she believed any evaluation done by human beings has the potential to be subjective. However, because she has been a principal for nine years, Dr. Wexler felt she knew how to eliminate this subjectivity and accurately and fairly evaluate all of the teachers in her building. Dr. Wexler explained through her principal training and her experience observing teacher classroom instruction, she was able to make accurate determinations of teacher quality. In short, Dr. Wexler was confident in her ability to evaluate teachers accurately, regardless of the system she was using. Overall, because principals with high-levels of experience had confidence in themselves as evaluators they had confidence that final teacher evaluation scores were an accurate representation of teacher effectiveness. Additionally, these principals believed strongly their current teacher evaluation system produced accurate teacher evaluation scores and results. This belief resulted in principals with high-levels of experience typically following these systems with fidelity – at least what these individuals believed to be fidelity to this system. While their more veteran peers were confident in the accuracy of their teacher evaluation system, an analysis of these data suggests that principals with low-levels of experience do not think their current teacher evaluation system is an accurate reflection of teacher effectiveness. 90 Perhaps most pointedly when asked if he felt his school’s current teacher evaluation was an accurate reflection of teacher effectiveness, Mr. Jarmel said: No, because I think it’s so much more than just a rubric. I know that we’ve been working hard. I think it’s more than just going in for the 40 minutes to sit and do an evaluation on them. If I’m in the classrooms every day, and I see what’s going on, that’s more to me valuable to the teachers because I can stop right then and offer suggestions and supports. Principals that don’t go into the classrooms regularly, I don’t see how any evaluation you do could be fair or consistent for teachers. Mr. Jarmel went on to articulate that his policy did not allow for these extra observations to be counted towards his teacher’s evaluations, so while he knew that he was helping teacher he also knew that if a teacher had a bad lesson during an “official” evaluation this bad lesson would be reflected in their evaluation rating and the rating might not be the best reflection of that teacher’s effectiveness. If he had control of how to evaluate teacher’s Mr. Jarmel “would use many short visits to teacher’s classrooms” to evaluate instruction throughout the school year. Ms. Robbins also was not confident that her district’s teacher evaluation system as an accurate representation of teacher effectiveness. She said: Sometimes I feel like there’s pieces on there that are not—there’s pieces I’d like to see there that really aren’t there. For example, again, coming back to the tone that a teacher uses with a child, if they’re respectful with the child or not. I’m not sure that that’s really there. I feel like it’s really important. It frustrates me sometimes when I do have an issue with a teacher who isn’t addressing students with respect. That’s something that we’re really working on. It’s not really there. It’s not there explicitly. I feel like I’m having to work it into something where it really—the verbiage isn’t there. That gets a little 91 frustrating sometimes. But it doesn’t stop the conversation, because that’s an expectation in the building out. We’ll work it through it in another way. It will definitely be part of the evaluation, the observation, and it will be something that we’re going discuss every time, because it’s an area of focus for that teacher. For Ms. Robbins, something she valued greatly and thought was a measure of teacher effectiveness was not included in his district’s teacher evaluation system. As a result, she lacked confidence in the accuracy of this system. Ms. Steinman was also not confident that her school’s teacher evaluation system captured all that the teachers were doing or that they produced an accurate representation of teacher effectiveness. Ms. Steinman’s lack of confidence impacted how she ultimately scored the teachers in her building. Ms. Steinman said: I’m still very sensitive. I will admit it’s very hard for me to give a minimally effective or ineffective or missed opportunity. I still feel that. I don’t know if that’s good or bad. Maybe I will always feel that. As a teacher, I always strive to be highly effective. Then you still have to base things on reality and what you’re seeing and really trying to use it for growth versus bashing. It’s not a tool to be bashed with. It’s a fine line. I haven’t arrived there yet. I think that I still think more like a teacher than an administrator yet. That will come later. Ms. Steinman went on to articulate that partly because of her lack of confidence in the accuracy of her school’s teacher evaluation system, she was hesitant to rate teachers critically and she typically defaulted to higher evaluation scores. Giving teachers the benefit of the doubt and defaulting to more favorable ratings was a common sentiment amongst principals with lowlevels of experience. Partially because they lacked complete confidence in their evaluation 92 system and partially because they lacked complete confidence in their abilities as an evaluator, these principals were more likely to rate teachers higher than their more veteran peers would rate their teachers. For example, Ms. Steinman (one year of experience) said: I think that it has been a transition for me in general to switch my mindset from teacher to administrator. I’m still not there yet. My admin team will tell me frequently that I still think very much like a teacher. I don’t know that I think that’s bad. It has allowed me to create some very good relationships with my staff this year, which has been great. Nuances. Another example of principals’ cognitive schemas impacting how these principals evaluated their teachers was the principals’ belief of whether or not their teacher evaluation system was an accurate representation of teacher effectiveness. Individual principals’ beliefs were clearly divided between experience levels. Principals with high-levels of experience generally took on the mindset of an administrator who believed in the accuracy of their current evaluation system. Principals with low-levels of experience generally questioned the accuracy of their system, typically thinking more from the teachers’ perspective. Although overwhelmingly principals with high-levels of experience said they believed their current teacher evaluation system was an accurate reflection of teacher effectiveness, one principal with high experience, Mr. Bookman, did not believe his system was an accurate reflection of teacher effectiveness. Mr. Bookman did not think his system “account for all that teachers did” and his district’s current system was missing certain components. Mr. Bookman explained he was able to provide final teacher evaluation ratings that he felt were accurate, because of his ability to “work around” his teacher evaluation system. The majority principals in this study with low-levels of experience stated they did not 93 think their current teacher evaluation system was an accurate reflection of teacher effectiveness. In all four of six principals did not believe in the accuracy of their system, while two principals, Mr. Sherman and Ms. Chang believed their system was an accurate reflection of teacher effectiveness. Mr. Sherman and Ms. Chang thought along the same lines as their more veteran peers. Interesting to note, both Mr. Sherman and Ms. Chang were in their fourth year as a school principals at the time data collection. These two principals were the most experienced principals in the low-experience group. Subtheme Four: Hiring Decisions All principals in this study reported their districts used teacher evaluation scores to make hiring and layoff decisions. As a result, how principals rated teachers had the potential to impact these teachers’ future employment. The direct association of teacher evaluation ratings and future teacher employment was not lost on principals, who consistently referred back to “the enormity” of these evaluations. However, an analysis of these data show a clear distinction between how principals with high and low experience levels think about considering teacher evaluation data when making hiring decisions. Principals with high-levels of experience generally did not look at or consider a teacher’s prior evaluation score when thinking of hiring a new teacher. However, principals with low-levels of experience asked for and looked at prior teacher evaluation data before making hiring decisions. All principals in this study noted information such as the credentials, conversations with former employers, interviews, and how candidates made “data-driven instructional decisions” influenced who they hired. However, the use of previous teacher evaluation scores varied by experience level. For example, Mr. Bania said: 94 We always say when we go to hire somebody it is a million dollar decision. We want to make sure that they understand our philosophy and how they answer (interview) questions will depend on whether or not I hire. It’s the interview. It’s whether I can talk to different community members. We call references who know that person. I don’t look at evaluation scores at all. For Mr. Bania, and all of the principals with high-levels of experience, a teacher’s previous evaluation score was not considered a central or important aspect of hiring that teacher. Principals with high-levels of experience were much more likely to say they relied on things such as the interview with the candidate, if a candidate “fit” with their school and philosophy, and if this teacher had the right credentials. Another example of a principal with high-levels of experience and their lack of use of previous teacher evaluation scores when making hiring decisions comes from Mr. Bookman who explained he too never looks at this information. Mr. Bookman explained that he knew how to select the right teachers for his schools based on interviewing potential candidates and simply talking to them about their teaching beliefs, mindset, and philosophy. Mr. Bookman said he did not put much stock into a teacher’s prior evaluation score for a number of reasons, including the relationship that teacher might have had with a previous evaluator (good or bad) and because of the context of the school. Mr. Bookman noted even if a teacher had a previous score of highly effective, that means very little to him because everyone in that school may be been rated highly effective. Mr. Bookman noted as he gained experience as a principal he relied on past evaluation scores less and less when making hiring decisions and currently he does not look at this information at all. 95 While their more experienced peers tended not to rely on previous teacher evaluation data when hiring teachers, principals with low-levels of experience were much more likely to seek out these data before making hiring decisions. For example, Ms. Chang (four years of experience) said, “We don’t want an ineffective teacher teaching our students. Yes, we do look at previous scores if they’ve been teaching…its part of the puzzle.” For Ms. Chang, it was important to know as much about a teacher as possible for making a hiring decision. Ms. Chang wanted as much information as possible before filling any vacancies she had and therefore she would look at previous teacher evaluation information, mostly for their evaluation score and for any comments and/or feedback provided by the previous evaluator and Ms. Chang would use this information to make her hiring decision. Mr. Sherman said, “I mean yeah (he does look at and consider prior teacher evaluation scores when hiring a new teacher). We try to get effective teachers. We try to gauge our school and look at our school and say where there’s the biggest need.” Mr. Sherman went on to articulate that he would not consider hiring a teacher unless this teacher’s previous evaluation score was effective or highly effective. Mr. Sherman felt that anything less than a rating of effective reflected poorly on the teacher and therefore he did not want this teaching working in his school. One final example illustrating that principals with low-levels of experience tended to rely or consider relying on previous teacher evaluation information while evaluating teacher in the current year comes from Ms. Robbins who said, “I would. It hasn’t come up yet, but I wouldn’t accept somebody that was minimally effective if I had a choice.” Nuances. An analysis of the data reveals several nuances regarding how principals’ cognitive schemas impact how principals consider prior teacher evaluation information when 96 making hiring decisions. Although this finding is not directly related to teacher evaluation policy implementation, it does speak to what principals think about and value while making hiring decisions based on teacher evaluation information. As evidenced from the analysis of these data, principals with high-levels of experience typically believed in their ability to identify a highquality teacher, without needing to review that teacher’s previous evaluation scores. These principals believe they can look at teacher credentials and most importantly learn about these teachers through interviews and can decide who would be a good fit for their school. However, their less-experienced principal peers almost always looked at prior teacher evaluation scores before hiring any teacher in their building. These principals look at these data in part due to the fact that these less-experienced principals wanted to have a complete picture and perhaps some validation that they are making a strong hiring decision. However, one principal with highexperience, Ms. Hamilton, said her district required her to review and consider this information before hiring a teacher. Ms. Hamilton said while she looked at previous teacher evaluation ratings that she used this information as a “tie-breaker” when two candidates seemed equal (based on interviews and credentials). All principals in this study with low-levels of experience stated they did look at previous teacher evaluation information before making hiring decisions (or at least they would once an opportunity to hire teachers came up, as some new principals had not yet experienced hiring any teachers to their building). Chapter Summary The analyses in this chapter demonstrate that principals’ cognitive schemas influence how they think about implementing their teacher evaluation system, in part, by the type of sensemaking in which they engaged. Specifically, principals with high-levels of experience typically engaged in individual sensemaking where they made sense of their teacher evaluation 97 system by themselves with little other support or outside information. The type of sensemaking in which these principals engaged had implications for how these principals thought about, communicated, and ultimately carried out their school’s teacher evaluation policies. Principals with high-levels of experience were less likely to use previous teacher evaluation data when evaluating teachers, were less likely to use teacher evaluation information when making hiring decisions, and were more likely to believe their teacher evaluation system was an accurate reflection of teacher effectiveness. All of these characteristics fit into the individual sensemaking framework as the principals engaged in each of these tasks relying on their own sensemaking of the process of evaluation teachers (Ganon-Shilon & Schechter, 2016). These findings support prior literature suggesting principals rely on their prior knowledge when attempting to implement school level polices (Coburn, 2005; Spillane et al., 2002). In short, principals with high-levels of experience draw on their experience as an administrator, because these experiences are what makes the most sense to them. While their more veteran peers engaged in individual sensemaking principals with less experience typically engaged in collective sensemaking. These principals were more likely to engage in relational leadership and include other teachers in their thought process and discussion of teacher evaluation policy implementation. These principals were also more likely to look at previous teacher evaluation scores when evaluating teachers in the current year, were more likely to look at previous teacher evaluation scores when making hiring decisions, and were less likely to believe their teacher evaluation system was an accurate reflection of teacher effectiveness. Principals with low-levels of experience tended to draw on their experiences as a teacher because these experiences make up a majority of their professional educational experience. 98 In summary, the principals in this study with high-levels of experience engaged in individual sensemaking, drawing on their own experiences and beliefs about the goals of education, which impacted how these principals thought about the process and purpose of teacher evaluations and how these individuals actually evaluated teachers in their building. Their less experienced peers were more likely to collectively navigate the teacher evaluation process in part because they were more sympathetic to their teachers. It makes sense that individual cognition may change as principals gain experience. However, these findings suggest how teachers are evaluated varies by the amount of experience of the evaluator. This variation may be one explanation why consistent teacher evaluation policy implementation remains a challenge. For example, teachers with an identical skill sets, identical instructional practices, and identical classroom impact could receive a vastly different evaluation rating simply based on the experience level of the principal who does the evaluation. Additionally, based on these findings one might assume that teachers who work in schools with less experienced principals may receive more favorable teacher evaluation ratings than their peers in schools with more experienced principals. A complete discussion of the implications of these findings is found in chapter seven. 99 Chapter 6: The Role of External Context and Experience in Principal Learning and Implementation of Teacher Evaluation Policies and Systems “I feel the pressure for the teachers because they want to be highly effective, but it’s really hard to be highly effective when your students are failing. Just getting them to understand that. If we weren’t a priority school, it would be different.” – Mr. Jarmel (high-pressure environment) This chapter answers my second and third research questions: What role does external context (e.g. high-pressure vs. low-pressure environments) play in shaping principal learning and enactment of teacher evaluations systems and how, if at all, do principal experience and context interact during the policy implementation process? The first section of this chapter answers my second research question. When analyzed through the lens of cognition and specifically sensemaking theory, two important themes emerge from an analysis of the data. First, principals who work in high-pressure environments perceive a pressure to differentiate teacher evaluation ratings among teachers in their building. Second, principals in high-pressure environments did not believe their evaluation system accounted for the challenges their teachers faced (e.g. working in low-income communities, working with transient student populations, and teaching students who enter their classroom several grade levels behind academically). The remainder of this chapter answers the final research question which examines more closely how experience and context interact, if all, during the implementation of teacher evaluation policies and systems. An analysis of the data provides evidence that experience and context do interact and influence principals’ thoughts and actions around teacher evaluation policy implementation in several meaningful ways. 100 Theme One: Differentiating Teacher Evaluation Ratings The first theme that emerged from an analysis of the data was principals who work in high-pressure environments perceive a pressure to differentiate teacher evaluation ratings among teachers in their building. In this study, all 12 principals were the sole evaluators of their teaching staff. The school district or charter school authorizer of each of these 12 principals tasked these principals with implementing a formal teacher evaluation policy, including providing specific directives of how and when to observe teacher instruction, how to account for student assessment data in evaluations, and how to use the results of these evaluations for human capital decisions. Despite the formal and prescriptive nature of these policies, an analysis of the data suggests external context played a prominent role in how principals thought about teacher evaluation policy implementation as well as how they ultimately evaluated the teachers in their building. One way in which external pressure impacted how principals in this study evaluated teachers is how principals thought about and ultimately assigned teacher evaluation ratings. Principals in high-pressure environments were more likely to rate teachers critically than their peers in low-pressure environments. The principals in these high-pressure environments provided several explanations as to why they rated teachers critically. First, some principals felt an added pressure from district administrators to have some form of differentiated ratings among the teachers in their building. Specifically, these principals perceived a pressure limit the number of teachers they rated as effective or highly effective. Second, principals in high-pressure environments reported feeling a pressure from teachers in their building to differentiate ratings amongst teachers because these teachers were aware of the consequences of these scores for their future employment. Although none of the principals in this study said they received directives to 101 differentiate the ratings they gave their teachers, these principals did suggest that they received some type of message from district administrators, including superintendents, that there should be some distribution of teacher effectiveness ratings. Finally, principals in high-pressure contexts put pressure on themselves to critique teachers’ performance because they knew the status of their school (in terms of their state ranking on Michigan’s Accountability Scorecard) did not reflect a school where all teachers were effective or highly effective. For example, Mr. Jarmel said: I know I feel the pressure for the teachers because they want to be highly effective, but it’s really hard to be highly effective when your students are failing. Just getting them to understand that. If we weren’t a priority school, it would be different. Mr. Jarmel went on to say because his school’s test scores were so low in previous years he would not be able to justify rating all teachers effective or highly effective. Mr. Jarmel went on to explain that simply knowing his school was underperforming on state assessments was enough for him to know some teachers needed to be rated less than effective. In his mind, low student achievement on state assessments equated to less than effective teaching. Mr. Ramon said while his administration was supportive of how he assigned teacher evaluation ratings, he understood from informal conversations that there was an expectation to differentiate teacher evaluation ratings in the district. Mr. Ramon recalled conversations with his superintendent about the observational rubric and the “high expectations” of the rubric. Mr. Ramon said, “When you look at what highly effective is, those are some really, really high expectations.” Mr. Ramon took these conversations with his superintendent to mean that he should look very carefully at the domains of the evaluation rubric to make sure teachers met 102 these criteria. Mr. Ramon explained he perceived a pressure to make sure if he scored a teacher highly effective, he could point to adequate evidence to validate this rating. Mr. Ramon also recalled having several contentious conversations with his teaching staff about their final ratings. He said: We’ve had some interesting conversations about the rating of ineffective, effective, and highly effective. You know, the conversations among professionals will occur, and you might have a teacher who got highly effective who might tell a teacher who got effective, and they’re like, oh, why I didn’t get it? You do have a lot of interesting dialog and dynamics in regards to explaining the process. Your employment really can be contingent upon the results (of your evaluation), so I had several teachers who we really debated what their final ranking ended up being. I think I only changed one and it was really, really tough for me because even as I told them, when you look at the specifications that are listed within our model for evaluation, when you look at what highly effective is, those are some really, really high expectations. I potentially have a union action that I will be dealing with in the next couple weeks about an evaluation as well. For Mr. Ramon, these challenging conversations with his teaching staff occurred each year leading up to and during teacher evaluations. Mr. Ramon felt pressure from his staff to make a clear hierarchy of teacher evaluation scores within his building. Another principal who worked in a high-pressure context, Ms. Hamilton, told a story of how a teacher in her building came into her office and vigorously debated her evaluation rating of “effective.” Ms. Hamilton explained that after about an hour of debate and going back and forth with this teacher on specific rubric scores, this teacher directly told Ms. Hamilton that she knew a fellow teacher, who she considered less effective, received the same evaluation rating. In 103 this teacher’s mind, this rating was not only unfair, but had the potential to impact this teacher’s career. This teacher was at risk of losing her job if this district experienced layoffs, because she had fewer years of experience (which was this district’s tie-breaker if teachers received the same evaluation rating). Ms. Hamilton explained teachers thinking about their job security was a real concern for many teachers in her district, because her district almost always experienced teacher layoffs. Ms. Hamilton said she was confident she correctly rated each teacher, but she understood why this teacher was arguing for a higher score. Although Ms. Hamilton reported she ultimately did not change this teacher’s score, when reflecting on this meeting Ms. Hamilton acknowledged teacher’s future employment was something that was always in the back of her mind when she evaluated teachers in the future. She said: I think it (thinking about comparing final evaluation ratings of teachers) does reflect my reality and maybe my reality isn’t somebody else’s reality. I use this (the evaluation process) as a tool to grow them. In order to grow them I have to grow myself. I always go back and I look, what did I do? What could I have done different? Sometimes we place the wrong person, to me, in the job. Ms. Hamilton concluded this story by suggesting that she perceived a pressure from almost all of the teachers in her building to differentiate among their final evaluation score due to the amount of layoffs experienced by her district. Ms. Hamilton explained she had constant conversations with teachers about their evaluation scores as compared to their peers and these conversations led Ms. Hamilton to thoroughly examine and at times reconsider her final evaluation ratings of the teachers in her building. Unlike their peers working in high-pressure environments, the principals in this study who worked in low-pressure environments did not perceive pressure to rate any set number of 104 teachers in any category (effective, highly effective, etc.) and said that although it would be unlikely for all teachers to receive a highly effective rating, if that is what they all earned, that is what they would be rated. As Ms. Chang said, “The chips fall where they may. It is what it is.” Mr. Sherman explained that he did not feel any pressure to assign any specific score to any teacher in his building. He said: Again, the evaluation is going be what it is. Even if it’s Miss X and Miss X is one of our great teachers. If she starts going down (in terms of her evaluation rating), it’s going be because that’s what she was doing in her classroom. That’s my mind. Now if somebody wants to engage me then that’s fine we can talk. Then if they say something that I believe in, I have changed them (the evaluation score). It could be something like, oh I’m sorry you’re right I missed that. If they fight for it and it’s right, I will change. If it’s not, it stays the same. Mr. Sherman went on to explain that he let his teachers have some form of conversation with him regarding their final evaluation score, but he was not under any pressure to assign teachers certain evaluation ratings. Therefore, he was comfortable adjusting these scores if the teacher made a compelling case. Ms. Goldstein also indicated she felt no pressure to differentiate amongst the evaluation rantings she provided the teachers in her building. She said, “Evaluations are everything, but I say that tongue-in-cheek because your evaluation isn’t everything. If you’re doing the job you were hired to do and doing it well, your evaluation is going be highly effective.” For Ms. Goldstein, final teacher evaluation ratings were a result of teacher actions, student assessment data, and overall professionalism. Therefore, depending on the results of these teacher actions, all teachers might be rated similarly. Although she said she has never given all teachers highly 105 effective, Ms. Goldstein said that if all of the teachers in her building met the criteria to be highly effective she would not hesitate to assign all of her teachers a rating of highly effective. Nuances. One way in which external context impacted how principals thought about and ultimately rated teachers was the pressure perceived by principals when assigning final teacher evaluation ratings. However, principals who worked in high-pressure environments were not the only principals who perceived a pressure from teachers to differentiate evaluation scores. This perceived (and real) pressure did come up in several interviews with principals in low-pressure environments, although much less often. For example, Ms. Steinman said she felt a pressure from teachers to rate them as effective or highly effective. In her opinion she had many strongwilled teachers who believe they were highly effective and these teachers knew how to argue/make a compelling case for themselves. These teachers also seemed to know all of the teachers’ evaluation ratings and although they might be okay with an effective, they would argue their score if they felt it was not accurate compared to other teachers in the building. Other principals in low-pressure environments (and the teachers I interviewed) certainly referenced the importance of their final evaluation scores as in almost all instances employment decisions were based on these scores. However, principals in high-pressure contexts felt this pressure much more strongly, in part due to these districts laying off teachers annually. The low-pressure contexts in this study rarely experienced teacher layoffs. In summary, one way in which external pressure impacted the way in which principals thought about and ultimately rated the teachers in their building was the perceived (and real) pressure administrators felt from teachers and district level superiors. Principals who worked in low-pressure environments reported experiencing some perceived pressure; however, they never experienced pressure from the district level to distribute teacher evaluation ratings and the 106 pressure these principals felt from teachers was different from the pressure perceived from their high-pressure peers, because of the lack of teacher layoffs generally experienced in low-pressure schools. Theme Two: What do Teacher Evaluations Measure? Another way in which external pressure impacted how principals thought about and ultimately implemented teacher evaluation policies was principals’ beliefs about the efficacy of these systems in measuring the true performance of the teachers in their building. Principals in high-pressure environments did not believe their teacher evaluation systems accounted for all of the challenges the teachers in their contexts were facing. This belief resulted in principals looking for creative ways to increase a teacher’s their final evaluation score. For example, Mr. Bookman said: It’s tough because teachers aren’t going do it (be at their best) all the time. They’re just not. It’s human nature. I’m not looking to lambaste anybody, but I feel like sometimes that’s what the evaluation process does. I don’t think there’s any tool that’s going accurately evaluate all the things that teachers are doing. It’s that human factor that you just can’t evaluate. Mr. Bookman, along with other principals in high-pressure environments, reported that teachers in their buildings had more challenges that other teachers with whom they have worked in contexts with less pressure. For example, Mr. Bookman told a story of one teacher who had a goal of perfect attendance (or as close as possible) for her whole class for the entire year. Aiming for perfect attendance was an ambitious goal, as many of this teacher’s students were chronically absent. However, in the second part of the year, her students rarely, if ever, missed a day of school. Mr. Bookman said the relationships this teacher developed with students and parents and 107 her efforts to make school enjoyable caused the increase in attendance. Mr. Bookman went on to say that “obviously attendance is associated with learning and other growth”, but there is nothing on his teacher evaluation system that can “reward” teachers for increasing student attendance. As a result Mr. Bookman said he would try and factor in increased student attendance into this teacher’s final evaluation rating in the professionalism part of observational rubric. Although evaluating a teacher in this way may have been stretching what the professionalism part of his system meant (and he knew this – it was supposed to include things such as number of teacher absences and the amount of professional developments a teacher attended and if they were involved in student-related activities beyond teaching, such as coaching a sports team or mentoring other teachers) he gave this specific teacher the highest possible professionalism score because of her accomplishments of increasing student attendance. In his mind, Mr. Bookman believed it was his job as an evaluator to account for all that his teachers did, even if his current evaluation system did not. Mr. Bookman was not alone in his belief that his school’s current teacher evaluation system did not account for all of the challenges faced by the teachers in his building. Ms. Cohen agreed that her evaluation system did not capture the relational part of teaching. She said: Here in the urban setting relationship is everything. I have a hard time measuring those soft pieces. Our tool was great at measuring the data and those things that you can see like how is your classroom set up, is your classroom organized, is it functioning so that people can travel from place to place, are your transitions good. All that stuff that you can see is easy to evaluate. The tool was still missing that relationship piece. 108 Mr. Bookman and Ms. Cohen felt their evaluation system, particularly given their context, did not include crucial pieces that reflected teachers’ impact in the classroom, and as a result these principals looked for ways to credit their teachers on other areas of the evaluation. Mr. Jarmel also thought his system did not do an adequate job capturing the “whole teacher” and everything a teacher was doing on a daily basis. He said: I mean you have to hold people accountable for their jobs, but I think it’s so much more than that. I know we all need to be held accountable for our jobs, but I don’t quite know what that is. I think it’s more than just going in for the 40 minutes to sit and do an evaluation on them. If I’m in the classrooms every day, and I see what’s going on, that’s more to me valuable to the teachers because I can stop right then and offer suggestions and supports. Do you want the development of your staff and to have high quality, or is it just a dog-and-pony show that you get a couple of times a year? Mr. Jarmel explained that he factored in all of the visits he had with teachers throughout the school year. Although technically he was supposed to evaluate teachers twice yearly, in 30-45 minute observations of their instruction, Mr. Jarmel used observation data he collected throughout the school year when assigning his teacher’s final evaluation score. In his mind, this data point was more valuable both to him and his staff than two scheduled observations of teacher instruction where the teacher might perform above or below their actual ability. While principals who worked in high-pressure environments thought their evaluation system did not accurately encapsulate all that a teacher did, principals in low-pressure environments were more likely to report their teacher evaluation system did capture all that their teachers did. For example Ms. Steinman said: 109 We do the dimension ten (which accounts for everything outside of the classroom). Do they complete things on time? Do they attend work? All of those things. How do they conduct themselves during parent interviews, parent conferences, parent contacts with home? Their attendance based on sick versus leave versus conferences that they’ve attended, and any disciplinary actions that may have occurred have to go in that, the professional learning number ten area. Ms. Steinman believed that because of the way her district set up their current teacher evaluation system and used dimension ten of her evaluation rubric, the system did in fact account for everything that the teacher did in the current year, including things both inside and outside of the classroom. Mr. Bania also suggested his district’s teacher evaluation policy accounted for everything teachers in his school did. He said his policy had certain aspects, such as professionalism, that allowed him feel confident in the accuracy of his teachers’ final evaluation scores. He said, “The teachers that are effective that get marked as effective or highly effective. I believe we do a good job in this district here. It’s something we look at as a whole, not just by building, but as a whole district.” Mr. Bania was confident that his district had taken the necessary steps to ensure that their current teacher evaluation system account for everything that the teachers were expected to do as a teacher in their district. Dr. Wexler makes another argument suggesting that principals in low-pressure contexts felt their teacher evaluation policy was comprehensive and accounted for all that their teachers were asked to do. Dr. Wexler said: Teacher effectiveness seems to be so much bigger than just a piece of paper. I think it (her school’s teacher evaluation policy/system) captures in essence what they do. How 110 they interact with their kids, raise scores, raise the self-worth of our kids. You can’t really measure that, but you can see it in the kids. You can see it in the classrooms and the way that they interact. Dr. Wexler went on to say that she believed her school’s evaluation system did a fine job capturing all that she expects from teachers and as a result, the final evaluation ratings she assigned her teachers was an accurate representation of not only her teachers’ instructional effectiveness and their ability to raise student test scores, but also their ability to relate with student and improve student self-esteem and confidence. Nuances. A second way in which external pressure impacted how principals thought about and rated the teachers in their building was the principal’s belief and perception that his or her teacher evaluation did or did not account for all that their teachers were asked to do in their context. While principals in low-pressure environments thought the evaluation system did a fair job capturing all that their teachers did throughout the school year principals in high-pressure environments continually referred to the fact that teacher evaluations did not encapsulate the many challenges faced by their teachers. At times this belief resulted in principals in highpressure contexts looking to give teachers additional credit, ultimately raising some of these teachers’ final evaluation scores. Principals who worked in high-pressure environments were not exclusively critical of their evaluation system in terms of it capturing all that their teachers did throughout the school year. For example, Ms. Chang (low-pressure context) did not believe her school’s current system accurately accounted for all responsibilities of her staff. Ms. Chang was complimentary of her current evaluation system in many places, but other times, specifically in how her district accounted for student growth, she was critical and did not believe how the district measured 111 growth was an accurate or fair representation for her teachers. Additionally, two principals in high-pressure contexts, Ms. Robbins and Mr. Ramon, felt while their teacher evaluation had limitations, it did account for all their teachers were asked to do. Other principals in low-pressure environments noted some things they would like to see changed or added to their current teacher evaluation system, but these administrators overwhelmingly believed their current system account for most if not all things done by their staff and therefore was an accurate reflection of teacher effectiveness. Their peers in high-pressure environments were much less likely to have a similar mindset. In summary, a second way in which external pressure impacted the way in which principals thought about and ultimately rated the teachers in their building was the belief of principals that their current teacher evaluation system did or did not account for all that their teachers did throughout the school year. This belief resulted in some principals manipulating the final evaluation ratings of teachers. How do Experience and External Pressure Interact during the Implementation Process? The final research question that guided this work examines how principal experience and external pressure interact during the process of teacher evaluation policy implementation. Specifically, do principals with certain experience levels and who work in contexts with differing amounts of external pressure think about and implement their school’s teacher evaluation system in similar or different way? When analyzed through the lens of cognition and specifically sensemaking theory several themes emerge from each grouping of principals. The thoughts, beliefs and actions of principals with different experience levels and facing different amounts of outside pressure had implications for how these individuals thought about teacher evaluations and ultimately how they rated the teachers in their building. When compared to their peers in other categories principals with high-levels of 112 experience in high-pressure contexts believed (1) all teachers should be evaluated annually, regardless of the effectiveness of the teacher; and (2) it was their responsibility to provide teachers with specific directives of how to teach in an effort to improve their instruction and ultimately student learning. All three principals with high experience in high-pressure environments believed all teachers in their building should be evaluated annually, regardless of the effectiveness of the teacher. These principals believed they needed as much information as possible on these teachers to make sure the teachers were improving their practice and to make sure these teachers were held accountable for their performance. Mr. Bookman said: The whole idea of tenure going out the window where it’s like everybody’s on a level playing field. I was happy about, to tell you the truth, because I don’t think anybody should just be guaranteed a job. If you’re horrible, you shouldn’t be guaranteed a job. That’s not how it works in the real world. That’s not how it works from my chair either. You got to perform, and you got to perform every year. As evaluators, or as evaluators and administrators, we’re evaluated every year. Mr. Bookman and the other principals in this category believed their teachers should be evaluated at least annually in order to hold teachers accountable for their performance. Mr. Bookman’s experience conducting teacher evaluations gave him confidence that the data he collected during teacher evaluations would be beneficial for not only him as the principal, but also the for teacher. Ms. Cohen and Ms. Hamilton, the other two principal in this category, also believed more evaluations were typically beneficial for not only teachers, but for themselves as evaluators to make sure teachers were improving. For example, Ms. Hamilton described a teacher who was highly effective and in her words, “absolutely a rock star” one year. However, 113 the following year, for a variety of reasons, this teacher’s performance slipped and she was rated minimally effective. Ms. Hamilton said that if this teacher had not been officially evaluated that year (as is the case with highly effective teachers in some districts) this performance would have gone unchecked, ultimately hurting the students in this teacher’s classroom. The second way in which principals with high-levels of experience in high-pressure contexts differed from their peers in other categories was these principals were most likely to give specific directives of how teachers should teach. While their peers in other categories tended to lean towards giving more support, guidance, or suggestions, these principals believed that they should be telling teachers what to do and these teachers should be following their directives. This belief was in part because of their extensive experience and in part because of the high-pressure context of their school. For example, Ms. Hamilton explained how she meets individually with each teacher prior to the beginning of the school year and they co-develop goals that the teacher will work on throughout the year. Although co-developing goals with teachers was not unique to principals with high-experience in high-pressure contexts, Ms. Hamilton was quick to note that although technically these were co-constructed goals, as the principal, she set the goals, monitored the goals, and made sure by the teacher’s evaluation that he or she had made progress towards these goals. When asked what she did if her teacher’s may disagree, she said “tough”. Ms. Hamilton continued: It would be less difficult if I didn’t take ownership for their growth, but I take ownership for their growth. I know principals that just go in, score it, and they go to sleep at night. If I didn’t take ownership for their growth, I would say it’d be a lot easier. I do take ownership for their growth, so I want the best for them and it’s up to me to differentiate it for them. 114 Part of this belief and action of principals in this category could be due to the fact these schools typically employed less experienced teachers and these principals were aware of this fact and wanted to make sure they were giving these teachers specifics about what works in their context. For example, Ms. Cohen said: Most of my staff has less than three years experience. I’m the first school that they’ve ever been too. They came straight out of college and they’re fine with it (the fact that Ms. Cohen gives them specific directives of how to teach). I think the reason why they’re fine with it is because in staff meetings I promote why we’re doing this. I give them a mission, a vision as to why are the test scores important. I do it because I feel like I should. I realize with new teachers or people that are just new to your school, you have to say it at least four or five times before it sinks in because they’ve got so much on their plate. Ms. Hamilton added, “We have a real attrition problem. The front door is like a turnstile with them (teachers) coming and going. My goal at the end of the day is to have them as successful as they can be.” In sum, the beliefs and actions of principals with high-levels of experience in highpressure contexts differ from the other principals in this study. Specifically, these principals believe their teachers should be evaluated as often as possible in an effort to provide both the principal and teacher with feedback to improve their practice. This belief impacted how these principals thought about using teacher evaluation information. Additionally, these principals believe in giving their teachers specific directives of how to improve their teaching. This belief and this action impacted how often these principals spent in teachers’ classrooms. Although principals could not increase the number of formal evaluations, they could observe teacher 115 instruction more often and direct the goals their teachers created, the feedback they provided teachers, and what they expected from teachers throughout the school year. These characteristics manifested themselves with these principals much more so than their peers in other categories. When compared to their peers in other all other categories principals with low-experience in high-pressure environments (1) spent the most time in teachers’ classrooms; and (2) provided more support and guidance (in terms of official and unofficial observation feedback) than their peers in similar high-pressure contexts with high-levels of experience. These findings suggest these principals provide constant and mostly supportive feedback to their teaching staff throughout the school year. This finding holds true for the type of feedback these principals provide teachers during the official teacher evaluation process. For example, Mr. Jarmel said: I’m more of an instructional leader. I work extremely hard, so the teachers see that and respect that. I’m on the front lines with the teachers. I don’t expect anything from them that I’m not going to show them how to get there. We need highly effective teachers. Making sure that all professional development surrounded by that differentiated instruction, depth of knowledge, and DEI, which is direct, explicit. Those are the key components because I want them to be highly successful. I want them to be highly qualified. Mr. Jarmel continued that in order for him to provide the most accurate representation of a teacher’s performance, he needed to spend as much time as possible in the classrooms of these teachers. Mr. Jarmel was quick to point out that all observations of teacher instruction he conducted counted towards a teacher’s evaluation score, even though only two observations of instruction was used for official purposes. However, if Mr. Jarmel noticed a teacher attending professional development opportunities and taking his feedback and implementing it into 116 practice, this would be reflected in his teachers’ final evaluation scores. Principals with low-experience levels in high-pressure environments also believed it was not fair to rate their teachers based on one or two 45-minute observations. Like their more experienced peers who also work in high-pressure contexts, principals in this category believed that while their teachers should be evaluated at minimum annually, they also believed their teachers should be evaluated multiple times throughout one academic year. Because of this belief, these principals tended to spend as much time as possible in teachers’ classrooms and observing teachers instruction. These principals said they considered all of these informal observations, not just the official observations, in their teacher evaluations. For example, Mr. Ramon said: One of my goals this year was to visit at least four to five classes every day. I let my staff know what would I look for when I came into the classroom for identifying the goals and objectives to key vocabulary in regards to the subject area that was being covered, and that falls under, once again, informal evaluations. I ended up deciding this year to provide that instantaneous feedback that when I do go in to do regular evaluations. Mr. Ramon explained that this year long feedback played a role in how he ultimately evaluated teachers. During one day of teacher observations Mr. Ramon and I visited four classrooms in my two-hour visit. He said visiting multiple classrooms was very common for him as he hated staying in his office. One teacher commented “He is in here all the time”, providing further evidence that Mr. Ramon prioritized spending time in the classrooms of his teachers. He said that if he saw teachers taking this feedback to heart and making improvements, he would include this effort in their final evaluation. At the same time, if he observed a teacher for their official evaluation and this teacher did not perform as well as Mr. Ramon knew this teacher 117 could, he would not rate this teacher solely based on one below average performance. He would use all he knew about the teacher throughout the year before making a final evaluation rating. Principals with low-experience in high-pressure environments were also more likely to provide their teachers with guidance, support, and feedback, when compared to their more experienced peers who worked in similar environments. For example, Ms. Robbins explained she believed it was her best interest to provide her teachers with support and guidance, as many were beginning career teachers and others had many challenges “outside of their control” – which Ms. Robbins described as challenges in the community, such as poverty. Because of this belief, Ms. Robbins thought it best to support her teachers, as opposed to providing them with mandates and directives about what they must do. In her opinion, her teachers were working very hard to ensure the academic success of all students and giving them mandates or deadline or directives would be counterproductive. Instead, Ms. Robbins focused on provided structured, positive feedback and as long as she felt her staff was making a good faith effort to address this feedback, she was content. In sum, principals who worked in high-pressure environments with low-levels of experience spent the most time in classrooms of all of the principal categories and this time spent in classrooms factored into their thinking about teacher evaluation feedback and ultimately how they rated the teachers in their building. Typically, these constant observations manifested in these three principals providing support and feedback to their staff throughout the school year. Additionally, these principals used these informal observations when calculating a teacher’s final evaluation score. Spending time in teachers’ classrooms is not unique to this category of principals, but these three individuals referenced using all observations throughout the school 118 year in a teacher’s final official evaluation much more frequently than their peers in other categories. While principals in high-pressure environments said they took a more active and directive role in the classrooms of their teachers, principals in low-pressure environments with high-levels of experience were more likely to provide suggestions, ideas, and support that teachers could consider to improve their practice. This behavior manifested itself during principal/teacher conversations, as well as in the feedback principals provided teachers. For example Ms. Goldstein said: I feel like, to be an educational leader, you’ve got to be on the front line with your people. You’ve got to be in the room. You’ve got to be hearing about what’s working, what’s not working, watching the behaviors of students that are making it impossible to teach. You’ve got to be there. You’ve got to be supportive in the sense that "I’m there with you, not just sitting behind my desk in an office that’s 100 yards from you." That’s not effective. Ms. Goldstein went on to say that when it came to things such as supporting teaching practices that promoted change, innovation, and teachers taking risks, she would never tell her teachers how to teach or not to try something. She explained: I think everybody should be (allowed to try new things). It helps you understand what you do well and what you need to look at. To the complexity in which we’re doing it, are we pushing teachers away from teaching because of it? They walk in, one shot (references how teacher evaluations are currently structured in his district), and that’s your deal. If you did good that day, you have a job. If you didn’t, well, sorry about your luck. I don’t understand how we got to where we’re at with evaluations. I don’t know, 30, 119 40, 50 years ago, in education, of why it got to be what it is today. Something must have happened, and now we’re reactive to that, instead of being proactive with the teachers that are coming in. Ms. Goldstein believed that evaluations should be a means of support and a way to let teachers know how they are doing and to help them improve their instructional practice. This belief was a common sentiment from principals with high-levels of experience in low-pressure environments. Principals in this category believed their teachers should be evaluated, but the emphasis of the evaluations should be changed from punitive to supportive. Dr. Wexler also stressed that she focused on using evaluations as a way to provide support and suggestions to the teachers in her building. She said, “Well, we look at things like supporting change and innovation, how we communicate as a team, how we communicate with each other, how we communicate with our students. It’s how do they influence students and others in collaborative ways.” Dr. Wexler explained that evaluations were more than a rating system of her teachers. She did believe her teachers should be evaluated and held accountable for her performance, but she believed much more strongly that evaluations should be used as a means of support and feedback for her teachers. Additionally, principals with high-levels of experience in low-pressure environments believed their teachers were evaluated too often and the total number of teacher evaluations should be reduced for most teachers. For example, Ms. Goldstein said: If you’re doing your job, and you’re moving kids, and your classroom is just ticking along, and I have evidence to support the fact that you’re innovative, and you’re matching our district philosophy, why do I need to evaluate you every year? Why? We’re trying to not punish the best of the best by making them do enormous amounts of tedious 120 paperwork and give them a year off every year. What I’m hearing from them, they don’t like that, because then in the year they’re off, things change, and so the next year when they are evaluated, they’re like, "Oh my gosh. There were so many changes. I don’t even know what I’m doing now." As helpful as it was supposed to be, it’s not becoming helpful. Where’s the common ground there? Ms. Goldstein thought that unless a teacher was struggling or perhaps new to the district, she should not have to evaluate all teachers in her building. In her opinion the official evaluation process was a waste of time and resources, as she could better spend her time communicating with her staff in other ways, such as informal walkthroughs and shorter, simple and point blank conversations. In short, the principals with high-levels of experience who worked in lowpressure environments were most likely to believe evaluations should provide support for teachers as well as to believe that annual formal evaluations were not necessary. Finally, when compared to their peers in other categories principals with low-experience in low-pressure environments were likely to co-construct the evaluation process with their teachers and were the most likely to provide teachers the benefit of the doubt and negotiate with teachers during the evaluation process (when providing teachers their final evaluation rating). For example, Ms. Steinman said: I want to make sure that I’m open and fair and consistent. I release all my walkthroughs. My teachers come and talk to me about them. "What can I do?" or "Why did you do this?" or "Did you notice this?" Sometimes I haven’t. I really just try to be fair and open with them so that they know where they’re at, there’s no surprises. It’s not going be at the end of the year, they’re like, "Oh, my goodness! I didn’t even know that you didn’t think 121 I was an effective teacher. Ultimately, when I go to complete the evaluation, all of them will come up. I still get to pick the final score. Ms. Steinman went on to explain that she and her teaching staff had detailed conversations throughout the school year about their projected teaching effectiveness and their projected final evaluation rating. She said she also allowed teachers to voice their concerns about how she ultimately rated these teachers. Ms. Steinman was quick to point out that because she was new in her position she tended to want teachers to get the benefit of the doubt in an effort to secure positive and trusting relationships. Ms. Steinman believed building these relationships would be beneficial in the long-run as she would be able to have more difficult conversations with her staff. However, Ms. Steinman did note that her lack of experience as an administrator and her recent experience as a teacher did lend to negotiable teacher evaluation ratings and to her defaulting to higher teacher evaluation scores with her staff. Ms. Chang also said she worked very hard to co-construct meaning around her school’s teacher evaluation system and ultimately her teacher’s evaluation ratings. Ms. Chang said: I leave some blanks (in the final evaluation rubric). For certain teachers, I can’t see everything in the evaluation and I ask for them to bring some evidence. Then I fill out as many boxes as I can from the information that I’ve gathered in those walkthroughs and observations. I’ll leave some blank spots of things that we want to talk about, or if they have evidence to bring or show me, then I want to mark those boxes efficiently. Ms. Chang went on to say that in her mind she was not negotiating scores, but allowing her teachers to provide evidence and state their case as to why they should receive a higher evaluation score. Ms. Chang said she was fine with this approach to evaluations because as a former teacher she knew an evaluator could not capture everything that was happening in a 122 classroom. Therefore, if she didn’t see something she would not mark this teacher’s evaluation score down. Instead she allowed these teachers to present their case for why they should receive a higher score. The other principal in this category, Mr. Sherman, talked extensively about how he let teachers “argue” for a higher score. Although letting teachers dispute their final evaluation rating did not mean Mr. Sherman changed a teacher’s score just because they argued, he encouraged his teachers to have these conversations and fight for themselves. If they made a compelling case, he would change the score, because in his mind, the teacher knew best what they did every day in the classroom and just because Mr. Sherman might not have seen something, as long as the teacher could point to some evidence, he was fine giving teachers the benefit of the doubt and ultimately a higher rating. In short, the three principals in this category were more likely than their peers in other categories to negotiate teacher evaluation ratings with their staff. These conversations were often ongoing throughout the school year and teachers had a chance to influence their final evaluation rating based on these conversations outside of their classrooms. Chapter Summary The results of this chapter suggest external context influences how principals think about implementing their teacher evaluation system in several ways. First, principals who work in high-pressure environments believed they had an added pressure to differentiate among teacher evaluation ratings. This perceived (and real) pressure typically caused principals in these contexts to rate teachers more critically than principals in low-pressure environments. Second, principals in high-pressure environments did not believe their evaluation system accounted for all that their teachers were doing and the challenges the teachers in their context were facing. 123 This belief resulted in these principals looking for ways to include some of these things, even if the policy did not call for it. The second section of this chapter highlights differences between different categories of principals. In sum, principals with high-levels of experience in high-pressure environments believe their teachers should be evaluated more often and believe it is their job to provide specific directives to teachers in regard to how to improve their teaching practice. Principals with low-experience in high-pressure environments believed it was their job to provide their teachers with support and guidance. Additionally, these principals spent the most time in teachers’ classrooms compared to principals in all other categories. Principals with high-levels of experience in low-pressure environments believe their teachers should be evaluated less often and believe in providing suggestions and support to teachers. Finally, principals with lowexperience in low-pressure environments were most likely to co-construct how their teacher evaluations looked in practice with the teachers in their building and were most likely to give teachers the benefit of the doubt when assigning teacher evaluation ratings. These findings suggest how teachers varies based on a combination of the pressure faced and experience of the evaluator. These findings have several implications for practice, including teachers who work in different contexts receiving different teacher evaluation ratings solely based on their current work environment. A complete discussion of the implications of these findings is found in chapter seven. 124 Chapter 7: Discussion, Implications, and Conclusions Today’s education policy conversation includes an increasing amount of scholarship dedicated to principals’ evaluation of teachers (see for example, Donaldson & Papay, 2014; Goldring et al., 2015; Rigby, 2015; Steinberg & Donaldson, 2016). This dissertation complements and extends this growing body of literature by providing nuanced evidence of how principals’ cognitive schemas impact their implementation of teacher evaluation policy and teacher evaluation systems. Analyzed through the lens cognition and specifically sensemaking theory, the results of this work indicate that principal experience as well as external context impact how principals think about implementing teacher evaluation policies and systems and ultimately how these policies and systems play out in practice. Specifically, this dissertation fills an important theoretical gap in the literature by suggesting that principals with high-levels of experience engage in individual sensemaking when implementing teacher evaluation policies and systems, while principals with low-levels of experience engage in collective sensemaking when implementing these same policies and systems. Additionally, this dissertation fills a gap in the empirical teacher evaluation literature by providing insights as to how principal experience and external context influence teacher evaluation policy and system interpretation and implementation. In this chapter I situate my findings into the broader teacher evaluation policy scholarship landscape. I then discuss the implications of the findings of this work and provide concluding remarks. The Goals of Teacher Evaluation Policy As research continues to show a high correlation between teacher quality and positive student outcomes, such as achievement, attendance, and graduation (Aaronson, Barrow, & Sander, 2007; Chetty, Friedman, & Rockoff, 2014; Rockoff, 2004), ensuring all students have 125 access to high-quality teachers is of critical importance. The pace at which teacher evaluation policies and systems are changing is one indication that governments (nationally and locally), researchers, and practitioners believe carefully and thoughtfully constructed teacher evaluation policies have the potential to realize the goal of high-quality teachers for all students, by identifying high-quality teachers and by providing better information on what makes a quality teacher. Although teacher evaluation policies and systems have changed dramatically in recent years, the goals and purposes of teacher evaluations have changed very little. Early research suggested teacher evaluations were meant to serve the general purposes of teacher improvement or accountability at either the individual level or the organizational level (Wise et al., 1985). Thirty years later, the two schools of thought regarding the purposes and goals of teacher evaluation remain the same. One is as a means of support and improvement for teachers (Kraft & Gilmour, 2015) and the other as a means of accountability in terms of rating teachers and dismissing ineffective teachers (Hanushek & Rivkin, 2010). Steinberg and Donaldson (2016) put it best when they write: Most new teacher evaluation systems incorporate measures of student achievement and observations of classroom instruction to assess teacher performance (NCTQ 2013; Hallgren, James-Burdumy, and Perez-Johnson 2014). The espoused goal of these new evaluation systems is to more closely tie the work of teachers to improvements in student learning (Darling-Hammond, Wise, and Pease 1983; Murphy, Hallinger, and Heck 2013). There are two approaches to satisfying the system’s fundamental goal of improvement in student outcomes: (1) developing teachers’ skills to improve student performance, and (2) evaluating teacher effectiveness for accountability purposes related to tenure, rewards, and dismissal (p. 341). 126 However, despite these seemingly clear purposes and goals of teacher evaluations one question situated in the teacher evaluation discussion is can teacher evaluation policies and systems, as currently constructed, achieve the these aforementioned goals? Put differently, can teacher evaluation serve the dual purpose of providing useful feedback for teachers to help them improve their practice, while holding them accountable for their performance? And, can teacher evaluation policies and systems provide policymakers and practitioners better information on what makes a quality teacher? The findings from this dissertation suggest teacher evaluation policies and systems, as currently constructed, are not well-suited to accomplish these goals in part because principal cognition and external context greatly impact how principals generate teacher evaluation information. The main theoretical contribution of this dissertation is the type of sensemaking in which a principal engages is dependent upon the experience level of that principal. Specifically, principals with high-levels of experience engage in individual sensemaking (a type of sensemaking that occurs in one’s head and relies on personal experiences and knowledge to make sense of a situation or task) while principals with low-levels of experience engage in collective sensemaking (a type of sensemaking that occurs among multiple people in an organization or an environment). The results of this analysis suggest principals who engage in individual sensemaking make sense of evaluating teachers differently than principals who engage in collective sensemaking. For example, principals who engage in individual sensemaking rely primarily on their own definitions of good classroom instruction where principals who engage in collective sensemaking co-construct what good classroom instruction looks like with their informal networks within their school. These findings suggest that teachers will receive evaluations that look quite different simply based on the experience level of the 127 principal performing the evaluation. Ultimately, these findings suggest the information generated by teacher evaluations will vary by the experience level of the principal, which might make it difficult for policymakers to decipher what information truly shows quality teaching. Distinguishing between principals who engage in individual versus collective sensemaking is one way this dissertation moves past the assumption that “sensemaking happens”. This dissertation suggests sensemaking happens differently within principals with certain experience levels, which impacts teacher evaluation policy and system implementation. In short, principals with high-levels of experience evaluate teachers differently than their less experienced peers, which brings into question the consistency of the teacher evaluation information generated by school principals. However, this finding provides potentially significant information to policymakers. For example, if high-experience principals consistently generate high-quality teacher evaluation information, policymakers may be able to design a teacher evaluation system that uses principals with high-experience to conduct all teacher evaluations in a district or state and remove low-experience principals from this process. One of the practical contributions of this dissertation is that, among my participants, principals with low-levels of experience navigate the process of teacher evaluations with different mindsets and priorities than their more experienced peers. For example, principals with low-levels of experience typically find it difficult to critique teacher performance. Instead these principals prioritize cultivating positive relationships with their staff. Principals with low-levels of experience use teacher evaluation systems to achieve the goal of providing teachers feedback and support to help teachers improve their practice, but in most cases they avoid using these systems to hold teachers accountable for their performance (except in extreme cases). Therefore, the information generated by these principals has the potential to be quite different than the 128 information generated by their more veteran peers and looking at teacher evaluation ratings across principals with varying experience levels might make it difficult for policymakers, researchers, and practitioners to determine the accuracy of this information. The different backgrounds, knowledge, experiences, and contexts of principal evaluators raises questions about the capability of current teacher evaluation systems accomplishing the goals of providing teachers information to help improve their classroom performance and hold them accountable for their performance as educators, while also providing policymakers better information on what makes a quality teacher. For example, this study’s findings build on previous research which suggests principals in high-pressure environments are more likely to use teacher evaluation policies as a way to rank teachers and as a tool to determine who is effective and who is effective, while principals in low-pressure environments use the same policies as improvement tools for their teaching staffs (Chingos & West, p. 428; Fuller & Ladd, 2007). Although principals may be working with the best of intentions to both accurately critique teacher performance and provide teachers with actionable feedback to help them improve their practice, principals’ cognition often prioritizes one of these goals over the other. For example, Mr. Jarmel provided his teachers with feedback to improve their practice, but in his mind, his school’s teacher evaluation system was a way to show which of his teachers did not reflect effective or highly effective teachers. Because the principals in this study worked in very different contexts in terms of the amount of outside pressure they felt from the state of Michigan, as well as their district-level superiors (and even amongst the teachers in their building), how these principals evaluated their teachers looked quite different from school to school. The amount of outside pressure facing schools and principals is one reason why relying on principals to generate better information on teacher quality is a challenge. 129 Another challenge of realizing the goals commonly associated with current teacher evaluation policies is there is no, or a very limited, consensus on what makes a teacher effective (Donaldson & Papay, 2014; Fenstermacher & Richardson, 2005). Supporting this research, the principals in this study had varying definitions of teacher effectiveness. For example, Mr. Jarmel believed a teacher’s effectiveness was measured by high student achievement on state assessments. In contrast, Mr. Bania did not consider state assessments at all when evaluating teacher performance and instead relied on what he saw in the classroom during observations of teacher instruction. Still another principal, Mr. Bookman defined teacher effectiveness largely based on teacher-student interactions and the relationships teachers built with students and parents. What an effective teacher looks like and means to one principal may differ from others, even within the same school district. Therefore, how principals evaluate teachers and what they prioritize during evaluations will look quite different. Because of this lack of consistent definition of teacher quality, it is often difficult for teacher evaluation policies and systems to provide quality information on what characteristics make a quality teacher. The implication of a lack of consensus on a definition of teacher quality is that teachers receive vastly different evaluation scores, based on the cognition of the evaluator. This is not inherently bad, but given the enormous stakes attached to evaluation scores in terms of teacher employment, this may appear unfair to individual teachers and may lead to unintended consequences, such as teachers leaving schools where principals do not score them favorably. In short, the results of this work suggest teacher evaluation policies and systems, as currently constructed, will likely continue to fall short in providing policymakers, researchers, practitioners better information on what makes a quality teacher. Instead, this research suggests the information from evaluations will provide these individuals information on what principals with specific amounts of experience and who 130 worked in specific contexts think constitutes quality teaching. I address this further in the following implications section. Principals’ Role in Teacher Evaluations Given that most policymakers, practitioners, and researchers agree on the goals and purposes of teacher evaluations, as difficult as these goals may be to accomplish, the next logical question is, is it reasonable for principals to be the primary people charged with achieving these goals? In almost all cases, school principals are the primary school-based actors tasked with enacting teachers evaluations systems and assigning these important teacher evaluation ratings (this was the case for the 12 participants in this study) (Steinberg & Donaldson, 2016). As a result, how these individuals make sense of implementing these policies will affect how teacher evaluations look in practice. Additionally, school principals’ sensemaking will affect the data produced from these evaluations. The principals in this study were tasked with making sense of external demands while balancing the needs of their specific school and the teachers within the school (Ganon-Shilon & Schechter, 2016). Moreover, the principals in this study “make key decisions that determine which reform demands they bring in, which demands they emphasize with the staff, and which they filter out” (Ganon-Shilon & Schechter, 2016, p. 7), a finding that supports other research that examines how principals make sense of policies that enter their systems of practice. Given the widespread research that shows teacher quality is a significant factor that leads to the aforementioned desirable student outcomes and other, longer-term positive outcomes, such as increased labor market opportunities, and increased employment wages (Hanshuek, 2010; Rockoff 2004), the role of principals in identifying quality teachers and helping teachers improve their craft as educators is arguably a principal’s most important responsibility. Although principals certainly can impact student outcomes in a variety of ways 131 (e.g. fostering a positive working environment, establishing strong communication within a building, supporting parental and community engagement, etc.), one of the most direct ways principals can positively impact student school experiences and outcomes is by identifying, hiring, and retaining high-quality teachers (Boyd et al., 2011; Harris, 2010; Ladd, 2011; Leithwood et al., 2008). However, principals are responsible for a host of other things outside of the realm of teacher evaluations. For example, principals must manage their building, serve as the instructional leader of the school, and communicate with district administration, parents/guardians, the local community, and various other stakeholders. Additionally, principals are expected to take on the dual role of coach and evaluator, providing support to their teaching staff, while being competent evaluators of classroom instruction and student learning, in a multitude of subject areas and grade levels. Principals also must manage the school budget, bus schedules, design and deliver professional development for staff, and deal with issues of student discipline, absences, and safety. Given all that is asked of school principals, is it reasonable for policymakers to expect principals to be able to accomplish the goals associated with teacher evaluations in addition to their myriad of other tasks? Since the RTTT initiative in 2009 changes to teacher evaluation systems have occurred at unprecedented rates. Donaldson and Papay (2014) note, “teacher evaluation is a prime policy lever as a conduit to combine accountability and support,” but are principals the individuals best suited to accomplish these goals? Principals are charged with understanding these changes and policymakers rely on principals for successful implementation of these policies. New teacher evaluation systems task principals with rating teachers accurately and differentiating amongst teacher effectiveness, while also supporting teacher instructional improvement. Teacher evaluation systems are time consuming to implement 132 and this implementation must be done delicately given the enormous stakes attached to current teacher evaluations policies. Put another way, is it reasonable to expect principals are able to fairly critique teacher performance and at the same time provide teachers the support and feedback to help them improve their craft and provide districts, states, and policymakers more accurate information on teacher quality and effectiveness? If the answer to this question is yes, principals must receive increased and more targeted training and professional development when using these complex systems. For example, increasing evidence suggests ongoing conferences between principals and teachers are crucial to the overall evaluation process because these conferences provide opportunities for teachers to improve their practice and ultimately student achievement (Steinberg & Donaldson, 2016; Steinberg & Sartain, 2015; Taylor & Tyler, 2012). Therefore, principals should receive constant support as to how to structure these conferences, what to include during conversations in these conferences, and how to deliver useful feedback to their teachers. As is suggested in this study, these conferences varied drastically from school to school and in some cases did not occur at all. Principals will likely continue to play an active role in negotiating federal, state, and local policies and initiatives (Ganon-Shilon & Schechter, 2016; Koyama, 2014), but if policymakers could ensure at minimum the essential parts of teacher evaluation policies, in this case conversations around instruction, were consisted between schools, some of this lack of continuity may be abated. Alternatively, as principals continue to be held increasingly accountable for student performance and the performance of their school, giving principals greater discretion over how they evaluate, hire, and work with their staff is something policymakers and district leaders should consider. If principals are the primary people charged with successfully running a school, 133 they potentially should have a larger say in how these people are evaluated and if the teachers are valuable assets to their school. Giving principals greater autonomy over how teachers are evaluated may best support the needs of local schools. Some research shows that principals, at least in part, are able to make strong evaluative and human capital decisions if given the right information (Jacob & Lefgren, 2008; Rockoff & Speroni, 2011). Additionally, this work suggests that principals’ human capital decisions often correlate with other positive results, such as increased parental and student satisfaction (Jacob & Lefgren, 2005; Rockoff & Speroni, 2011). Therefore, if the goal of teacher evaluation policies is to provide better information on what makes a quality teacher and to provide all students with the best teachers, allowing principals the opportunity to decide what works best for their local context may be a unique approach to teacher evaluations, especially considering in many cases it appears principals are evaluating teachers in various ways already. If principals are not the best people suited to evaluate teachers then policymakers should consider the use of outside evaluators. Often times, principals already have their minds made up about how they will evaluate a teacher, even before the process begins. Weick (1995) calls this a “decision premise” where an individual, early on in the process of making a judgement, assigns values, beliefs, and meanings to what he or she will be judging (p. 115). In that way, when it comes to the final judgement, these individuals will be able to make sense of what they are seeing. Evaluators having a predetermined mindset about who or what they will evaluate is concerning as Weick (1995) writes, “As facts give way to values, computation gives way to judgement, and sensation is displaced by ideology, all without the member necessarily being any wiser to these shifts” (p. 115). One of the main findings about all principals in this study, but particularly those with 134 low-levels of experience, is it is difficult for these individuals to separate all that they know about teachers from the official teacher evaluation. The principals in this study constantly referenced the idea that a teacher evaluation was a snap-shot in time and did not encapsulate all that teachers did during the school year. Additionally, principals noted that if they knew that an observation of a teacher, or a teacher’s final student assessment data, did not reflect what the principal believed was the teacher’s true impact on student learning, their teacher evaluation system had wiggle room to evaluate them accordingly, which typically meant rating teachers more favorably. These findings lend some credence to the research that suggests using multiple observers, or observers who know little to nothing about the teachers they are evaluating may provide more reliable assessments of teacher instructional ability (Kane et al., 2013; Donaldson & Papay, 2014). Research suggests because principals have intimate relationships with the teachers they are charged with evaluating, it is virtually impossible for principals to evaluate teachers objectively. Using outside observers or observing teachers using multiple administrators, possibly the school principal and another individual from the district office, has the potential to alleviate this concern. The principals in this study certainly referenced the relational aspect of evaluating teachers as a challenge and therefore considering the use of evaluators that do not have close relationships with teachers is something policymakers and district leaders can and should consider when designing future teacher evaluation policies. The principals in this study used their own thinking and beliefs to evaluate teachers and based their justification for this choice on the high-stakes nature of teacher evaluation policies. When analyzed through the lens of sensemaking theory, these findings suggest that the ways a principal values or perceives the purposes of teacher evaluations, and the relationships he or she 135 has with staff, shape how he or she interprets and ultimately implements teacher evaluation policies. One example of principals using their own thinking and beliefs to evaluate teachers relates to teacher evaluation scores being used for human capital decision. Because some of the principals in this study knew their school district was using teacher evaluations for human capital decisions, some principals were less likely to rate teachers critically. In other words, the principals in this study interpreted and implemented teacher evaluations while always thinking of the future employment of their teaching staff. Recent work from Grissom and Loeb (2016) produced similar findings in which principals were more likely to rate teachers higher on highstakes evaluations than on low-stakes evaluations. In short, given all that is expected of school principals, I argue that if policymakers want better information on what makes a quality teacher, future teacher evaluation policies should allow principals much greater input and freedom when evaluating the teachers in their building or remove principals from the evaluation process. The suggestion to remove principals from the evaluation process entirely and use outside evaluators is not without limitations. For example, outside evaluators will bring their own cognition to the evaluation process. These individuals will have set expectations and beliefs on what makes a quality teacher and will have a predetermined mindset on what quality teaching and instruction looks like. However, the use of outside evaluators does eliminate the relational aspect of teacher evaluations, which has consistently surfaced as a concern or factor for principals while evaluating the teachers in their building. The first suggestion (and my personal preference) gives principals more say in who teaches in their building and gives principals the power to decide what makes a quality teacher for their specific context. Giving principals greater autonomy regarding how they evaluate teachers is not with flaws, as surely some teachers and policymakers may object to evaluations 136 that are outright subjective and may look different from teacher to teacher. However, if we operate under the assumption that all principals want what is best for their school and students, allowing principals greater discretion on what constitutes a quality teacher in their specific context may help schools cultivate stronger teaching staffs. Therefore, I suggest principals should have greater professional judgement and say as to how teachers are evaluated in their local context. Implications: For Policymakers The results of this dissertation’s analysis have implications for both policymakers and practitioners. First, policymakers indicate that a primary reason teacher evaluation policies continue to change is the need to design policies that provide better information and what makes a quality teacher, as well as hold teachers accountable for their performance in the classroom. However, an analysis of the findings of this dissertation suggests principal cognition greatly impacts the consistency and transferability of this information, putting into question how useful this information is for policymakers. This is not to say that the information collected by principals during evaluations is not valuable. Observation information collected by principals may in fact be very valuable, particularly when evaluating a teacher at a specific school in a certain context. However, this type of data collection makes it difficult to make between school teacher comparisons, even between schools in the same district. For example, some principals in this work noted that their teacher evaluation systems had room to adjust final scores if they felt such adjustments were called for (e.g. if a principal felt the outcomes of teacher observations and/or students’ final assessment data did not reflect teachers’ true impact on student learning). If this approach is used consistently by one principal for all of the teachers in a school, this information may be useful when attempting to determine what type of teacher is most effective 137 in that specific context. However, if some principals do this within a district and others do not, policymakers cannot rely on this information to decide which teachers are in fact most effective. Other principals in this study considered factors outside of, but related to, teacher evaluation policies (e.g. future teacher employment) when evaluating teachers. For example, during an official observation, Mr. Bookman and I observed a lesson by a teacher that Mr. Bookman told me did not reflect the effectiveness of this teacher. Mr. Bookman said because this lesson was just one bad 45-minute snapshot, he would not penalize this teacher, even though this observation was technically the official observation used for evaluation purposes. Mr. Bookman ultimately rated this teacher highly-effective, even though he admitted the lesson we observed rated more as minimally effective. This implication being, the information provided by Mr. Bookman on this teacher’s performance was not an accurate depiction of what we observed. The observational assessment provided by Mr. Bookman may in fact by a fair representation of this teacher’s quality (Mr. Bookman has observed this teacher throughout the school year and said these observations were very high-quality), but if the observation conducted by someone other than who knew this teacher so intimately, the evaluation of this teacher would have looked quite different. The implication here for policymakers being the information generated by teacher evaluation policies is largely dependent upon who does the evaluating. Individuals may suggest using outside observers or multiple observers who do not know the teachers as intimately has the potential to alleviate the concern of subjective evaluations of teacher performance. However, the use of outside evaluators does not remove the ethical question surrounding the ways teacher evaluations will be used, which may be a concern for any evaluator. For example, if an outside evaluator knows the results of a teacher’s evaluation will be used for employment decisions by the district and this does not align with the evaluators’ personal beliefs, outside evaluators may 138 still rate teachers favorably. Additionally, it is important to note that outside evaluators, like principals, will bring in their own cognition when evaluating teachers, which has the potential to cause similar disruptions to policy implementation efforts. Therefore, I believe if policymakers truly want accurate information on what makes a quality teacher, policies should be designed that allow for increased professional judgement of individual principals. This approach has the potential to produce more useful information on what makes a quality teacher in certain specific contexts. For example, principals who work in high-pressure environments may evaluate teachers in specific ways, by looking for specific characteristics and teaching skills. Policymakers will be able to use the information generated by principals who work in these contexts to better predict the type of teacher that will be successful (and remain teaching) in these high-pressure environments. In short, I argue principals should have more professional judgement when assigning teacher evaluation ratings, particularly given how intimately principals know their own school, teachers, and students. Research (including this dissertation) suggests principals do this already anyway and perhaps if policymakers and district leaders provided an opportunity for principals to evaluate teachers based on what principals believe is in the best interest of their local school, there might be a better match between teachers and schools, ultimately increasing the quality and length of tenure of teachers in schools. The implication here is that policymakers must understand that principals with different levels of experience and who work in different contexts need different types of teachers in their building and what constitutes and “effective” teacher in one context may not constitute an “effective” teacher in another context. 139 Implications: For Practitioners Like policymakers, practitioners (e.g. school and district leaders) hope new teacher evaluation systems will provide better information on what makes a quality teacher, as well as hold teachers accountable for their classroom performance. Additionally, practitioners hope these new policies and systems create opportunities for principals to provide support and feedback to teachers to help teachers improve their classroom practice, resulting in increased student achievement. Given these stated goals, districts and school systems would be well-served to provide principals more structured and intensive training with how to best use these new evaluation systems. The initial training of new teacher evaluation systems is a crucial element to how principals come to understand and implement these systems. While some states and districts have increased the amount and quality of training principals receive on how to implement evolving teacher evaluation systems, the training of the principals in this study varied drastically and in some cases the principals in this work received no training. For example, Ms. Steinman said: This year I really—I had one day of training as PD that was actually where they were training teachers, not administrators. I went to the teacher training, so at least I got a little bit of feel for that. The rest of my training really has been on the go, reading by myself, researching online, and then working with my administrative team for consistency. It’s been a limited training. The training received by the principals in this study varied in length and quality, suggesting principals everywhere do not always receive adequate teacher evaluation training and support. Principals nationwide would likely benefit from more in-depth and detailed initial and ongoing teacher evaluation training when adopting new teacher evaluation systems. If more 140 consistent and ongoing training is provided to principals these individuals may be more likely to use these systems in the ways envisioned by policymakers and district leaders. Policies may still be adapted to local contexts, but strengthening the initial training and support for principals has the potential to provide a more aligned vision between policymakers’ intentions and practitioners’ implementation efforts. Additionally, this training may help principals feel more confident in the accuracy and fairness of these systems. An important idea of these new evaluation systems was to improve student learning by identifying teachers with strong instructional practices and providing constructive feedback in areas where teachers needed improvement. However, the principals in this study often felt their teacher evaluation policy and system was not a good tool for evaluating teacher performance. Districts and the state would be well-served to provide training and explicit rationale to all principals using these new systems about how these systems will help teachers improve their practice and ultimately benefit the students in their school. In addition to providing principals with initial support and training on how to best use teacher evaluation systems, districts should provide principals feedback on how they are evaluating teachers, citing specifics about their evaluative process, not that they are just in compliance and completing the required paperwork. As principals become more comfortable implementing these new systems and receive constructive feedback, perhaps they will be more willing to critique teachers’ practice and provide a more accurate picture of which teachers are most effective. This feedback should include how principals are observing teachers, how principals communication and deliver teacher evaluation information to teachers, and the overall process of evaluating teachers. Providing principals with increased and ongoing feedback on how they are implementing teacher evaluation policies and systems has the potential to move 141 principals towards uniformity across districts and states and reduce the subjectivity of teacher evaluations. Currently, in many districts principals are forced to make sense of how to best use teacher evaluation systems on the fly and with little support and rarely receive feedback on their performance as an evaluator (beyond, “you completed the evaluation”). Additionally, school districts and states would be well-served to consult principals when creating and implementing new teacher evaluation systems. Principals are perhaps the best school-based actors who can most accurately speak to what should be included in a teacher’s evaluation and how to best navigate this process. Principals can work together to create meaningful teacher evaluation systems that still can allow for some professional judgement of principals on teacher performance. Principal involvement in creating teacher evaluation policies has the potential to alleviate some of the concerns of lack of policy implementation as principals will have greater buy-in as they are in large part responsible for designing these policies. Finally, aspiring school leaders need to be trained in how to identify teacher quality, use data to make important decisions, and evaluate teacher performance, which begins in their principal preparation program. Principal preparation program directors should focus much of their attention on principal evaluation of teachers, as this is arguably one of the most, if not the most, important aspect of a principal’s job. Providing current and future school principals with a clear understanding of teacher quality (at least, as much as practitioners, scholars, and policymakers know about what makes a quality teacher), or at least the potential to identify quality teachers, has the potential to lead to a collection of better information on teacher quality at the school level. If school districts decide school principals are not the best people suited to effectively implement teacher evaluation policies and evaluate teachers, districts should consider the use of 142 outside district evaluators, which is happening in some districts throughout the country (Kane et al., 2013). Using outside observers or multiple observers has the potential to remove the relational aspect of performance evaluations, which remains a concern when trying to cultivate a fair and objective teacher evaluation system. However, individuals charged with evaluation teachers will still bring with them their personal cognition, experiences, beliefs, and lens when evaluating teacher performance. Although some may argue the district can better control training, implementation, and feedback if they use this approach, I would argue in practice the use of outside evaluators will still be largely subjective based on the background and knowledge of the evaluator. Therefore, practitioners should proceed with caution before investing heavily in the use of outside evaluators. A more cautious approach is for researchers to conduct more randomized control trials to compare the reliability of principals’ ratings of teachers compared the outside evaluator ratings of teachers. Limitations and Future Research There are two main limitations to this dissertation. First, the 12 principals that participated in this work influence the findings. If another 12 principals participated in this study, the results may look different. The small number of participants, who were not selected randomly, does not allow me to make generalizable statements about all principals with similar characteristics. However, the goal of this work was to begin to hypothesize about how principals with certain characteristics think about and enact teacher evaluation policies. Therefore, these findings begin the process of building information that can test this hypothesis. The second limitation is, although principals were observed in their natural environment implementing their teacher evaluation policy and system, I did not observe each principal multiple times, with a variety of teachers, or during every interaction the principal had attempting to implement the 143 policy. In this way, my presence as a researcher may not have captured exactly how principals were conducting teacher evaluations in all circumstances. However, to account for this limitation I spoke with teachers when available to see if what I observed was an accurate or consistent representation of how these principals navigated the process of teacher evaluations. Additionally, I reviewed documents completed when I was not present, to compare and contrast what principals did while I was present and while I was absent. The findings from this research answered my three research questions and future research can again examine these research questions by conducting the same type of study with different principals who have the same characteristics. This approach is one that I will take when embarking on future research. Additionally, some of the findings of this work can best researched and tested quantitatively. For example, in Michigan data on how specific principals rate teachers are available. Therefore, future work could test the finding of if principals in highpressure environments (as defined by this study) do in fact distribute their teacher evaluation ratings more so than their peers who work in low-pressure environments. Additionally, researchers can quantitatively examine if principals with low-levels of experience do in fact rate teachers less critically than their more experienced peers. Examining the results of this work quantitatively will help either support these findings, or disconfirm some of this work, either way testing these hypotheses. Finally, the principals in this study used a variety of teacher evaluation systems. Future work could examine if individuals using certain teacher evaluation systems, such as the Charlotte Danielson Framework for Effective Teaching, are more likely to produce consistent and reliable ratings of teachers. I hypothesize that individual principal cognition will impact any subjective evaluative system, but future work geared towards supporting this hypothesis would provide much needed information to policymakers and practitioners alike. 144 Conclusions The goal of this study was to inform both practitioners (e.g. school district leaders) and policymakers on the importance of understanding how individuals with certain characteristics implement teacher evaluation policies. Practitioners need to know how individuals with certain characteristics make sense of evaluating teachers because if they better understand the individuals who are primarily charged with implementing new policies (in this case, school principals), they can directly address these variations and challenges by providing specific professional development, creating benchmarks and check-ins with principals throughout the implementation process and by holding these individuals accountable for their performance as evaluators. Policymakers need to know how individuals with certain characteristics make sense of evaluating teachers because as a policy is designed and develops, understanding how the people with whom this policy interacts make sense of the policy will help policymakers address some of this variation of sensemaking while drafting future legislation. Put simply, policymakers will be better able to anticipate what challenges may occur when practitioners attempt to implement future policies. The results of this study, coupled with other emerging work (Donaldson & Papay, 2014; Goldring et al., 2015; Grissom & Loeb, 2016), may be one explanation as to why teacher evaluation policy implementation remains a challenge in Michigan and beyond. This study and other research shows principals’ cognitive schemas will likely always have some impact on how policies look in practice (even if there is a consensus on the definition of teacher effectiveness and even if all principals receive the same training). The principals in this study were greatly influenced by their experiences and external context resulting in a wide variation of teacher evaluation policy implementation. 145 Interesting to note, since the reform of teacher tenure laws in 2011, of the 96,000 K-12 teachers in Michigan, only 19 have been dismissed due to poor evaluation scores (Michigan Department of Education, 2016). Additionally, K-12 teachers in Michigan continue to be rated overwhelming effective or highly effective; 97% of teachers in the state met this criteria (Michigan Department of Education, 2016). According to the findings of this study, we can likely attribute these high teacher evaluation ratings and lack of dismissals due to principals scoring teachers higher than would be expected – not necessarily because all of these teachers are effective or highly effective in the classroom. How principal cognition impacts their implementation of teacher evaluation systems is a double-edged sword. First, principals were able to be nimble and react to local instructional needs, such as tailoring their evaluation systems to focus on teaching attributes that these principals felt were an important measure of teacher effectiveness for their school context. Additionally, the principals were able to use teacher evaluations as a tool for focusing on larger, local priorities. Some research argues policies should be able to be adapted to meet local needs (McLaughlin & Talbert, 1993) and giving local actors more say in how policies look in practice may in fact be a net positive for promoting teacher, student, and school growth. However, the other edge of the sword is that principals in this study did not always address the goals central to the policy aims of teacher evaluation reform. In Michigan and nationally, steps are currently being taken to better standardize the teacher evaluation process calling for more clarity, accountability, and transparency in teacher evaluation systems (Hill & Grossman, 2013; US Department of Education, 2009). However, the results of this work indicate principals use teacher evaluation systems to work towards local goals and priorities and not necessarily towards the goals envisioned by policymakers. This finding is potentially worrisome in the sense that it 146 shows how a policy can be co-opted and used for reasons outside of the scope of the design of the policy. Although the principals in this study were acting in good faith and doing what they believed was in the best interest of their school and students, the results indicate there is a mismatch between policymakers’ intentions and practitioners’ implementation. This dissertation contributes to our understanding of how principals make sense of and implement teacher evaluation policies by modeling the relationship between principal cognitive schemas and teacher evaluation policy implementation. Because past research shows there is considerable variability between how principals implement teacher evaluation policies (Halverson et al., 2004) a more nuanced understanding of how principals with certain characteristics implement these widely popular policies may help district and education policy leaders better support principals and thus ensure more beneficial implementation. Thus, this dissertation contributes to both theory and practice. Specifically, this dissertation contributes to the sensemaking theory literature by suggesting principals with high-levels of experience engage in individual sensemaking, while principals with low-levels of experience engage in collective sensemaking. As was previously stated, the type of sensemaking in which one engages has implications for how teacher evaluation policies look in practice. For practice, this dissertation provides school districts information on how principals with certain experience characteristics and who work in certain contexts are likely to think about evaluating teachers in their building. Practitioners can use this information to better train principals, as well as provide them support as they navigate the process of teacher evaluation policy implementation and anticipate the challenges of implementation. This study is significant for two reasons. First, in most cases principals are primarily and solely responsible for conducting and implementing teacher evaluation policies. Understanding 147 why principals think about these high-stakes policies in certain ways and what types of thinking go into the evaluation process is important. Second, there is little evidence to support even the best designed teacher evaluation system will be implemented as intended. Therefore, it will be useful for both school leaders and policymakers to better understand how principals with certain cognitive schemas think about these systems as to predict how they may play out with certain principals. In this way both policymakers and practitioners will be better able to anticipate challenges of policy implementation and better account for these challenges when designing policies and training principals to use evolving teacher evaluation systems. This study brings together the bodies of literature on cognitive and sensemaking theory with principal and policy implementation with the goal of generating hypotheses of how principals with certain cognitive schemas are likely to implement teacher evaluation policies. This work builds on and extends the idea that school leader sensemaking is influenced by prior knowledge and preexisting understandings impact how they implement policies, particularly teacher evaluation policies (Coburn, 2005). Additionally, this study may help other states outside of Michigan begin to build theory and begin to identify a predictive model of how principals may implement teacher evaluation policies in their context. Many other states are in the process of a teacher evaluation overhaul and stand to learn from the lessons of Michigan. Given the increasing amount of attention, scrutiny, and changing evaluation policies, states across the country will need to better understand what is happening with these policies as they enter their system of practice. Although it is unlikely everyone will implement the same policy identically in all circumstances, past research shows as currently constructed, teacher evaluations are not doing a good job identifying quality teachers (Weisberg et al., 2009) and part of this is because of a lack of fidelity of policy implementation. In the future, states, researchers, and school 148 leaders should work together to provide better training and support to those charged with implementing these important policies as well consider allowing increased space for the professional judgement of principals when evaluating teachers in their context . As new teacher evaluation policies continue to permeate the educational landscape, how these reforms play out in different contexts is of extreme importance and should be studied. Given how important school principals are in policy implementation, it is imperative we better understand their thinking about how and why they implement policies or parts of policies in certain ways. This study shows that outside factors such as experience and context do in fact impact principals’ implementation of teacher evaluation policies. As more sanctions, money, and overall importance is tied to these policies, researchers should continue to focus on what the people who are charged with enacting these policies deem important and how this interpretation and implementation process affects policies, schools, and students. 149 APPENDICES 150 Appendix A Principal Questionnaire Thank you for taking the time to complete this questionnaire. As with any part of this study, you can withdraw your consent to participate at any time and you do not have to answer any questions that you do not want to answer. Anything you say will not be connected with your name, the name of your school, or the name of your school district in any publications or presentations. Your responses to this questionnaire will be kept in a locked filing cabinet or on a secure computer. Your identity will be kept using unique ID numbers and will never be released. Background Information What is your age range? (Please check one box) Younger than 30 30-40 41-50 51-60 Older than 60 What is the highest level of formal education you have completed? (Please check one box) Bachelor’s Degree Master’s Degree Doctoral Degree How many years of experience do you have working as a principal? (Please circle) 1 2 3 4 5 6 7 8 9 10 More than 10 How many years of experience do you have working as a principal at your current school? (Please circle) 1 2 3 4 5 6 7 8 9 10 More than 10 How many years did you spend as a classroom/subject teacher before you became a principal? (Please check one box) None 1-5 6-10 151 More than 10 Please respond to the following questions by placing and X in the box that most aligns with your feelings/beliefs. Scale: (1. SD: Strongly Disagree, 2. D: Disagree, 3. SOD: Somewhat disagree, 4. U: Undecided, 5. SOA: Somewhat Agree, 6. A: Agree, 7. SA: Strongly Agree) Strongly Disagree Disagree Somewhat Disagree My experience as an administrator impacts how I implement my district’s teacher evaluation policy. My beliefs on the goals of education impacts how I implement my district’s teacher evaluation policy. My leadership style impacts how I implement my district’s teacher evaluation policy. I am implementing my district’s teacher evaluation policy as envisioned by policymakers. 152 Undecide d Somewhat Agree Agree Strongly Agree Other principals in my district are implementing the teacher evaluation policy as envisioned by policymakers. My district’s teacher evaluation policy is an accurate reflection of teacher quality/effecti veness. My relationship with my teaching staff impacts how I implement our district’s teacher evaluation policy. I look at teachers’ previous evaluation data (including observation scores and student test scores) before evaluating a teacher. I use available resources to support teachers who are struggling. 153 If I need clarification on an aspect of my district’s teacher evaluation policy I will seek clarification. I communicate with my teaching staff as my district’s teacher evaluation policy requires. I provide feedback to my teaching staff as my district’s teacher evaluation policy requires. It is challenging to implement my district’s teacher evaluation policy. I use my district’s teacher evaluation policy to make personnel decisions. 154 I think about the future of my school when implementing my district’s teacher evaluation policy. 155 Appendix B Principal Interview Protocol 1 Principal ID: Date: Thank you for taking the time to be interviewed. As with any part of this study, you may withdraw your consent to participate at any time and you do not have to answer any questions that you do not want to answer. Anything you say will not be connected with your name, the name of your school, or the name of your school district in any publications or presentations. I will audiorecord your responses for my use only. First, I’ll ask questions about your knowledge and beliefs about your district’s teacher evaluation system. Then, I will ask you about your role in implementing your district’s teacher evaluation policy. Finally, I will ask you about your experiences planning for and conducting teacher evaluations. Your responses to this interview will be kept in a locked filing cabinet or on a secure computer. Your identity will be kept using unique ID numbers and will never be released. STATE PARTICIPANT ID NUMBERS, DATE, NAME OF INTERVIEWER, AND “START INTERVIEW” FOR RECORDING DEVICE Principals’ Current Teacher Evaluation System 1. 2. 3. 4. 5. 6. What teacher evaluation framework does your school use? How was that framework chosen? What are the strengths of your current teacher evaluation framework? What are the weaknesses of your current teacher evaluation framework? What would you change? How do you conduct a teacher evaluation?  What does the process look like from start to finish?  How do you prepare for them?  How do you conduct them?  How do you communicate the evaluation to teachers? 7. To what extent do you believe your district’s teacher evaluation system observational protocol is a valid measurement of teacher effectiveness? 8. How were you trained in using this instrument? 9. Describe all of the factors included in a teacher’s evaluation score. 10. What percentage of a teacher’s evaluation is based on student growth data? 11. Has the addition of student test scores in the evaluation process impacted how you conduct teacher evaluation observations? 12. What sources of assessment data are used for determining student growth? (e.g. state standardized tests, teacher made assessments, etc.) 13. What factors most impact your ability to implement teacher evaluation policies? 14. How do new teacher evaluation policies affect principal your relationship with your teaching staff? Principals’ Beliefs about Teacher Evaluation Policy 156 1. Do current teacher evaluation measures help you identify quality teaching? If so, how? If not, why not? 2. Setting aside raising test scores for the moment, what teacher characteristics do you consider most important when evaluating teacher quality? 3. How do you use teacher evaluation scores to make decisions? 4. In your opinion, what should be included in a teacher’s evaluation to make his or her score reliable and valid? 5. What do you think is the best indicator of a teacher’s effectiveness? 6. What do you think is the best way to accurately determine a teachers’ effectiveness in the classroom? 7. Do you think using student assessment data can improve the quality of teachers in your building? The teacher workforce as a whole? 8. In your opinion what are teaching behaviors that accurately represent a quality teacher? 9. How do teachers generally respond to the evaluation process? 10. Are teacher evaluations are “helpful” or “beneficial” to teachers (e.g. the feedback helps them improve?) If yes, how so? If no, why not? 11. Last year teachers were required to be evaluated two times. Do you think this is a fair number? Should it be more or less? Why? 12. What percentage of a teacher’s evaluation score should be related to student assessment data (e.g. growth, etc.)? Why this percentage? 13. Do you think being able to effectively judge teacher effectiveness is an important indicator of how you are doing as a principal? 14. Do you think you should be held responsible for the effectiveness of your teaching staff? 15. Do you think you should be held accountable for student achievement/growth of the students in your building? 157 Appendix C Principal Interview Protocol 2 Principal ID: Date: Thank you for taking the time to be interviewed. As with any part of this study, you can withdraw your consent to participate at any time and you do not have to answer any questions that you do not want to answer. Anything you say will not be connected with your name, the name of your school, or the name of your school district in any publications or presentations. I will audiorecord your responses for my use only. First, I’ll ask questions about how you provide feedback to your teachers. Then, I will ask you about how you use teacher evaluations to make decisions. Finally, I will ask you about how you think about teacher evaluations in the big picture. Your responses to this interview will be kept in a locked filing cabinet or on a secure computer. Your identity will be kept using unique ID numbers and will never be released. Principals’ Experience Implementing Teacher Evaluation Policies 1. When you observe a teacher, what are some of the indicators that help you distinguish between an effective lesson and an ineffective lesson? 2. Do you take into account outside factors that the evaluation rubric may not account for (e.g. you know the teacher has a challenging class, or maybe is just having an off day) and if so, what are some of these factors? 3. Do you consider outside factors other than teaching (e.g. a teacher who supports the school in other ways, like coaching, after school tutor, etc.) when evaluating teachers? 4. Does increased teacher effort play into evaluations, either consciously or subconsciously? Is it part of the evaluation process and/or should it be? 5. As teacher evaluation policies have changed, have you noticed a change in classroom behaviors from teachers (e.g. teaching to the test?) 6. Can you reflect on a particularly challenging evaluation? What made it challenging and how was it ultimately resolved? Changing Teacher Evaluation Policies 1. In 2018-19 40% of a teacher’s evaluation score will be based on student growth. How fair is this percentage? 2. In this current format standardized test scores would account for only half of the student growth measurement. The other half would be based on local measures or assessments. How fair is this approach? 3. How will the 2018-19 change of “schools are prohibited from assigning a student to an ineffective teacher in the same subject area for two consecutive years” impact your job and/or how you conduct teacher evaluations? 4. How fair is it for you to be personally held accountable for your performance implementing teacher evaluation policies? Using Teacher Evaluations 158 1. How would you describe the purpose of teacher evaluations?  What are they used for? (hiring, firing, retention, assigning teachers to specific students/classrooms?)  What should they be used for? 2. Is your school in a position to successfully implement the current teacher evaluation policy?  If so, how?  If not, what is missing? 3. Has this system been a success?  How do you know/what is your evidence? 4. Has implementation of these policies changed over time? (e.g. was implementation different year one than it is now?)  If so, how? 5. How does the current teacher evaluation system help improve student performance? 6. How does the current teacher evaluation system interact with other policy initiatives?  Do they conflict?  Do they assist? 7. What are the greatest challenges of teacher evaluation policy implementation? 8. How does your current teacher evaluation system allow you, as the school leader, to impact teacher quality and student achievement? 9. To what extent do you believe your district’s student growth measurement component is a valid measure of a teacher’s effectiveness? 10. How are teacher evaluation scores used in hiring decisions?  Firing decisions?  Retention decisions? 159 Appendix D Principal Interview Protocol 3 Principal ID: Date: Thank you for taking the time to be interviewed. As with any part of this study, you may withdraw your consent to participate at any time and you do not have to answer any questions that you do not want to answer. Anything you say will not be connected with your name, the name of your school, or the name of your school district in any publications or presentations. I will audiorecord your responses for my use only. During this interview I will ask you questions about the observation we just completed. Your responses to this interview will be kept in a locked filing cabinet or on a secure computer. Your identity will be kept using unique ID numbers and will never be released. STATE PARTICIPANT ID NUMBERS, DATE, NAME OF INTERVIEWER, AND “START INTERVIEW” FOR RECORDING DEVICE 1. What are you initial thoughts/reflections on the lesson we observed? 2. Was that observation a standard length? 3. How do you navigate your actions during the observation (i.e. typing notes, interacting with students, etc.)? 4. What were the strengths of that lesson? 5. What were some areas of improvement? 6. How do you approach the process of notetaking? 7. Are you thinking about the specifics of what your teacher evaluation policy asks you to do while you are observing the teacher? 8. Does how you observe a teacher change based on that teacher? 9. Are observations an accurate representation of teacher effectiveness? 10. How do you think about providing feedback to teachers that is meaningful and useful? 160 Appendix E Teacher Interview Protocol Thank you for taking the time to be interviewed. As with any part of this study, you can withdraw your consent to participate at any time and you do not have to answer any questions that you do not want to answer. Anything you say will not be connected with your name, the name of your school, or the name of your school district in any publications or presentations. I will audiorecord your responses for my use only. I will ask you some questions about your thoughts and experiences with your school’s teacher evaluation system. Your responses to this interview will be kept in a locked filing cabinet or on a secure computer. Your identity will be kept using unique ID numbers and will never be released. Teacher Knowledge and Beliefs of Current Evaluation Policies 1. What are the strengths of your current teacher evaluation framework? 2. What are the weaknesses of your current teacher evaluation framework? 3. What would you change? 4. How would you describe the purpose of teacher evaluations? 5. How should teacher evaluations be used? 6. How have you been trained with your current evaluation system? 7. What percentage of student assessment data should be used in these evaluations? 8. Do you feel your evaluation is an accurate representation of your teaching? 9. What criteria is the most accurate representation of your teaching? 10. How has the feedback you received from evaluations helped you improve your practice? 11. To what extent do you believe your district’s student growth measurement component is a valid measure of a teacher’s effectiveness? 12. In your opinion, what should be included in a teacher’s evaluation to make their score reliable and valid? 13. In 2018-19 40% of a teacher’s evaluation score will be based on student growth. Is this a fair percentage? 14. In this current format standardized test scores would account for only half of the student growth measurement. The other half would be based on local measures or assessments. Is this a fair approach? Teachers’ Perceptions of Principal Policy Implementation 1. How well does your principal understand the current teacher evaluation system? 2. In your opinion, is your school implementing the teacher evaluation system in a way consistent with your understanding of the policy? 3. How do you think your principal thinks they should be used? 4. How often to you use advice/feedback given to you by your principal? 5. Does your principal dominate conversations or do they provide you a chance for ample input and a chance to contribute to the conversation?How Teacher Practice is Impacted by Evaluations 6. How has your practice been impacted by changes teacher evaluation policies? 161 7. How does your practice change when you are being observed for a formal teacher evaluation? 8. How does your practice change knowing student assessment data is used as part of your evaluation score? 9. How does your practice change knowing your evaluation scores will be compared to colleagues (both within your school and district wide)? 162 Appendix F Observation Protocol Date: Time: Participant(s): School: Observations: Notes: Questions/Follow Up: 163 WORKS CITED 164 WORKS CITED Aaronson, D., Barrow, L., & Sander, W. (2007). Teachers and student achievement in the Chicago public high schools. Journal of Labor Economics, 25(1), 95-135. Anagnostopoulos, D., & Rutledge, S. (2007). Making sense of school sanctioning policies in urban high schools. Teachers College Record, 109(5), 1261-1302. Bartlett, F. C. (1958). Thinking: An experimental and social study. London: Allen & Unwin. Berg, B. L. (2007). Qualitative research methods for the social sciences (6th Ed.). San Francisco, CA: Pearson Education. Beteille, T., Kalogrides, D., & Loeb, S. (2009). Effective Schools: Managing the recruitment, development, and retention of high quality teachers. CALDER Working Paper 37. Washington, DC: The Urban Institute. Bidwell, C. E. (2001). Analyzing schools as organizations: Long-term permanence and shortterm change. Sociology of Education, 74, 100-114. Bingham, C. B., & Kahl, S. J. (2013). The process of schema emergence: Assimilation, deconstruction, unitization and the plurality of analogies. Academy of Management Journal, 56(1), 14–34. Blasé, R., Blasé, J., & Phillips, D. Y. (2010). Handbook of school improvement: How high-performing principals create high-performing schools. Thousand Oaks, CA: Corwin Press. Booher-Jennings, J. (2005). Below the bubble: "Educational Triage" and the Texas accountability system. American Educational Research Journal, 42(231), 231-268. Branch, G. F., Hanushek, E. A. & Rivkin S. G. (2009). Estimating principal effectiveness. CALDER Working Paper 32. Washington, DC: The Urban Institute. Chetty, R., Friedman, J. N., & Rockoff, J. E. (2014). Measuring the impacts of teachers II: Teacher value-added and student outcomes in adulthood. American Economic Review, (104)9, 2633-2679. Cubberley, E. (1929). Public school administration (3rd ed.). Boston, MA: Houghton Mifflin. Clark, D., Martorell, P. & Rockoff, J. E. (2009). School principals and school performance. CALDER Working paper 38. Washington, DC: The Urban Institute. 165 Clotfelter, C. T., Ladd, H. F., & Vigdor, J. L. (2006). Teacher-student matching and the assessment of teacher effectiveness. The Journal of Human Resources, 41(4), 778-820. Cohen, D. K., & Barnes, C. A. (1993). Pedagogy and policy. In D. Cohen, M. W. McLaughlin, & J. E. Talbert (Eds.), Teaching for understanding: Challenges for policy and practice, (pp. 207-239). San Francisco, CA: Jossey-Bass Inc. Coburn, C. E. (2001). Collective sensemaking about reading: How teachers mediate reading policy in their professional communities. Educational Evaluation and Policy Analysis, 23(2), 145-170. Coburn, C. E. (2005). Shaping teacher sensemaking: School leaders and the enactment of reading policy. Educational Policy, 19(3), 476-509. Cohen, D. K., & Hill, H. (2001). Learning policy: When state education reform works. New Haven, CT: Yale University Press. Cohen-Vogel, L. (2011). Staffing to the test: Are today’s school personnel practices evidence based? Educational Evaluation and Policy Analysis, 33(4), 483-505. Creswell, J. (2013). Qualitative inquiry and research design. Los Angeles, CA: Sage. Darling-Hammond, L. (2000). Teacher quality and student achievement: A review of state policy evidence. Educational Policy Analysis Archives, 8(1), 1-44. Darling-Hammond, L., Wise, A. E., & Pease, S. R. (1983). Teacher evaluation in the organizational context: A review of the literature. Review of Educational Research 53(3), 285–328. Dee, T., Jacob, B. A., & Schwartz, N. (2013). The effects of NCLB on school resources and practices. Educational Evaluation and Policy Analysis, 35(2), 252-279. Denzin, N. K., & Lincoln Y. S. (2003). Collecting and interpreting qualitative materials (2nd ed.) Thousand Oaks, CA: Sage Publications. Derrington, M. L., & Campbell, J. W. (2013). The changing conditions of instructional leadership: Principals’ perceptions of teacher evaluation accountability measures. In B. Barnett, A. R. Shoho, & A. J. Bowers (Eds.), School and district leadership in an era of accountability (pp. 231-251). Charlotte, NC: Information Age Publishing. Dewey, J. (1938). Experience and education. New York, NY: The Macmillan Company. Diamond, J. & Spillane, J.P. (2004). High-stakes accountability in urban elementary schools: Challenging or reproducing inequality? Teachers College Record, 106(6), 1145-1176. 166 Donaldson, M. L., & Papay, J. (2014). Teacher evaluation for accountability and development. In H. F. Ladd and M. E. Goertz (Eds.), Handbook of research in education finance and policy, (pp. 174–193). New York, NY: Routledge. Donaldson, M. L., & Papay, J. P. (2014). Teacher evaluation reform: Policy lessons from school principals. Principal’s Research Review, 9(5), 1-8. Donaldson, M. L. (2013, April). How do teachers respond to being evaluated based on their students’ achievement? Evidence from New Haven, CT. Paper presented at the annual conference of the American Educational Research Association, San Francisco, CA. Donaldson, M. L. (2009). So long, Lake Wobegon? Using teacher evaluation to raise teacher quality. Retrieved from https://www.americanprogress.org/wp-content/uploads/issues/2009/06/pdf/teacher_ evaluation.pdf Duke, D. L., & Stiggins, R. J. (1990). Beyond minimum competence: Evaluation for professional development. In J. Millman & L. Darling-Hammond (Eds.), The New Handbook of Teacher Evaluation: Assessing Elementary and Secondary School Techers (pp. 116-132). Newbury Park, CA: Corwin Press. Duke, D. L., & Stiggins, R. J. (1986). Teacher evaluation: Five keys to growth. Washington, D.C.: National Educational Association. Elmore, R. F. (1980). Complexity and control: What legislators and administrators can do about implementing public policy. In L. Shulman & G. Sykes (Eds.), Handbook of teaching and policy (pp. 342-369). New York, NY: Longman. Ganon-Shilon, S., & Schechter, C. (2016). Making sense of school leaders’ sense-making. Educational Management Administration & Leadership, Published online before print. Gates, P. E., Blanchard, K. H., & Hersey, P. (1976). Diagnosing education leadership problems: A situational approach. Educational Leadership, 33(5), 348-354. Goldring, E., Grissom, J. A., Ruben, M., Neumerski, C. M., Cannata, M., Drake, T., & Schuermann, P. (2015). Make room value added: Principals’ human capital decisions and the emergence of teacher observation data. Educational Researcher, 44(2), 96-104. Greeno, J. G. (1998). Where is teaching? Issues in Education, 4(1), 110–119. Grider, C. (1993). Foundations of cognitive theory: A concise review. Available at: http://files.eric.ed.gov/fulltext/ED372324.pdf (accessed 15 November, 2015). Grint, K. (2011). A history of leadership. In A. Bryman, D. Collinson, K. Grint, B. Jackson & M. Uhl-Bien (Eds.), The SAGE handbook of leadership (pp. 3-14). Thousand Oaks, CA: Sage. 167 Grissom, J. A. (2011). Can good principals keep teachers in disadvantaged schools? Linking principal effectiveness to teacher satisfaction and turnover in hard-to-staff environments. Teachers College Record, 113(11), 2552-2585. Grissom, J. A., & Loeb, S. (forthcoming). Assessing principals’ assessments: Subjective evaluations of teacher effectiveness in low- and high-stakes environments. Education Finance and Policy. Grissom, J. A. & Loeb, S. (2009). Triangulating principal effectiveness: How perspectives of parents, teachers, and assistant principals identify the central importance of managerial skills. CALDER Working Paper 35. Washington, DC: The Urban Institute. Guba, E. G., & Lincoln, Y. S. (1994), Competing paradigms in qualitative research. In N. K. Denzin & Y. S. Lincoln (Eds.), The handbook of qualitative research (pp. 105-117). Thousand Oaks, CA: Sage Publications. Fenstermacher, G. D. & Richardson, V. (2005). On making determinations of teacher quality. Teachers College Record, 107(1), 186-213. Figlio, D. N., & Winicki, J. (2005). Food for thought: The effects of school accountability plans on school nuitrition. Journal of Public Economics, 89(2-3), 381-394. Firestone, W., Monfils, L., Schorr, R., Hicks, J., & Martinez, M. C. (2004). Pressure and support. In W. Firestone, L. Monfils & R. Schoor (Eds.), The ambiguities of teaching to the test. Standards, assessment and educational reform. Mahwah, NJ: Lawrence Erlbaum Associates. Fiss, P. C., & Zajac, E. J. (2006). The symbolic management of strategic change: Sensegiving via framing and decoupling. The Academy of Management Journal, 49(6), 1173–1193. Hallinger, P., & Heck, R. (1996). Reassessing the principal’s role in school effectiveness: A review of empirical research, 1980-1995. Educational Administration Quarterly, 32(1), 544. Halverson, R., Kelley, C., & Kimball, S. (2004). Implementing teacher evaluation systems: How principals make sense of complex artifacts to shape local instructional practice. In C. Miskel & W. Hoy (Eds.), Theory and research in educational administration (pp. 66-90). Greenwich, CT: Information Age Press. Halverson, R., & Clifford, M. (2006). Evaluation in the wild: A distributed cognitive perspective on teacher assessment. Educational Administration Quarterly, 42(4), 578-619. Hanushek, E. A., & Rivkin, S. G. (2010). Generalizations about using value-added measures of teacher quality. American Economic Review, 100(2), 267–271. Harris, D. N., Ingle, W. K., & Rutledge, S. A. (2014). How teacher evaluation methods matter 168 for accountability: A comparative analysis of teacher effectiveness ratings by principals and teacher value-added measures. American Educational Research Journal, 51(1), 73 112. Hess, F. M. (2008). Looking for leadership: Assessing the case of mayoral control of urban school systems. American Journal of Education, 114(3), 219-245. Hill, H. C. & Barth, M. (2004). NCLB and teacher retention: Who will turn out the lights? Education and the Law, 16(2-3), 173-181. Honig, M. I. (2006). Complexity and policy implementation: Challenges and opportunities for the field. In M. I. Honig (Ed.), New directions in education policy implementation: Confronting complexity (pp. 1-24). Albany, NY: State University of New York Press. Honig, M. I., & Hatch, T. C. (2004). Crafting coherence: How schools strategically manage m multiple, external demands. Educational Researcher, 33(8), 16-30. Jacob, B. A. (2011). Do principals fire the worst teachers? Educational Evaluation and Policy Analysis, 33(4), 403-434. Jacob, B. A., & Lefgren, L. (2008). Can principals identify effective teachers? Evidence on subjective performance evaluation in education. Journal of Labor Economics, 26(1), 101136. James, W. (1890). The principles of psychology. New York, NY: Henry Holt & Company. Kane, T. J., McCaffrey, D. F., Miller, T., & Staiger, D. O. (2013). Have we identified effective teachers? Validating measures of effective teaching using random assignment. The Bill and Melinda Gates Foundation: Seattle, WA. Keesler, V. A., & Howe, C. (2015). Teacher evaluation in Michigan. In J. A. Grissom & P. Youngs (Eds.), Improving teacher evaluation systems (pp. 156-168). New York, NY: Teachers College Press. Kennedy, M. (2010). Attribution error and the quest for teacher quality. Educational Researcher, 39(8), 591-598. Kimball, S. M. (2003). Analysis of feedback, enabling conditions and fairness perceptions of teachers in three school districts with new standards-based evaluation systems. Journal of Personnel Evaluation in Education, 16(4), 241-269. Klein, G., Moon, B., & Hoffman, R. R. (2006). Making sense of sensemaking 2: A macrocognitive model. IEEE Intelligent Systems, 88-92. Klein, G., Moon, B., & Hoffman, R. R. (2006). Making sense of sensemaking 1: Alternative perspectives. IEEE Intelligent Systems, pp. 22-26. 169 Koyama, J. (2014). Principals as bricoleurs: Making sense and making do in an era of accountability. Educational Administration Quarterly, 50(2), 279-304. Kraft M. A., & Gilmour A. F. Revisiting the widget effect: Teacher evaluation reforms and the distribution of teacher effectiveness. Working Paper. Kraft, M. A. & Gilmour, A. F. (2015). Can principals promote teacher development as evaluators? A 21 case study of principals’ views and experiences. Brown University Working Paper. Leithwood, K., Harris, A., & Hopkins, D. (2008). Seven strong claims about successful school leadership. School Leadership & Management 28(1), 27-42. Leithwood, K., Seashore-Louis, K. Anderson, S., & Wahlstrom, K. (2004). How leadership influences student learning. New York: The Wallace Foundation. Retrieved from http://www.wallacefoundation.org/knowledge-center/Pages/How-Leadership-Influences Student-Learning.aspx Lipsky, M. (1980). Street-level bureaucracy: Dilemmas of the individual in public services. New York, NY: Russell Sage Foundation. Maitlis, S., & Christianson, M. (2014). Sensemaking in organizations: Taking stock and moving forward. The Academy of Management Annals, 8(1), 57-125. Marshall, C., & Rossman, G. (1999). Designing qualitative research (3rd Ed.). Thousand Oaks, CA: Sage Publications. Matsumura, L. C., & Wang, E. (2014). Principals’ sensemaking of coaching for ambitious reading instruction in a high-stakes accountability policy environment. Educational Policy and Analysis Archives, 22(51), 1-37. Maxwell, J. A. (2005). Qualitative research design: An interactive approach (2nd Ed.). Thousand Oaks, CA: Sage Publications. McCleskey, J. A. (2014). Situational, transformational, and transactional leadership and leadership development. Journal of Business Studies Quarterly, 5(4), 117-130. McLaughlin, M. W. (1987). Learning from experience: Lessons from policy implementation. Educational Evaluation and Policy Analysis, 9, 171-178. McLaughlin, M. W., & Talbert, J. E. (2001). Professional communities and the work of high school teaching. Chicago, IL: University of Chicago Press. MET Project. (2013). Ensuring Fair and Reliable Measures of Effective Teaching. Bill and Melinda Gates Foundation. 170 Michigan Department of Education. (2015). Educator evaluations. Retrieved from: http://www.michigan.gov/mde/0,4615,7-140-5683_75438---,00.html Milanowski, A. T., & Heneman, H. G., III. (2001). Assessment of teacher reactions to a standards-based teacher evaluation system: A pilot study. Journal of Personnel Evaluation in Education, 15(3), 193-212. Miles, M. B., Huberman, A. M., & Saldana, J. (2014). Qualitative data analysis: A methods sourcebook (3rd Ed.). Thousand Oaks, CA: Sage Publications. Murphy, J. T. (1971). Title I of ESEA. The politics of implementing federal education reform. Harvard Educational Review, 41(1), 35-63. Nelson, B. S., Sassi, A., & Grant, C. M. (2001, April). The role of educational leaders in instructional reform: Striking a balance between cognitive and organizational perspectives. Paper presented at the American Educational Research Association Conference, Seattle, WA. Odden, A. (1991). The evolution of education policy implementation. In A. Odden (Ed.), Education policy implementation (pp. 1-12). Albany, NY: State University of New York Press. Papay, J. P., & Johnson, S. M. (2012). Is PAR a good investment? Understanding the costs and benefits of teacher peer assistance and review programs. Educational Policy, 26(5), 696729. Papay J. P. & Kraft M. A. (2014) Forthcoming: Productivity returns to experience in the teacher labor market: Methodological challenges and new evidence on long-term career improvement. Journal of Public Economics. Patton, M. (2014). Qualitative research and evaluation methods. (4th Ed.). St. Paul, MN: Sage Publishing. Piaget, J. (1964). Cognitive development in children. Journal of Research in Science Teaching, 2(3), 176-186. Piaget, J., & Inhelder, B. (1958). Growth of logical thinking: From childhood to adolescence. London: Routledge & Kegan Paul, PLC. Porter, A. C., Youngs, P., & Odden, A. (2001). Advances in teacher assessment and their uses. In V. Richardson (Ed.), Handbook of research on teaching, (pp. 259–297). New York, NY: Macmillan. Rigby, J. (2015). Principals’ sensemaking and enactment of teacher evaluation. Journal of Educational Administration, 53(3), 374-392. 171 Rockoff, J. E. (2004). The impact of individual teachers on student achievement: Evidence from panel data. The American Economic Review, 94(2), 247-252. Seashore-Louis, K., & Robinson, V. M. (2012). External mandates and instructional leadership: School leaders as mediating agents. Journal of Educational Administration, 50(5), 629 655. Seashore-Louis, K., Wahlstrom, K. L., Leithwood, K., & Anderson, S. E. (2010). Investigating the links to improved student learning. The Wallace Foundation. Retrieved on December 10, 2015, from http://www.wallacefoundation.org/knowledge-center/school leadership/key-research/Pages/Investigating-the-Links-to-Improved-Student Learning.aspx Smylie, M. A. (2010). Continuous school improvement. Thousand Oaks, CA: Corwin Press. Spillane, J. P. (2006). Distributed leadership. San Francisco, CA: Jossey-Bass. Spillane, J. P. (2000). Cognition and policy implementation: District policymakers and the reform of mathematics education. Cognition and Instruction, 18(2), 141-179. Spillane, J. P., Diamond, J. B., Burch, P., Hallett, T., Jita, L., & Zoltners, J. (2002). Managing in the middle: School leaders and the enactment of accountability policy. Educational Policy, 16(5), 731-762. Spillane, J. P., & Kenney, A. (2012). School administration in a changing education sector: The U.S. experience. Journal of Educational Administration, 50(5), 541-561. Spillane, J. P., & Lee, L. C. (2014). Novice principals sense of ultimate responsibility: Problems of practice transitioning to the principal’s office. Educational Administration Quarterly, 50(3), 431-465. Spillane, J. P., Reiser, B. J., & Gomez, L. M. (2006). Policy implementation and cognition: The role of human, social, and distributed cognition in framing policy implementation. In M. I. Honig (Ed.), New directions in education policy implementation: Confronting complexity (pp. 47-63). Albany, NY: State University of New York Press. Spillane, J. P., Halverson, R., & Diamond, J. B. (2004). Towards a theory of leadership practice: A distributed perspective. Journal of Curriculum Studies, 36(1), 3-34. Spillane, J. P., Reiser, B. J., & Reimer, T. (2002). Policy implementation and cognition: Reframing and refocusing implementation research. Review of Educational Research, 72(3), 387-431. Steinberg, M. P., & Donaldson, M. L. (2016). The new educational accountability: Understanding the landscape of teacher evaluation in the post-NCLB era. Education Finance and Policy, 11(3), 340-359. 172 Steinberg, M. P., & Sartain, L. (2015). Does teacher evaluation improve school performance? Experimental evidence from Chicago’s Excellence in Teaching Project. Education Finance and Policy, 10(4), 535–572. Taylor, E. S., & Tyler, J. H. (2012). The effect of evaluation on teacher performance. American Economic Review, 102(7), 3628-3651. The Center for Public Education. (2014). Trends in teacher evaluation. Retrieved from http://www.centerforpubliceducation.org/Main-Menu/Evaluating-performance/ US Department of Education. (2009). Race to the Top program executive summary. Retrieved from http://www2.ed.gov/programs/racetothetop/executive-summary.pdf Weatherly, R., & Lipsky, M. (1977). Street-level bureaucrats and institutional innovation: Implementing special education reform. Harvard Educational Review, 47(2), 171–197. Weick, K. E. (1995). Sensemaking in organizations. Thousand Oaks, CA: Sage Publications. Weick, K., & Sutcliffe K. (2007). Managing the unexpected: Resilient performance in an age of uncertainty. San Francisco, CA: Jossey Bass. Weisberg, D., Sexton, S., Mulhern, J., & Keeling, D. (2009). The widget effect: Our national failure to acknowledge and act on differences in teacher effectiveness. New York, NY: The New Teacher Project. Wise, A. E., Darling-Hammond L., McLaughlin, M. W., & Bernstein, H. T. (1985). Teacher evaluation: A study of effective practices. The Elementary School Journal 86(1), 60-121. Wundt, W. M. (1902). Principles of physiology psychology. New York, NY: The Macmillan Company. Yin, R. K. (2013). Case study research (5th Ed.). Thousand Oaks, CA: Sage Publications. 173