THREE CHAPTERS IN ECONOMICS OF EDUCATION By Dongming Yu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Economics – Doctor of Philosophy 2024 ABSTRACT Chapter 1: Since the launch of the ConnectED initiative in 2013, there has been a significant surge in digital learning classrooms, driven by the Obama Administration's goal to provide 99% of schools with high-speed wireless broadband. This mission was later integrated into the E-rate program, a national initiative offering discounts to schools and libraries to facilitate internet access. In spite of the anticipated advantages, the educational impact of Wi-Fi integration in schools remains unclear. Utilizing school district-level Wi-Fi deployment data and student performance data, this paper investigates the effects of school district Wi-Fi investments on student academic and disciplinary outcomes. The findings suggest that, on average, the introduction of Wi-Fi in schools widens the achievement gap between racial groups, particularly negatively affecting disadvantaged subgroups. This effect is more pronounced in economically disadvantaged regions, including those with more rural schools or higher levels of racial segregation, as well as in technologically lagging areas characterized by larger household internet access disparities. When exploring potential mechanisms, this study finds evidence that Wi-Fi-equipped school districts did not necessarily invest more in supplementary resources to effectively utilize Wi-Fi, while student disciplinary problems arising from Wi-Fi usage seemed to be of lesser concern. Chapter 2: This study investigates the impact of third-grade retention policies across the United States, focusing on how these policies influence retention rates among Kindergarten to second-grade students and adjust kindergarten entrance ages. Since the implementation of such policies, starting with California in 1998, there has been a notable shift towards retaining students who are not reading proficiently by the end of third grade—a critical transition point from "learning to read" to "reading to learn." Utilizing data from the October Current Population Survey, the analysis reveals that the introduction of retention policies is associated with a decrease in retention rates for boys in kindergarten and second grade and a reduction in kindergarten entrance age for girls by approximately one month. These findings underscore the importance of considering both educational outcomes and the perspectives of children and parents in assessing the effectiveness of retention policies. This study highlights the varied impacts of these policies on students of different genders and from various socioeconomic backgrounds, and can guide improvements in how schools and families address early reading challenges. Chapter 3: Educators and policymakers have been concerned that the COVID-19 pandemic has led to substantial delays in learning due to disruptions, anxiety, and remote schooling. We study student achievement patterns over the pandemic using a combination of state summative and higher frequency benchmark assessments for middle school students in Michigan. Comparing pre-pandemic to post- pandemic cohorts we find that math and ELA achievement growth dropped by 0.20, and 0.03 standard deviations more than expected, respectively, between 2019 and 2022. These drops were larger for Black, Latino, and economically disadvantaged students, as well as students in districts that were at least partially remote in 2021-22. Benchmark assessment results are consistent with summative assessments and show sharp drops in 2020-21 followed by a partial recovery and potential stall-out in 2021-22. TABLE OF CONTENTS CHAPTER 1: THE EDUCATIONAL IMPACT OF SCHOOL DISTRICT WI-FI SPENDING UNDER THE E-RATE PROGRAM ........................................................................................................................... 1 REFERENCES .............................................................................................................................. 31 APPENDIX .................................................................................................................................... 34 CHAPTER 2: THE SPILLOVER EFFECT OF THIRD-GRADE RETENTION POLICY ON EARLIER GRADES .................................................................................................................................................... 37 REFERENCES .............................................................................................................................. 57 APPENDIX .................................................................................................................................... 59 CHAPTER 3: THE PATH OF STUDENT LEARNING DELAY DURING THE COVID-19 PANDEMIC: EVIDENCE FROM MICHIGAN ........................................................................................ 65 REFERENCES .............................................................................................................................. 93 APPENDIX .................................................................................................................................... 97 iv CHAPTER 1: THE EDUCATIONAL IMPACT OF SCHOOL DISTRICT WI-FI SPENDING UNDER THE E-RATE PROGRAM 1. Introduction Over the past two decades, the U.S. K-12 educational sector has experienced a significant technological transformation. Acknowledging the digital age's potential to revolutionize education, school districts nationwide have made substantial investments in student internet access. This drive towards digitalization saw remarkable growth, with internet access in primary and secondary classrooms skyrocketing from 14% in 1996 to an astonishing 98% by 2018. This notable progress achieved the Federal Communications Commission (FCC)'s benchmark, advocating for 100 kbps internet access per student by 2018 (Education Superhighway, 2018). Despite the technological advancements and increased emphasis on digital integration, a significant portion of the student population remained underserved. Lacking Wi-Fi connectivity in schools presents a significant barrier to modern educational advancement. By 2013, only four million students had access to digital learning resources within their classrooms. Furthermore, 75% of classrooms still lacked Wi-Fi connectivity, highlighting the persistent digital divide across educational institutions (Education Superhighway, 2019). A study conducted in the same year emphasized this gap, revealing a widespread consensus among educators on the necessity of technological integration. A significant 86% of teachers and 93% of administrators concurred that educational technologies were crucial. Notably, 90% of these educators expressed enthusiasm for amplifying the use of digital tools in their daily teaching efforts (Pai, 2013). In today's digital era, consistent and dependable internet access is essential for students to utilize a broad spectrum of educational resources, including digital textbooks and interactive learning platforms. A 2020 Pew Research Center report highlighted the reliance of lower-income students on schools for online access, underscoring the pivotal role schools play in equalizing the digital playing field. The lack of Wi-Fi connectivity precludes students from accessing collaborative online opportunities, conducting research, and pursuing educational exploration within the classroom environment, thereby impeding their preparedness for the competitive demands of a progressively digital global economy. President Barack Obama, in 2013, turned the spotlight onto the groundbreaking strides made by a school district in Mooresville, North Carolina. By earmarking over a million dollars every year and ensuring individual computer access for every student, the district manifested commendable enhancements in terms of test scores and graduation metrics. This success story set the stage for the ambitious ConnectED initiative endorsed by President Obama. This initiative was designed to bridge the existing digital divide with the aim of connecting a near-complete 99% of schools to top-notch wireless broadband. To translate this vision into reality, the Federal Communications Commission (FCC) initiated a pivotal reform of the Schools and 1 Libraries (E-Rate) program in December 2014. This reform resulted in the annual funding allocated for school connectivity ballooning from $2.4 billion to $3.9 billion. The funding springs from the requirement in the 1996 Telecommunications Act for the FCC to provide discounts to eligible schools, libraries, and other educational institutions to help them access telecommunication and communication services, which is known as the E-rate program. The 2014 FCC E-rate reform focused on providing schools and districts with affordable high-speed internet (fiber connection) and Wi-Fi, which are key to ensuring digital classrooms and mobile learning. Wi-Fi wasn't just about internet connectivity; it was a gateway to mobility, innovative learning, and fostering a culture of initiative. The reform plays an important role in closing classroom connectivity gap. Between 2015 and 2019, the commitment to this vision was evident as public school districts in the U.S. allocated an astounding near-$5 billion to bolster their Wi-Fi architectures (Education Superhighway, 2019). At the same time, the percent of school districts taking advantage of digital learning grows from 30% in 2013 to 99% in 2019. The ripple effects of these investments weren't confined to classrooms. Case in point, the Boulder Valley School District in Colorado spearheaded an initiative in 2018, in tandem with a local internet service provider, to bestow free internet access to roughly 6000 deserving low-income students (Arbela, 2021). Despite the purported importance of spending on internet access, the evidence is inconclusive as to whether having high-speed access in the classroom leads to better academic outcomes. Some studies find a positive effect (Dettling, Goodman, and Smith, 2018), others a negative effect (Belo, Ferreira, and Telang, 2014), and still others find no effect (Faber, Sanchis-Guarner, and Weinhardt, 2016). This heterogeneity in findings underscores the myriad challenges inherent in implementing sweeping educational transformations, both expected and unforeseen. The successful integration of technology in classrooms often depends on accompanying resources, such as the professional development provided to educators, and administrative regulations regarding website usage safety. These are crucial for ensuring the relevance and quality of online resources, which significantly impact the learning process. Simply offering access doesn't guarantee an improved learning experience; aligning online materials with curriculum goals, employing effective teaching methods, and regulating student behavior are all essential components. While few papers directly evaluate the impact of having Wi-Fi in the classroom due to the difficulty of separating out the investment in Wi-Fi from the general investment in internet access, this paper fills the gap by taking advantage of the 2014 E-rate reform that made Wi-Fi service available to applicants during 2015-2020. Using a modified two way fixed-effects model, I exploited the variation in the timing of Wi-Fi deployment across school district in the U.S. to study the causal impact of school district Wi-Fi spending on students’ academic outcomes. I find that the introduction of classroom Wi-Fi is correlated with a wider achievement gap between racial groups, especially on Mathematics instead of ELA. The findings also 2 suggest that the disadvantaged subgroups are adversely impacted, and the effect is more pronounced in economically disadvantaged regions or in technologically lagging areas. Exploring the mechanisms, I find that schools that receive grants from the E-rate program do not necessarily invest in complementary educational resources to make efficient use of Wi-Fi, which could possibly explain why achievement gaps are larger between racial groups. On the other hand, I don't find strong evidence of an increase in severe student misbehaviors like bullying or expulsion. My findings offer insights that while Wi-Fi in schools holds the potential to enhance student outcomes, its effectiveness depends on the additional educational investments and the administrative regulations on student behavior after deploying Wi-Fi in classrooms. Without these measures, the potential benefits of this advanced technology may remain unrealized or even counterproductive, leading to varying academic results. Past research has highlighted that without proper training, the potential advantages of advanced technology can remain untapped or even be misapplied, which can result in varied academic outcomes (Rienties & Brouwer, 2013; Prestridge, 2017). The teacher's role in managing internet use in the classroom is vital. Some research suggests that clear guidelines, consistent monitoring, and the strategic integration of online resources can mitigate many behavioral concerns associated with internet use (Ertmer et al., 2012). While the data indicates that instances of severe student misbehavior may be relatively rare, it's imperative for school administrators to remain vigilant against more common issues. In the following section, I provide an overview of the E-rate program. Section 3 presents a conceptual framework discussing two potential mechanisms by which Wi-Fi in the classroom could affect student outcomes. Section 4 describes the data. The methodology is outlined in Section 5, and the academic outcomes are presented in Section 6. The two channels are examined in Section 7, and the paper concludes in Section 8. 2. Overview of E-rate Since 1996, the Schools and Libraries (E-rate) Program has provided discounts to assist eligible schools and libraries to obtain internet access and telecommunications services at affordable rates. The FCC appointed the Universal Service Administrative Company (USAC) as the administrator of the program. Two categories of services are eligible. The first category includes data transmission, internet access services, and voice services. These funds are intended to bring broadband access to the school. The second category includes internal connections, internal broadband services, and basic maintenance. In 2014, FCC expanded this category to include wireless local area network services such as Wi-Fi. Discounts range from 20 per cent to 90 per cent of service costs, depending on the poverty level, the percentage of students eligible for the National School Lunch Program and the urban/rural status of the school district (Universal Service Administrative Company (USAC), 2017)). The E-rate funding is provided on an annual cycle, and the process takes several steps. In Step 1, 3 eligible schools, school districts, or libraries submit an application form describing the service requests, and USAC posts these requests for service providers’ consideration. Subsequently, service providers start the competitive bidding process by offering prices to compete for these service requests. In Step 2, applicants select the service contract after evaluating the bids received, where the price of eligible products and services is the most heavily weighted factor, given the rules of the E-rate program. In Step 3, applicants request funding in accordance with the service costs in the contract, and USAC reviews the request and determines the funding commitment. In Step 4, applicants inform USAC that the delivery of approved services has started, and the invoicing process can begin. In Step 5, applicants or service providers receive the reimbursement of the service costs after completing the invoicing process. In 2014, the FCC approved an increase in the E-Rate funding cap from $2.4 billion to $3.9 billion and made changes in the administration and application process. Most importantly, compared to working on getting schools and libraries connected to the internet before 2014, the program now pays more attention to supporting internal connections such as Wi-Fi networks (category two services). Meanwhile, several steps are taken to make the E-Rate administration and application process easier and faster, aimed to encourage more applications and address the concern that poorer and rural school districts may be prevented from participating in the program by the complicated process. Since the inception of the reform, the E-Rate program has helped to close the digital divide in schools and libraries in the USA (Oh, 2014). 3. Conceptual Framework and Related Literature There is a large and growing literature in economics on investment in information and communication technology (ICT), but there is little agreement on payoffs to such investment. Some studies have identified positive effects from investments in high-speed internet at home or schools on academic outcomes such as improved test scores or heightened college application rates (Fettling, Goodman, and Smith, 2018; Chen, Mittal, and Sridhar, 2021). Conversely, other works have discerned no significant impact of internet or classroom technology investment on school outcomes (Rouse and Krueger, 2004; Goolsbee and Guryan, 2006; Violette 2017). Still, others have documented negative returns on ICT investments (Vigdor, Ladd, and Martinez, 2014; Hazlett, Schwall, and Wallsten 2019). Previous research has predominantly explored the implications of investing in internet access at homes or schools, as well as the impact of integrating technology, such as computer-assisted instruction, within classrooms. However, the influence of Wi-Fi connectivity, a relatively novel addition to primary and secondary educational settings, remains less examined. In the current digital age, Wi-Fi has become the connectivity norm in most households, overshadowing traditional wired broadband connections. Despite this shift, many educational institutions continue to rely on wired networks, confining internet access to static locations like computer labs. This reliance on outdated technology potentially sidelines an entire array of modern educational tools, including digital classrooms and personal tablets, rendering them ineffective 4 within the learning environment (Redway Networks, 2021). The advent of classroom Wi-Fi empowers educators to leverage online instructional materials and conduct assessments in real-time, while students gain the ability to access a wealth of digital resources, from school library databases to conducting independent research, through tablets or other devices. Despite the apparent advantages of integrating cutting-edge technology and devices into learning spaces, there remains a concern that such investments might detract from traditional educational resources, such as textbooks and teacher support. Moreover, the potential risks associated with technology use in education prompt further caution. Legislation like the Children's Internet Protection Act (2000) and The Student Digital Privacy Act (2015) play crucial roles in regulating students' interaction with digital tools in educational settings and safeguarding their online information. Consequently, the true impact of Wi-Fi accessibility in classrooms continues to be a subject of debate. The nuanced effects of Wi-Fi availability in classrooms extend beyond mere connectivity, potentially varying across racial groups, school district characteristics, and household internet access. For instance, schools in wealthier neighborhoods might leverage Wi-Fi more effectively, integrating sophisticated adaptive learning platforms, whereas those in less affluent areas may lack the resources or expertise for such advanced applications, thereby exacerbating disparities in technology use (Reardon & Owens, 2014). Moreover, schools situated in affluent yet racially segregated regions might witness an unequal distribution of resources, disproportionately disadvantaging minority students (Logan & Burdick- Will, 2016). The presence of robust Wi-Fi infrastructure does not inherently ensure beneficial outcomes for all students. Disparities in device availability or home internet stability can foster a digital divide, leaving some students unable to engage with assignments or educational materials beyond school premises (Washier & Muchnik, 2010). This divide can be particularly detrimental for students from socio- economically disadvantaged backgrounds, who may not acquire the necessary digital literacy skills to navigate online resources effectively (van Deursen & van Dijk, 2019). While numerous studies have delved into the academic implications of investments in Information and Communication Technologies (ICT), the majority have concentrated on expenditures related to internet access at home or within schools, or on investments in classroom technologies. The specific effects of Wi- Fi implementation in classrooms have received less attention in scholarly research, largely due to the difficulty in isolating investments aimed solely at Wi-Fi deployment from broader internet infrastructure investments within educational settings. This paper seeks to bridge this gap by utilizing data from the 2014 E-rate reform, which extended Wi-Fi service availability to applicants, thereby enabling an examination of how school district investments in Wi-Fi influence academic outcomes. Prior studies leveraging E-rate program data have primarily focused on the program's first service category, which has facilitated internet access since its inception in 1996 (Goolsbee & Guryan, 2006; Chen, Mittal, & Sridhar, 2021). However, 5 data post-2015 reform, which is relatively untapped, presents new avenues for exploration. A recent study utilizing this dataset concentrated more on the reform's effects on funding distribution rather than on direct academic outcomes (Grzeslo et al., 2018). This study contributes to the ICT literature by conducting an analysis with a national scope, utilizing both E-rate application data from school districts and their corresponding academic outcomes. Using a modified TWFE model, I utilize the variation in Wi-Fi deployment timing across school districts to investigate its impact on academic and disciplinary outcomes nationwide. To explain the primary findings in this paper, which reveal a negative impact of Wi-Fi on the racial achievement gap, as well as the heterogeneous outcomes across different school demographics and student backgrounds, this study explores two potential mechanisms by which Wi-Fi might influence academic performance: adjustments in educational spending and the rise in disciplinary issues attributable to Wi-Fi usage. 3.1 The Financial Channel The Financial Channel investigates how school districts adjust their expenditures upon receiving a Wi-Fi grant from E-rate. While investing in Wi-Fi aims to provide high-speed internet access to all students and possibly decrease disparities among subgroups, the intended benefits may not be realized if the school fails to invest in complementary resources. These resources include purchasing devices, providing training, and offering support to prepare teachers and students for new classroom technology. Purchasing devices such as laptops, tablets, and other electronic equipment that can connect to Wi-Fi enables students and teachers to access and utilize online educational resources effectively. Adequate training ensures that educators know how to use Wi-Fi and supporting devices efficiently in their teaching. They need training on how to integrate internet resources, online tools, and digital platforms into their lessons. Failing to invest in these areas will hinder the ability of less advantaged students to effectively adapt to and benefit from the new technology, thereby limiting the efficient utilization of Wi-Fi in schools. Additionally, ensuring that Wi-Fi and related digital tools align with and support the overarching objectives of the curriculum requires additional support. This ensures that technology serves educational goals rather than detracting from them. In an ideal scenario, investments in Wi-Fi should encompass costs that extend beyond the initial expenditure, covering aspects such as maintenance, training, and supplementary resources, such as computers or tablets. However, this is not usually the case. Goolsbee and Guryan (2006) found that federal grants in the U.S. to help schools buy technology resulted in significant increases in spending on technology beyond what was provided by the grants, indicating the presence of complementary investments or associated costs. Some studies caution about the effective use of technology funds. For example, Cuban (2001) has pointed out that while schools have heavily invested in technology, effective integration into the curriculum remains a challenge. Angrist and Lavy (2002) studied computer use in schools and found 6 positive effects on learning outcomes, but these effects depended on how the technology was used. The push for 1:1 device initiatives, where each student receives a device, highlights the importance of Wi-Fi infrastructure. Studies on these initiatives have shown mixed results, with positive outcomes in some settings (Weston and Bain, 2010) and less promising results in others (Fried, 2008), often pointing to implementation challenges. With the increasing reliance on online resources and digital tools for education, the digital divide has become a significant concern. Schools in disadvantaged areas may lack the necessary infrastructure. Even when these schools obtain grants for Wi-Fi and redistribute resources to optimize its use, they may cut back on allocations for essentials like traditional textbooks. While technology can sometimes be seen as a substitute for genuine teaching and learning processes, in certain underprivileged areas or among particular student demographics, traditional textbooks may still remain crucial to the learning process. An over-reliance on digital tools without appropriate pedagogical strategies can lead to superficial learning (Cuban, 2001). 3.2 The Behavioral Channel The second channel explores how Wi-Fi usage may influence disciplinary behaviors in schools, particularly among less advantaged students or in economically disadvantaged regions where there are fewer regulations and less teacher management of internet use. With the growing dependence of educational institutions on digital tools and online resources, the connection between internet access in schools and student behavior has gained attention. This relationship between school internet access and student disciplinary behavior is complex and multifaceted. On the one hand, when technology is efficiently integrated, it can render lessons more captivating, potentially diminishing distractions and classroom disruptions. Digital classrooms offer personalized instruction tailored to each student's pace and proficiency. When students perceive that lessons cater to their needs, they may be less prone to engage in disruptive behaviors. Additionally, Wi-Fi-enabled classroom technologies, such as collaborative online platforms, can impart essential communication and teamwork skills to students, possibly fostering positive interpersonal conduct. On the other hand, using digital resources requires teachers to reallocate instructional time between new technology and traditional methods. When students spend excessive time on the internet, it can lead to distractions and other behavioral problems, such as cyberbullying. Unrestricted internet access and the availability of social media may divert students' attention away from their studies (Glass and Kang, 2019; Ellison and Godoi, 2018; Bjornsen and Archer, 2015; Demirbilek and Talan, 2018). With unrestricted Wi- Fi access, students might easily get sidetracked by non-educational websites, social media, and games, potentially hindering the learning process. Furthermore, the internet can serve as a platform for harmful behaviors, including cyberbullying, which can have repercussions in the physical school environment. 7 The teacher plays a vital role in managing internet use in the classroom. According to research, implementing clear guidelines, maintaining consistent monitoring, and strategically integrating online resources can help alleviate many behavioral concerns related to internet use. For instance, a study conducted in Portugal found that students in schools that restrict access to websites such as YouTube tend to perform better (Belo, Ferreira, and Telang, 2014). Furthermore, a meta-analysis underscores the significance of both external factors, such as school policies and support, and internal factors, such as teachers' beliefs and attitudes, in the adoption of technology within classrooms (Scherer, Siddiq, and Tondeur, 2019). 4. Data To investigate the impact of Wi-Fi deployment in schools on the educational and disciplinary outcomes across school districts, I gathered information from various sources pertaining to public school districts in the United States. I sourced Wi-Fi deployment data from USAC, academic performance and demographic data from the Stanford Education Data Archive (SEDA), disciplinary data from the Civil Rights Data Collection (CRDC), expenditure data of school districts from the Common Core of Data School District Finance Survey (CCD), and county-level household internet access information from the American Community Survey (ACS). I will proceed to detail each of the data sources utilized in this study. 4.1 E-rate Data I obtained all E-rate requests of applicants from 2008 to 2018 from the USAC data retrieval tool and USAC E-Rate Recipient Details And Commitments dataset. Data for the funding year 2015 and before are obtained from the data retrieval tool. I use the E-rate Recipient Details And Commitments dataset as a source of data for applicants’ and recipients’ information for the funding year 2016 and onwards. Though some normalization is needed to deal with the inconsistency of naming and variable definitions between two datasets, in both datasets the request includes service type (e.g., internet access, internal connection), applicant type (i.e., school, school district, library, consortium), funding year, funding status (whether the funding request is approved by USAC), total funds requested (i.e., actual spending) and discount rate. I limit the sample to school districts because it is standard practice for the school districts, not the schools, to apply (and receive) funds. While Wi-Fi is not an explicit service type, I use request for internal connection to approximate request for Wi-Fi service for two reasons: Firstly, the 2014 E-rate reform increased the funding cap from $2.4 billion to $3.9 billion with a purpose to support the 2013 Connected Initiative, which emphasizes the importance of having Wi-Fi in school. Secondly, internal connection typically refers to Wi-Fi, like costs associated with internal wiring, including ‘cabling, components, routers, switches, and network servers’ to distribute connections to classrooms and other campus facilities. Since internal connect is just a fuzzy approximation of Wi-Fi connectivity, I also merged information on Wi-Fi quality in each school from 8 USAC dataset “E-Rate Request for Discount on Services: Connectivity Information” to get a more precise approximation. With this information, I define a school as treated if the school receives funding from USAC and is identified as having “completely” or “mostly” Wi-Fi quality in the school, which means all or most of the buildings in the school have Wi-Fi. I then define a school district as treated in year t when the aggregate number of students in the treated schools is more than half of the students in the district. Before the 2014 reform, the program mainly funded requests for internet access and cell phone services. In 2013 and 2014, the reallocation of all funding for phone and internet services left no money for a second internal connections funding category to pay for internal networking and wireless equipment. The 2014 reform added funding to provide school districts with affordable Wi-Fi. Figure 1 shows that the average number of applications for internal connection made by a school district, either eventually funded or not, increased dramatically after the reform. There are 5,278 school districts that request internal connection service during the sample period, with 4,930 school districts eventually being funded after the reform. Table 1 shows that treated and untreated school districts share similar baseline characteristics overall. A few notable differences are that more students are enrolled in treated school districts, while the percent of students in rural schools and percent of black students are slightly higher in untreated districts. 4.2 SEDA Data To investigate the relationship between Wi-Fi deployment and school district academic outcomes, I linked USAC school district treatment information with SEDA data. The SEDA data set provides data on educational contexts, district-level achievement, and achievement gaps between different subgroups in all public school districts in the United States. The data was constructed using Grade 3 to 8 math and ELA test results in school years 2008–2009 through 2017–2018. Each state is required to designate its own test to Grade 3 to 8 students annually in reading and math under the No Child Left Behind Act, but the results were not previously comparable across states before SEDA became available. The SEDA remedies this issue by making average test scores available to the public, allowing researchers to directly contrast student achievement in school districts across the United States for the first time (Reardon, Ho, et al., 2021). Besides academic achievement data, I also use school district characteristics data from SEDA to conduct the heterogeneity analysis in section 6.2. For rural-urban analysis, I split school districts into two samples, with the urban sample containing school districts with more than half of the students in urban schools, and the rural sample containing school districts with more than half of the students in rural schools. Though SEDA doesn’t provide academic achievement data for each school, it provides school characteristics data that I use to construct the Theil index in section 6.2 for racial segregation analysis. 4.3 Other Data for Heterogeneity Analysis CRDC Data To investigate the relationship between Wi-Fi deployment and school disciplinary outcomes, I 9 linked USAC school district treatment information with the CRDC, which collects data on education and civil rights issues, including school discipline (Office for Civil Rights, 2018). I gathered disciplinary data for the 2013–2014, 2015-2016 and 2017–2018 school years from the CRDC, since the CRDC did not require data reporting on school discipline for years prior to the 2013–2014 school year. Schools reported harassment and bullying, retention, expulsions, out-of-school suspensions and in-school suspensions (when a student is removed from classes and activities but remains in the school building). The data provided discipline counts by race for every public school. Prior to aggregating school-level counts to the district level, I excluded any school that reported a missing count for any racial group. I also excluded schools classified as alternative, vocational, special education, or other. I divided the sum of these discipline measures by total enrollment in the district to create a district-level discipline prevalence proportion. ACS Data I also obtained household internet access data from ACS to examine whether household internet access and school internet access are complement or substitute. Data about computer use by race have been collected by ACS since 2015 and provides yearly estimates for counties with populations of 65,000 people or more. Using county-level percent of broadband internet subscriptions for each race, I obtained the county-level White-Black home internet access gap in 2015, which is the year before E-rate reform started. To be clear, I measure the racial home internet access gap using the subscription rate for Whites minus the subscription rate for Blacks in each county. CCD Data To investigate the relationship between Wi-Fi deployment and investment in traditional education resources, I linked USAC school district treatment information with Common Core of Data (CCD), which collects data on education and civil rights issues, including school discipline (Office for Civil Rights, 2018). I gathered school district expenditure data for k-12 from 2008-2009 to 2018-2019. The School District Finance Survey consists of data submitted annually to NCES by state education agencies (SEAs) in the 50 states and the District of Columbia. The survey provides finance data for all local education agencies (LEAs) that provide free public elementary and secondary (prekindergarten through grade 12) education in the United States. I obtain expenditure on elementary/secondary education, non-elementary/secondary education and capital outlay. 5. Method School districts are treated at different times after the reform. I intend to use the variation in treatment timing to estimate the impacts of district investment in Wi-Fi on academic outcomes and subgroup outcome gaps. The baseline empirical approach compares outcomes in the treated districts to the untreated, before and after getting funds from the E-rate program, in a difference-in-differences (DD) framework: 10 𝑌𝑖𝑡𝑔𝑠 = 𝛽𝑃𝑜𝑠𝑡_𝐹𝑢𝑛𝑑𝑒𝑑𝑖𝑡 + 𝛿𝑡 + 𝛾𝑖𝑔 + 𝜀𝑖𝑔𝑡 (1) where 𝑌𝑖𝑡𝑠 is the interested average academic outcome of school district i in year t in grade g for subject s, and 𝑃𝑜𝑠𝑡_𝐹𝑢𝑛𝑑𝑒𝑑𝑖𝑡 takes value one for the first year the school district i is treated and all subsequent years. 𝛿𝑡 is a year fixed effect, and 𝛾𝑖𝑔 is a district-grade fixed effect. The standard error 𝜀𝑖𝑔𝑡 is clustered at district level, not district-grade level, to account for potential correlations and dependencies within districts over time. The regression is weighted by the number of students used to calculate mean score for each district. Problems with staggered rollout in a DiD setting are increasingly drawing researchers’ attention in recent literature. Several papers point out that the coefficient of interest (𝛽 of Equation (1)) is not easily interpretable, and is not consistent for ATT or ATE. The traditional TWFE implicitly assumes homogeneous treatment effects, and the estimator is a weighted average of treatment effects. However, restricting heterogeneity effects under the staggered rollout DiD designs may lead to estimands that put negative weights on some comparisons between early and late treated groups (Borusyak, Jaravel, and Spiess, 2021; Goodman-Bacon, 2021). Modified TWFE estimators are developed to address this issue (Borusyak, Jaravel, and Spiess, 2021; Callaway and Sant’Anna, 2020; Chaisemartin and D’Haultfoeuille, 2020), I use the efficient and robust estimator developed by Borusyak, Jaravel, and Spiess (2021) for my analysis. The imputation estimator developed in their paper is constructed so that the heterogeneity of treatment effect is not restricted. Their imputation process takes three steps. First, regressing on untreated ̂ . Second, use the school districts only to obtain the fitted value of unit and period fixed effects 𝛾𝑖̂ and 𝛿𝑡 fitted values to impute the untreated academic outcomes for treated school districts, and therefore obtain an ̂ for each treated observation. Finally, choosing weights estimated treatment effect 𝛽𝑖𝑡𝑠̂ = 𝑌𝑖𝑡𝑠 − 𝛾𝑖̂ − 𝛿𝑡 to take a weighted average of these treatment effect estimates. I will derive overall ATTs for all outcomes following these steps. The underlying assumptions in BJS’s setting are parallel trend and no anticipation effects, as in the traditional DiD design. The parallel trend assumption posits that the treated and control districts would have followed similar trends in student outcomes over time in the absence of E-rate reform, while the no anticipation effects assumption asserts that school districts do not alter their behavior in response to expectations of the policy change before it officially takes effect. The BJS paper also impose a parametric model of treatment effect to weaken the implicit assumption of homogeneous effects imposed in the traditional TWFE model. In a recent working paper, Roth (2021) demonstrated that common pre-trends tests used to validate the plausibility of the parallel trends assumption can inadvertently introduce a survivor bias to estimates that pass these tests. To mitigate this issue, BJS (2020) propose an alternative parallel trends test that complements their imputation estimator and conducts a F-test of the null hypothesis that pre-period coefficients are jointly insignificant, I show results in Appendix Table A.1 for all of the subgroup 11 outcomes and outcome gaps using indicator for the four periods prior to Wi-Fi deployment. Overall, although a few variables of interest do show evidence of differential trends between the untreated and the treated units, there does not appear to be evidence of systematic violations of the parallel trends assumption. Notably, the tests for achievement gap results between subgroups suggest the parallel trend assumption remains intact. Figure 3 and Figure 4 are selected event plots showing that the parallel trend assumption is not violated for White-Black and White-Hispanic achievement gap. 6. Results 6.1 Academic Outcomes In this section, I present results of the impact of Wi-Fi on the achievement gap by race, and separately, the impact of Wi-Fi on achievement for each subgroup. Table 2 presents BJS estimates of the impact of school district investment in Wi-Fi service on the mathematics and English Language Arts (ELA) racial achievement gap. I commence by investigating the effect on the average district achievement, where I observe a decrease of 0.015 standard deviations in the district mean math score and a decrease of 0.007 standard deviations in the ELA score. In the last three rows, I present the effects on the White-Black, White- Hispanic, and White-Asian achievement gaps in both subjects, respectively. The findings reveal an expansion of the achievement gap between advantaged and disadvantaged subgroups, with the most pronounced impact observed in the White-Black gap, particularly in math rather than ELA. Both the White- Black and White-Hispanic gaps experience an increase, with the 0.034 standard deviation increase in the White-Black math gap being approximately 1.7 times larger than that in the White-Hispanic math gap. The final row displays a negative value of -0.024, indicating an enlargement of the Asian-White math gap. Given the potential variance in the use of digital learning tools between elementary and secondary school students, with secondary school students generally presumed to employ such tools more frequently (Gallup Inc., 2019), I partition the sample into elementary and middle school students. The results for different grade levels are detailed in Appendix Table A.2. However, due to the absence of discernible differences between elementary and middle school outcomes, I refrain from further subdividing the sample by grade in subsequent analyses. To gain deeper insight into whether the heightened achievement gap stems from benefiting or harming specific subgroups, I conduct separate regressions of achievement outcomes for each subgroup against the treatment status. The outcomes are presented in Table 3. Firstly, focusing on the White-Black achievement gap: math scores of black students decrease by 0.047 standard deviations, while those of white students remain unaffected. This suggests that the introduction of Wi-Fi into the classroom adversely affects black students in treated districts in comparison to their counterparts in untreated districts. Similarly, the panel for the White-Hispanic gap demonstrates a decrease in Hispanic students’ math scores by 0.037 standard deviations and a decrease in their ELA scores by 0.021 standard deviations. Lastly, the increase in 12 the Asian-White achievement gap is attributable to a decrease in the scores of white students, a subgroup with comparatively lower academic advantage than Asian students. Overall, the outcomes in both Table 2 and Table 3 underscore that the amplified White-Black, White-Hispanic, and Asian-White gaps primarily result from decreased academic scores within the relatively disadvantaged subgroup. In sections 7.1 and 7.2, I examine the financial and behavioral factors to understand why the anticipated benefits of Wi-Fi are not fully realized. I found that the inefficient allocation of Wi-Fi funding could potentially hinder disadvantaged students' ability to effectively adapt to and benefit from new technologies. It is also important to consider the role of home internet access when evaluating the impact of school internet, as the two can function as either supplements or complements, potentially leading to different outcomes. As discussed in the next section, while school Wi-Fi aims to enhance educational opportunities, its benefits are not uniformly distributed. Data suggests that students who already have home internet access can leverage school Wi-Fi more effectively, using it as a supplementary resource to further their learning outside school hours. This additional advantage contributes to the negative impact observed in the achievement gaps between students with and without home internet access. Specifically, the lack of home internet access leaves disadvantaged students unable to continue learning at the same pace as their peers, thereby widening the achievement gap. The analysis reveals that in districts with significant disparities in home internet access, school Wi-Fi exacerbates existing educational inequalities by primarily benefiting those who are already better positioned. Therefore, while school Wi-Fi potentially supports educational equity, its actual impact without complementary access at home suggests a widening rather than narrowing of academic disparities. This underscores the need for holistic strategies that encompass not only school-based technological enhancements but also broader accessibility improvements to ensure all students benefit equally from digital educational resources. As shown in Appendix Table A.1, while individual subgroup outcomes do show evidence of differential trends between the untreated and treated units, there does not appear to be evidence of systematic violations of the parallel trend assumption for achievement gaps. Thus, I maintain the focus on the analysis of achievement gaps rather than isolating individual groups in the subsequent analysis. 6.2 Heterogeneity Analyses It's also crucial to examine if the achievement gap varies across districts with distinct demographic features. For example, is the racial achievement gap more pronounced in districts that are less economically or technologically advanced? I explore variations in effects based on several district demographic indicators: urban status, racial segregation index, and household internet access rate. I rely on 2015 district demographics for baseline comparisons, as that's the final year before school districts were influenced by the E-rate reform. Rural-Urban Analysis 13 Previous studies suggest that while internet access like Wi-Fi is a foundational step, competency in using technology effectively is also critical. Rural schools, despite having access, might not have the requisite competencies due to broader challenges, leading to potential negative outcomes (Looker and Thiessen, 2003). In some cases, without proper implementation and support, the introduction of Wi-Fi in rural schools can even exacerbate existing educational disparities. Table 4 presents the effects of Wi-Fi on the racial achievement gap, segmented by urban status. I divided school districts into two categories: Panel A includes districts where over half of the students attended rural schools in 2015, while Panel B comprises districts where more than half of the students were in urban schools in 2015. When comparing the data from Panel A to Panel B in Table 4, it's evident that the widened achievement gap highlighted in Table 2 is primarily driven by disparities in rural areas. The math achievement gap between White and Black students is over twice as large in rural districts. Additionally, the English Language Arts (ELA) achievement disparity between White and Black students is exclusive to rural areas. There's also a notable increase in the math achievement gap between White and Hispanic students, again, only evident in rural districts. While there's an increase in the ELA gap between White and Hispanic students, the reliability of this result is questionable, as the p-value in Appendix Table A.1 indicates a potential violation of the parallel trend for Hispanic ELA outcomes. The above findings indicate that the effectiveness of Wi-Fi usage in schools can be significantly influenced by the rural-urban status of a school district. While Wi-Fi and other educational technologies offer substantial opportunities for enhancing learning environments, rural school districts face unique challenges that may hinder their ability to fully harness these technologies' potential. For instance, rural districts often grapple with higher ongoing maintenance costs and infrastructure deficits, particularly in areas experiencing population decline and budget cuts. These districts are pressured to implement cost- cutting measures to balance their budgets, which can further compromise the maintenance and expansion of technology infrastructure (Hawkes, Halverson, & Brockmueller, 2002). Conversely, urban school districts usually have the requisite infrastructure to support high-speed Wi-Fi and a broader array of technological resources. These districts are typically equipped with more structured procedures for teacher training and technological integration, contrasting with the more informal, and sometimes inadequate, approaches found in rural districts. Urban schools can allocate resources not only for maintaining existing technology but also for network management and the integration of new programs and hardware, which facilitates smoother and more effective integration of technologies like Wi- Fi. Urban teachers show higher levels of familiarity and confidence in using such technologies, which enhances the integration and effectiveness of Wi-Fi and other digital tools in urban classrooms (Wang, 2013). Rural areas, despite potentially having Wi-Fi in schools, might face slower internet speeds due to 14 inadequate broadband infrastructure, which can severely limit the effective integration of online resources into teaching and frustrate both educators and students. Additionally, the lack of technical support in rural schools can complicate the resolution of technical issues, further hindering effective technology use. Moreover, the absence of consistent home internet access for rural students poses significant barriers to completing online assignments and continuing learning outside school settings. This exacerbates the digital divide between rural and urban areas, not only in terms of infrastructure but also in digital literacy skills (Wang, 2013). Racial Segregation Analysis Racial segregation of a school district refers to the situation where students of different racial or ethnic backgrounds attend separate schools. Prior work has argued that segregation has harmful effects on disadvantaged individuals through various channels: reducing exposure to successful peers and role models, and decreasing funding for local public goods such as schools (Wilson 1987; Massey and Denton 1993). In this section, I evaluate the heterogeneous impact of Wi-Fi investment by the degree of racial segregation in a school district. I begin by measuring racial segregation using a Theil (1972) index, constructed using SEDA school covariates data in 2015. Let 𝜙𝑟 denotes the fraction of individuals of race r in a given district, with four racial groups: Whites, Blacks, Hispanics, and others. I measure the level of racial diversity in the school district by an entropy index: 𝐸 = ∑ 𝜙𝑟 log2 𝑟 1 𝜙𝑟 , with 𝜙𝑟 log2 1 𝜙𝑟 = 0 when 𝜙𝑟 = 0. Letting j = 1, … , N index schools in the school district, I analogously measure racial diversity within school as 𝐸𝑗 = ∑ 𝜙𝑟𝑗 log2 𝑟 1 𝜙𝑟𝑗 , where 𝜙𝑟𝑗 denotes the fraction of individuals of race r in school j. I define the degree of racial segregation in the district as 𝐻 = ∑[ 𝑗 𝑝𝑜𝑝𝑗 𝑝𝑜𝑝𝑡𝑜𝑡𝑎𝑙 𝐸 − 𝐸𝑗 𝐸 ] where 𝑝𝑜𝑝𝑗 denotes the total population of school j and 𝑝𝑜𝑝𝑡𝑜𝑡𝑎𝑙 denotes the total population of the school district. Intuitively, H measures the extent to which the racial distribution in each school deviates from the overall racial distribution in the school district. The segregation index H is maximized at H = 1 when there is no racial heterogeneity within schools, in which case 𝐸𝑗 = 0 in all schools. It is minimized at H = 0 when all schools have racial composition identical to the district as a whole, so that 𝐸𝑗 = 𝐸. Table 5 presents the achievement gap based on levels of racial segregation. Panel A includes districts in the lowest segregation quintile, suggesting they experience less racial segregation, while Panel B comprises districts in the highest segregation quintile. A notable observation is the widening of the math achievement gap between White and Black students by 0.018 standard deviations in areas with higher racial segregation. Likewise, Appendix Table A.3 reveals a 0.021 standard deviations rise in the White-Black math achievement gap based on economic segregation. Economic segregation is determined in a similar 15 manner using the Theil index, but using the percentage of students receiving free or reduced lunch for the calculation. I find no significant difference in student outcomes based on racial segregation. This suggests that the increase in the racial achievement gap is not strongly correlated with whether disadvantaged students are exposed to their more successful peers in terms of technology access. Home Internet Gap Wi-Fi in schools doesn't inherently ensure equitable access. For true access, students require devices like laptops or tablets. If students from less advantaged backgrounds lack these devices at home, their ability to continue learning outside school hours is compromised compared to their more privileged counterparts. Even when schools provide these devices, the home environments can differ significantly. Students from more affluent backgrounds may have dedicated quiet spaces conducive for studying, while those from less advantaged backgrounds might be constrained by housing situations that don't offer such environments. It's crucial to study whether disparities in home internet access contribute to the academic achievement gap. Recognizing that traditionally disadvantaged student groups (such as Black and Hispanic students) typically have less home internet access compared to their advantaged counterparts (White students), and might therefore require additional time to learn new technology and navigate web pages, I propose two hypotheses: If, after the implementation of Wi-Fi in schools, White students show a relative increase in academic performance compared to Black students, it indicates that school Wi-Fi is complementing home internet. Conversely, if Black students exhibit comparative improvements over White students, it suggests that school Wi-Fi is serving as a substitute for home internet access. In Figure 2, I show the relationship between the proportion of households having broadband in a county and the White-Black home internet access gap. In the less technologically advanced area where there is a smaller proportion of households having broadband, whites are much more likely to have broadband at home than blacks. I thus investigate the heterogeneous impact of school Wi-Fi investment by racial internet access gap, which is measured by the difference in broadband subscription rate between whites and blacks in a county in 2015. Using data from the ACS, which provides county-level broadband subscription rates by race, I calculated the gap in home internet access and present the racial achievement gap by their home internet access gap in Table 6. Panel A includes districts in the lowest quintile, characterized by the smallest racial household internet access gap (White-Black, White-Hispanic, Asian-White). In contrast, Panel B includes districts in the highest quintile. Evidently, regions with pronounced disparities in home internet access, as observed in Panel B, exhibit larger academic achievement gaps. The math achievement gap between White and Black students expands by 0.032 standard deviations, and the gap between White and Hispanic students increases by 0.037 standard deviations. The results in panel B indicate that having internet at home is more likely to serve as a complement to school Wi-Fi in the less technologically advanced areas (larger home internet access gap), as disparities between racial achievement gap are not enlarged. 16 7. Channels 7.1 The Financial Channel Prior research underscores that mere availability of Wi-Fi does not suffice. Effective integration into the educational framework—backed by adequate training, professional development, and alignment with curriculum goals—is paramount. Upon approval of a school district's application by E-rate, the district is granted a discount rate, determining the percentage of the total funds requested will be paid by E-rate, and the percentage the school district will pay. This prompts an inquiry into whether districts, upon receiving this grant, augment their expenditure to bolster Wi-Fi deployment or conversely curtail investments in other educational resources. Note that the treatment definition is fuzzy in my setting: a school district as treated in year t when the aggregate number of students in the treated schools is more than half of the students in the district. A district may still receive some E-rate funding, but not defined as a treated district because the number of affected students is less than half of the total students in the district. I employ the regression below to evaluate the differential effects of an additional E-rate grant dollar on treated versus control districts, focusing on the coefficient 𝛽2, which measures the difference between treated and control school district: 𝑌𝑖𝑡 = 𝛽0 + 𝛽1𝐺𝑟𝑎𝑛𝑡𝑖𝑡 +𝛽2𝐺𝑟𝑎𝑛𝑡𝑖𝑡 ∗ 𝑇𝑟𝑒𝑎𝑡 + 𝛽3𝑝𝑒𝑟𝑓𝑟𝑙𝑖𝑡 + 𝛽4𝑆𝐸𝑆𝑖𝑡 + 𝛽5𝐷𝑖𝑠𝑐𝑜𝑢𝑛𝑡𝑖𝑡 + 𝛾𝑖 + 𝜀𝑖𝑡 where 𝐺𝑟𝑎𝑛𝑡𝑖𝑡 signifies the per-student E-rate Wi-Fi grant received by the district. I also control for the percentage of free or reduced lunch students in the district, as this is a determining factor for how much discount a district will get from E-rate. Additionally, I control for the discount rate the school district receives and the composite socioeconomic status.1 Table 7 presents the impact of receiving an extra E-rate dollar on other school district educational expenditure. The first column presents the three primary expenditures from CCD data: those on elementary/secondary education, non-elementary/secondary education, and capital outlay. In the second column, I break down the total expenditure to further investigate how Wi-Fi grant changes expenditure structure2. In comparison with control districts — which may receive some E-rate funding but not enough to cover over half of their student population — the treated districts tend to increase their educational expenditure by $0.87. Breaking down this expenditure further, I observe that for every dollar of grant received, these districts allocate an added $0.34 towards instruction and $0.50 towards support services. 1 SEDA use data from the ACS to construct measures of median family income, proportion of adults with a bachelor’s degree or higher, proportion of adults that are unemployed, the household poverty rate, the proportion of households receiving SNAP benefits, and the proportion of households with children that are headed by a single mother, which are used to construct the composite SES. 2 In the breakdown of total elementary/secondary education expenditure, expenditure includes instruction and support services. The non-elementary/secondary education expenditure encompasses costs for food and enterprises, which aren't pertinent to this context. The capital outlay covers construction, land, and equipment, but only those relevant to this study are included 17 The $0.34 increment in "expenditure on instruction" is channeled towards teacher salaries and benefits, but excluding costs linked to instructional support activities, which are paramount for the effective deployment of Wi-Fi. The $0.50 rise in "support expenditure" covers areas like maintenance, instructional and student support, business support, and others. While these results suggest that Wi-Fi grants might encourage districts to increase their investments in resources complementing digital classroom integration, a closer look at the support services expenditure shows only a small increase in spending for both instructional staff and student support, each around $0.07. These categories, however, are crucial indicators of whether school districts are allocating resources effectively to enhance Wi-Fi utilization in classrooms. Specifically, expenditure on instructional staff encompasses costs for supervision, curriculum development, instructional staff training, and support services. These are vital elements in equipping teachers to adapt to and effectively use new technologies, which is particularly important in serving low-income students and areas. Moreover, as indicated in the last category of Table 7, there is no significant increase in capital outlay for instructional equipment like laptops, which is essential to fully leverage Wi-Fi in classrooms. In Appendix Table A.4, the results are analyzed using a logarithmic transformation of the dependent variable—other expenditures—to examine the percentage changes brought about by Wi-Fi investments, offering insights into how these investments impact various expenditure categories disproportionately. The findings from this logarithmic model are consistent with those from the original model, both underscoring a potential misallocation of Wi-Fi funding. While the absolute dollar increases in non-educational investments, such as food services and enterprise operations, are modest—amounting to only a $0.015 increase per dollar of grant received per student—these categories exhibit the most significant relative increases, at 0.059% for non-educational expenditure compared to a mere 0.006% increase in educational expenditure. This disparity suggests a strategic reallocation of funds that may not directly enhance educational outcomes or effectively support Wi-Fi utilization. The implications of these findings are significant as they suggest that funds are potentially being directed to areas that, despite receiving increased funding, do not contribute directly to the educational use of technology. This trend echoes the initial analysis, which revealed a restrained increase in spending on crucial educational support areas like instructional staff and pupil support, indicating that a disproportionate amount of funds is being diverted to less impactful areas. These findings suggest that, although treated districts may experience a rise in overall expenditures after obtaining an E-rate grant, the augmented investments primarily channel into infrastructure maintenance and teacher compensation, or non-educational area, rather than fostering staff training, student assistance, or the acquisition of technologically advanced equipment. These allocation patterns could unintentionally dampen the effectiveness of Wi-Fi in educational environments, particularly due to deficiencies in equipment availability, inadequate teacher training, and insufficient student support. 18 It is important to acknowledge a caveat concerning the data source used in this study. The results shown in Table 7 represent the total K-12 student expenditure across districts, going beyond the specific focus on grades 3-8 captured by the SEDA data, which forms the basis for examining student achievement outcomes in my study. Thus, while the findings on other education expenditures offer some insights into the interpretation of racial disparities in academic achievement, it is essential to recognize the potential for disparities or inconsistencies that may emerge due to the broader scope of the data. 7.2 The Behavioral Channel Many studies suggest that unrestricted internet access can lead to off-task behaviors, such as browsing non-educational websites, playing online games, or engaging in social media during class time (Kirschner & De Bruyckere, 2017). These distractions can interrupt the learning process and contribute to decreased academic performance. Focusing on a range of school disciplinary problems collected in CRDC data, Table 8 presents the results of the impact of Wi-Fi investment on several school disciplinary outcomes: harassment and bullying, retention, in-school suspension (ISS), out-of-school suspension (OSS) and expulsion. I find no impact of school Wi-Fi on harassment and bullying, retention, and out-of-school suspension. The effect of school Wi-Fi on in-school-suspension is statistically significant and positive. Introducing Wi-Fi at school increases in-school-suspension by 0.042% for White students and 0.067% for black students. Expulsion rate decreases by 0.067% for white students. The results are consistent with previous findings: school internet investment tends to lead to an increase in less serious disciplinary problems. A previous study (Chen, Mittal, and Sridhar,2021) finds that school district internet access spending consistently led to an increase in less serious offense-related problems like assault, but not the most serious offense-related problems like aggravated assault. In-school- suspensions are more common and less severe than out-of-school suspensions and expulsions. ISS removes students from classroom activities, but not from school premises, and usually lasts less than 10 days. Out- of-school suspensions or expulsions, however, usually are used as the last resort. Furthermore, although 0.042% and 0.067% are statistically significant, they are economically small. The effect size of 0.042% refers to 42 more incidents of ISS reported per 100,000 students. Thus, from a school district’s perspective, potential disciplinary problems led by having Wi-Fi at school should not be a big concern. This study focuses on disciplinary issues tracked in CRDC, but does not include minor classroom problems like visiting unauthorized websites, getting distracted by games, or participating in online bullying because of data constraints, which are crucial for school administrators to be mindful of these concerns as well. 8. Conclusion Taken together, the results suggest that, on average, introducing Wi-Fi in schools leads to an increase in both the White-Black and White-Hispanic achievement gaps, with a more pronounced negative 19 impact on math than ELA. Furthermore, the effect is more significant in less advantaged subgroups and areas. The effect sizes of the decrease in math scores range from 0.01 to 0.06 of a standard deviation across different subgroups and samples. To contextualize these findings, Dettling, Goodman, and Smith (2018) report a positive effect of internet access on SAT scores at 0.003 standard deviations. Meanwhile, Vigdor, Ladd, and Martinez (2014) find that internet access decreases math test scores by 0.027 standard deviations in North Carolina. Summarizing results from the heterogeneity analyses, I conclude that the increase in the achievement gap is primarily due to the decline in math test scores among disadvantaged subgroups, notably Black and Hispanic students. Additionally, less advanced areas — such as rural regions, areas with higher racial segregation, and areas with a larger home access gap — are more adversely impacted. In conclusion, while Wi-Fi in schools holds the potential to enhance student outcomes, its effectiveness depends on strategic implementation, thorough teacher training, and adequate student digital literacy training. Merely deploying Wi-Fi and providing students with unrestricted access can lead to distractions and undesirable disciplinary behaviors. It's imperative to guide students in navigating the digital realm both safely and productively. Moreover, the integration of Wi-Fi into the educational environment is vital. As the findings suggest, treated school districts didn't necessarily augment their expenditures on accompanying educational resources, such as offering training for teachers or support for students to ensure effective instruction. It's particularly essential to assist disadvantaged students in adapting to new classroom technologies. If the adverse effects on these students stem from the additional time they require to familiarize themselves with new technologies, one might expect these impacts to diminish in the long-term as students and teachers become more adept with the new tools and methods. Given that the reform spanned from 2015 to 2020, I don't have outcome data for an extended period post Wi-Fi deployment. However, future studies might explore whether the observed negative effects can be attributed to the "learning curve" phenomenon, once more data becomes available. 20 Figure 1. Number of E-rate applications for internal connection services made by each district Figures Figure 2. Percent of HH with broadband in a county and White-Black home internet gap 21 Figure 3. White-Black Math Gap Estimates Figure 4. White-Hispanic Math Gap Estimates 22 Tables Table 1. Comparison of treated and untreated school districts at baseline Treated Characteristics Untreated % in rural schools % Black % Eligible for free or reduced lunch % Economically disadvantaged % English language learner % Special education Poverty rate Unemployment rate SNAP receipt rate Single mother HH rate Log of median income Average Enrollment Number of districts 22 23 53 56 7 13 16 11 13 21 17 15 53 55 9 13 15 10 12 20 10.8 1287.9 348 10.9 1985.3 4930 Notes: Characteristics are weighted by number of students in the district 23 Table 2. BJS Estimates of the Impact of Wi-Fi Investment on Subgroup Achievement Gaps Effect sizes Dependent variable District mean achievement Observations White-Black achievement gap Observations White-Hispanic gap Observations White-Asian gap Math -0.015*** (0.004) 260,222 0.034*** (0.005) 58,684 0.020*** (0.004) 70,283 -0.024*** (0.008) 30,888 ELA -0.006* (0.003) 268,097 0.017*** (0.005) 60,277 0.018*** (0.004) 72,268 -0.013* (0.007) 31,222 Observations Notes: Regressions are weighted by the number of students used to calculate mean score for each district. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Standard errors are clustered at district level. 24 Table 3. BJS Estimates of the Impact of Wi-Fi Investment on Achievement, by Race Effect sizes Dependent variable White-Black gap White Black Observations White-Hispanic gap White Hispanic Observations White-Asian gap White Asian ELA 0.017*** (0.005) 0.010** (0.004) -0.006 (0.006) 60,277 0.018*** (0.004) 0.004 (0.005) -0.021*** (0.006) 72,268 -0.013* (0.007) -0.012*** (0.005) -0.003 (0.011) 31,222 Math 0.034*** (0.005) 0.009 (0.005) -0.047*** (0.006) 58,684 0.020*** (0.004) 0.003 (0.005) -0.037*** (0.007) 70,283 -0.024*** (0.008) -0.014** (0.006) -0.007 (0.012) 30,888 25 Observations Notes: SEDA data is available for grade 3-8. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Cluster robust standard errors in parentheses. Regressions are weighted by the number of students used to calculate mean score for each district. Table 4. BJS Estimates of the Impact of Wi-Fi Investment on Achievement Gap, by Urban Status Dependent variable Math Panel A: districts with more students in rural schools Effect sizes 0.058*** (0.014) 12,354 0.025*** (0.006) 10,911 -0.029 (0.041) 1,991 Panel B: districts with more students in urban schools White-Black gap Observations White-Hispanic gap Observations White-Asian gap Observations White-Black gap Observations White-Hispanic gap Observations White-Asian gap ELA 0.044*** (0.017) 12,264 0.021 (0.019) 11,110 -0.036 (0.040) 1,857 0.009 (0.008) 48,013 0.021*** (0.006) 61,158 -0.015 (0.011) 29,365 Observations Notes: Regressions are weighted by the number of students used to calculate mean score for each district. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Cluster robust standard errors in parentheses. 0.025*** (0.008) 46,330 0.021 (0.016) 59,372 -0.020 (0.012) 28,897 26 Table 5. BJS Estimates of the Impact of Wi-Fi Investment on Achievement Gap, by Racial Segregation Dependent variable Math Panel A: less racial segregated Effect sizes -0.035 (0.031) 3,519 -0.009 (0.022) 4,679 0.013 (0.057) 1,635 Panel B: more racial segregated White-Black gap Observations White-Hispanic gap Observations White-Asian gap Observations White-Black gap Observations White-Hispanic gap Observations White-Asian gap ELA -0.025 (0.032) 3,624 0.004 (0.026) 4,904 -0.040 (0.048) 1,630 0.001 (0.010) 17,495 0.012 (0.010) 20,139 -0.016 (0.016) 11,151 Observations Notes: Panel A represents districts in the bottom segregation quintile, meaning that they are less racially segregated. Panel B represents districts in the top segregation quintile 0.017* (0.010) 17,098 0.010 (0.009) 19,876 -0.027 (0.017) 11,207 27 Table 6. BJS Estimates of the Impact of Wi-Fi Investment on Achievement, by home internet access gap Dependent variable Math Panel A: small home internet access gap Effect sizes 0.011 (0.016) 5,948 0.019 (0.013) 7,304 0.002 (0.036) 920 Panel B: large home internet access gap White-Black gap Observations White-Hispanic gap Observations White-Asian gap Observations White-Black gap Observations White-Hispanic gap Observations White-Asian gap ELA -0.015 (0.014) 6,093 0.008 (0.016) 7,230 0.005 (0.038) 935 0.008 (0.020) 6,726 -0.001 (0.015) 7,894 -0.046 (0.042) 2,423 Observations Notes: Panel A includes districts in small internet access gap counties in 2015. Panel B includes districts in large internet access gap counties in 2015 0.032** (0.016) 6,557 0.037** (0.016) 7,861 -0.012 (0.046) 2,547 28 Table 7. Impact of Wi-Fi Investment on Other Expenditures Dependent variable Total Elementary/Secondary Education Expenditure on Instruction Expenditure on Support Maintenance Instructional Staff Pupils Business Support General Administration Student Transportation Effect sizes Total Expenditure (1) 0.865*** (0.160) Expenditure Breakdown (2) 0.340*** (0.079) 0.499*** (0.104) 0.257*** (0.079) 0.073*** (0.018) 0.069*** (0.021) 0.046* (0.026) 0.038*** (0.012) -0.013 (0.012) Total Non-Elementary/Secondary Education Total Capital Outlay Land and Existing Structure Instructional Equipment 0.015* (0.009) 0.394 (0.449) Observations Notes: District expenditures are measured for k-12 from 2008-2009 to 2018-2019 in CCD. Estimates reflect the dollar change in other expenditures between treated and control districts for each dollar of grant received by the district. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Cluster robust standard errors in parentheses. 4,064 0.030 (0.045) -0.045 (0.043) 4,064 29 Table 8. BJS Estimates of the Impact of Wi-Fi Investment on School Disciplinary Problems Retention Expulsion White Students (N = 14,948) Black Students (N = 13,771) Notes: Retention rate is calculated for grade 3-8, Other rates are calculated for K-12. Numbers are percent, 0.1 means 0.1%. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Cluster robust standard errors in parentheses. -0.067* (0.035) -0.023 (0.017) 0.007 (0.005) -0.005 (0.006) Effect sizes In-school Suspension 0.042** (0.020) 0.067** (0.028) Out-of-school Suspension 0.001 (0.002) 0.011 (0.031) Harassment and Bullying 0.000 (0.002) 0.000 (0.001) 30 REFERENCES Angrist, J., & Lavy, V. (2002). New evidence on classroom computers and pupil learning. The Economic Journal, 112(482), 735-765. Belo, R., Ferreira, P., & Telang, R. (2014). Broadband in school: Impact on student performance. Management Science, 60(2), 265-282. Borusyak, K., Jaravel, X., & Spiess, J. (2021). Revisiting event study designs: Robust and efficient estimation. Working paper. Callaway, B., & Sant’Anna, P. H. (2021). Difference-in-differences with multiple time periods. Journal of Econometrics, 225(2), 200-230. Chen, Y., Mittal, V., & Sridhar, S. (2021). Investigating the Academic Performance and Disciplinary Consequences of School District Internet Access Spending. Journal of Marketing Research, 58(1), 141-162. Cheung, A. C., & Slavin, R. E. (2013). The effectiveness of educational technology applications for enhancing mathematics achievement in K-12 classrooms: A meta-analysis. Educational research review, 9, 88-113. Cuban, L., Kirkpatrick, H., & Peck, C. (2001). High access and low use of technologies in high school classrooms: Explaining an apparent paradox. American educational research journal, 38(4), 813- 834. Office for Civil Rights. (2018). 2015-16 Civil Rights Data Collection: School Climate and Safety. U.S. Department of Education, Office for Civil Rights. De Chaisemartin, C., & d'Haultfoeuille, X. (2020). Two-way fixed effects estimators with heterogeneous treatment effects. American Economic Review, 110(9), 2964-96. Demirbilek, M., & Talan, T. (2018). The effect of social media multitasking on classroom performance. Active learning in higher education, 19(2), 117-129. Dettling, L. J., Goodman, S., & Smith, J. (2018). Every little bit counts: The impact of high-speed internet on the transition to college. Review of Economics and Statistics, 100(2), 260-273. Education Superhighway (2018). We have made tremendous progress connecting our students. Retrieved from https://stateofthestates.educationsuperhighway.org/2018/#national Education Superhighway (2019). The classroom connectivity gap is now closed. Retrieved from https://stateofthestates.educationsuperhighway.org/#national Faber, B., Sanchis-Guarner, R., & Weinhardt, F. (2016). Faster broadband: are there any educational benefits? (No. 480). Centre for Economic Performance, LSE. Felisoni, D. D., & Godoi, A. S. (2018). Cell phone usage and academic performance: An experiment. Computers & Education, 117, 175-187. Fried, C. B. (2008). In-class laptop use and its effects on student learning. Computers & education, 50(3), 906-914. Furenes, M. I., Kucirkova, N., & Bus, A. G. (2021). A comparison of children’s reading on paper versus screen: A meta-analysis. Review of educational research, 91(4), 483-517. Glass, A. L., & Kang, M. (2019). Dividing attention in the classroom reduces exam performance. Educational Psychology, 39(3), 395-408. Gluckman, P. (2018). The digital economy and society: a preliminary commentary. Policy Quarterly, 14(1). 31 Goodman-Bacon, A. (2021). Difference-in-differences with variation in treatment timing. Journal of Econometrics, 225(2), 254-277. Goolsbee, A., & Guryan, J. (2006). The impact of Internet subsidies in public schools. The Review of Economics and Statistics, 88(2), 336-347. Grzeslo, J., Bai, Y., Min, B., & Jayakar, K. (2019). Is the 2014 E-Rate reform a game changer? An empirical analysis of Pennsylvania data. Digital Policy, Regulation and Governance. Hawkes, M., Halverson, P., & Brockmueller, B. (2002). Technology facilitation in the rural school: an analysis of options. Journal of Research in Rural Education, 17 (3), 162-170. Hazlett, T. W., Schwall, B., & Wallsten, S. (2019). The educational impact of broadband subsidies for schools under E-rate. Economics of Innovation and New Technology, 28(5), 483-497. Inan, F. A., & Lowther, D. L. (2010). Factors affecting technology integration in K-12 classrooms: A path model. Educational technology research and development, 58, 137-154. Kirschner, P. A., & De Bruyckere, P. (2017). The myths of the digital native and the multitasker. Teaching and Teacher education, 67, 135-142. Logan, J. R., & Burdick-Will, J. (2016). School segregation, charter schools, and access to quality education. Journal of Urban Affairs, 38, 323-343. Looker, E. D., & Thiessen, V. (2003). Beyond the digital divide in Canadian schools: From access to competency in the use of information technology. Social Science Computer Review, 21(4), 475- 490. Massey, D. S., & Denton, N. A. (2019). American apartheid: Segregation and the making of the underclass. In Social stratification (pp. 660-670). Routledge. Oh, S. (2014). Effects of the discount matrix on e-rate funds from 1998 to 2012. Telecommunications Policy, 38(11), 1069-1084. Prestridge, S. (2017). Analysing the data–data analytics in professional development. Teaching and Teacher Education, 63, 90-103. Reardon, S. F., Kalogrides, D., & Ho, A. D. (2021). Validation methods for aggregate-level test scale linking: A case study mapping school district test score distributions to a common scale. Journal of Educational and Behavioral Statistics, 46(2), 138-167. Reardon, S. F., & Owens, A. (2014). 60 years after Brown: Trends and consequences of school segregation. Annual Review of Sociology, 40, 199-218. Rienties, B., & Brouwer, N. (2013). The effects of online professional development on higher education teachers’ beliefs and intentions towards learning facilitation and technology. Teaching and Teacher Education, 29, 122-131. Rouse, C. E., & Krueger, A. B. (2004). Putting computerized instruction to the test: A randomized evaluation of a “scientifically based” reading program. Economics of Education Review, 23(4), 323-338. Scherer, R., Siddiq, F., & Tondeur, J. (2019). The technology acceptance model (TAM): A meta-analytic structural equation modeling approach to explaining teachers’ adoption of digital technology in education. Computers & Education, 128, 13-35. Theil, H. (1992). Statistical decomposition analysis with applications in the social and administrative sciences. Journal of Econometrics, 3(3), 319-319. Van Deursen, A. J., & Van Dijk, J. A. (2019). The first-level digital divide shifts from inequalities in 32 physical access to inequalities in material access. New Media & Society, 21(2), 354-375. Vigdor, J. L., Ladd, H. F., & Martinez, E. (2014). Scaling the digital divide: Home computer technology and student achievement. Economic Inquiry, 52(3), 1103-1119. Violette, Diana (2017), “A Study of Internet Spending and Graduation Rates: A Correlational Study,” doctoral dissertation, University of Central Florida. Wang, P. Y. (2013). Examining the Digital Divide between Rural and Urban Schools: Technology Availability, Teachers' Integration Level and Students' Perception. Journal of Curriculum and Teaching, 2(2), 127-139. Warschauer, M., & Matuchniak, T. (2010). New technology and digital worlds: Analyzing evidence of equity in access, use, and outcomes. Review of Research in Education, 34(1), 179-225. Weston, M. E., & Bain, A. (2010). The end of techno-critique: The naked truth about 1: 1 laptop initiatives and educational change. Journal of Technology, Learning, and Assessment, 9(6), n6. Wilson, W. J. (2012). The truly disadvantaged: The inner city, the underclass, and public policy. University of Chicago Press. 33 Table A.1. BJS Estimates, Parallel Trend Tests APPENDIX White-Black Gap White-Hispanic Gap White-Asian Gap White only Black only Hispanic only Asian only Notes: These are p-values for a test of parallel trends in the 4 periods prior to Wi-Fi deployment. The test used is described in Borusyak, Jaravel, and Spiess (2021) . Math 0.298 0.706 0.942 0.594 0.671 0.892 0.010 ELA 0.490 0.152 0.519 0.682 0.148 0.003 0.045 Table A.2. BJS Estimates of the Impact of Wi-Fi Investment on Subgroup Achievement Gaps, by Grade Dependent variable White-Black Math gap White-Black ELA gap White-Hispanic Math gap White-Hispanic ELA gap White-Asian Math gap White-Asian ELA gap Effect sizes Elementary 0.035*** (0.008) 0.017** (0.009) 0.024*** (0.007) 0.023*** (0.007) -0.026** (0.012) -0.004 (0.012) Middle 0.031*** (0.009) 0.015* (0.008) 0.017** (0.007) 0.015** (0.007) -0.017 (0.016) -0.025* (0.014) Notes: Regressions are weighted by the number of students used to calculate mean score for each district. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Cluster robust standard errors in parentheses. 34 Table A.3. BJS Estimates of the Impact of Wi-Fi Investment on Achievement Gap, by Economic Segregation Dependent variable Math Panel A: less economic segregated Effect sizes White-Black gap White-Hispanic gap White-Asian gap White-Black gap White-Hispanic gap White-Asian gap -0.063** (0.032) 0.004 (0.022) -0.001 (0.056) Panel B: more economic segregated 0.021** (0.010) 0.007 (0.008) -0.024 (0.015) ELA -0.048 (0.035) 0.011 (0.025) -0.036 (0.048) 0.007 (0.009) 0.005 (0.009) -0.012 (0.014) Notes: Panel A represents districts in the bottom segregation quintile, meaning that they are less economically segregated. Panel B represents districts in the top segregation quintile. 35 Table A.4. Impact of Wi-Fi Investment on the Percentage Change in Other Expenditures Dependent variable Total Elementary/Secondary Education Expenditure on Instruction Expenditure on Support Maintenance Instructional Staff Pupils Business Support General Administration Student Transportation Effect sizes Total Expenditure (1) 0.006*** (0.000) Expenditure Breakdown (2) 0.005*** (0.000) 0.008*** (0.000) 0.011*** (0.000) 0.013*** (0.000) 0.008*** (0.000) 0.008*** (0.000) 0.011*** (0.000) -0.002 (0.000) Total Non-Elementary/Secondary Education Total Capital Outlay Land and Existing Structure Instructional Equipment 0.059*** (0.000) 0.035 (0.000) Observations (num of districts) Notes: District expenditures are measured for k-12 from 2008-2009 to 2018-2019 in CCD. Estimates reflect the percentage change in other expenditures between treated and control districts for each dollar of grant received by the district. The coefficient values have been multiplied by 100 to convert into percentage terms, so a coefficient value of 0.008 is interpreted as 0.008%. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Cluster robust standard errors in parentheses. 4,064 0.064 (0.000) -0.025 (0.000) 4,064 36 CHAPTER 2: THE SPILLOVER EFFECT OF THIRD-GRADE RETENTION POLICY ON EARLIER GRADES 1. Introduction The third-grade year is pivotal in elementary education, marking the shift from learning basic reading skills to using these skills for more advanced learning. By the end of third grade, it is critical that students are able to read proficiently, as this skill is foundational for their subsequent educational success. Proficiency at this stage is usually assessed through standardized tests or similar evaluations. Failure to reach this milestone places children on a trajectory towards lower educational and economic outcomes, significantly impacting national productivity and economic health (Annie E. Casey Foundation, 2010). Research highlights that students who do not achieve reading proficiency by third grade are four times more likely to drop out of high school than their proficient counterparts. A significant proportion of students— 63 percent—who do not graduate high school on time were found to be reading below grade level in third grade (Schimke and Rose, 2012). In response to these findings, several states have implemented legislation that mandates retention for third graders who do not meet reading proficiency standards. For instance, in 1998, California enacted a law requiring third graders to attain a specific score on a statewide reading examination or meet a set literacy benchmark to progress to the fourth grade. Currently, legislation in seventeen states and Washington, D.C., targets the enhancement of third-grade reading proficiency through measures that include early identification, intervention, and if necessary, retention. Additionally, eight states have policies that permit, but do not require, retention based on reading proficiency. Figure 5 shows a map of states with varying retention requirements, marked in different colors from dark green (where retention is required) to light green (where it is not). Several factors contribute significantly to third-grade reading success, including school readiness, regular school attendance, access to summer learning programs, reduction of family stressors, and exposure to high-quality teaching. Early interventions that address these areas can substantially enhance a child's ability to meet reading proficiency standards (Reading Partners, 2013). The details of these third-grade reading policies vary by state, but all aim to ensure early identification, intervention, and retention of students struggling with reading from PreK through grade 3. Until 2019, 17 states plus the District of Columbia mandated retention for third-grade students who failed to meet the expected reading levels. Moreover, 36 states and the District of Columbia require a reading assessment at some point between PreK and third grade, primarily to identify students with reading deficiencies. These assessments are administered through a combination of state mandates and local decisions. Additionally, 33 states and the District of Columbia either require or recommend that school districts provide some form of intervention or remediation for students identified as struggling readers within these grades. Some states specify the 37 interventions to be used, while others offer a range of suggested interventions (Workman, 2014). Understanding parents and students' responses to these retention policies and their proactive measures to avoid future retention is crucial. In states with retention mandates, third-grade students must meet specific criteria to advance to the next grade, directly impacting them. Students in grades K-2 are also affected by these policies through the mechanisms of early identification and intervention. As a result, students and teachers in the earlier grades can prepare adequately for potential challenges. Although the broader impact of grade retention on future academic success is well documented, there is limited research on how third-grade retention policies influence earlier educational decisions. State-specific studies in Florida (Schimke and Rose, 2012) and Michigan (Michigan Department of Education, 2020) have shown an increase in retention rates in kindergarten and first grade. However, these studies are confined to individual states. This paper addresses the gap in the literature by using variations in the timing of the introduction of third-grade retention policies across states to estimate their effects on retention and kindergarten entrance age decisions among K-2 students. Utilizing a difference-in-differences (DiD) framework, this study compares outcomes in states with third-grade retention policies to those without, both before and after the policies were introduced. Throughout the analysis, outcomes are presented according to children’s socioeconomic status (SES). Previous studies suggest that early childhood education is particularly beneficial for children from low-income families, who may be more responsive to retention policies and might take proactive measures to avoid retention. For the SES measurement, an index was constructed combining both parents’ educational levels and family income. This index was subsequently divided into five quintiles. The index categorizes the highest quintile as representing high SES, with the remaining four quintiles considered low SES. Initial analyses focus on the influence of third-grade retention policies on retention rates among K- 2 students. The findings indicate that the introduction of these policies significantly reduces the likelihood of retention among kindergarten boys from low SES backgrounds, with a decrement of 1.66 percentage points in retention rates as per DD estimates. Similarly, the retention likelihood for second-grade boys from high SES backgrounds decreases by 2.63 percentage points. However, the impact appears insignificant in other grade levels, likely due to the rarity of retention occurrences within those groups. This paper also examines the impact of these policies on kindergarten entrance age, continuing to segment the data by SES. Given the requirement across the U.S. for children to be five years old by a designated cutoff date to commence kindergarten, parents face a critical decision for children born near this cutoff. The study hypothesizes that third-grade retention policies may influence parental decisions regarding whether to advance or delay their child's entry into kindergarten to better prepare them for potential retention. However, the analysis indicates only a marginal shift, with girls from low SES families 38 starting kindergarten approximately 25 days earlier, suggesting that these policies minimally impact kindergarten entrance timings. Additionally, while isolated instances of significant changes in retention rates are noted in certain grades and demographics, there is no consistent impact on retention rates among K-2 students overall. This suggests that third-grade retention policies may not be substantially shaping early educational decisions. 2. Background Third grade represents a critical juncture in a student's academic trajectory. According to the Annie E. Casey Foundation (2010), up to third grade, children are primarily learning to read. From fourth grade onwards, the focus shifts to reading to learn, employing literacy skills to gather information across subjects like math and science, solve problems, and engage in critical thinking. In the United States, approximately 10 percent of public school students are retained at least once from kindergarten through eighth grade, with the highest instances of retention occurring between kindergarten and third grade. The 2017 National Assessment of Educational Progress (NAEP) highlights that only 37 percent of fourth graders nationwide, and 20 percent of black fourth graders, achieved proficiency in reading. In high poverty areas, the risk of reading failure is staggeringly high, affecting 70 to 80 percent of students. Legislative responses vary significantly across the country. The National Conference of State Legislatures reports that seventeen states, plus the District of Columbia, have enacted policies mandating retention for students who exhibit reading difficulties by the end of third grade, with fourteen states offering conditional promotion for good cause.3 These policies are designed not merely to retain students but to facilitate early identification and intervention. The RAND Corporation (2009) underscores that the most effective retention strategies are those that incorporate early identification and interventions. Additionally, the Education Commission of the States (ECS) identifies that 22 states and the District of Columbia have implemented policies targeting third-grade reading proficiency that include elements of identification, intervention, and retention. Florida is noted for its stringent third-grade retention policy, which retains students only after they have been identified as having difficulties and have received targeted interventions (Schimke and Rose, 2012). The increasing adoption of these policies has sparked considerable debate. Proponents argue that retention enables students who are lagging to receive additional instruction and support, potentially enhancing their educational outcomes. Conversely, opponents raise concerns about the negative impacts on students' attitudes towards school and their self-perception, fearing that retention could exacerbate feelings 3 Good-cause exemptions include limited English proficient students (typically three or fewer years in an English language acquisition program), special education students, participating in an intervention, parent, principal or teacher recommendations, previous retention, demonstrating proficiency through a portfolio (student work demonstrating mastery of academic standards in reading), or passing an approved alternative reading assessment. 39 of inadequacy and disengagement from the educational process (Huddleston, 2015). 3. Conceptual Framework and Related Literature The academic development of students, particularly at critical junctures such as the third grade, has been extensively studied. Key findings suggest that while short-term academic benefits are evident, the long-term outcomes raise concerns about the sustainability of such benefits. Prior research has primarily concentrated on localized evaluations of third-grade retention policies in regions such as Florida (Greene and Winters, 2007; Schimke and Rose, 2012), Chicago (Jacob and Lefgren, 2004), and New York City (Mariano and Martorell, 2013), with additional insights from studies in Arizona (Miller, 2010), Georgia (Huddleston, 2015), and North Carolina (Gruendel, Abledinger, and Ruble, 2017). Most of this literature finds either negative or insignificant effects of grade retention on children’s emotional adjustment and academic achievement. Even though some retention policies have been shown to improve short-term academic outcomes, the effect fades over time, and retained students are more likely to drop out of high school than their socially promoted peers (Allensworth, 2005; Jacob and Lefgren, 2009).4 While significant effort has been directed at assessing the influence of these policies on students' subsequent academic performance and emotional health, there has been less focus on their impact on earlier educational stages. Investigations specifically targeting the effects of third-grade retention policies on younger students have documented increased retention rates in kindergarten and first grade following the implementation of these policies. Noteworthy studies conducted in Florida and New York City have highlighted the effectiveness of assessment and remediation programs in boosting early literacy skills (Schimke & Rose, 2012). Furthermore, the Michigan Department of Education (2020) observed a marked increase in planned kindergarten retention rates after the introduction of third-grade retention policies, raising important questions about how various factors influence educational decisions made by teachers and parents in response to such policies. Resource Access and Educational Outcomes The effectiveness of interventions is often contingent on the resources that schools and communities can mobilize to support at-risk students. The differential access to resources can differently influence how parents and educators make strategic decisions based on the potential academic benefits and risks associated with retention policies (Becker, 1964; Bourdieu, 1986). The RAND Corporation (2009) found that successful retention policies are those that provide substantial support mechanisms. Research also indicates that socioeconomic factors play a significant role in educational outcomes, with students from lower socioeconomic backgrounds often experiencing higher rates of academic retention. A study by the National Center for Education Statistics found that students from low-income families are more likely to 4 Social promotion is the practice promoting a child to the next grade level regardless of skill mastery in the belief that it will promote self-esteem. 40 be retained due to the compounded challenges of limited access to learning materials, less parental involvement, and fewer enrichment opportunities outside of school (NCES, 2019). Additionally, previous research suggests that early interventions, which are more accessible to higher SES families, can diminish the likelihood of retention by providing timely support to students at risk of falling behind. In their longitudinal study, Alexander, Entwisle, and Olson (2007) found that summer learning programs and early tutoring significantly benefit students from underprivileged backgrounds, potentially reducing retention rates. Psychological Preparedness and Maturity in School Entry Studies on school entry age indicate that parents weigh the benefits of allowing their children more time to mature against the academic demands they will face (Piaget, 1952). Piaget emphasize the stages of cognitive development in children and suggest that readiness for school is contingent upon reaching certain developmental milestones. Stipek (2002) has given a summary of earlier literature on this topic. In general, the findings reviewed in her paper provide more support for early educational experience to promote academic competencies than for waiting for children to be older when they enter school. Some of the studies document the positive impact of delayed entrance to school, if it existed, to be short lived, while others, such as the work by Black, Devereux, and Salvanes (2011) finds that starting school later has a small and negative effect on educational attainment, and leads to lower earnings until about age 30. In another study that uses the data of Tennessee’s Project STAR, Cascio and Schanzenbach (2016) discover that redshirting5 children score significantly lower on achievement tests in both kindergarten and middle school, and the likelihood of being retained by middle school increases. Gender Difference Gender differences in developmental trajectories and responses to educational policies also necessitate tailored educational strategies. It has been documented that boys often develop literacy skills more slowly than girls and might therefore benefit from delayed school entry, which provides additional time to develop the necessary skills for academic success (Smith & Wilhelm, 2002). Interestingly, research by Black, Devereux, and Salvanes (2011) found that while delayed school entry has a generally negative effect on long-term educational attainment, boys who start school at an older age are less likely to experience poor mental health by age 18. This highlights the importance of considering gender-specific needs when formulating educational policies, ensuring that the unique developmental needs of both boys and girls are addressed effectively. This paper significantly extends the existing literature by undertaking a national-level analysis to assess the effects of third-grade retention policies on early educational outcomes, specifically focusing on 5 Redshirting is the practice of postponing entrance into kindergarten of age-eligible children in order to allow extra time for socioemotional, intellectual, or physical growth. 41 K-2 retention decisions and kindergarten entrance ages. Previous research has predominantly concentrated on the ramifications of retention on later academic outcomes and has been confined to particular states or cities. By leveraging annual survey data, this study provides a comprehensive examination of the nationwide implications of these policies on early education. It contributes fresh insights into the effects on K-2 retention decisions and kindergarten entrance ages across the United States, broadening our understanding of the broader impacts of these educational strategies. 4. Data To examine the nationwide impact of third-grade retention policies on K-2 students’ kindergarten entrance age decisions and retention decisions, the best available data source is the October Current Population Survey (CPS) School Enrollment Supplements. I utilize the annual October supplements data spanning from 1996 to 2017. During this period, sixteen states enacted relevant laws and are thus considered treatment states. States that do not mandate third-grade retention or passed laws after 2017 are not included in this group.6 In the U.S., the lack of comprehensive state-reported retention rates complicates national-level studies. However, the October CPS offers a method to estimate retention rates by state annually by asking two questions: (a) the current grade of enrollment and (b) the grade enrolled in the previous year. This approach enables the identification of students who have been retained. As discussed in Sections I and III, considering that people from different socioeconomic backgrounds likely respond differently to policies, this analysis utilizes information on mother's education, father's education, and family income from the Current Population Survey (CPS) to construct a socioeconomic status (SES) index. This index differentiates families based on their socioeconomic standing. The comprehensive household data in the October CPS supplements are pivotal for matching children with their family members, enabling the extraction of both parents’ education levels and family income. Each variable is first standardized to eliminate biases due to varying scales, and equal weights are then assigned to construct the SES index. Consequently, the sample is divided into two SES groups: children with SES index values in the highest quintile are categorized as high SES, while those in the remaining four quintiles are considered low SES. In Table 10, children from high SES backgrounds consistently show lower retention rates across all grades compared to their low SES counterparts. This disparity underscores the influence of socioeconomic factors on educational outcomes, where higher SES may correlate with better support systems and resources that prevent retention. Additionally, the study examines gender differences in retention rates and kindergarten entrance 6 Three states and D.C. have third grade retention law, but are not included in the analysis: Nevada enacted retention law in 2018, which is beyond the sample period. The exact law’s enactment years can’t be found in Delaware, D.C. and Missouri, and these three states are not included in the sample. 42 ages. Table 10 provides insights into overall national trends in retention and kindergarten entrance ages, indicating significant gender differences. Across both SES groups and all children, there is a clear trend of decreasing retention rates over time for kindergarten, first grade, and second grade. There's an interesting pattern where the second-grade retention rates for both boys and girls in the low SES group peaked in 2006 before decreasing in 2016. While girls consistently show lower retention rates than boys across both SES groups, the gap appears to be narrowing over the years, especially in the high SES group. For instance, the retention rate decrease from 1996 to 2016 for boys in second grade (from 3.35 to 1.58) and girls (from 2.98 to 1.60) in the high SES group suggests a converging trend. The kindergarten entrance age shows remarkable stability over time across both genders and SES levels, with slight fluctuations. Furthermore, this study utilizes annual state-level data from the National Center for Education Statistics (NCES) covering the years 1996 to 2017. The NCES dataset includes critical variables such as state-level pupil-teacher ratios and per-pupil expenditures in public elementary and secondary schools. These variables are included as control variables in the regression model to account for variances across states and over time. The expenditure data has been carefully adjusted for inflation to ensure consistency and accuracy throughout the analysis period from 1996 to 2017. 5. Method I first look at the impact of retention policies on boys’ and girls’ K-2 retention rates, separately. This analysis separates out the trends in early K-2 retention rates in the fourteen states that pass third-grade retention policies during sample periods from those for states elsewhere in the country. Utilizing the October CPS supplements from 1996 to 2017, I compare changes in K-2 retention rates in treatment states after the introduction of the retention policy to changes in K-2 retention rates in the rest of the United States. The baseline difference-in-differences (DD) approach is used to study the retention rate of three cohorts— kindergarteners, first graders, and second graders separately—and is captured in the following equation: 𝑌𝑖𝑠𝑡 = 𝛽𝑃𝑜𝑠𝑡𝑖𝑠𝑡 + 𝛿𝑡 + 𝛾𝑠 + 𝜀𝑖𝑠𝑡 (1) where 𝑌𝑖𝑠𝑡 is a binary variable equal to one if student i from state s in year t is retained in kindergarten, first grade, or second grade, separately, and equals zero otherwise. 𝑃𝑜𝑠𝑡𝑖𝑠𝑡 is an indicator set to one in each treatment state starting from the first year the analyzed grade is affected as well as subsequent years, and equals zero otherwise. The value of 𝑃𝑜𝑠𝑡𝑖𝑠𝑡 depends on the law’s enactment year, the law’s effective year and the grade used for analysis. For example, kindergarteners will be immediately affected in the law’s enactment year only if they don’t reach third grade by the law effective year. 𝛾𝑠 and 𝛿𝑡 are vectors of state and year fixed effects, respectively. The state fixed effects account for fixed differences in K-2 retention rates across states, while the year fixed effects account for common shocks to K-2 retention rates. In some specifications, I also add a vector of controls, including race, number of school aged siblings in the household, pupil-teacher ratio, and expenditure per pupil. 𝜀𝑖𝑠𝑡 is an error term, which represents 43 unobserved determinants of retention rates. Standard errors are clustered at the state level to control for possible correlation among individual error terms within states. Estimation of this baseline model will identify the coefficient of interest, 𝛽, only if we assume the state third-grade retention policies are exogenous. This assumption would be violated if, for example, the kindergarten retention rates in treatment states would have been on a steeper trajectory than elsewhere in the country even without passing the retention policies. In this case, the estimates would imply more crowd- out as a result of retention policies than has actually taken place. Therefore, I begin by estimating an event-study model, which allows me to test whether the treatment states were on different retention trajectories prior to introducing the state policies. So, I first estimate the following event-study model: 𝑌𝑖𝑠𝑡 = ∑ 𝑑=4 𝑑=−4,𝑑≠−1 1(𝑡 − 𝑝𝑠 = 𝑑)𝛽𝑑 + 𝛿𝑡 + 𝛾𝑠 + 𝜀𝑖𝑠𝑡 (2) Nothing else has changed, except that I replace the 𝑃𝑜𝑠𝑡𝑖𝑠𝑡 indicator in equation 1 with a series of indicator variables for year relative to the first year that the analyzed grade is affected. 𝑝𝑠 is the first year that the analyzed grade is affected in state s. A window of time, from four years before the first year the analyzed grade is affected to four years after, is used to estimate the model. I omit the dummy for one year immediately prior to the first year the analyzed grade is affected. The coefficients of interest are 𝛽𝑑, where −4 ≤ 𝑑 ≤ 4, and 𝑑 does not equal one.7 Using the same strategy above, I then explore the impact of retention policies on kindergarten entrance age for boys and girls separately. With everything else kept the same, I substitute the dependent variable in equation 1 with the age of students who are in kindergarten in year t, and estimate the following model: 𝐴𝑔𝑒𝑖𝑠𝑡 = 𝛽𝑃𝑜𝑠𝑡𝑖𝑠𝑡 + 𝛿𝑡 + 𝛾𝑠 + 𝜀𝑖𝑠𝑡 (3) Given that age is recorded as an integer in the October CPS dataset, the estimated coefficient 𝛽 represents the percentage of a year. In some specifications, I add a vector of controls, including race, number of school aged siblings in the household, pupil-teacher ratio, and expenditure per pupil. Similarly, the corresponding event-study model is represented as follows: 𝐴𝑔𝑒𝑖𝑠𝑡 = ∑ 𝑑=4 𝑑=−4,𝑑≠−1 1(𝑡 − 𝑝𝑠 = 𝑑)𝛽𝑑 + 𝛿𝑡 + 𝛾𝑠 + 𝜀𝑖𝑠𝑡 (4) Problems with staggered rollout in a DiD setting are increasingly drawing researchers’ attention in recent literature. Several papers point out that the coefficient of interest (β of Equation (1)) is not easily interpretable, and is not consistent for ATT or ATE. The traditional TWFE implicitly assumes homogeneous treatment effects, and the estimator is a weighted average of treatment effects. However, restricting heterogeneity effects under the staggered rollout DiD designs may lead to estimands that put 7 The first and last indicators represent all prior and subsequent years, respectively. 44 negative weights on some comparisons between early and late treated groups (Borusyak, Jaravel, and Spiess, 2021; Goodman-Bacon, 2021). Modified TWFE estimators are developed to address this issue (Borusyak, Jaravel, and Spiess, 2021; Callaway and Sant’Anna, 2020; Chaisemartin and D’Haultfoeuille, 2020), I use the estimator developed by Borusyak, Jaravel, and Spiess (2021) and present the BJS estimates in Tables 11 to 13, and present the regression-based estimates in Appendix Table A.5 and Table A.6. The imputation estimator developed in their paper is constructed so that the heterogeneity of treatment effect is not restricted. Their imputation process takes three steps. First, regressing on untreated ̂ . Second, use the fitted values states only to obtain the fitted value of unit and period fixed effects 𝛾𝑖̂ and 𝛿𝑡 to impute the untreated retention outcomes for treated states, and therefore obtain an estimated treatment ̂ for each treated observation. Finally, choosing weights to take a weighted effect 𝛽𝑖𝑡𝑠̂ = 𝑌𝑖𝑡𝑠 − 𝛾𝑖̂ − 𝛿𝑡 average of these treatment effect estimates. The underlying assumptions in BJS’s setting are parallel trends and no anticipation effects, as in the traditional DiD design. They also impose a parametric model of treatment effect to weaken the implicit assumption of homogeneous effects imposed in the traditional TWFE model. Figures 7 to 9 are selected event plots showing that the parallel trend assumption is not violated. 6. Results 6.1 Impact on K-2 Retention Rates Main Results Table 11 shows the BJS estimates of the impact of third-grade retention policies on K-2 retention rates, broken down by socioeconomic status. The data reveal no significant effects on retention rates for both boys and girls across low and high SES levels, suggesting that the third-grade retention policy may have little to no influence on early educational outcomes as measured by retention rates. This section proceeds to a more detailed heterogeneity analysis by individual grade levels—kindergarten, first, and second grades—to determine if the retention policy affects each grade distinctly. This deeper analysis is aimed at revealing potential effects that may be obscured by the aggregated data, providing clearer insights into the nuanced effects of the policy across different educational stages. Heterogeneity Analysis Table 12 provides a more detailed look into the effects of third-grade retention policies, segmented by grade and socioeconomic status. The BJS estimates show varied impacts, notably among boys across different SES levels. For example, kindergarten boys with low SES exhibit a reduction in retention rates by 1.20 percentage points, and second-grade boys with high SES experience a reduction by 1.92 percentage points. Figures 6 and 7, featuring selected event plots, confirm the parallel trend assumption is not violated for the groups where significant results were noted. Table A.5 aligns with these findings, presenting regression-based DiD estimates. It shows a 45 decrease in the kindergarten retention rate for low SES boys by 1.66 percentage points post-policy enactment. There is also a notable reduction in retention rates for second-grade boys with high SES, robust against additional controls such as the number of school-aged siblings, pupil-teacher ratios, per-pupil expenditure, and racial composition (column 2 & column 6).8 Figures A.1, A.2 and A.3 plot the event-study estimates for K-2 retention rates, respectively. To better compare the results across family background, I present the coefficient estimates for children from both SES categories in the same graph. The capped lines around the coefficient estimates represent 95 percent confidence intervals. For kindergarten boys and second-grade boys with low SES level, the introduction of the retention law decreases the retention rate by around 2 percentage points relative to expectations based on retention trends elsewhere in the country. However, the graph suggests that the negative impact is not immediate after the introduction of the policies. Notably, there are no significant changes observed in the retention rates of girls across both SES levels, suggesting potential gender-specific or less impactful interactions with the policies at these early stages. For boys, especially in kindergarten and second grade, while both estimation methods show a consistent reduction in retention rates, the lack of a consistent pattern across different SES statuses and grades could indicate that these results might either reflect true policy impacts or merely represent statistical anomalies. Additionally, the event study Figures A.1, A.2, and A.3 do not show a clear decrease in boys’ retention rates, which may reinforce the notion that the impacts we found are merely nuances. The analysis of third-grade retention policies on K-2 retention rates reveals that the effects of such policies can be complex and vary significantly by demographic and educational stage. While the aggregated analysis across K-2 grades shows no significant changes in retention rates—suggesting a general absence of impactful effects when these grades are viewed collectively—the findings from more specific, grade- separated analyses occasionally indicate significant effects, particularly among boys. This discrepancy suggests that combining grades K-2 into a single model, though it enhances overall statistical precision by increasing the sample size and reducing variance, may mask critical nuances of how retention policies affect different educational stages. However, it is also plausible that the significant results observed in disaggregated K-2 groups could be due to statistical chance, implying that there may be no genuine impacts at these earlier grades, as supported by the aggregate analysis. Should these observed significant effects in specific grades truly reflect the impact of the policies, it might suggest a strategic shift in early educational practices. Educators and parents might be assessing the necessity of early retention with greater scrutiny, anticipating stricter retention policies in third grade. Such foresight could render early retention increasingly redundant, supporting an educational philosophy that favors retention as a last resort, pursued only after exploring other supportive interventions for 8 Estimates using probit model are similar, significance level doesn’t change 46 struggling students. Moreover, this scenario may lead to increased efforts by schools and parents to bolster early literacy and prepare students more comprehensively before they reach third grade, thereby naturally decreasing the tendency to retain students in earlier grades. Conversely, if the observed decreases in retention rates for boys are primarily due to statistical noise, this would suggest that current retention policies have minimal influence on early retention decisions. This potential lack of proactive early intervention underscores the need for educational settings to prioritize early support strategies over a reliance on student retention. By addressing learning challenges proactively before they escalate, educators can reduce the need for retention and provide foundational support that upholds high academic standards. This approach fosters a learning environment centered on preventive rather than corrective measures, ensuring every student has the best opportunity for success from the outset of their educational journey. 6.2 Impact on Kindergarten Entrance Age Table 13 shows a modest decrease in kindergarten entrance age for girls with low SES by approximately 26 days or about 7.16 percent of a year, a change not observed among boys. This aligns with findings from traditional regression-based estimates in Table A.6, which show a similar 25-day decrease for low SES girls. Figure 8 indicates the parallel trend assumption holds, with an immediate decrease in kindergarten entrance age for low SES girls. The slight decrease in kindergarten entrance age for low SES girls, although statistically significant, is relatively small and may not represent a significant shift in the timing of kindergarten entry. This minor adjustment might reflect a nuanced response from families facing greater academic challenges, possibly using policies proactively to secure earlier educational support. This trend is not observed among families with boys, potentially indicating gender-specific differences in policy implementation or effects. The lack of a similar observable impact among boys, even within the same socioeconomic bracket, suggests potential gender-specific differences in developmental readiness or societal expectations about school readiness. It appears that boys might not be perceived by parents or educators as benefiting from or needing as early a start in formal education as girls, or there might be a discrepancy in the levels of policy compliance and awareness among families with boys. Although the observed impact is small and limited to girls, these results suggest a need for educational policies that consider both the economic challenges families encounter and the developmental differences between boys and girls. The slight reduction in kindergarten entrance age for girls with low SES points to the potential benefits of creating interventions tailored to meet the diverse needs of students, ensuring an equitable start in their educational journey. This approach should focus on providing targeted support where necessary, without presupposing consistent effects across all demographic groups. 47 7. Bacon Decomposition In this paper, I leverage the natural variation in the timing of the introduction of retention laws across the U.S. from 1996 to 2017 to estimate their effects on retention rates and kindergarten entrance ages. Goodman-Bacon (2018) describes the DD coefficient as a weighted average of the canonical "2x2" DDs that occur when treatments are implemented at different times. For this analysis, I use the command developed by Goodman-Bacon to calculate component DDs and their respective weights. Panel data, which is required for the bacon decomposition, necessitates using aggregated state-level data instead of individual data due to the structure of the data set. The estimates using state-level data align with those obtained from individual data for equation (1). However, achieving strongly balanced panel data is challenging due to the low frequency of certain outcomes in the CPS data, which sometimes requires excluding some states that lack data for certain grades in specific years. Consequently, I have limited the bacon decomposition analysis to kindergarten retention rates, where data for every state is available during the sample period. Figure 9 offers critical insights into the heterogeneous effects of policy implementation across different groups within a staggered rollout Difference-in-Differences (DiD) framework. The comparisons include treated versus untreated, early versus late, and late versus early. The two-way fixed effects estimate of -1.66 represents the average of the y-axis values weighted by their x-axis values. The graph reveals that although treatment effects vary depending on the timing and the groups compared, the majority of the weight resides in the comparisons between those treated at any point versus those never treated. This dominant weighting on "Treatment vs. Never Treated" explains why the BJS estimates are similar to the traditional regression-based Two-Way Fixed Effects (TWFE) estimates. Such findings underscore the critical insight that, despite the complexity introduced by staggered policy implementations and the heterogeneity in individual group effects, the primary drivers of the overall model estimates remain rooted in the fundamental comparison of treated versus never treated groups. 8. Conclusion This study has investigated the effects of third-grade retention policies on K-2 retention rates and kindergarten entrance ages. The analysis reveals no significant changes across various socioeconomic statuses and genders. Initially, it seemed that boys from low SES backgrounds were more likely to be retained in kindergarten, which could indicate an early response to retention policies. However, further investigation suggests that these findings might be due to statistical noise or data anomalies rather than a genuine effect. Likewise, preliminary results indicated that girls, especially from low SES backgrounds, were entering kindergarten at younger ages. However, the modest scale of this change and the lack of a consistent pattern across the data suggest that these early educational decisions might not be directly influenced by third-grade retention policies. 48 Given the findings that suggest a lack of proactive early measures in anticipation of future retention, there is significant scope for policy improvement focusing on early identification and intervention. These measures are crucial for supporting student development and preventing the need for future retention. Policies should emphasize comprehensive early screening and diagnostic protocols from kindergarten through second grade. Early identification of students at risk, particularly in critical areas such as literacy and numeracy, allows for the deployment of targeted interventions tailored to individual needs. It is essential to equip educators with the skills necessary to recognize and address early signs of learning difficulties. Additionally, engaging parents in the educational process is critical. Educating parents about developmental milestones and strategies to support educational activities at home can bolster school-based interventions and increase their effectiveness. Good early identification and intervention strategies form a crucial component of the retention policy framework, helping students meet educational benchmarks and making retention a last resort. Future research could aim to further investigate the broader impacts of retention policies, exploring variables such as parental engagement, labor market outcomes for parents, and standardized test scores for children. Such studies could provide a deeper understanding of how these policies affect not only educational outcomes but also family dynamics and long-term academic and economic success. Additionally, given the initial indications of gender-specific responses, further studies should aim to validate these observations or refute them, using individual-level data instead of state-level data when available, to provide a more nuanced analysis of the effects of retention policies. 49 Figure 5. Third Grade Retention Requirement by State Figures Source: National Conference of State Legislatures, 2019 Retention policies are not specified in grey colored states. 50 Figure 6. Retention Rate of Low-SES Boys in Kindergarten Figure 7. Retention Rate of High-SES Boys in 2nd Grade 51 Figure 8. Kindergarten Entrance Age for Girls 52 Figure 9. Difference-in-Difference Decomposition for Kindergarten Retention Rate Notes: Notes: The figure plots each 2x2 DD components from the decomposition theorem against their weight for the kindergarten retention rate analysis. The two-way fixed effects estimate, -1.66, equals the average of the y-axis values weighted by their x-axis value. 53 -505102x2 DD Estimate0.000.050.100.150.200.25WeightEarlier Group Treatment vs. Later Group ControlLater Group Treatment vs. Earlier Group ControlTreatment vs. Never Treated Table 9. Enactment Year of Third-Grade Retention Law Tables State Arizona California Connecticut Florida Georgia Indiana Iowa Year Law Enacted State Year Law Enacted 2010 1998 2012 2002 2001 2010 2012 Michigan Mississippi North Carolina Ohio South Carolina Tennessee Arkansas 2016 2013 2012 2012 2014 2012 2010 Notes: Three states and D.C. have third grade retention law, but are not included in the analysis: Nevada enacted retention law in 2018, which is beyond the sample period. The law’s enactment years are not exact in Delaware, D.C. and Missouri, and these three states are not included in the sample. Table 10. K-2 Retention Rates and Kindergarten Entrance Age, by Socio-economic Status: 1996, 2006 and 2016 Outcome Boy Kindergarten retention First-grade retention Second-grade retention Kindergarten entrance age Girl Kindergarten retention First-grade retention Second-grade retention Kindergarten entrance age All children Kindergarten retention First-grade retention Second-grade retention Kindergarten entrance age Low SES level 2006 1996 2016 1996 High SES level 2006 2016 5.56 4.93 2.78 5.10 4.78 4.20 1.96 5.10 5.12 4.35 2.33 5.10 4.92 4.22 4.15 5.06 4.40 3.56 3.74 5.03 4.57 3.83 4.11 5.04 4.60 2.98 1.64 5.15 4.02 2.24 1.61 5.12 4.28 2.54 1.63 5.13 4.64 3.95 2.19 5.08 3.62 3.35 1.58 5.06 4.16 3.53 1.99 5.07 4.42 2.89 3.35 5.14 3.21 2.48 2.98 5.06 4.12 2.70 3.03 5.10 4.02 2.13 1.58 5.15 3.42 1.86 1.60 5.18 3.72 2.01 1.59 5.17 Source: Authors’ calculations from the October CPS, 1996-97 (for 1996), 2005-07 (for 2006), and 2015- 17 (for 2016) 54 Table 11. BJS Estimates of the Third-Grade Retention Policy on K-2 Retention Rates, combined Dependent Variable (percent) Effect sizes K-2 Low SES level Girl’s retention rate Boy’s retention rate Observations Girl’s retention rate Boy’s retention rate Observations High SES level -0.37 (0.25) 0.12 (0.14) 56,083 -0.34 (0.28) -0.19 (0.15) 10,573 Notes: Numbers are percentage points. The upper panel coefficients estimate the effects of the policies on K-2 retention rates for children from the low socioeconomic status (SES) quintile, while the lower panel provides estimates for children from the high SES quintile. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Standard errors clustered at the state level. Table 12. BJS Estimates of the Third-Grade Retention Policy on K-2 Retention Rates, separately Dependent Variable (percent) Kindergarten Effect sizes First grade Second grade Girl’s retention rate Boy’s retention rate Observations Girl’s retention rate Boy’s retention rate Low SES level -0.27 (0.45) -1.20*** (0.24) 18,538 High SES level -0.36 (0.27) 1.10 (0.73) 19,567 -0.44 (0.55) 0.38 (0.52) 18,960 -0.90* (0.50) 1.42 (0.98) 4,006 -1.31* (0.76) -0.10 (0.20) 3,720 1.14 (0.93) -1.92*** (0.41) 3,987 Observations Notes: Numbers are percentage points. The upper panel coefficients estimate the effects of the policies on K-2 retention rates for children from the low socioeconomic status (SES) quintile, while the lower panel provides estimates for children from the high SES quintile. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Standard errors clustered at the state level. 55 Table 13. BJS Estimates of the Third-Grade Retention Policy on Kindergarten Entrance Age Dependent Variable (percent) Effect sizes Girl’s Entrance Age Boy’s Entrance Age Girl’s Entrance Age Boy’s Entrance Age Low SES level High SES level -7.16*** (1.22) -2.77* (1.47) 1.31 (1.12) -1.45 (1.21) Notes: Age is an integer in the CPS data. Thus, the above coefficient represents percentage of one year. The upper panel are estimates of the effect of the policies on girl/boy kindergarten entrance age from children with SES index that is below the top quintile, and the lower panel are estimates from children with SES index in the top quintile. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Standard errors clustered at the state level. 56 REFERENCES Allen worth, E. M. (2005). Dropout rates after high-stakes testing in elementary school: A study of the contradictory effects of Chicago's efforts to end social promotion. Educational Evaluation and Policy Analysis, 27, 341-364. Goodman-Bacon, A., Goldring, T., & Nichols, A. (2019). bacondecomp: Stata module for decomposing difference-in-differences estimation with variation in treatment timing. Statistical Software Components S, 458676. Black, S. E., Devereux, P. J., & Salvanes, K. G. (2011). Too young to leave the nest? The effects of school starting age. The review of economics and statistics, 93(2), 455-467. Bond, G., & Dyskra, R. (1967). Coordinating Center for First-Grade Reading Instruction Programs. U.S. Department of Health Education & Welfare Office of Education. Cascio, E. U., & Schanzenbach, D. W. (2016). First in the class? Age and the education production function. Education Finance and Policy, 11(3), 225-250. Fiester, L. (2010). Early warning! Why reading by the end of third grade matters. KIDS COUNT Special Report. Annie E. Casey Foundation. Goodman-Bacon, A. (2018). Difference-in-differences with variation in treatment timing (No. w25018). National Bureau of Economic Research. Greene, J. P., & Winters, M. A. (2009). The effects of exemptions to Florida's test-based promotion policy: Who is retained? Who benefits academically? Economics of Education Review, 28, 135-142 Gruendel, J., Abledinger, M., & Ruble, K. A. (2017). What Works for Third Grade Reading. NC Pathways to Grade-Level Reading Working Paper. Formal and Informal Family Supports: Supported and Supportive Families and Communities.: North Carolina Early Childhood Foundation Institute for Child Success. Hernandez, D.J. (2012). Double Jeopardy: How Third-Grade Reading Skills and Poverty Influence High School Graduation. Baltimore: The Annie E. Casey Foundation. Huddleston, A. P. (2015). “Making the Difficult Choice”: Understanding Georgia’s Test-Based Grade Retention Policy in Reading. education policy analysis archives, 23, 51. Jacob, B. A., & Lefgren, L. (2009). The effect of grade retention on high school completion. American Economic Journal: Applied Economics, 1(3), 33-58. Mariano, L. T., & Martorell, P. (2013). The academic effects of summer instruction and retention in New York City. Educational Evaluation and Policy Analysis, 35(1), 96-117. Miller, Bethany. (2010). Lessons from Florida’s Third Grade Reading Retention Policy and Implications for Arizona. Helios Education Foundation Reading Partners. (2013, September 23). 5 major contributors to third grade reading proficiency. Reading Partners. https://readingpartners.org/blog/5-factors-contributing-to-third-grade-reading- proficiency/ Rose, S. (2012). Third Grade Reading Policies. Education Commission of the States (NJ3). Rose, S., & Schimke, K. (2012). Third Grade Literacy Policies: Identification, Intervention, Retention. Education Commission of the States (NJ3). Smith, M. W. & Wilhelm, J. D. (2002). Reading don't fix no chevys: Literacy in the lives of young men. Portsmouth, NH: Heinemann. Snow, C. E., Burns, M. S., & Griffin, P. (Eds.). (1998). Preventing reading difficulties in young children. 57 Washington, DC: National Academy Press. Stipek, D. (2002). At What Age Should Children Enter Kindergarten? A Question for Policy Makers and Parents. Social Policy Report. Volume 16, Number 2. Society for Research in Child Development. UNICEF. (2011). Late enrollment in primary education: Causes and recommendation for prevention. In Technical Report. UNICEF Turkey and MEB Primary Education General Directorate. Weyer, M. (2018). A look at third-grade reading retention policies. National Conference of State Legislatures. Workman, E. (2014). Third-Grade Reading Policies. Reading/Literacy: Preschool to Third Grade. Education Commission of the States. Yesil Dagli, U., & Jones, I. (2012). The Effects of On-Time, Delayed and Early Kindergarten Enrollment on Children's Mathematics Achievement: Differences by Gender, Race, and Family Socio- Economic Status. Educational Sciences: Theory and Practice, 12(4), 3061-3074. 58 APPENDIX Table A.5. Difference-in-Difference Estimates of the Impact of Third Grade Retention Policy on K-2 Retention Rates, by Socio-economic Status Kindergarten Effect sizes First grade Second grade Dependent variable (percent) Girl’s retention rate Boy’s retention rate Overall retention rate Observations Girl’s retention rate Boy’s retention rate Overall retention rate Observations Controls State fixed effect Year fixed effect Number of school aged siblings Pupil/teacher ratio Expenditure per pupil Nonwhite (percent) Baseline DD (1) 0.33 (0.97) -1.66** (0.73) -0.62 (0.63) 18538 -0.95 (2.64) 2.01 (2.36) 0.61 (1.79) 4015 Y Y N N N N Add controls Baseline DD (3) Low SES level -0.43 (1.43) 1.32 (0.95) 0.48 (0.92) 19567 High SES level -1.36 (2.00) -0.30 (1.36) -0.54 (1.23) 3913 (2) 0.50 (1.00) -1.42* (0.73) -0.44 (0.65) 18538 -0.40 (2.67) 1.86 (2.42) 0.85 (1.89) 4015 Y Y Y Y Y Y Y Y N N N N Add controls (4) -0.73 (-0.49) 0.79 (0.98) 0.06 (0.85) 19567 -0.98 (1.92) -0.98 (1.58) -0.67 (1.18) 3913 Y Y Y Y Y Y Baseline DD (5) -0.51 (1.49) 0.43 (0.64) -0.006 (0.3) 18960 1.03 (1.11) -2.63** (1.16) -0.88 (0.79) 4054 Y Y N N N N Add controls (6) -0.46 (0.56) 0.33 (0.67) -0.002 (0.028) 18960 0.94 (1.19) -2.49** (1.13) -0.87 (0.81) 4054 Y Y Y Y Y Y Notes: Numbers are percentage points. The upper panel coefficients are estimates of the effect of the policies on K-2 retention rates from children with SES index below the top quintile, and the lower panel are estimates from children with SES index in the top quintile. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Standard errors clustered at the state level. 59 Table A.6. Difference-in-Difference Estimates of the Impact of Third Grade Retention Policy on Kindergarten Entrance Age, by Socio-economic Status Dependent variable (percent) Baseline DD (1) Low SES level (N=18,537) Effect sizes Add controls (2) Girl’s Entrance Age Boy’s Entrance Age Overall Entrance Age Girl’s Entrance Age Boy’s Entrance Age Overall Entrance Age -6.67*** (2.28) -1.30 (4.06) -4.05 (2.61) High SES level (N=3,950) 1.40 (4.67) -1.55 (5.74) -1.13 (4.35) Controls State fixed effect Year fixed effect Number of school aged siblings Pupil/teacher ratio Expenditure per pupil Race Y Y N N N N -6.62*** (2.29) -1.11 (3.96) -3.94 (2.58) 0.92 (5.11) -1.34 (5.83) -1.25 (4.31) Y Y Y Y Y Y Notes: Age is an integer in the CPS data. Thus, the above coefficient represents percentage of one year. The upper panel are estimates of the effect of the policies on girl/boy kindergarten entrance age from children with SES index that is below the top quintile, and the lower panel are estimates from children with SES index in the top quintile. Asterisks denote significance: * p<0.1, ** p<0.05, *** p< 0.01. Standard errors clustered at the state level. 60 Figure A.1. Event-Study Estimates of the Impact of Third Grade Retention Policy on Kindergarten Retention Rates, by Socio-economic Status Girl’s retention rate Boy’s retention rate Year relative to the law’s enactment year Year relative to the law’s enactment year Overall kindergarten retention rate Blue: low SES level Red: high SES level 95% confidence interval Year relative to the law’s enactment year Source: October CPS school enrollment, 1996-2017. All regressions include state fixed effects and year fixed effects in addition to dummies for year relative to the year the retention law was enacted. The coefficients plotted at –4 represent 4 years or more prior to enactment, while the coefficients plotted at 4 represent 4 or more years after; the dummy at –1, representing one year immediately prior to enactment, is omitted to identify the model. 61 -1001020-4-3-2-101234lowseshighses-1001020-4-3-2-101234lowseshighses-1001020-4-3-2-101234lowseshighses Figure A.2. Event-Study Estimates of the Impact of Third Grade Retention Policy on First Grade Retention Rates, by Socio-economic Status Girl’s retention rate Boy’s retention rate Year relative to year 1st graders start being affected Year relative to year 1st graders start being affected Overall first grade retention rate Blue: low SES level Red: high SES level 95% confidence interval Year relative to year 1st graders start being affected Source: October CPS school enrollment, 1996-2017. All regressions include state fixed effects and year fixed effects in addition to dummies for year relative to the first year that 1st graders start being affected. The coefficients plotted at –4 represent 4 years or more prior to the first year that 1st graders start being affected, while the coefficients plotted at 4 represent 4 or more years after; the dummy at –1, representing one year immediately before, is omitted to identify the model. 62 -20-100102030-4-3-2-101234lowseshighses-5051015-4-3-2-101234lowseshighses-10-50510-4-3-2-101234lowseshighses Figure A.3. Event-Study Estimates of the Impact of Third Grade Retention Policy on Second Grade Retention Rates, by Socio-economics Status Girl’s retention rate Boy’s retention rate Year relative to year 2nd graders start being affected Year relative to year 2nd graders start being affected Overall second grade retention rate Blue: low SES level Red: high SES level 95% confidence interval Year relative to year 2nd graders start being affected Source: October CPS school enrollment, 1996-2017. All regressions include state fixed effects and year fixed effects in addition to dummies for year relative to the first year that 2nd graders start being affected. The coefficients plotted at –4 represent 4 years or more prior to the first year that 2nd graders start being affected, while the coefficients plotted at 4 represent 4 or more years after; the dummy at –1, representing one year immediately before, is omitted to identify the model. 63 -20-1001020-4-3-2-101234lowseshighses-15-10-505-4-3-2-101234lowseshighses-10-50510-4-3-2-101234lowseshighses Figure A.4. Event-Study Estimates of the Impact of Third Grade Retention Policy on Kindergarten Entrance Age, by Socio-economic Status Girl’s entrance age Boy’s entrance age Year relative to year law enacted Year relative to year law enacted Overall entrance age Blue: low SES level Red: high SES level 95% confidence interval Year relative to year law enacted Source: October CPS school enrollment, 1996-2017. All regressions include state fixed effects and year fixed effects in addition to dummies for year relative to the year the retention law was enacted. The coefficients plotted at –4 represent 4 years or more prior to enactment, while the coefficients plotted at 4 represent 4 or more years after; the dummy at –1, representing one year immediately prior to enactment, is omitted to identify the model. The y-axis represents days. 64 -1000100200300-4-3-2-101234lowseshighses-200-1000100200-4-3-2-101234lowseshighses-200-1000100200-4-3-2-101234lowseshighses CHAPTER 3: THE PATH OF STUDENT LEARNING DELAY DURING THE COVID-19 PANDEMIC: EVIDENCE FROM MICHIGAN 1. Introduction The COVID-19 pandemic has severely impacted student achievement across the United States. Nationally, average test scores in fall 2021 were substantially below historic averages and academic recovery since then has been slow (Goldhaber et al., 2022; Kuhfeld & Lewis, 2022). For example, spring 2022 end-of-year testing outcomes from multiple states show that student achievement continues to trail pre-pandemic levels (e.g., Halloran et al., 2023; Kogan, 2022; Idaho State Department of Education, 2022; Tennessee Department of Education, 2022; Texas Education Agency, 2022; Sass & Ali, 2022). Similarly, National Assessment of Education Progress (NAEP) outcomes from spring 2022 represent historically large decreases in student achievement between 2019 and 2022 (National Center for Education Statistics, 2022). Pandemic impacts have been particularly acute for certain student subgroups, including students of color and those receiving additional services, as well as students attending high poverty schools and elementary schools, those who learned remotely, and those with lower baseline achievement. (e.g., Goldhaber et al., 2022; Kilbride et al., 2022). In light of these findings, it is imperative that research continues to document achievement trends so that educators, policymakers, and the public can better understand how the pandemic and associated school disruptions affected and continue to affect students’ academic development. This paper uses student achievement measures from the Michigan’s summative end-of-year tests (the Michigan Student Test of Educational Progress, M-STEP) and formative fall and spring NWEA MAP Growth and Curriculum Associates i-Ready benchmark assessments to assess achievement growth and trajectories during the pandemic. A particularly useful benefit of combining these two data sources is that we are able to examine both the total impact of the pandemic through spring 2022 as well as how achievement progressed during the pandemic-affected school years. We also examine heterogeneity in performance across students with different demographic characteristics and those who participated in different modes of instruction (e.g., fully in-person, fully remote, or hybrid instruction). This paper answers three main questions: 1) How did the pandemic affect student achievement in Michigan?; 2) How did these achievement trends change throughout the pandemic?; and 3) Did achievement vary by race/ethnicity, economic disadvantage, and/or instructional modality? To investigate M-STEP achievement growth, we compare three-year growth outcomes for a “pre- pandemic cohort” that completed either the math or ELA assessment three years apart before the school closures that occurred at the start of the pandemic (i.e., 3rd- and 4th-grade students in spring 2016 who 65 progressed to 6th- and 7th-grade in spring 2019) and a “pandemic” cohort that completed the M-STEP before the pandemic and in the most recent test administration (i.e., 3rd- and 4th-grade students in spring 2019 who progressed to 6th- and 7th-grade in spring 2022). We also examine changes in achievement on nationally normed benchmark assessments across the fall 2020, spring 2021, fall 2021, and spring 2022 testing periods. These analyses provide additional insight into students’ achievement trajectories by capturing more granular changes during the school years that were directly impacted by the pandemic. To align with the sample of students in our M-STEP analyses, we focus on middle school students (i.e., students who were in 5th through 7th grade in 2020-21 and in 6th through 8th grade in 2021-22). This allows for a more consistent comparison of students and outcomes across assessments. Given that the available literature on learning during the COVID-19 pandemic generally finds that achievement slowed more for early elementary students than older students (e.g., Amplify Education, 2021; Pier et al., 2021; Goldhaber et al. 2022), this sample choice may also provide an upper bound for unfinished learning across all grade levels. We find that middle school students in Michigan experienced far less math achievement growth over the last three years than prior cohorts of students before the pandemic. Effects on ELA were generally small and statistically insignificant. However, this overall picture of pandemic-era achievement masks semester-by-semester trends in achievement. In particular, Michigan students were scoring much farther behind national norms in math by fall 2020 than they were in reading. Both math and reading achievement then declined substantially between the fall and spring of 2020-21, with somewhat steeper declines in reading than in math. Although students have recovered some of these losses as of spring 2022, average scores in both subjects remain below pre-pandemic norms. Across both types of assessments, we consistently find larger negative effects for students of color, students who are economically disadvantaged, and students whose districts did not offer in-person instruction in 2020-21. The remainder of the paper proceeds as follows: Section two first describes Michigan’s Return to Learn legislation that laid out assessment requirements to enable districts and policymakers to track student learning during the pandemic. Section three then briefly reviews the extant literature on student achievement during and beyond the pandemic. The fourth section describes our data and methods of estimating achievement growth and trends during the pandemic. We provide our results in the fifth section and conclude with a discussion of these results and implications for policymakers in section six. 2. K-12 Student Testing in Michigan during the COVID-19 Pandemic In March of 2020, all schools in Michigan were ordered by the state to close and move to remote learning. The expected spring 2020 administration of the M-STEP exam was canceled, and schools stayed remote for the remainder of the school year. In August of 2020, the governor signed a series of three “Return to Learn” bills intended to grant districts flexibility to safely provide instruction during the COVID-19 pandemic (Public Act 147, 2020; Public Act 148, 2020; Public Act 149, 2020). For the 2020-21 school year 66 only, the legislation waived many instructional requirements, including what learning activities count toward the attendance and enrollment calculations that determine state aid allocations. The state also waived requirements that students had to take M-STEP exams if they were in remote schooling. Approximately 70 percent of students participated in the M-STEP assessment in spring 2021, and the tested and untested populations differed substantially across individual, school, and district characteristics. As a result, given substantial sample selection concerns, we do not consider the spring 2021 administration of the M-STEP. As a condition for receiving state aid for the year, the legislation required each district to develop an extended COVID-19 learning plan that included the administration of benchmark assessments to all K- 8 students at the beginning and end of the school year to determine whether students made meaningful progress toward mastery of state standards in reading and mathematics. The legislation allowed districts to choose one of four state-approved benchmark assessments in reading or math, an assessment that met the same requirements, or develop their own assessment locally. While the legislation prohibited the use of these data for accountability purposes, districts that elected to use a state-approved provider were required to report data to the state. Additional legislation renewed the benchmark assessment requirement for the 2021-22 academic year. Finally, in spring 2022, after nearly all schools in Michigan returned to full-time in-person instruction, the M-STEP exams returned to their pre-pandemic administration requirements and students were no longer given pandemic-related exemptions. 3. Relevant Literature Across the country, educators and students alike have reported that teaching and learning during the pandemic were challenging, requiring educators to gain new skills, districts to provide new resources, and students to learn in unfamiliar and often difficult circumstances (e.g., Chen et al., 2021; Ferren, 2021; Francom et al., 2021; Hamilton et al., 2020; Pitluck & Jacques, 2021). In Michigan, as well, teachers, principals, and district superintendents reported that pandemic instruction was difficult for them and their students (Cummings et al., 2020; Hopkins et al., 2021). Survey evidence shows that Michigan educators were concerned that many students missed critical instructional time, had inadequate access to technology, lacked support for at-home learning, and received insufficient services during the 2020-21 school year (e.g., meals, counseling). In addition, educators indicated a need for training and guidance to help them provide adequate instruction during the pandemic. These things, combined with the extramural burdens of the pandemic, led to difficulties keeping students engaged in schoolwork, locating students, and maintaining student attendance (Cummings et al., 2020; Hopkins et al., 2021; for a review of the literature, see West & Lake, 2021). It is therefore no surprise that a growing literature of national and state-specific research shows that there were fewer opportunities for students to learn during the pandemic than in a typical year. This has resulted in less – and sometimes far less – student growth on standardized achievement tests. 67 3.1. Student Achievement at the End of the 2021-22 School Year As spring 2022 end-of-year assessment data have become available, there is growing evidence that students made progress academically during the 2021-22 school year, but many still fall below pre- pandemic achievement levels, particularly in math. For example, in Tennessee, slightly more than a third of elementary, middle, and high school students scored proficient on the spring 2022 ELA standardized assessment. The scores for each grade span all matched or exceeded pre-pandemic achievement levels. Math proficiency levels in Tennessee have yet to recover, though proficiency gains across all grade levels closed 30 to 50% of the initial learning gaps documented at the beginning of the COVID-19 pandemic (Tennessee Department of Education, 2022). State education agencies in Florida, Idaho, Indiana, Ohio, and Texas have all reported similar results (Appleton, 2022; Greater Fort Lauderdale Alliance, 2022; Kogan, 2022; Texas Education Agency, 2022; Idaho State Department of Education, 2022). Analyses using nationally representative data from non-summative tests provide a more tepid view of pandemic recovery. A July 2022 study summarizing aggregate achievement among students who completed an NWEA assessment shows 2020-21 learning rates in math and reading were well below pre- pandemic trends (Kuhfeld & Lewis, 2022). In 2021-22, learning gains generally mirror pre-pandemic achievement trends, and, in some cases, achievement growth exceeded that of a typical school year by as much as a quarter to a third of the unfinished learning experienced throughout school closures and remote instruction over the last two school years. However, even if this accelerated growth continues at similar rates to those seen during the 2021-22 school year, it may be years before students experience a full recovery; Kuhfeld and Lewis (2022) estimate that students currently in grades three through five may not fully recover for three to five years while middle school students may need five or more years to return to pre-pandemic achievement levels. Recently reported results from the spring 2022 administration of the National Assessment of Educational Progress (NAEP) paint an even bleaker picture of achievement during the pandemic. The most recent math and reading NAEP scores fell for nearly all student subgroups and in all regions across the country. On average, NAEP reading scores for students in grades four and eight dropped by three points relative to scores from 2019, which was the largest decrease in reading scores in more than 30 years. The declines in math were even larger (five and eight points for 4th- and 8th-graders, respectively) – the first time math scores fell since the NAEP began in the late 1960s (National Center for Education Statistics, 2022). Outcomes in some states were worse than others. In Michigan, where our study is based, NAEP math declines were generally equal to the average decreases across the country (four and eight points for 4th- and 8th-graders, respectively), but declines in reading scores for 4th- (six points) and 8th-graders (four points) exceeded national averages. 68 3.2. Heterogeneity in the Effects of the Pandemic on Student Learning There are myriad reasons for these declines in student achievement, ranging from the massive toll the pandemic took on many educators’ and students’ mental, socio-emotional, and physical health, the frequent disruptions and changes to school operations, learning environments, modes of instruction, and other extramural elements of the pandemic itself. A recent report from the Center on Reinventing Public Education (CRPE) detailed the overarching findings from the most rigorous of these studies (Cohodes et al., 2022). The CRPE report highlights that many, and often the most traditionally underserved, students received less in-person instruction in the first two full school years affected by the pandemic than in a typical school year. This resulted in reduced learning time, and in some cases, lower quality instruction. This point is critical for any understanding of the effects of the pandemic on student learning. While average measures of interrupted learning are themselves quite concerning, it is clear from the CRPE’s review that the effects of COVID-19 on students varied across student populations and the pandemic has had a greater, negative effect on achievement and achievement growth for specific student groups. Relevant to this study, research consistently shows that Black, Latino, and economically disadvantaged students experienced the greatest learning interruptions and fell further behind their White and more advantaged peers (Amplify Education, 2021; Dorn et al., 2021; Goldhaber et al., 2022; Jack et al., 2022; Kilbride et al., 2022 Kogan & Lavertu, 2021; Pier et al., 2021). For example, in the three metro- Atlanta districts studied by Sass and Ali (2022), differences in achievement by race and socioeconomic status have grown, more so in math than in reading. Some of the variation in student achievement is also explained by the instructional modality districts used or students selected; students who received more in-person instruction, on average, have learned more throughout the pandemic (Cohodes et al., 2022; Darling-Aduana et al., 2022; Jack et al., 2022; Kilbride et al., 2022; Kogan & Lavertu, 2021; Sass & Ali, 2022). For example, Goldhaber and colleagues (2022) leveraged NWEA assessment data from more than two million students across 49 states to understand how the provision of different instructional modalities impacted achievement gaps. Overall, math achievement gaps by race/ethnicity and school poverty status, as well as reading gaps to a lesser extent, did not widen in districts that provided students with in-person instruction. Conversely, the authors found that a district-level shift from in-person to remote instruction was a primary driver of widening racial/ethnic and socioeconomic achievement gaps. With all of these findings in mind, it is important to note that estimates of learning growth during the pandemic likely understate the true impacts on student learning. Across the country and in Michigan, we know that fewer students enrolled in school and that absenteeism increased during the pandemic (Belsha, 2021; Cavitt, 2021; Levin, 2021; Mahnken, 2021; Pendharkar, 2021). This translates into lower- than-usual participation in assessments, especially in the 2020-21 school year, adding to the difficulty of 69 drawing clear conclusions about student performance during the pandemic (Fensterwald, 2020; Sawchuk, 2021). In particular, students disproportionately affected by the pandemic may comprise a substantial portion of the missing student assessment data, contributing to inequitable learning experiences across the country (Barnum, 2021). 4. Data and Methods 4.1. Data We combine several sources of data to understand student achievement in Michigan during the COVID-19 pandemic, including student performance on both the state’s summative end-of-year assessment and benchmark assessments administered during the pandemic. We also use state administrative data capturing student, school, district, and county demographics as well as a measure of access to in-person instruction offered during the 2020-21 school year. We describe these data below. We use two sources of student achievement data to understand shifts in assessment performance during the pandemic. First, we use student outcomes from the M-STEP math and ELA assessments administered during the 2015-16, 2018-19, and 2021-22 school years. The M-STEP is Michigan’s summative standardized assessment used to meet state and federal accountability requirements for students in grades three through seven. There are no M-STEP scores available from spring 2020, as the federal government waived testing requirements for the 2019-20 school year. Moreover, because the federal government waived test participation requirements in spring 2021 due to continued pandemic-related disruptions to in-person learning, only 73% of Michigan students participated in M-STEP testing in spring 2021, and the tested population consisted of more White, non-economically disadvantaged students from higher income districts with lower proportions of students of color. Given these limitations, our main M- STEP measures are generated as three-year changes in student M-STEP performance between 2016 and 2019 (for the pre-pandemic cohort) and changes between 2019 and 2022 (for the pandemic cohort). We use three-year gaps to ensure that we have a pre-pandemic testing outcome for the pandemic cohort. Prior to calculating these three-year growth outcomes, we standardize math and ELA M-STEP scores within each cohort to enable a comparison of student achievement over time. Specifically, we calculate the mean and standard deviation of math and ELA M-STEP scores separately for each grade level in the base year for each cohort (i.e., 2016 and 2019 for the pre-pandemic and pandemic cohorts, respectively). We then use these grade- and year-specific means to standardize math and ELA M-STEP scores for the same grade levels relative to the base year for each cohort. Second, we use student performance on nationally normed math and reading benchmark assessments administered to Michigan students in the fall and spring of the 2020-21 and 2021-22 school years. The vast majority of districts and students participated in either NWEA’s MAP Growth or Curriculum Associates’ i-Ready assessments. Due to the small sample sizes for the other two state-approved 70 assessments (Renaissance Learning’s Star 360 and Data Recognition Corp’s Smarter Balanced Interim Assessments), we limit our analyses to just MAP Growth and i-Ready. Due to Michigan policies written into the Return to Learn law, we are restricted to using district-grade-subgroup level means rather than individual student data.9 Our main outcome of interest for benchmark assessments is therefore district-level average math and reading scores for students in grades five through seven, overall and by subgroups. Similar to the M-STEP outcomes, the benchmark assessment scores are standardized relative to pre-pandemic test score distributions. However, unlike the M-STEP outcomes, we use means and standard deviations from nationally representative norming samples to standardize scores for each grade, subject, and testing period. One reason for this is that districts only provided benchmark assessment data from the fall 2020 and subsequent testing periods because they were not required to administer assessments prior to fall 2020, and for those that did, they were not required to provide them to the state. Therefore, we cannot use these data to identify pre-pandemic score distributions that are specific to our sample. Moreover, there are substantial differences between the MAP Growth and i-Ready samples in terms of demographic composition and prior achievement, and this approach also allows us to measure achievement on each benchmark assessment relative to populations of students that are more comparable to each other. Although the M-STEP and benchmark data are not directly comparable, we include spring 2019 and 2022 M-STEP scores in our analysis of benchmark assessment trends to explore outcomes across assessments across a similar timeframe. While the M-STEP is not administered outside of Michigan, its design is based closely on the Smarter Balanced assessment and both M-STEP and Smarter Balanced scores are derived from the same underlying scale (Michigan Department of Education, 2019). This allows us to convert M-STEP scores to Smarter Balanced scores and standardize outcomes relative to national norms for the Smarter Balanced assessment (Smarter Balanced Assessment Consortium, 2020). Additionally, since the sample of students in our analysis were in grades 5 through 7 in the 2020-21 school year and 8th-graders in Michigan complete the PSAT 8/9 to satisfy annual federal testing requirements, we also standardize spring 2022 PSAT 8/9 scores for Michigan 8th graders relative to national norms. Each testing regime has benefits and drawbacks, making it valuable to investigate both. For the M- STEP, the data are recorded at the individual student level both before and after the start of the COVID-19 pandemic. This gives us the ability to control for the same characteristics included in the benchmark analysis at the individual student level rather than district-grade-subgroup averages. Moreover, nearly all 9 Most districts provided authorization for us to construct district-level aggregate datasets from their student-level benchmark assessment data, while some districts chose to only provide aggregate datasets they prepared themselves. Districts that chose to aggregate their own data were instructed to calculate average scale scores across all students in the same subgroup and grade level who completed an assessment from the same provider in each of the four testing periods. We apply the same sample restrictions and construct equivalent aggregate measures for the districts that provided student-level data, then compile all districts’ aggregate data into a combined dataset for the benchmark analysis. 71 3rd- through 7th-grade students in Michigan take the M-STEP, so these data provide a more representative and consistent measure of student achievement than the data from district-selected benchmark assessments. However, since the M-STEP was not administered in spring 2020 and many students did not take the M- STEP in spring 2021, it is difficult to track student growth at different times throughout each pandemic- affected school year. As such, we use three-year periods to measure achievement growth and the pandemic cohort includes some instruction in 2019 before the start of the pandemic. The key benefits of the benchmark exams begin with the fact that districts administered them twice each year, allowing us to examine higher frequency changes in achievement. For example, with the benchmark exams we can study how far achievement initially fell over the course of the first full pandemic year, and then how quickly students recovered. However, because these data are only available for fall 2020 and after, we cannot compare students’ performance on these assessments directly to their pre-pandemic performance, nor can we capture changes in achievement during the earliest months of the pandemic between the spring and fall of 2020. In addition, there is a national sample of students who take the NWEA MAP Growth and Curriculum Associates i-Ready assessments. This more easily allows us to compare Michigan students’ progress throughout the pandemic to that of students across the country. We merge assessment scores with several other data sources to explore heterogeneity in test score outcomes. First, we incorporate data on student demographic characteristics from the Michigan Student Data System (MSDS) to identify student subgroups based on their race/ethnicity and economically disadvantaged status.10 In analyses exploring differences by race/ethnicity, we focus on White, Black, and Latino students as these are the three largest racial/ethnic subgroups in the state and we often do not have large enough sample sizes of students in other subgroups to permit analysis. Second, we examine heterogeneity by districts’ instructional modality during the 2020-21 school year. In that year, all Michigan school districts not already operating virtually prior to the pandemic were required to report the instructional modalities offered to students each month of the school year. In the monthly questionnaire administered through MDE, districts were asked to indicate if they planned to instruct any of their students in a fully in- person (students receive 100% of their instruction in person), fully remote (students receive 100% of their instruction remotely), or hybrid format (students attend school in person for part of the week and participate in remote instruction for part of the week). For our analysis, we assign students to each modality type based on the number of months their district offered fully in-person instruction: zero months, one to four months, five to eight months, or all nine months of the 2020-21 school year. Finally, since district modality offerings were often tied to community incidence of COVID-19, we 10 In Michigan, students are identified as economically disadvantaged if they qualify for free or reduced-price milk or meals through the National School Lunch Program (i.e., Supplemental Nutrition Eligibility). This includes homeless-identified students who are categorically eligible for free meals. 72 link our achievement data with daily counts of county-level COVID-19 deaths collected and distributed by the Michigan Department of Health and Human Services in order to control for COVID-19 incidence during our sample period. We use these data to calculate seven-day average death rates per 100,000 residents for the first day of each month between July 2020 and May 2022. For our analysis of M-STEP outcomes, we average COVID-19 death rates throughout the 2020-21 school year and assign these rates to students in the pandemic cohort (COVID-19 death rates for students in the pre-pandemic cohort are set to equal zero). For the benchmark analysis, we assign death rates by averaging rates across the three months leading up to each test administration period (July, August, and September for the fall administration, and March, April, and May for the spring administration) in both 2020-21 and 2021-22. 4.2. Analytic Samples M-STEP analysis. Our M-STEP analysis compares three-year M-STEP growth outcomes for two groups of students: our pre-pandemic and pandemic cohorts. The pre-pandemic math and ELA cohorts include approximately 198,600 students who completed the M-STEP math or ELA assessment in both spring 2016 and spring 2019. The pandemic math and ELA cohorts include approximately 180,500 students who completed one iteration of the M-STEP math or ELA assessment prior to the pandemic in spring 2019 and the most recent administration of the assessment in spring 2022. The difference in size between the pandemic and pre-pandemic cohorts is likely due to the fact that K-12 student enrollment in Michigan has decreased each year over the last decade, with particularly acute declines in 2020-21 (Center for Educational Performance and Information, 2023). Given the three-year gap in outcomes and our desire to follow individual students, our analysis sample is constrained to include students who begin the three year-period in the 3rd- or 4th-grade and finish in the 6th- or 7th-grade.11 Thus, the pre-pandemic cohort includes students who completed the 3rd- or 4th- grade assessment in 2016 and the 6th- or 7th-grade assessment in 2019. Similarly, students in the pandemic cohort include those who completed the 3rd- or 4th-grade assessment in 2019 and the 6th- or 7th-grade assessment in 2022. Because we construct these measures only from students with data from both test administrations, we drop students who were not present in Michigan, did not participate in the test, or had invalid test scores in either period. Thus, the pre-pandemic and pandemic cohorts represent 88.0 and 85.3 percent of all Michigan 3rd- and 4th-grade students, respectively, who participated in M-STEP testing in the base year for each cohort. Table 14 provides summary statistics for students in the M-STEP sample by subject and cohort. Table 14 shows that students in the pre-pandemic and pandemic cohorts are similar demographically. More than half of the students in each cohort are female, and each cohort has similar shares of Black, Latino, 11 In Michigan, 8th-graders take the PSAT 8/9 instead of the M-STEP, limiting us to examining students in grades three through seven. 73 special education, and English learner students. The two cohorts also started with similar base-year math and ELA achievement. However, previewing our results, we see that the pandemic cohort performed worse than the pre-pandemic cohort over the three-year period. Average math growth for students in the pre- pandemic cohort was small but positive (0.030 standard deviations [sd]), while the math achievement decreased by 0.212 sd on average for students in the pandemic cohort. Students in the pre-pandemic and pandemic cohorts experienced a similar decrease in ELA achievement over their respective three-year periods. Table 14. Summary Statistics; M-STEP Analytic Sample; Grades 3 and 4 (Base Year) Student Demographics (%) Pre-Pandemic Pandemic Pre-Pandemic Pandemic Math Cohorts ELA Cohorts Total Students Percent of 3rd and 4th Graders in Base Year 198580 88.0 180573 85.3 198559 180499 88.0 85.3 Student Demographics (%) Economically Disadvantaged Black Latino Special Education English Learners In-Person Access (%) 9 Months 5-8 Months 1-4 Months 0 Months M-STEP Scores (std. dev.) Base-Year Math Scores Math Growth Base-Year ELA Scores ELA Growth 52.1 16.8 8.3 11.3 8.2 -- -- -- -- 52.7 17.4 8.5 12.3 9.2 32.6 31.9 13.8 21.6 52.1 17.0 8.2 11.3 8.0 -- -- -- -- 52.8 17.4 8.5 12.3 9.1 32.6 32.0 13.8 21.5 0.0301 0.0003 -- -- 0.0364 -0.2124 -- -- -- -- 0.0268 -0.1061 -- -- 0.0311 -0.1376 Notes: Student demographic characteristics are measured in the comparison year for each cohort (i.e., 2019 for the pre-pandemic cohort and 2022 for the pandemic cohort. Base-year achievement summarizes outcomes in 2016 for the pre-pandemic cohort and 2019 for the pandemic cohort. “Math Growth” and “ELA Growth” represent three-year differences in achievement between 2016 and 2019 for the pre-pandemic cohort and between 2019 and 2022 for the pandemic cohort. 74 Benchmark Analysis. Our full sample for the benchmark analysis includes district-grade aggregated data from 141,034 students who entered the fall 2020 semester in grades five through seven and have valid math or reading scores in all four administration periods between fall 2020 and spring 2022. We focus only on students in grades five through seven to provide the closest comparison to students in our M- STEP sample. These aggregate measures only include students with test scores in each of the four semesters when benchmark assessments were administered (fall 2020, spring 2021, fall 2021, and spring 2022) to ensure that our comparisons over time reflect changes in student performance as opposed to changes in the populations of students tested. Additionally, we exclude districts that were not required to report data under Michigan’s benchmark assessment legislation. In total, this sample represents 68.8 percent of all 5th- through 7th-grade students in districts that offered a MAP Growth or i-Ready assessment in fall 2020. Table 15 provides summary statistics for students in the benchmark assessment sample. In this table, we compare the characteristics of all 5th- through 7th-grade Michigan students (“Statewide” column) to those 5th- through 7th-grade students in the analytic sample who completed a MAP Growth or i-Ready assessment in fall 2020, spring 2021, fall 2021, and spring 2022 (“All,” “MAP Growth,” and “i-Ready”). While the demographics of students in the analytic sample generally resemble the full population of students in districts that offered a MAP Growth or i-Ready assessment in similar grade levels, they are considerably less likely to be economically disadvantaged and slightly less likely to be Black. Students who took the NWEA MAP Growth assessment represent more than 80 percent of the analytic sample and have about 10 percentage points lower economic disadvantage rates and 6.5 percentage points fewer Black students than the average district statewide. Students who took the i-Ready assessments, on the other hand, are substantially more likely to be Black (30 percent) compared to the full population of Michigan students in grades five through seven (20 percent), but largely similar otherwise. This is largely driven by the Detroit Public Schools Community District, which is the largest school district in Michigan and accounts for more than one-fifth of all students who took an i-Ready assessment. 75 Table 15. Summary Statistics, Benchmark Assessment Analytic Sample, Grades 5-7 (2020-21) Student Demographics (%) Total Students Percent of Analytic Sample Percent of Enrollment in Districts Offering MAP Growth or i-Ready Assessment Student Demographics (%) Economically Disadvantaged Black Latino Special Education English Learner In-Person Access (%) 9 Months 5-8 Months 1-4 Months 0 Months Statewide Analytic Sample MAP Growth All i-Ready 205,038 -- 100 141,034 100.0 68.8 116,015 82.2 56.6 25,019 17.7 12.2 54.4 20.0 8.8 14.4 5.3 31.9 33.7 14.9 19.4 46.6 16.6 8.3 12.1 4.8 38.3 27.5 15.3 19.0 45.0 13.5 8.0 12.0 4.1 38.8 27.1 16.1 18.1 54.2 30.2 9.7 12.3 8.5 32.8 31.9 6.0 29.3 2019 M-STEP Achievement (std. dev.) Math ELA 0.0011 0.0131 0.0413 0.0268 0.0745 0.0599 -0.1121 -0.1262 Notes: The “Statewide” column includes all 5th- through 7th-grade students in Michigan districts that offered an NWEA MAP Growth or Curriculum Associates i-Ready benchmark assessment. The “All” column includes both MAP Growth and i-Ready students from the analytic sample. Average standardized 2019 M- STEP achievement represents 3rd- through 5th-grade outcomes for all students in MAP Growth and i-Ready districts (“Statewide”) as well as those in our analytic sample. 4.3. Methods To examine disparities in three-year M-STEP achievement growth between pre-pandemic and pandemic cohorts, we estimate the following baseline model: 3𝑌𝐺𝑠𝑔𝑑 = 𝛼 + 𝜃1𝑃𝐶𝑂𝐻𝑂𝑅𝑇𝑠𝑔𝑑+ 𝜃2𝐵𝑌𝐴𝑠𝑔𝑑 + 𝜃3 ′ 𝑆𝐶𝐻𝐴𝑅𝑠 + 𝛾𝑔 + 𝛿𝑑 + 𝜀𝑖 (1) where 3𝑌𝐺𝑠𝑔𝑑 represents three-year standardized M-STEP math or ELA growth for each student, s, in grade, g, and district, d. 𝑃𝐶𝑂𝐻𝑂𝑅𝑇𝑠𝑔𝑑 is a binary indicator that identifies students in the pandemic cohort. 𝑆𝐶𝐻𝐴𝑅𝑠 is a vector of student characteristics (i.e., gender and race/ethnicity, as well as economically disadvantaged, special education, English learner, homeless, and migrant status). 𝛾𝑔 and 𝛿𝑑 are grade and district fixed effects. The coefficient 𝜃1 captures any disparity in standardized M-STEP test score growth between students in the pre-pandemic and pandemic cohorts. To estimate subgroup-specific differences in achievement growth during the pandemic, we extend model (1) by interacting 𝑃𝐶𝑂𝐻𝑂𝑅𝑇𝑠𝑔𝑑 with our 76 indicators for race/ethnicity, economically disadvantaged status, and access to in-person instruction (i.e., zero months, one to four months, five to eight months, and all nine months). Second, to understand trends in student achievement during the pandemic school years, we use both M-STEP and benchmark assessment scores in the following baseline model: 𝑀𝑆𝑇𝐸𝑃 + 𝜃5 𝑀𝑆𝑇𝐸𝑃 + 𝜃2𝑆21𝑡 + 𝜃3𝐹21𝑡 + 𝜃4𝑆22𝑡 + 𝜃5𝑆22𝑡 𝑌𝑑𝑔𝑠𝑣𝑡 = 𝛼 + 𝜃1𝑆19𝑡 ′ 𝐷𝐶𝐻𝐴𝑅𝑑𝑔𝑡 + 𝛾𝑔 + 𝛿𝑑 + 𝜀𝑖 (2) where 𝑌𝑑𝑔𝑠𝑣𝑡 is the average standardized test math or reading score for students in district, d, grade, g, completing subject test, s, from assessment provider, v, in semester, t. 𝑆19𝑀𝑆𝑇𝐸𝑃, 𝑆21, 𝐹21, 𝑆22 and 𝑆22𝑀𝑆𝑇𝐸𝑃are binary indicators identifying the semester associated with the outcome of interest, 𝑌𝑑𝑔𝑠𝑣𝑡 (i.e., nationally standardized M-STEP or benchmark assessment scores). 𝐷𝐶𝐻𝐴𝑅𝑑𝑔𝑡 is a vector of mean- centered, district-level student characteristics (i.e., student shares by gender and race/ethnicity, as well as economically disadvantaged, special education, English learner, homeless, and migrant status), and 𝛾𝑔 and 𝛿𝑑 are grade and district fixed effects, respectively. The coefficients on indicators 𝜃1, 𝜃2, 𝜃3, 𝜃4 and 𝜃5 describe the difference in average standardized test scores between fall 2020 and spring 2019, spring 2021, fall 2021, and spring 2022 (for both benchmark and M-STEP outcomes), respectively. To examine heterogeneity across student subgroups and district instructional modality, we extend model (2) by interacting each time indicator with our indicators for race/ethnicity, economically disadvantaged status, and access to in-person instruction.12 5. Results Before delving into the main sets of results, we first take a simple descriptive look at Michigan student performance over the course of the pandemic. Using linking studies available from each assessment provider (see Curriculum Associates, 2020; NWEA, 2020), we translate students’ MAP Growth and i- Ready scores into approximate M-STEP proficiency levels (i.e., not proficient, partially proficient, proficient, or advanced) to investigate how Michigan students’ benchmark assessment scores translate to M-STEP performance before and during the COVID-19 pandemic. For this analysis, we compare these performance thresholds for all 3rd- through 7th-grade students with a valid MAP Growth or i-Ready benchmark assessment score in spring 2021 or spring 2022 to the actual distribution of 2018-19 and 2021- 22 M-STEP proficiency outcomes among students in the same districts that offered a MAP Growth or i-Ready assessment. This analysis allows us to understand how Michigan students might have performed on the state’s summative assessment during the first two pandemic years when M-STEP was either canceled or optional and compare these estimates to the actual M-STEP proficiency levels of students in the same 12 Since our measure of access to in-person instruction is calculated at the district level, we do not include district fixed effects in the models examining differences across access. 77 districts in 2018-19 and 2021-22. Figure 10. M-STEP Proficiency Levels and Vendor-Defined M-STEP Equivalencies, NWEA MAP Growth and Curriculum Associates i-Ready Notes: These percentages include 3rd- through 7th-grade students with a valid benchmark assessment score in spring 2020 or spring 2021. Benchmark assessment scores are converted to an estimated M-STEP proficiency category based on a linking studies from NWEA and Curriculum Associates. Proficiency rates from the 2018-19 and 2021-22 M-STEP include all students in districts that use the MAP Growth or i- Ready assessments. Figure 10 shows how the estimated distribution of M-STEP proficiency levels using benchmarks outcomes for Michigan 3rd- through 7th-grade students compares to the actual distribution of M-STEP outcomes from the spring 2019 and spring 2022 administrations of the assessment. As seen in the figure, achievement declined in the first pandemic year and remained lower than pre-pandemic levels in the next two years. Specifically, compared to students in the same districts who took the M-STEP in 2018-19, more students were classified as “not proficient” and fewer were classified as “proficient” or “advanced” based on their spring 2021 or spring 2022 benchmark assessment scores across both subjects. The percentages of students in each of these proficiency levels, however, did not change much between 2020-21 and 2021-22. Importantly for our study, the estimated M-STEP proficiency rates from spring 2022 generally align with actual outcomes from the spring 2022 M-STEP administration. Indeed, the underlying correlation between individual benchmark and M-STEP scores from the spring 2022 administrations of both tests is 0.902 in math and 0.834 in reading/ELA. This suggests that we gain a similar signal from both measures of performance. 78 5.1. M-STEP Achievement Growth Figures 11 through 14 provide our results from estimating model (1), examining differences in achievement growth between students in the pre-pandemic and pandemic M-STEP cohorts. Tables A.7 through A.9 provide the coefficient estimates from these models. In Figure 11, the zero-line represents the average three-year M-STEP growth for students in the pre-pandemic cohort, and we show results from models that initially control for students’ grade level then sequentially add demographic/community characteristics and district fixed effects. In Figures 12 through 14, the zero-line represents the average three- year M-STEP growth for pre-pandemic cohort students in the specific reference group (i.e., White or non- economically disadvantaged students). Given the similarities between models that do and do not include district fixed effects, these latter figures only provide results from our preferred models that include district fixed effects. Figure 11 shows that, overall, students in the pandemic cohort had significantly lower math achievement gains than students in the pre-pandemic cohort. Specifically, students in the pandemic cohort grew between 0.167 and 0.201 sd less in math over the three pandemic-affected years than did students in the pre-pandemic cohort. ELA growth for students in the pandemic cohort was generally similar to those in the pre-pandemic cohort; in our fully specified model, students who completed an ELA M-STEP assessment in 2019 and 2022 grew by approximately 0.025 standard deviations less than similar students who completed assessments in 2016 and 2019, however, this estimate is not statistically significant. 79 Figure 11. Differences in Learning Trajectories between Pre-Pandemic and Pandemic M-STEP Cohorts, 2016-2019 and 2019-2022 M-STEP Mathematics and ELA Assessments Notes: Each model includes grade-level indicators for each sub-cohort to control for differences in learning trajectories between younger and older students. The second estimate in each panel also includes controls for student demographics and community characteristics. The final estimate in each panel adds district fixed effects to control for time-invariant, unobservable characteristics of each district that may influence learning trajectories. 80 Figure 12. Differences in Learning Trajectories between Pre-Pandemic and Pandemic M-STEP Cohorts by Student Demographics, 2016-2019 and 2019-2022 M-STEP Mathematics and ELA Assessments Notes: Each model includes student demographics and community characteristics, grade-level indicators for each sub-cohort to control for differences in learning trajectories between younger and older students, and district fixed effects to control for time-invariant, unobservable characteristics of each district that may influence learning trajectories. Figure 12 provides results from the district fixed effects model, this time examining heterogeneity by race/ethnicity and economically disadvantaged status. Even prior to the pandemic, disparities in achievement growth existed such that Black and Latino and economically disadvantaged students experienced slower achievement growth than their White and higher-income peers. However, we find that growth disparities across these groups of students intensified during the pandemic, particularly in math. Specifically, in the three years prior to the pandemic, Black and Latino students experienced math achievement growth that was 0.112 and 0.018 sd lower than White students during the same period, respectively. In the three years encompassing the pandemic, Black and Latino achievement growth fell even further behind (-0.368 and -0.240 sd, respectively). Similarly, math achievement growth for economically disadvantaged students in the pre-pandemic cohort was 0.130 sd behind their more advantaged peers and this disparity increased for students in the pandemic cohort (-0.351 sd). In ELA, achievement growth for Black, Latino, and economically disadvantaged students in the pre-pandemic cohort trailed their respective 81 peers. However, these differences changed little for students in the pandemic cohort. Figure 13 summarizes district fixed-effect models estimating differences in math and ELA M- STEP three-year growth by the instructional modalities provided to students in 2020-21.13 We find that students in districts that offered in-person instruction all nine months of the 2020-21 school year still had lower math achievement growth over the course of the pandemic than students in the pre-pandemic cohort (-0.147 sd). Students in districts that did not offer in-person instruction for at least some of the 2020-21 school year experienced significantly slower math achievement growth than did students in districts that offered in-person instruction for all nine months, with achievement growth trailing their in-person peers by nearly 0.05 sd. Moreover, achievement growth for these students trailed pre-pandemic students’ math achievement growth by more than 0.2 sd. However, there were no significant differences between students in districts that were remote for all of the year or only part (i.e., in person for 5-8 months or for 1-4 months). Again, the disparities in ELA growth across modalities were much smaller, and the disparities in growth rates were not significant compared to students in the pre-pandemic cohort. 13 It is important to note that while we are considering three-year achievement growth covering 2019-20 through 2021-22 here, we only consider modality in 2020-21 as after the pandemic began in late 2020, all schools in the state were remote for the remainder of the school year and by fall 2021, almost every school district in the state had returned to in-person modality. 82 Figure 13. Differences in Learning Trajectories between Pre-Pandemic and Pandemic M-STEP Cohorts by 2020-21 Instructional Modality, 2016-2019 and 2019-2022 M-STEP Mathematics and ELA Assessments Notes: Each model includes student demographics and community characteristics, grade-level indicators for each sub-cohort to control for differences in learning trajectories between younger and older students, and district fixed effects to control for time-invariant, unobservable characteristics of each district that may influence learning trajectories. Finally, we find that learning remotely adversely affected all students regardless of their race/ethnicity or socioeconomic status. Figure 14 shows results from models estimating differences in math and ELA growth by instructional modality provided to pandemic cohort students within each student demographic group considered in Figure 12. We find that the overall modality trends did not substantially differ across racial/ethnic and economically disadvantaged student subgroups, with all groups performing substantially higher in math and slightly higher in ELA if their school was in-person all year. For students experiencing remote instruction, Black and Latino students only showed slightly and mostly insignificantly lower math growth than White students with the same modality, as did economically disadvantaged students relative to non-disadvantaged. 83 Figure 14. Differences in Learning Trajectories between Pre-Pandemic and Pandemic M-STEP Cohorts by 2020-21 Instructional Modality and Student Demographics, 2016-2019 and 2019-2022 M-STEP Mathematics and ELA Assessments Notes: Each model includes student demographics and community characteristics, grade-level indicators for each sub-cohort to control for differences in learning trajectories between younger and older students, and district fixed effects to control for time-invariant, unobservable characteristics of each district that may influence learning trajectories. 5.2. Benchmark Achievement Trends Figures 15 through 18 show adjusted trends in standardized math and reading benchmark achievement for students who started the 2020-21 school year in grades five through seven and completed a MAP Growth or i-Ready assessment in all four administration periods during the 2020-21 and 2021-22 school years. Tables A.10 through A.12 provide the coefficient estimates from these models. Table A.10 summarizes overall math and reading benchmark trends and includes specifications that sequentially adds grade controls, district-level student controls and community-level COVID-19 incidence, and district fixed effects.14 Since the trends in each specification are generally similar, we only report estimates for models that include district fixed effects in Figures 15 through 18 and Tables A.11 and A.12. As noted earlier, benchmark assessment scores are standardized relative to pre-pandemic national norms for each grade, subject, and testing period. As such, we interpret the trend lines in Figures 15 through 9 as deviations from the average scores for nationally representative samples of students who took the same assessments before the pandemic. If Michigan students grew at the same rate as students in the pre- pandemic norming sample (and therefore maintained the same relative position within the norming 14 Estimates without district fixed effects are similar and available by request. 84 distribution over time), we would see a straight horizontal line. If they grew at a faster rate than students in the norming sample, we would see lines that slope upward. By contrast, downward sloping lines indicate slower than expected growth between two time periods. There are several important takeaways from Figure 15. First, by fall 2020, average benchmark scores in Michigan were below the pre-pandemic norms for both reading and math (0.021 and 0.233 sd below average, respectively). Again, although the spring 2019 M-STEP and fall 2020 data points are not directly comparable, it is clear that Michigan students in our sample were performing only slightly better in reading in fall 2020 than in spring 2019 but were substantially behind in math. Second, we find that both math and reading benchmark scores dropped considerably during the 2020-21 school year, falling even farther behind the national pre-pandemic norm. Between spring and fall 2021, however, Michigan students experienced faster than expected growth, such that by fall 2021 they had almost caught up to where they had started the 2020-21 school year in math, but still trailed their fall 2020 average score in reading. Nonetheless, these scores both remained substantially behind the average standardized M-STEP score from spring 2019. Then, during the 2021-22 school year, students made slightly higher than expected progress in math relative to the pre-pandemic national norm, whereas reading achievement fell relative to the national norm once again, albeit at a much slower rate than the prior year.15 Spring 2022 benchmark and M-STEP scores were generally similar in both subjects. Thus, overall, total achievement growth trends over the three years of the pandemic as measured by the benchmarks are consistent with our findings comparing pre- and post- pandemic M-STEP cohorts – a substantial drop in math achievement and a smaller drop in reading. What the benchmark exams highlight, however, is that this path was non-linear with severe drops in the first fully-impacted pandemic school year and some recovery in the time between spring 2021 and fall 2021 assessments. Worrisomely, however, there is an indication that recovery stalled in the 2021-22 school year, as a continued upward trend between fall 2021 and spring 2022 would be necessary to recover all of the losses from the early part of the pandemic. It is unclear at this time whether the recovery has accelerated into 2022-23. 15 To better see why a flat line indicates “normal” growth, Appendix Figures A.1.1 and A.1.2 show unadjusted scale score trends for the same sample of students. In these figures, the dashed gray lines represent pre-pandemic comparison points from each assessment provider’s norming sample, and the solid blue and green lines represent math and reading outcomes for the cohorts of Michigan students tested during the pandemic. By comparing the slopes of the solid lines to the slopes of the dashed lines, we can see whether the score changes realized by Michigan students exceeded or trailed pre-pandemic norms. It is clear that in both math and reading the slopes between fall 2020 and spring 2021 of the solid lines are flatter than the dashed lines, indicating negative relative growth. This reverses in the next segment and then reverts in the last segment, though math remains parallel. 85 Figure 15. Regression Adjusted Scale Score Trends, NWEA MAP Growth and Curriculum Associates’ i- Ready, Grades 5-7 Notes: These regression estimates include only students with benchmark assessment scores for every possible testing period. Each model controls for student demographics. Test scores have been standardized relative to NWEA’s and Curriculum Associates’ pre-pandemic national norms. Spring 2019 and 2022 M- STEP estimates have been standardized relative to national norms. Figure 16 shows differences in adjusted trends in standardized math and reading benchmark achievement by race/ethnicity. We find similar patterns across subgroups, all in line with the overall results shown in Figure 15. White students had consistently higher scores in both subjects compared to Black and Latino students, with Black students scoring the lowest of the three subgroups. White, Black, and Latino students all experienced a decrease in math and reading benchmark achievement between fall 2020 and spring 2021, followed by a rebound in scores during the 2021-22 school year. The declines in 2020-21 were largest for Black students (-0.207 and -0.227 sd in math and reading, respectively), followed by Latino students (-0.116 and -0.148 sd), and White students (-0.053 and -0.146 sd). During the 2021-22 school year, these gaps in math began to diminish, as math achievement for White students plateaued whereas Black and Latino math achievement increased slightly by 0.041 and 0.026 sd, respectively. Reading achievement decreased across all three groups of students during the 2021-22 school year (between approximately -0.035 and -0.050 sd across all student groups). 86 Figure 16. Regression Adjusted Scale Score Trends by Race/Ethnicity, NWEA MAP Growth and Curriculum Associates’ i-Ready, Grades 5-7 Notes: These regression estimates include only students with benchmark assessment scores for every possible testing period. Each model controls for student demographics. Test scores have been standardized relative to NWEA’s and Curriculum Associates’ pre-pandemic national norms. Spring 2019 and 2022 M- STEP estimates have been standardized relative to national norms. These patterns reveal that the pandemic exacerbated racial/ethnic math achievement gaps. In fall 2020, the differences in math achievement between White students and their Black and Latino peers were 0.492 and 0.240 sd, respectively. By spring 2022, the White-Black and White-Latino gaps increased to 0.585 and 0.270 sd, respectively. In reading, the White-Black and White Latino gaps both decreased slightly by approximately 0.04 sd between fall 2020 and spring 2022. 87 Figure 17. Regression Adjusted Scale Score Trends by Economically Disadvantaged Status, NWEA MAP Growth and Curriculum Associates’ i-Ready, Grades 5-7 Notes: These regression estimates include only students with benchmark assessment scores for every possible testing period. Each model controls for student demographics. Test scores have been standardized relative to NWEA’s and Curriculum Associates’ pre-pandemic national norms. Spring 2019 and 2022 M- STEP estimates have been standardized relative to national norms. Figure 17 examines similar trends across students who were and were not economically disadvantaged. We find many of the same trends as previously discussed. Economically disadvantaged students scored consistently lower in both math and reading across all testing periods compared to their more advantaged peers. Further, both groups of students experienced a decline in math and reading achievement between fall 2020 and spring 2021, followed by a rebound in scores during the 2021-22 school year. The decreases in 2020-21 for both subjects were slightly larger for economically disadvantaged students (-0.126 to -0.179 sd in math and reading, respectively) than more advantaged students (-0.053 and -0.139 sd). Math achievement for the more advantaged students plateaued during the 2021-22 school year, while economically disadvantaged math achievement increased by 0.019 sd. Reading achievement across both groups students decreased during the 2021-22 school year (-0.035 and -0.056 sd for economically disadvantaged and non-economically disadvantaged students, respectively). Similar to results for disparities by race and ethnicity, we find that the pandemic exacerbated math achievement gaps by economically disadvantaged status. In fall 2020, economically disadvantaged students scored 0.463 sd below their peers in math. By spring 2022, this gap increased to 0.502 sd. The same gap in reading decreased slightly, from 0.432 to 0.419 sd between fall 2020 and spring 2022. 88 Figure 18. Regression Adjusted Scale Score Trends by 2020-21 Instructional Modality, NWEA MAP Growth and Curriculum Associates’ i-Ready, Grades 5-7 Notes: These regression estimates include only students with benchmark assessment scores for every possible testing period. Each model controls for student demographics. Test scores have been standardized relative to NWEA’s and Curriculum Associates’ pre-pandemic national norms. Spring 2019 and 2022 M- STEP estimates have been standardized relative to national norms. Finally, Figure 18 shows differences in adjusted scale score trends in standardized math and reading benchmark achievement by 2020-21 instructional modalities. To clearly understand achievement trends among students in districts that offered varying amounts of in-person instruction, we have removed the confidence intervals from Figure 18 because they overlap to such a great extent, making the figure more difficult to interpret. Hence, it is important to note that the differences we see across modalities in Figure 18 are generally not statistically significant. We provide an additional figure in the Appendix where the confidence intervals are included (Figure A.7). As might be expected given the overlapping confidence intervals, we find few differences in achievement across districts that offered varying levels of in-person instruction conditional on having hybrid or remote modalities during 2020-21, but students in districts that offered in-person instruction throughout all of 2020-21 performed better during that school year, consistent with our earlier results from analyses of the M-STEP scores. Initially, districts that offered in-person instruction throughout all of the 2020-21 school year had close to the lowest achievement levels as of the fall 2020 benchmarks, though these differences were not statistically significant. However, while schools that were in-person all year maintained math growth equivalent to pre-pandemic national norms between fall 2020 and spring 2021, those with remote schooling for any part of the year saw large declines in achievement relative to these norms, regardless of the number of months in which remote modality was offered. After the 2020-21 school 89 year, as schools returned to mostly in-person learning, math achievement growth equalized across modalities. Thus, throughout the first two pandemic years, schools that remained entirely in person saw smaller overall interruptions to math learning relative to the 2019-20 M-STEP scores, consistent with the pre- vs. post-pandemic M-STEP comparisons in Figure 13. For ELA, initially between Fall 2020 and spring 2021, districts that were entirely in-person performed better than those in other modalities, but then districts with remote instruction caught up such that there was little difference by modality in ELA by spring 2022, again consistent with the results in Figure 13. 6. Discussion Our M-STEP results suggest that, while middle-school ELA achievement fell only slightly, math achievement growth dropped considerably during the pandemic relative to pre-pandemic cohorts. These decreases in achievement growth were larger for Latino and Black students than for White students, but there was no significant difference by race or ethnicity in ELA achievement growth over the same period. Similarly, economically disadvantaged students experienced larger reductions in student achievement growth than their wealthier peers. In addition, students in districts that offered in-person instruction for all of the 2020-21 school year experienced significantly higher achievement growth than those in districts that did not offer in-person instruction for part or all of the year. Our benchmark results provide greater detail on student achievement trajectories during the two school years directly impacted by the pandemic. We find that, early in the pandemic, Michigan student achievement on benchmark assessments was already below national norms. In the first full pandemic- impacted school year (2020-21), achievement trends for Michigan middle school students fell further behind national norms before partially rebounding during the 2021-22 school year, especially for math. However, although math achievement growth began to mirror pre-pandemic trends in the 2021-22 school year, this is insufficient to enable students to “catch up” to where they would have been prior to the pandemic. Students would need to experience accelerated achievement growth – at rates greater than pre- pandemic expectations – to overcome the interrupted learning from the spring of 2020 and the 2020-21 school year. Whether or not we see this accelerated growth will become apparent as 2022-23 academic year data become available. The overall patterns we see are consistent across all subgroups of students (by race/ethnicity and socioeconomic status). However, disparities in math achievement between White and Black or Latino students, as well between economically disadvantaged students and their wealthier peers, grew between fall 2020 and spring 2022. Finally, we find some evidence that students who had access to in-person instruction for the entirety of the 2020-21 school year performed better in both reading and math during that same school year, but these effects only persisted for math achievement in the 2021-22 school year. There was no discernable difference in reading student achievement by spring of 2022. 90 Together, these summative and formative assessment results paint a nuanced picture of student achievement trends and outcomes during the full school years most impacted by the COVID-19 pandemic. By the spring of 2022, we see persistent and large negative effects on math. Effects on ELA were generally small and statistically insignificant. Benchmark results make clear that during the initial phase of the pandemic, when school buildings first shuttered for in-person learning across the state, Michigan students were scoring much further below national norms in math than they were in reading. While both math and reading achievement were negatively impacted in the 2020-21 school year, students in our data improved at a rate higher than national norms would have predicted over the summer of 2021. During the 2021-22 school year, math achievement improved relative to national norms and students were able to recoup achievement losses in 2020-21. However, given how far below national norms Michigan students’ math scores had fallen by fall 2020, even these relative improvements in the second full year of the pandemic were insufficient to allow students to rebound completely. We make several recommendations for policymakers and educators based on these findings. First, results from the 2021-22 school year make clear that the road to academic recovery in Michigan will not be quick and a return to “business as normal” will be insufficient to improve student achievement to pre- pandemic levels. For example, based on benchmark outcomes from spring 2022 and typical growth measures defined by each assessment provider, 5th- through 7th-grade students in Michigan will need to achieve roughly 140 to 180% of typical fall-to-spring growth in math, and between 120 to 140% of typical growth in reading, during the 2022-23 school year to reach the 50th percentile of pre-pandemic achievement by spring 2023. Clearly, then, the tremendous effect that the COVID-19 pandemic has had, and continues to have, on student learning will not be addressed quickly or without a substantial and sustained influx of resources to support education in Michigan. These patterns in achievement and achievement growth mirror recent findings from across the U.S. (e.g., Curriculum Associates, 2022; Goldhaber et al., 2022; Kuhfeld & Lewis, 2022). It will be critical for local, state, and the federal governments to prioritize both short- and longer-term investments into public education as educators and students work to recover from the trauma of the COVID-19 pandemic. Moreover, our and others’ results show particularly troublesome disruptions to math achievement. However, there has been relatively little discussion of ways to improve math achievement (Kuhfeld et al., 2022; Kuhfeld et al., 2022). While it is critical to continue providing supports for literacy instruction, the pandemic has taken an even greater toll on math achievement. Policymakers and educators will need to provide increased supports for math learning and instruction in the years to come. Thus, we are not “out of the woods” yet. Educators and policymakers must continue to monitor learning outcomes for all students, and especially for groups that were disproportionately affected by the COVID-19 pandemic. The mandated use and reporting of benchmark assessments in Michigan makes it 91 possible for state and local policymakers to understand where progress is (and is not) being made towards academic recovery. It will be critical to continue collecting data that allows policymakers, educators, and stakeholders to assess progress in the coming years. In particular, research exploring trends in academic achievement over the past two years makes clear that the COVID-19 pandemic has had a greater and more negative effect on economically disadvantaged, Black, and Latino students. While we do find that outcomes for these students increased at a faster rate compared to their respective peers between 2020-21 and 2021- 22, disparities between each group persist. Any decisions to reduce monitoring of student learning progress may exacerbate longstanding achievement gaps. In sum, our results bolster other data from around the country that make clear the road to recovery will be long – particularly for students who have been traditionally disadvantaged in K-12 public schooling. Educators and students will need continued and extensive supports in order to recover from the trauma of the COVID-19 pandemic, and governments at all levels must continue to prioritize both short- and longer- term investments into public education, in Michigan and elsewhere. 92 REFERENCES Amplify Education. (2021). COVID-19 means more students not learning to read. Amplify Education Research Brief. https://readytogether.sde.ok.gov/sites/default/files/2022- 04/AmplifymCLASS_MOY-COVID-Learning-Loss-Research-Brief_022421.pdf Appleton, A. (2022, July 13). Modest gains for this year’s Indiana test scores after pandemic decline. Chalkbeat Indiana. https://in.chalkbeat.org/2022/7/13/23205866/ilearn-indiana-state- testingscores-2022-pandemic-recovery Barnum, M. (2021, February 24). This year’s state test results will be tough to make sense of, experts warn. Chalkbeat. https://www.chalkbeat.org/2021/2/24/22299804/schools-testing-covidresults- accuracy. Belsha, K. (2021, August 7). Lack of in-person instruction pushed public school enrollment down, new research finds. Chalkbeat. https://www.chalkbeat.org/2021/8/7/22613546/research-remote- instructionschool-enrollment-declines. Cavitt, M. (2021, July 12). Oakland County sees historic drop in public school enrollment during pandemic. The Oakland Press. https://www.theoaklandpress.com/2021/07/12/top-l-enrollment- 0711. Center for Educational Performance and Information, 2023. Student Enrollment Counts Report, Statewide Trend, All Grades K-12. https://www.mischooldata.org/student-enrollment-counts-report/. Chen, L. K., Dorn, E., Sarakatsannis, J., & Wiesinger, A. (2021). Teacher survey: Learning loss is global—and significant. McKinsey & Co. https://www.mckinsey.com/industries/public-and- social-sector/our- insights/teacher-survey-learning-loss-is-global-and-significant?cid=other-eml- altmip-mck&hdpid=1a57cf47-eae2-400f-b9f0- fd8edddbbfb6&hctky=12238167&hlkid=b53175decdec48bbb6888bd21b5 Cohodes, S., Goldhaber, D., Hill, P., Ho, A., Kogan, V., Polikoff, M., ... & West, M. (2022). Student Achievement Gaps and the Pandemic: A New Review of Evidence from 2021-2022. Center on Reinventing Public Education. Cummings, A., Kilbride, T., Turner, M., Zhu, Q., & Strunk, K. (2020). How did Michigan educators respond to the suspension of face-to-face instruction due to COVID-19. Education Policy Innovation Collaborative. Curriculum Associates (2020, May). Scores on i-Ready Diagnostic that are equivalent to performance levels on the Michigan Student Test of Educational Progress (M-STEP). Curriculum Associates Research Report No. RR 2020-29. Curriculum Associates. (2022, September). The state of student learning in 2022. Curriculum Associates Annual Report. https://www.curriculumassociates.com/- /media/mainsite/files/corporate/state-of- student-learning-2022.pdf?_ga=2.24 9214663.1791340512.1663161051-741382319.1663161051. Darling-Aduana, J., Woodyard, H. T., Sass, T. R., & Barry, S. S. (2022). Learning-mode choice, student engagement, and achievement growth during the COVID-19 pandemic. AERA Open, 8, 23328584221128035. Dorn, E., Hancock, B., Sarakatsannis, J., & Viruleg, E. (2021, July 27). COVID-19 and education: The lingering effects of unfinished learning. McKinsey & Company. https://www.mckinsey.com/industries/education/our-insights/covid-19-and-education-the- lingering-effects-of-unfinished-learning Fensterwald, J. (2020, November 30). Early data on learning loss show big drop in math, but not reading skills. EdSource. https://edsource.org/2020/early-dataon-learning-loss-show-big-drop-in-math- 93 but-not-reading-skills/644416 Ferren, M. (2021, July 6). Remote learning and school reopenings: What worked and what didn’t. Center for American Progress. https://www.americanprogress.org/issues/education- k12/reports/2021/07/06/501221/remote-learning-school-reopenings-worked-didnt/ Francom, G. M., Lee, S. J., & Pinkney, H. (2021, June 26). Technologies, challenges and needs of K-12 teachers in the transition to distance learning during the COVID-19 pandemic. TechTrends, 65(4), 589–601. https://doi.org/10.1007/s11528-021-00625-5 Goldhaber, D., Kane, T. J., McEachin, A., Morton, E., Patterson, T., & Staiger, D. O. (2022). The consequences of remote and hybrid instruction during the pandemic (No. w30010). National Bureau of Economic Research. Greater Fort Lauderdale Alliance. (2022, June 29). State releases 2022 Florida standards assessment and end of course exam results. The Alliance. https://www.gflalliance.org/news/2022/06/29/education-news/state-releases2022-florida- standards-assessment-and-end-of-course-examresults/ - :%7E:text=State Releases 2022 Florida Standards Assessment and End of Course Exam Resuhttps://www.gflalliance.org/news/2022/06/29/education-news/state-releases2022-florida- standards-assessme Halloran, C., Hug, C. E., Jack, R., & Oster, E. (2023). Post COVID-19 Test Score Recovery: Initial Evidence from State Testing Data (No. w31113). National Bureau of Economic Research. Hamilton, L. S., Kaufman, J. H., & Diliberti, M. K. (2020). Teaching and leading through a pandemic: Key findings from the American Educator Panels spring 2020 COVID-19 surveys. Rand Corporation. https://www.rand.org/pubs/research_reports/RRA168-2.html Hopkins, B., Turner, M., Lovitz, M., Kilbride, T., & Strunk, K. (2021) A look inside Michigan classrooms: Educators’ perceptions of COVID-19 and K-12 school in the fall of 2020. Education Policy Innovation Collaborative. https://epicedpolicy.org/fall-2020-covid-19- survey_policy_brief/. Idaho State Department of Education. (2022, July 6). Early reading test shows statewide gains from previous spring and fall. [Press release]. https://www.sde.idaho.gov/communications/files/news- releases/07-06-2022- Idaho-early-reading-test-shows-strong-statewide-gains-from-previous- springand-fall.pdf Jack, R., Halloran, C., Okun, J., & Oster, E. (2022). Pandemic schooling mode and student test scores: evidence from US school districts. American Economic Review: Insights. Kilbride, T., Hopkins, B., Strunk, K.O. & Yu, D. (2022) Michigan’s 2020-21 and 2021-22 Benchmark Assessments. Education Policy Innovation Collaborative. https://epicedpolicy.org/wp- content/uploads/2022/10/COVID_Benchmark_Assessments_Report_Oct2022.pdf. Kogan, V. (2022, June). Academic achievement and pandemic recovery in Ohio: An update from fall third grade English language arts assessments. https://glenn.osu.edu/academic-achievement- pandemic-recovery. Kogan, V., & Lavertu, S. (2021, January 27). The COVID-19 pandemic and student achievement on Ohio’s third-grade English language arts assessment. John Glenn College of Public Affairs, Ohio State University. http://glenn.osu.edu/educational- governance/reports/reportsattributes/ODE_ThirdGradeELA_KL_1-27-2021.pdf. Kuhfeld, M., & Lewis, K. (2022). Student Achievement in 2021-2022: Cause for Hope and Continued Urgency. Collaborative for Student Growth. Brief. NWEA. 94 Kuhfeld, M., Soland, J., & Lewis, K. (2022). Test score patterns across three COVID-19-impacted school years. Educational Researcher, 51(7), 500-506. Kuhfeld, M., Soland, J., Lewis, K., & Morton, E. (2022, March 3). The pandemic has had devastating impacts on learning. What will it take to help students catch up? Brookings. https://www.brookings.edu/blog/brown-center-chalkboard/2022/03/03/the-pandemic-has-had- devastating-impacts-on-learning-what-will-it-take-to-help-students-catch-up/ Levin, K. (2021, March 18). Michigan lost 62,000 students this fall. Black enrollment fell 5%. Detroit Free Press. https://www.freep.com/story/news/education/2021/03/17/michigan-publicschools- enrollment-decline/4730513001/ Mahnken, K. (2021, June 28). New federal data confirms pandemic’s blow to K-12 enrollment, with drop of 1.5 million students; Pre-K experiences 22 percent decline. https://www.the74million.org/article/public-school-enrollment-down3-percent-worst-century/. Michigan Department of Education (2019). Technical Report, Spring 2019 Michigan Student Test of Educational Progress (M-STEP). https://www.michigan.gov/mde/- /media/Project/Websites/mde/Year/2020/05/14/Spring_2019_M- STEP_Technical_Report_Main_report.pdf?rev=0c3bbf9629ea411395157233c9 2439e0&hash=7889BF8A50495DC27B1FD1AD62820C04. Michigan Public Act 147 of 2020, Mich., MCL § 388.1621f (2020) http://www.legislature.mi.gov/(S(rhqyboiv2ivksh4xj0wbiqpl))/mileg.aspx?page =GetObject&objectname=mcl-388-1621f. Michigan Public Act 148 of 2020, Mich. MCL § 388.1701 (2020) http://www.legislature.mi.gov/(S(hotihihl5cde5dkkjzjvx4g3))/mileg.aspx?page =getObject&objectName=mcl-388-1701. Michigan Public Act 149 of 2020, Mich., MCL § 388.1606 (2020) http://legislature.mi.gov/doc.aspx?mcl-388-1606. National Center for Education Statistics (2022). NAEP long-term trend assessment results: Reading and mathematics. The Nation's Report Card. https://www.nationsreportcard.gov/highlights/ltt/2022/ NWEA (2020, December). Linking study report: Predicting performance on the Michigan state assessment system in grades 3-8 ELA and mathematics based on NWEA MAP Growth scores. NWEA Psychometric Solutions. https://www.nwea.org/content/uploads/2016/12/MI-MAP- Growth-LinkingStudy-Report_NWEA_2020-12-22.pdf Pendharkar, E. (2021, July 17). More than 1 million students didn’t enroll during the pandemic. Will they come back? Education Week. https://www.edweek.org/leadership/more-than-1-million-students- didntenroll-during-the-pandemic-will-they-come-back/2021/06. Pier, L., Hough, H. J., Christian, M., Bookman, N., Wilkenfeld, B., & Miller, R. (2021, January 25). COVID-19 and the educational equity crisis: Evidence on learning loss from the CORE Data Collaborative. PACE. https://edpolicyinca.org/newsroom/covid-19-and-educational-equity-crisis Pitluck, C., & Jacques, C. (2021, July). Persistent challenges and promising practices: District leader reflections on schooling during COVID-19. Research Brief. AIR. National Survey of Public Education’s Response to COVID-19. https://www.air.org/sites/default/files/2021-08/research- brief-covid-surveypersistent-challenges-july-2021rev.pdf. Sass, T., & Ali, S. M. (2022). Student Achievement Growth During the COVID-19 Pandemic: Spring 2022 Update. Sawchuk, S. (2021, July 14). “Extreme” chronic absenteeism? Pandemic school attendance data is bleak, 95 but incomplete. Education Week. https://www.edweek.org/technology/extreme-chronic- absenteeismpandemic-school-attendance-data-is-bleak-but-incomplete/2021/07 Smarter Balanced Assessment Consortium (2020). 2018-19 summative technical report. https://technicalreports.smarterbalanced.org/2018-19 summative- report/_book/ Tennessee Department of Education. (2022, June 14). Tennessee Releases 2021–22 TCAP State-Level Results Highlighting Significant Learning Acceleration. https://www.tn.gov/education/news/2022/6/14/tennessee-releases-2021-22- tcap-state-level- results-highlighting-significant-learning-acceleration-.html Texas Education Agency. (2022, July 1). TEA Releases 2022 Grades 3–8 STAAR Results | Texas Education Agency. https://tea.texas.gov/about-tea/news-andmultimedia/news-releases/news- 2022/tea-releases-2022-grades-3-8-staarresults West, M. R. & Lake R. (2021, July). How much have students missed academically because of the pandemic? A review of evidence to date. Center on Reinventing Public Education. https://www.crpe.org/publications/how-much-havestudents-missed-academically-because- pandemic-review-evidence-date. 96 Figure A.5. Trends in Average Scale Scores, NWEA MAP Growth, Grades 5-7, Fall 2020 to Spring 2022 APPENDIX Notes: These averages include only students with benchmark assessment scores for every possible testing period. The comparison points in the figure represent the 50th percentile of NWEA’s conditional growth distribution. RIT stands for Rasch unit scale. 97 Figure A.6. Trends in Average Scale Scores, Curriculum Associates’ i-Ready, Grades 5-7, Fall 2020 to Spring 2022 Notes: These averages include only students with benchmark assessment scores for every possible testing period. The comparison points in the figure represent median scores for Michigan students in 2018-19. 98 Figure A.7. Regression Adjusted Scale Score Trends by 2020-21 Instructional Modality, NWEA MAP Growth and Curriculum Associates’ i-Ready, Grades 5-7 Notes: These regression estimates include only students with benchmark assessment scores for every possible testing period. Each model controls for student demographics. Test scores have been standardized relative to NWEA’s and Curriculum Associates’ pre-pandemic national norms. Spring 2019 and 2022 M- STEP estimates have been standardized relative to national norms. 99 Table A.7. Differences in Learning Trajectories between Pre-Pandemic and Pandemic M-STEP Cohorts, 2016-2019 and 2019-2022 M-STEP Mathematics and ELA Assessments Cohort Black Latino Economically Disadvantaged Base-Year Achievement Grade Controls Student Controls COVID-19 Death Rates District Fixed Effects (1) -0.212*** (0.009) -0.158*** (0.006) Y N N N Mathematics (2) -0.167*** (0.017) -0.188*** (0.010) -0.069*** (0.010) -0.199*** (0.006) -0.241*** (0.006) Y Y Y N (3) -0.201*** (0.017) -0.150*** (0.007) -0.037*** (0.007) -0.147*** (0.004) -0.247*** (0.006) Y Y Y Y (4) -0.030*** (0.008) -0.221*** (0.004) Y N N N ELA (5) 0.030+ (0.016) -0.071*** (0.010) -0.034** (0.011) -0.176*** (0.007) -0.280*** (0.003) Y Y Y N (6) -0.025 (0.017) -0.101*** (0.007) -0.019* (0.008) -0.137*** (0.004) -0.285*** (0.003) Y Y Y Y R2 Notes: Each model controls for student demographics and includes grade-level indicators for each sub-cohort to control for differences in learning trajectories between younger and older students. Robust standard errors clustered at the district level in parentheses. + p < 0.10, * p < 0.05, ** p < 0.01, *** p < 0.001 0.190 0.177 0.105 0.140 0.145 0.091 100 Table A.8. Differences in Learning Trajectories between Pre-Pandemic and Pandemic M-STEP Cohorts by Student Demographics or 2020-21 Instructional Modality, 2016-2019 and 2019-2022 M-STEP Mathematics and ELA Assessments (1) -0.182*** (0.015) -0.070*** (0.017) -0.041* (0.019) -0.116*** (0.012) -0.018+ (0.009) Cohort Black*Cohort Latino*Cohort Black Latino ED*Cohort ED IP 5-8 Months* Cohort IP 1-4 Months*Cohort IP 0 Months*Cohort Mathematics (2) -0.186*** (0.016) (3) -0.158*** (0.016) ELA (5) -0.020 (0.019) (6) -0.014 (0.020) (4) -0.029 (0.018) 0.019 (0.012) -0.010 (0.013) -0.110*** (0.009) -0.014+ (0.008) -0.035*** (0.010) -0.130*** (0.005) -0.013 (0.008) -0.131*** (0.005) -0.064** (0.020) -0.068** (0.022) -0.055*** (0.017) -0.246*** (0.006) -0.016 (0.017) -0.018 (0.020) -0.013 (0.024) -0.285*** (0.003) -0.285*** (0.003) -0.285*** (0.003) Base-Year Achievement -0.247*** (0.006) -0.247*** (0.006) Student Controls Grade Controls COVID-19 Death Rates District Fixed Effects Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y R2 Notes: Each model controls for student demographics and includes grade-level indicators for each sub- cohort to control for differences in learning trajectories between younger and older students. Robust standard errors clustered at the district level in parentheses. + p < 0.10, * p < 0.05, ** p < 0.01, *** p < 0.001 0.177 0.190 0.190 0.177 0.189 0.177 101 Table A.9. Differences in Learning Trajectories between Pre-Pandemic and Pandemic M-STEP Cohorts by 2020-21 Instructional Modality and Student Demographics, 2016-2019 and 2019-2022 M-STEP Mathematics and ELA Assessments Cohort IP 5-8 Months*Cohort IP 1-4 Months*Cohort IP 0 Months*Cohort Base-Year Achievement White -0.147*** (0.017) -0.047** (0.016) -0.046+ (0.026) -0.049* (0.022) -0.233*** (0.003) Black -0.204*** (0.034) -0.082* (0.036) -0.084* (0.036) -0.027 (0.032) -0.311*** (0.017) Math Latino -0.183*** (0.033) -0.084+ (0.045) -0.072+ (0.044) -0.064* (0.032) -0.244*** (0.006) Non-ED -0.149*** (0.022) -0.039* (0.019) -0.038 (0.027) -0.040 (0.026) -0.223*** (0.003) ED -0.171*** (0.017) -0.086*** (0.024) -0.096*** (0.023) -0.063*** (0.017) -0.268*** (0.008) White -0.018 (0.021) -0.024 (0.020) -0.016 (0.023) -0.025 (0.032) -0.273*** (0.003) Black 0.044 (0.044) -0.014 (0.033) -0.058 (0.035) -0.008 (0.035) -0.336*** (0.005) ELA Latino -0.047 (0.038) -0.005 (0.033) 0.019 (0.035) -0.014 (0.037) -0.289*** (0.005) Non-ED -0.019 (0.026) -0.010 (0.022) 0.007 (0.024) -0.009 (0.035) -0.260*** (0.003) ED -0.010 (0.020) -0.023 (0.018) -0.043* (0.021) -0.017 (0.024) -0.308*** (0.003) Student Controls Grade Controls COVID-19 Death Rates District Fixed Effects Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y Y R2 Notes: Each model controls for student demographics and includes grade-level indicators for each sub-cohort to control for differences in learning trajectories between younger and older students. Robust standard errors clustered at the district level in parentheses. + p < 0.10, * p < 0.05, ** p < 0.01, *** p < 0.001 0.190 0.170 0.196 0.190 0.181 0.167 0.189 0.237 0.209 0.189 102 Fall 2021 Spring 2021 Spring 2022 Spring 2019 (M-STEP) (1) 0.165*** (0.010) -0.082*** (0.009) -0.045*** (0.009) -0.038*** (0.011) 0.029** (0.010) -0.205 (0.214) Table A.10. Regression Adjusted Scale Score Trends, NWEA MAP Growth and Curriculum Associates’ i-Ready, Grades 5-7 Mathematics (2) 0.179*** (0.012) -0.082*** (0.013) 0.048*** (0.013) 0.055*** (0.014) 0.122*** (0.014) -0.032 (0.024) -0.439*** (0.041) 0.119 (0.127) -1.065*** (0.053) -0.395*** (0.018) Reading (5) -0.033** (0.011) -0.151*** (0.011) -0.026* (0.010) -0.073*** (0.014) -0.058*** (0.012) -0.046+ (0.025) -0.291*** (0.044) 0.220* (0.112) -0.976*** (0.048) -0.176*** (0.016) (3) 0.171*** (0.010) -0.087*** (0.012) -0.019 (0.013) -0.011 (0.012) 0.056*** (0.012) 0.073 (0.133) -0.506** (0.173) 0.016 (0.171) -0.145* (0.072) -0.233*** (0.028) (4) -0.046*** (0.009) -0.154*** (0.006) -0.106*** (0.005) -0.152*** (0.011) -0.137*** (0.008) -0.198 (0.194) Spring 2022 (M-STEP) Latino, District Percent Black, District Percent ED, District Percent 0.0676* (0.0290) -0.116** (0.035) Constant i-Ready Grade Controls District-Level Student Controls COVID-19 Death Rates District Fixed Effects Y N N N Y Y Y N Y Y Y Y Y N N N Y Y Y N (6) -0.038*** (0.010) -0.158*** (0.008) -0.079*** (0.008) -0.125*** (0.010) -0.110*** (0.011) -0.005 (0.080) -0.247+ (0.150) 0.186 (0.144) -0.082 (0.067) -0.021 (0.017) Y Y Y Y R2 Notes: Regression estimates include only students with benchmark assessment scores for every possible testing period. Each model controls for student demographics. Test scores have been standardized relative to NWEA’s and Curriculum Associates’ pre-pandemic national norms. Spring 2019 and 2022 M-STEP estimates have been standardized relative to national norms. + p < 0.10, * p < 0.05, ** p < 0.01, *** p < 0.001 0.759 0.054 0.888 0.790 0.051 0.906 103 Table A.11. Regression Adjusted Scale Score Trends by Race/Ethnicity or Economically Disadvantaged Status, NWEA MAP Growth and Curriculum Associates’ i-Ready, Grades 5-7 Math Reading Spring 2019 (M-STEP) Spring 2021 Fall 2021 Spring 2022 Spring 2022 (M-STEP) Black*Spring 2019 (M-STEP) Black*Fall 2020 Black*Spring 2021 Black*Fall 2021 Black*Spring 2022 Black*Spring 2022 (M-STEP) Latino*Spring 2019 (M-STEP) Latino*Fall 2020 Latino*Spring 2021 Latino*Fall 2021 Latino*Spring 2022 Latino*Spring 2022 (M-STEP) ED*Spring 2019 (M-STEP) ED*Fall 2020 ED*Spring 2021 ED*Fall 2021 ED*Spring 2022 ED*Spring 2022 (M-STEP) i-Ready Constant District-Level Student Controls Grade Controls COVID-19 Death Rates District Fixed Effects (1) 0.196*** (0.010) -0.053*** (0.012) 0.015 (0.011) 0.012 (0.013) 0.083*** (0.013) -0.568*** (0.033) -0.492*** (0.033) -0.646*** (0.033) -0.629*** (0.033) -0.585*** (0.038) -0.605*** (0.033) -0.257*** (0.023) -0.240*** (0.021) -0.303*** (0.023) -0.298*** (0.022) -0.270*** (0.029) -0.273*** (0.021) -0.0285 (0.0541) -0.1232*** (0.0173) Y Y Y Y (2) 0.156*** (0.014) -0.053*** (0.016) 0.008 (0.013) 0.004 (0.016) 0.064*** (0.015) -0.433*** (0.018) -0.463*** (0.022) -0.536*** (0.017) -0.526*** (0.017) -0.502*** (0.020) -0.492*** (0.018) 0.068 (0.127) 0.029 (0.029) Y Y Y Y (3) -0.016+ (0.009) -0.146*** (0.008) -0.084*** (0.008) -0.133*** (0.010) -0.100*** (0.010) -0.530*** (0.034) -0.428*** (0.034) -0.509*** (0.039) -0.410*** (0.035) -0.396*** (0.045) -0.509*** (0.037) -0.261*** (0.022) -0.246*** (0.023) -0.248*** (0.023) -0.215*** (0.022) -0.206*** (0.029) -0.243*** (0.023) -0.048 (0.122) 0.070** (0.024) Y Y Y Y (4) -0.033** (0.011) -0.139*** (0.009) -0.079*** (0.008) -0.135*** (0.009) -0.088*** (0.011) -0.445*** (0.016) -0.432*** (0.017) -0.472*** (0.017) -0.440*** (0.017) -0.419*** (0.019) -0.488*** (0.019) -0.006 (0.077) 0.221*** (0.019) Y Y Y Y R2 Notes: Test scores have been standardized relative to NWEA’s and Curriculum Associates’ pre-pandemic national norms. Spring 2019 and 2022 M-STEP estimates have been standardized relative to national norms. + p < 0.10, * p < 0.05, ** p < 0.01, *** p < 0.001 0.885 0.864 0.823 0.861 104 Table A.12. Regression Adjusted Scale Score Trends by 2020-21 Instructional Modality, NWEA MAP Growth and Curriculum Associates’ i-Ready, Grades 5-7 Spring 2019 (M-STEP) Spring 2021 Fall 2021 Spring 2022 Spring 2012 (M-STEP) IP 5-8 Months*Spring 2019 (M-STEP) IP 5-8 Months*Fall 2020 IP 5-8 Months*Spring 2021 IP 5-8 Months*Fall 2021 IP 5-8 Months*Spring 2022 IP 5-8 Months*Spring 2022 (M-STEP) IP 1-4 Months*Spring 2019 (M-STEP) IP 1-4 Months*Fall 2020 IP 1-4 Months*Spring 2021 IP 1-4 Months*Fall 2021 IP 1-4 Months*Spring 2022 IP 1-4 Months*Spring 2022 (M-STEP) IP 0 Months*Spring 2019 IP 0 Months*Fall 2020 IP 0 Months*Spring 2021 IP 0 Months*Fall 2021 IP 0 Months*Spring 2022 IP 0 Months*Spring 2022 (M-STEP) i-Ready Constant District-Level Student Controls Grade Controls COVID-19 Death Rates District Fixed Effects Math 0.202*** (0.022) 0.007 (0.024) 0.116*** (0.014) 0.114*** (0.022) 0.191*** (0.018) -0.035 (0.026) -0.007 (0.023) -0.094*** (0.027) -0.074* (0.029) -0.068* (0.031) -0.083** (0.027) 0.017 (0.031) 0.059* (0.025) -0.104*** (0.028) -0.059* (0.025) -0.050 (0.032) -0.068* (0.034) -0.007 (0.043) 0.040 (0.051) -0.123** (0.046) -0.105* (0.042) -0.072+ (0.041) -0.084* (0.039) -0.022 (0.025) -0.407*** (0.022) Y Y Y Y Reading -0.011 (0.018) -0.109*** (0.014) -0.032** (0.012) -0.070*** (0.014) -0.029+ (0.016) -0.047+ (0.028) -0.016 (0.021) -0.047+ (0.025) -0.001 (0.024) -0.000 (0.030) -0.048+ (0.026) 0.005 (0.028) 0.033 (0.024) -0.061* (0.025) 0.050* (0.024) 0.013 (0.025) -0.026 (0.033) -0.029 (0.037) 0.006 (0.039) -0.088* (0.042) 0.003 (0.045) -0.010 (0.042) -0.034 (0.047) -0.038 (0.025) -0.180*** (0.021) Y Y Y Y R2 Notes: Test scores have been standardized relative to NWEA’s and Curriculum Associates’ pre-pandemic national norms. Spring 2019 and 2022 M-STEP estimates have been standardized relative to national norms. + p < 0.10, * p < 0.05, ** p < 0.01, *** p < 0.001 0.799 0.766 105