ESSAYS ON HUMAN CAPITAL AND LABOR MARKETS By Raghav Rakesh A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Economics—Doctor of Philosophy 2024 ABSTRACT Chapter 1: International Peers in Higher Education and Domestic Students’ Outcomes Recent decades witnessed a rapid increase in foreign post-secondary student enrollment in the US, substantially altering the college landscape. While evidence suggests that foreign students contribute significantly to university revenues and the host economy, there remains much debate around their impact on domestic students’ outcomes. Using rich administrative and survey data from a large US public university, this paper explores the effects of exposure to foreign peers in college courses on domestic students’ academic outcomes. I focus on first-term introductory math courses and leverage plausibly exogenous variation in the share of foreign peers across terms but within a course-instructor pair. I find that exposure to foreign peers in lower-level (non-calculus) courses has a sizable negative effect on the graduation rate of domestic students; students in higher- level (calculus-based) courses are unaffected by their foreign peers. The decline in graduation comes through a drop in students graduating with non-STEM degrees, with no effect on the number of STEM graduates. Further, the negative effects are incurred by domestic students of all races except Asians; domestic Asian students incur positive effects. Exploring potential mechanisms, I find suggestive evidence of limited interaction, lack of shared interests or culture, and language barriers between domestic and foreign students. Additionally, evidence points to the potential role of domestic students’ lower academic rank in their peer group. At the same time, I do not find evidence of negative social preferences associated with races or immigrants among domestic students, nor do I find evidence linking the effect to differences in abilities between domestic and foreign students. Chapter 2: The Local Economic Impacts of Foreign Students Do foreign students affect the economic outcomes of the natives in places with post-secondary institutions? I address this question by examining the impacts of demand shocks induced by expansions in foreign post-secondary student enrollment in the US between 2004 and 2016. Using an instrumental variables strategy that exploits spatial variation in foreign student enrollment expansion over this period, I estimate the causal effects on a vector of local economic outcomes. On average, the demand shocks substantially increased local employment and wages while having no significant effect on housing rent. At the same time, I find no evidence of adverse spillover effects on neighboring areas without post-secondary institutions. Further, the effect on employment increases with population density. However, the effect on housing rent also increases, likely due to limited supply in densely populated areas. The results suggest welfare gains for natives, especially in less densely populated areas that depend heavily on the education sector. While the effect of changes in foreign student enrollment on the local economy is sizable, the effect of changes in domestic student enrollment is small during the same period. Chapter 3: Science Education and Labor Market Outcomes in a Developing Economy (with Abhiroop Mukhopadhyay, Nishith Prakash, and Tarun Jain) We examine the association between studying science in higher secondary school and labor market earnings in India. Studying science in high school is associated with 22% greater earnings than studying business or humanities. Earnings for science students are further enhanced with some fluency in English. Science education is also associated with more years of education, completing a professional degree, returns to entrepreneurship and working in public sector positions. Primary survey of high school students shows no discernible differences in behavioral characteristics of science students compared to others. Copyright by RAGHAV RAKESH 2024 ACKNOWLEDGEMENTS Completing this dissertation has been a long and memorable journey, and it would not have been possible without the support of many people. I would like to acknowledge and express heartfelt gratitude to all of them for playing essential roles in this journey. First, I would like to express my deepest and sincere gratitude to my supervisor, Dr. Todd Elder, for his invaluable guidance, insightful feedback, and unwavering support throughout graduate school. He always pushed me to think harder about the econometric analysis and the big picture story, which constantly improved my research. His constant motivation made me persevere even during the most challenging times in this journey. I am also hugely indebted to Dr. Ben Zou, whose thoughtful feedback and constructive criticism on even the most minor components of my research have been instrumental in refining my ideas and shaping my research. He was always extremely generous with his time when brainstorming ideas, reviewing my work, giving feedback, and discussing my numerous queries. I would also like to thank my dissertation committee members for their time and support. Dr. Stacy Dickert-Conlin’s detailed comments and feedback on my drafts helped me significantly improve their quality. Dr. Kristen Renn’s insights on higher education students’ success and her detailed background knowledge of the University’s administration and academic programs helped me refine my research design substantially and procure the novel granular dataset. Further, I would like to thank other professors in the Economics department at Michigan State, particularly Dr. Soren Anderson, Dr. Scott Imberman, and Dr. Justin Kirkpatrick, who provided helpful comments at different stages of my research. In addition, I would like to thank all the past and present economics department admin staff, particularly Lori Jean Nichols and Jay Feight, who have taken care of all the administrative work during my graduate school. I will always be indebted to Dr. Nishith Prakash and Dr. Abhiroop Mukhopadhyay, with whom I started my research career when I worked with them before starting my PhD. Since then, they have mentored me at every stage of my professional life. They have always supported me unconditionally and believed in me even when I doubted myself, for which I am forever grateful. v Further, I am incredibly grateful to have the support of my wonderful friends throughout this journey. Yogeshwar has always been present to support me since my first day in graduate school, and my dissertation could not have been completed without our daily chai-time random discussions about research and life. Also, it would be difficult to imagine life in East Lansing during these six years without the company of Priyankar Datta, Sambojyoti Biswas, Siddhartha Shukla, Chia-Hung Kuo, Graham Gardner, Andrew Earle, David Hong, Steven Wu-Chaves, and Giacomo Romanini. Many thanks to my childhood friends, Ankit Vatsa, Ravi Prakash, and Vivek Sharma, for always supporting me and keeping me sane. Thanks to my friends, Akshay Sahu, Animesh Panwar, Anjali Verma, Anurag Kumar, Anurag Pratik, Anushka Mitra, Ashutosh Baghel, Ayush Gupta, Ishan, Kiran Pasupuleti, Nilay Kant, Orville Mondal, Rajarshi Bhowal, Ravi Shankar, Saket Kumar, Salil Sharma, Sanchit Agrawal, and Shashank Joshi, who were always a call away. Anjali and Anushka helped me a lot during some of the most difficult and crucial times during the journey, for which I am grateful. Special thanks to Roshni Gandhi, who has been my constant pillar of support all these years. Finally, I want to thank my parents, Dr. Usha Kumari and Dr. Sushil Rakesh, my brother, Rahul Rakesh, my sister-in-law, Samiksha Kumar, and my niece, Maira Rakesh, for their unconditional love. vi TABLE OF CONTENTS CHAPTER 1 . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . INTERNATIONAL PEERS IN HIGHER EDUCATION AND DOMESTIC STUDENTS’ OUTCOMES . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1 1 1.1 7 1.2 Setting . 1.3 Data . 9 . 1.4 Econometric Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 1.5 Empirical Results . 1.6 Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23 . . 1.7 Potential Mechanisms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31 1.8 Conclusion . BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34 APPENDIX 1A MAIN TABLES AND FIGURES . . . . . . . . . . . . . . . . . 38 ADDITIONAL TABLES AND FIGURES . . . . . . . . . . . . 56 APPENDIX 1B ADDITIONAL ANALYSIS . . . . . . . . . . . . . . . . . . . . 64 APPENDIX 1C . . . . . . . . . CHAPTER 2 . . . . . . . . . . . . Introduction . THE LOCAL ECONOMIC IMPACTS OF FOREIGN STUDENTS . . . 70 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70 2.1 2.2 Foreign Students in US Post-Secondary Institutions . . . . . . . . . . . . . . . 76 2.3 Econometric Framework . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79 . 83 2.4 Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 84 2.5 Empirical Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92 . . 2.6 Robustness . . . 94 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2.7 Conclusion . . BIBLIOGRAPHY . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96 . . . . APPENDIX 2A MAIN TABLES AND FIGURES . . . . . . . . . . . . . . . . . 99 ADDITIONAL TABLES AND FIGURES . . . . . . . . . . . . 109 APPENDIX 2B . 116 DESCRIPTION OF VARIABLES . . . . . . . . . . . . . . . APPENDIX 2C . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . CHAPTER 3 . . . . . . . . . . . . . . . . . Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . SCIENCE EDUCATION AND LABOR MARKET OUTCOMES IN A DEVELOPING ECONOMY . . . . . . . . . . . . . . . . . . . . . . 118 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 118 3.1 . 122 3.2 Context 3.3 Data . . 123 3.4 Empirical Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 126 . 137 3.5 Role of Behavioral Characteristics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3.6 Conclusion . . . 139 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 141 . . BIBLIOGRAPHY . APPENDIX 3A MAIN TABLES AND FIGURES . . . . . . . . . . . . . . . . . 145 ADDITIONAL TABLES AND FIGURES . . . . . . . . . . . . 160 APPENDIX 3B . 168 ADDITIONAL ANALYSIS AND DETAILS . . . . . . . . . . APPENDIX 3C . . . . vii CHAPTER 1 INTERNATIONAL PEERS IN HIGHER EDUCATION AND DOMESTIC STUDENTS’ OUTCOMES 1.1 Introduction Recent decades saw a rapid expansion of students from foreign countries in post-secondary education (henceforth, foreign students). In the US, foreign student enrollment increased from 0.5 million in 2004 to close to 1 million in 2016, accounting for roughly 5% of total post-secondary enrollment (Institute of International Education, 2022). Existing research suggests that this influx significantly increased revenue for host universities amid declining state funding and also benefited the local economy (Bound et al., 2020; Rakesh, 2023). According to NAFSA (2020), foreign students contributed $41 billion to the US economy in the Academic Year 2018-19.1 Notwithstanding the sizable economic contributions, the rapid growth in foreign student pres- ence has generated a lot of debate around their impact on domestic students’ outcomes. Universities and proponents argue that foreign students provide an international and cross-cultural perspective that benefits all students.2 On the contrary, opponents claim that foreign students negatively affect domestic students’ academic outcomes. For instance, one argument posits that increased compe- tition from foreign students affects domestic students’ enrollment.3 As a consequence, there have been calls for restrictions on foreign student enrollment.4 Despite the extensive literature exploring the influence of peers on a wide range of outcomes across different contexts (Sacerdote, 2011, 2014), there is not much evidence on the peer effects of foreign students. In this paper, I study the effects of exposure to foreign peers in college courses on domestic students’ graduation and related outcomes. For the analysis, I use novel administrative student-level data from a large US public university that saw a drastic increase in foreign student enrollment 1To put that in perspective, the financial incentives provided by all tiers of the US government under place-based job policies was around $60 billion in 2015 (Bartik, 2020). 2See (Groot, 2023). This argument relates to the psychology literature on intergroup contact theory following Allport (1954), which generally documents a negative correlation between intergroup contact and prejudice (Pettigrew and Tropp, 2006; Paluck et al., 2019). 3See http://graphics.wsj.com/international-students/the-debate and Anderson (2016). 4In July 2021, the California Legislature even proposed a bill to reduce the number of nonresident University of California students (Kovach, 2021). 1 in recent decades. Another key feature of the university that makes it apt for this study is that it is not “highly selective.” This is important because, in a university that is very highly selective, all students are likely to be of very high quality — the students are likely to enter college with high academic ability and good study habits, which may greatly reduce the potential influence of their peers (Stinebrickner and Stinebrickner, 2006). I focus on the exposure to foreign students in first-term introductory mathematics courses, which students have to take to meet the University Mathematics Requirement to graduate, irrespective of their major. Further, a range of introductory math courses allows me to measure peer effects in lower-level (non-calculus) and higher-level (calculus-based) courses separately. I look at the effects in the two groups separately as the level of skills that affect individual performance and factors affecting collaboration or classroom dynamics could vary across the two groups, leading to distinct peer effects. The administrative data have detailed student-level information on all students who took one of the introductory math courses between the Fall 2005-Spring 2015 semesters. The data include demographic information, background information, and academic records of the students until the end of their time at the university. My main sample includes all freshmen domestic students who were admitted in one of the fall semesters between 2005-2014 (10 years) and enrolled in an introductory math course in their first term. I proxy for the exposure to foreign peers by the share of students who are international non-residents in the students’ first-term introductory math course-instructor pair. However, a major challenge in estimating the causal impact is that the share of foreign peers one is exposed to might not be random. There could be sorting of students into or out of courses, instructors, or terms due to reasons that are correlated with the foreign peer exposure and the outcome of interest. For instance, students may have a preference for particular instructors within a course, which may lead to the potential selection of students across instructors within a course. My empirical strategy leverages the idiosyncratic variation in the share of foreign peers at the course-instructor-term (peer group) level after controlling for course-instructor and course-term fixed effects. Employing fixed effects, I leverage variation in foreign peer exposure across terms 2 but within the same course-instructor pair.5 This allows me to effectively compare students who enrolled in the same introductory math course with the same instructor in their first term in college, and identify using over-time variation in foreign peer exposure. This approach absorbs potentially confounding time-invariant course-instructor factors and time-varying course-level factors. The causal interpretation of peer effects in my setting relies on a conditional independence assumption: after controlling for the fixed effects, the residual variation in the foreign peer exposure across students is as good as random. I assess the validity of this assumption by conducting balance tests between students’ exposure to foreign peers and their pre-determined characteristics and ability measures after controlling for the fixed effects. I also show that the residual variation is uncorrelated to the peer group characteristics, ability, and peer group size. In addition to the apt setting and the fixed effects, I include the student’s predetermined characteristics and ability measures, peer group size, and the peer group characteristics and ability measures to alleviate further endogeneity concerns. I find a sizable negative effect of exposure to foreign peers in lower-level courses on domestic students’ six-year graduation rate. On average, a 10 percentage point increase in the share of foreign peers causes the graduation rate of domestic students to decrease by 6.1 percentage points, a drop of 7.8% of the mean. In contrast, the graduation rate of domestic students in higher-level courses is unaffected by their foreign peers. This is consistent with the hypothesis that the potential influence of peers on academic outcomes weakens with the increasing quality of students. Further, I find that the entire effect on the graduation rate comes from a negative effect on the likelihood of students graduating with non-STEM majors, thereby reducing the number of non-STEM graduates. At the same time, there is no effect on the number of domestic STEM graduates. In fact, further analysis shows that conditional on graduating, the likelihood of domestic students with a STEM major preference graduating with a STEM major increases with increased exposure to foreign peers in higher-level courses. These results also address the concern related to foreign students potentially 5This approach is similar to other studies in peer effects literature using cohort-to-cohort variation in peer compo- sition within school-grade pair (Hoxby, 2000; Bifulco et al., 2011; Gould et al., 2009; Anelli and Peri, 2019; Carrell and Hoekstra, 2010). 3 displacing domestic students out of STEM majors. In addition to the balance tests, I conduct several tests to confirm the robustness of the findings. The results are consistent across a series of specifications tests, including the addition of more controls and fixed effects, using an alternate sample, and alternate variation in foreign peer exposure. I explore several mechanisms through which foreign peers in lower-level courses negatively affect domestic students’ graduation. First, I consider whether exposure to more foreign peers negatively affects short-term outcomes, which may lead to lower graduation rates. Looking at first-year retention, I find that over 70% of students adversely affected by exposure to foreign peers drop out in their first year. I also find negative effects on math course GPA and first-term GPA. Further, as students might perceive short-run achievement as a measure of relative performance within their peer group, I explore if the effect on graduation through this channel is linked to relative performance. I include short-run achievement variables in the main equation and re- estimate the effect on graduation, further restricting the comparison to students having the same relative performance within their peer group. The point estimate drops by one-third compared to the baseline, suggesting one-third of the total effect is due to an effect on relative performance. Second, the peer effects may be due to differences in the ability of domestic and foreign students in the lower-level courses — the presence of more higher-ability foreign students may negatively affect the graduation of relatively lower-ability domestic students, for instance, through increased competition. If ability difference is a source of potential mechanism, one would expect to see a stronger effect on lower-ability students in the peer group. However, I find no heterogeneous effect by the domestic student’s ability, as measured by ACT Math score. Further, looking at heterogeneity with within-peer-group ability bins, I find that students in the lowest quintile within a peer group are not differentially affected compared to students in the middle or highest quintiles. These results suggest that peer effects are likely not operating through ability-based mechanisms. Third, I explore several non-ability factors as a source of potential mechanisms. These factors may alter collaboration or classroom dynamics, among other things, potentially affecting gradua- tion. In addition to administrative data, I use unique panel data from a survey of domestic students 4 at the university to shed light on non-ability factors. The survey includes questions about domestic students’ experiences with foreign students in their first year at the university. Examining the heterogeneous effect by the race of domestic students, I find a negative effect of exposure on domestic students of all races except Asians; domestic Asian students, in contrast, incur a positive peer effect. Given that over 90% of foreign students are Asians, this result suggests that race-related factors could be operating here. Thus, I further explore domestic students’ social preferences over race and immigrants and a lack of common interests/culture between domestic and foreign students as two non-ability factors that could be driving heterogeneous effects by race and the main effect. Exploring the survey data, I do not find evidence of social preference against interacting with foreign students — most domestic students are “looking forward” or “excited” to interacting with foreign students at the beginning of their college life. However, in the 1-year follow-up survey, I find that the actual interaction between domestic and foreign students is very limited, and one of the major reasons domestic students mention that hinders their interaction with foreign students is that they have different interests than foreign students. The primary reason that domestic students mention that hinders their interaction with foreign students is communication due to the language barrier. Many foreign students may have limited English proficiency as most of them are non-native speakers of English. To explore this further, I look at the effect of exposure to foreign peers with high and low English proficiency on domestic students’ graduation using the main data and find that the effect is driven mainly by exposure to low English proficiency foreign students. This paper contributes to three bodies of literature in economics. First, it contributes to the literature on the effect of immigrants, particularly foreign students, on the educational outcomes of natives. Much of the past literature looks at primary and secondary education levels (Ballatore et al., 2018; Hunt, 2017; Figlio and Özek, 2019; Diette and Oyelere, 2014; Ohinata and Van Ours, 2013; Gould et al., 2009; Betts and Fairlie, 2003). Studies that focus on higher education have mostly exploited university-level variation (Shih, 2017; Borjas, 2004; Hoxby, 1998). On the other hand, I 5 exploit finer, peer group-level variation to provide evidence on the effects of foreign peers. Looking at the effects on the number of graduates by major, I further contribute to the literature exploring the effect on intensive margin outcomes (Anelli et al., 2023; Orrenius and Zavodny, 2015). In a related paper, Anelli et al. (2023), using a similar methodology and focusing only on calculus courses in a “highly selective” university, find a negative impact of exposure to foreign peers on the STEM graduation of domestic students. The negatively affected students move to high-earning Non-STEM majors, leading to no effect on the overall graduation rate or future expected earnings of domestic students. In contrast, my paper focuses on all students enrolled in a university that is not “highly selective” and finds a negative effect on domestic students’ graduation rate in non-calculus courses only; there is no effect on the overall supply of STEM graduates. Second, this paper contributes to an extensive literature on peer effects in education. Many papers look at the peer effects of exposure to particular dimensions of diversity, for instance, race, gender, and economic status (Rao, 2019; Anelli and Peri, 2019; Hoxby, 2000). However, foreign students are different — they usually embody multiple dimensions of diversity, such as race, language, and culture, making it difficult to extrapolate findings from other contexts. Therefore, the peer effects of foreign students need to be studied separately, and my paper fills this gap in the literature. I also contribute to the literature on peer effects in higher education (Sacerdote, 2001; Zimmerman, 2003; Foster, 2006; Stinebrickner and Stinebrickner, 2006; Martins and Walker, 2006; Parker et al., 2010). While many papers in this literature find no or modest effects on academic outcomes, my paper finds a sizable negative peer effect on graduation. Further, my paper also demonstrates how with the increasing ability of the peer group, the potential influence of peer effects may decrease. Finally, this paper contributes to the literature on diversity and desegregation. Studies find that contact substantially reduces inter-group prejudice (Boisjoly et al., 2006; Carrell et al., 2019; Finseraas et al., 2019). Although my paper does not directly look at the impact on the social outcomes of domestic students, I find evidence of a lack of interaction between students of the two groups. I also find that the interaction may be happening within races only. These findings suggest 6 that contact within peer groups may not necessarily lead to interaction. Thus, limited interaction may not improve inter-group prejudice, while at the same time, it may have a negative impact on academic outcomes, as shown in the paper. 1.2 Setting The study is based at a large US public university that hosts domestic students from all 50 US states and foreign students from over 135 countries. The institution is not “highly selective” with an average acceptance rate of around 70% in the last 10 years, which makes it an apt choice for this study.6 Students in “highly selective” universities are likely to be of very high quality, making them far less susceptible to incur peer effects. Those students are likely to enter college with high academic ability, good study habits, and a firm belief in the importance of college, which may greatly reduce the potential influence of their peers. This could be one of the potential reasons why many previous studies that focused on more selective universities found little evidence of peer effects on academic outcomes (Stinebrickner and Stinebrickner, 2006).7 Figure 1A.1 plots the foreign and domestic student enrollment trend at the university. The total undergraduate enrollment increased steadily from 35,000 to 38,000 between 2005 and 2014. A steady decrease in domestic undergraduate enrollment and a rapid increase in foreign undergraduate enrollment led to a rapid increase in the share of foreign undergraduates from 3.5% to 14% during the period. In Fall 2014, the domestic undergraduate student body comprised of around 78% White, 8% Black, 5% Asian, and 4% Hispanic. In the same term, over 90% of the foreign students were from Asian countries, primarily from China (75%), Korea (6%), Taiwan (2%), India (2%), and Saudi Arabia (2.5%). An undergraduate student may choose from more than 200 majors that the university has to offer. Freshmen and sophomore students can opt for an Exploratory Preference major, which allows them to explore options and choose an appropriate major that fits their abilities and interests. However, all students have to formally declare a major once they reach Junior standing (56 credits).8 About 6In recent years, the acceptance rate is around 80%. ACT and SAT ranges of admitted applicants who fell within the 25th and 75th percentile in 2022 are 23-29 and 1100-1320, respectively. 7See Sacerdote (2001); Zimmerman (2003); Foster (2006); Martins and Walker (2006); Parker et al. (2010). 8A student needs to complete 120 credits to get a degree in the majority of majors. 7 80% of undergraduate students graduate within 6 years of starting at the university. To identify the peer effects of foreign students, I focus on introductory math courses. This is because all students must meet the University Mathematics Requirement to graduate, irrespective of their major, by earning credits in one or more of these courses. Once admitted, students work with their university-assigned academic advisor and are placed into one of the introductory math courses based on the ACT/SAT Math score, Math Placement Service (MPS) Assessment score,9 and some additional factors.10 The idea is to enroll students in an introductory math course that is well suited to their math ability and preparedness.11 This rough mapping of students to their first introductory math course based on their ability at the time of admission reduces the potential endogeneity issue that might arise due to choosing a particular course. Most students enroll in the introductory math courses they are placed into in their first term, as the general recommendation from university advisors is to finish “general requirement” courses before enrolling in major-specific courses in later years. Also, introductory math courses are usually prerequisites for many other courses the students might want to take in subsequent semesters and years in the program. As most students enroll in their first introductory math course in their first term, it leads to each course having entering students with very similar math abilities, allowing me to measure peer effects in a setting where students do not differ on ability. A course may, however, have older students with lower initial math ability, but all students would have roughly a similar level of math preparedness.12 The set of introductory math courses a student can be placed into ranges from algebra to advanced calculus-based courses, in terms of their level of difficulty. I distinguish the set of 9Math Placement Service Assessment is conducted by the university for entering freshmen (and some transfer students) who have been accepted to the university. The students have to take it prior to the orientation day, and the results of this assessment do not affect the status of a student’s admission to the university. 10Some additional factors that are sometimes considered to determine math course placement are Advanced Placement (AP) math credits, International Baccalaureate (IB) math credits, or college-level math credits earned before admission. 11It is especially intended to warn students away from a course that is well beyond their present capabilities. 12All the higher-level introductory math courses have certain lower-level introductory math courses as prerequisites if you are not directly placed into them during admission. Thus, even the older students in these courses have successfully completed lower-level courses and acquired the math preparedness required to take a higher-level course, leading to all students within a particular course having roughly a similar level of math preparedness. 8 courses into lower-level and higher-level introductory math courses, which include non-calculus and calculus-based courses, respectively.13 I look at peer effects in the two groups separately as the level of skills that affect individual performance and factors affecting collaboration or classroom dynamics could be very different across the two groups, leading to different mechanisms and, therefore, distinct peer effects. 1.3 Data 1.3.1 Student Admnistrative Data This paper uses novel administrative student-level data from the above-mentioned university for the analysis.14 The dataset has information on all the introductory math courses offered between Fall 2005-Spring 2015 semesters and students who enrolled in those courses. For each introductory math course a student enrolls in, I observe the course name, enrollment term, and the instructor, using which I construct a roster of students for each time a course was offered by an instructor during the 10 years. The dataset includes student demographic information, background information, and detailed academic records. I follow students until the end of their time at the university. In particular, student academic records in the data include ACT scores, admit term, applicant type (first-time freshman, transfer, non-degree), GPA (term by term and overall), credits completed (term by term and overall), major preference at freshman standing, last term at the university, graduation term, and major at graduation. Student demographic and background information in the data includes sex, race/ethnicity, country of residence, US citizenship status, tuition residency, and first-generation status. My main sample consists of first-time freshman (FTF) domestic students who enrolled in an introductory math course in their first term at the university. An FTF is a student at the university who graduated from high school but has not previously enrolled at a college, university, or any other school after high school.15 These restrictions ensure that domestic students have not had foreign 13Appendix Table 1B.1 lists all the introductory math courses. 14Procuring the novel, detailed student-level administrative data from the university’s Office of the Registrar (RO) was a long process, and I would like to thank the RO staff for reviewing my proposal, providing feedback, sharing access to the granular data, sharing knowledge and resources to understand the data and the setting, and answering numerous of my queries, which made possible the empirical analysis I conducted in this chapter. 15A FTF student may have completed college credits while enrolled in high school. 9 peers at the post-secondary level that influence their instructor choices. Further, I restrict the main sample to fall semester admits because the fall semester is the first term for almost all the students at this university. This leaves me with a final sample of 32,115 FTF domestic students who were admitted in one of the fall semesters between 2005 and 2014 and took an introductory math course in their first term.16 I proxy the foreign peer exposure using the share of foreign students in the introductory math course a student is enrolled in. In particular, for student i, I measure the foreign peer exposure by the share of total students (excluding student i) who are international non-residents in the student i’s first-term math course-instructor pair. A student is categorized as an international non-resident if they are not a US citizen or permanent resident and require a visa to study in the US. Figure 1A.2a shows the variation in foreign peer exposure for every student in the final sample. Figures 1A.2b and 1A.2c show the variation in foreign peer exposure for every student in lower-level and higher-level courses, respectively. The mean exposure to foreign peers for students in the main sample in lower and higher-level courses is 0.04 and 0.15, respectively. Table 1A.1, Panels A and B present the summary statistics for main sample students and their foreign peers, respectively. Column 1 shows the mean characteristics of students in lower-level courses, column 2 shows the mean characteristics of students in higher-level courses, and column 3 shows the mean characteristics of students in all the courses. Panel A shows that the main sample students are predominantly white in both types of courses, where their average share is 84%. Students in lower-level courses are more likely to be Blacks, Hispanics, females, and first- generation students and less likely to be Asian compared to higher-level courses. The mean math and English ability of students, proxied by their ACT Math and English scores, respectively, are lower in lower-level courses than in higher-level courses, which is expected.17 Panel B of Table 1A.1 shows that a majority of foreign peers are from Asian countries, with a majority of them coming from China. The mean share of foreign peers from China is 47% and 16All the main sample students during the 10 years are enrolled in only one introductory math course in their first term. So, there are no multiple observations for the same student. 17There are a few cases with missing ACT scores in the data. In such cases, it is imputed from the SAT score or MPS Assessment Score, if available. 10 71.4% in lower and higher-level courses, respectively. Other major sending countries of foreign students are Korea, Taiwan, India, and Saudi Arabia. Like their domestic peers, the mean math and English ability of students are lower in lower-level courses than in higher-level courses. Also, on average, the foreign students are marginally better in math than the main sample students. 1.3.2 Survey Data In addition to the administrative data, I use data from a unique panel survey of domestic students at the university about their first-year experiences related to interaction with foreign students.18 The sample for the survey was randomly chosen from all the incoming domestic freshman students in Fall 2018. The baseline survey was conducted at the beginning of Fall 2018, and the follow-up survey was conducted in Fall 2019. The baseline survey included questions on students’ beliefs and expectations about interacting with foreign students at the university, among other things. The 1-year follow-up survey included questions about their experiences with foreign students in their first year. There are a total of 305 students who responded to both rounds of the survey, of which 74% are White, 9% are Black, and 8% are Asian. The percentage of foreign students among the total undergraduates at the university in Fall 2018 was 10% (3,862 out of 38,701 students), whereas the share of foreign students among the entering freshman was 8.6% (737 out of 8500 students). Although the survey period does not overlap with the period of administrative data used in this paper for the main analysis, the two samples are fairly comparable, and it is reasonable to use the survey data to shed light on the underlying mechanisms and complement the main analysis. 1.4 Econometric Framework 1.4.1 Empirical Strategy To identify the peer effects, I leverage the idiosyncratic variation in the share of foreign peers at the course-instructor-term (peer group) level after controlling for course-instructor and course-term fixed effects. This means that I am comparing students who enrolled in the same introductory 18I would like to thank Prof. Dongbin Kim and Prof. Kristen Renn for sharing this unique dataset, the knowledge and resources to use it, and their insights into its analysis. 11 math course with the same instructor in their first term in college, and identifying using over-time variation in foreign peer exposure. This approach absorbs potentially confounding time-invariant course-instructor factors and time-varying course-level factors. Essentially, I try to replicate an ideal experiment where students are exposed to a random share of foreign peers in their peer group while everything else about their peer group is the same. Formally, I estimate the impact of exposure to foreign peers using the following empirical specification: 𝑌𝑖𝑐 𝑗𝑡 = 𝛼 + 𝛽 × 𝐹𝑜𝑟𝑒𝑖𝑔𝑛 𝑆ℎ𝑎𝑟𝑒𝑖𝑐 𝑗𝑡 + 𝜃𝑐 𝑗 + 𝜆𝑐𝑡 + 𝛾𝑋𝑖 + 𝛿𝐺 𝑐 𝑗𝑡 + 𝜖𝑖𝑐 𝑗𝑡 (1.1) 𝑌𝑖𝑐 𝑗𝑡 denotes an outcome of student 𝑖, in introductory math course 𝑐, with instructor 𝑗, in term 𝑡. 𝐹𝑜𝑟𝑒𝑖𝑔𝑛 𝑆ℎ𝑎𝑟𝑒𝑖𝑐 𝑗𝑡 = (cid:205)𝑘≠𝑖 𝐹𝑆𝑘𝑐 𝑗𝑡 𝑛𝑐 𝑗𝑡 −1 is the proportion of students in student 𝑖′𝑠 peer group (except student 𝑖) who are foreign. 𝐹𝑆𝑘𝑐 𝑗𝑡 is a dummy variable that takes a value of 1 if the student 𝑘 in student 𝑖′𝑠 peer group is foreign, and 0 otherwise. 𝑛𝑐 𝑗𝑡 is the peer group size of the student 𝑖. I measure the foreign peer exposure at the course-instructor-term (peer group) level as opposed to the course-instructor-term-section (classroom) level due to the potential nonrandom selection of students into sections within a course and instructor.19 𝜃𝑐 𝑗 is course-instructor fixed effects, and it controls for fixed differences across course-instructor combinations that may lead to endogenous sorting of students. 𝜆𝑐𝑡 is course-term fixed effects, and it accounts for time-varying course-level factors. 𝑋𝑖 is a set of student i’s pre-determined characteristics, including race, gender, first- generation indicator, Math, and English ability. 𝐺 𝑐 𝑗𝑡 controls for peer group level characteristics, including average peer math ability, the share of females, and the share of first-generation students. 𝜖𝑖𝑐 𝑗𝑡 is the error term. I cluster the standard errors at the instructor level. 1.4.2 Identification There are three major identification concerns that are well documented in the peer effects literature: reflection, selection, and common shocks. The reflection problem arises when the 19This equation is essentially a reduced-form instrumental variables equation where the exposure to foreign students at the peer group level instruments for exposure to foreign students at the classroom level. In a robustness test, I estimate the structural IV estimate and find similar results. 12 simultaneous determination of student and peer outcomes leads to difficulty disentangling the effect that the peers have on the student from the effect the student has on the peers (Manski, 1993). This is usually an issue when contemporaneous outcomes of peers are used as a main explanatory variable. However, I use the share of foreign peers in the peer group as the main explanatory variable, and being foreign is determined exogenously before the students enrolled in college; reflection is not an issue here. My approach is similar to studies where variation in predetermined measures of peers is used as a proxy to their contemporaneous outcomes to resolve the reflection problem (Hoxby and Weingarth, 2005; Hoxby, 2000; Lavy and Schlosser, 2011; Imberman et al., 2012; Figlio, 2007; Lavy et al., 2012). Selection can be an issue if students sort themselves into peer groups due to reasons that may be correlated with the outcome of interest. Such endogenous sorting may bias the result and make it difficult to determine the causal effect of the peers. In this particular context, there may be a problem if there is a selection of students into or out of courses, instructors, or terms due to reasons that are correlated with foreign peer exposure and the outcome of interest. However, I take several measures that make it unlikely for the estimates to be biased due to selection issues in my setting. First, I only focus on FTF students and their first-term course-taking, which ensures that the main sample students have not been exposed to foreign peers at the college level before, and are unlikely to have much knowledge of how foreign peer exposure might affect them at the college level. Moreover, the students, with guidance from a university-assigned academic advisor, enroll in their first-term courses even before they are physically present on the college campus. This further suggests that the students are likely to have very limited knowledge about instructors, courses, and the student composition in each of the course-instructors combinations while enrolling for the first-term courses. Second, based on the ACT Math score, MPS Assessment score, and the recommendation from the academic advisor, the students are placed into their first math course. This makes it difficult for the students to select into and out of the first math course. Third, there may be sorting of students across instructors within a course, which may lead to selection bias. For instance, some 13 domestic students might have a preference to enroll in classes with American instructors, which may be correlated to their foreign peer exposure and their subsequent outcomes. However, my identification strategy controls for course-instructor fixed effects and relies on the variation in the share of students who are foreign within the instructor and course, but across terms. This resolves the concerns related to the selection into and out of instructors as well as courses. Finally, of critical importance to our identification strategy is that there is no endogenous sorting of students across terms within a course-instructor depending on the foreign peer exposure. Recall that the students, after getting admission to college, get placed into first-math courses based on their pre-college math ability and recommendations from the academic advisor. And, most students get enrolled in the first math course in their first term only. Further, for there to be endogenous sorting of students over terms within a course-instructor in my setting, the students would have to change their year of enrollment as we focus on the fall-enrolled students’ first-term course-taking only, which seems very difficult. Moreover, the schedule of courses offered and information on instructors is not available one year in advance, which makes it even more difficult for students to delay their enrollment if they want to base their decision on the possibility of the same course being offered by the same instructor in the following year. Common shocks or correlated effects can be a problem for identification when students and their peers share common treatments — it is often difficult to disentangle the peer effects from other shared treatment effects. They are more likely to be a problem if one uses contemporaneous peer achievement, as both student and peer achievement may be affected by the common shocks (Lyle, 2007). So, common shocks are less likely to confound the estimates in this paper because I use a pre-determined measure of peers to identify the peer effects. Moreover, controlling for course-instructor and course-term fixed effects should absorb most of the common shocks. For common shocks to be a problem in my particular setting, it has to vary within the course-instructor and should be correlated to the share of foreign students, which seems unlikely. Nonetheless, I control for individual characteristics of students and peer group-level characteristics to alleviate further concerns. 14 The causal interpretation of peer effects in my setting relies on a conditional independence assumption: after controlling for the fixed effects, the residual variation in the foreign peer ex- posure across students is as good as random. All the measures I take should alleviate the causal identification concerns; nevertheless, I conduct balance tests to examine the plausibility of this assumption. Before I look at the formal tests, I look at the raw correlation between student charac- teristics and foreign peer exposure without any fixed effects in Table 1A.2, Panel A. Each column shows a coefficient from a separate regression corresponding to a different student characteristic, including demographic and academic ability. In Table 1A.2, Column 1, I regress the dummy for if the student is White on the share of foreign peers, and the estimated coefficient is -0.007, implying that a 10 percentage point increase in the share of foreign students in the peer group is associated with a 0.7 percentage point decrease in the probability of being White. All the coefficients in Panel A similarly suggest that without accounting for the systematic differences across students through the inclusion of fixed effects, certain types of students are exposed to a higher share of foreign peers. For instance, the results show that domestic students who are exposed to a higher share of foreign peers are less likely to be White and more likely to be Asian. This could happen if students are more likely to enroll with instructors of their race, i.e. if White students are more likely to enroll with White instructors, Asian students are more likely to enroll with Asian instructors, and foreign students, a large percentage of whom are Asians, are more likely to enroll with Asian instructors. So, assuming that students who choose to enroll with an instructor of their race do better academically than when enrolled with an instructor of a different race, we may incorrectly attribute the lower academic achievement to foreign peer exposure without controlling for the fixed effects. Next, I conduct a balance test for selection by examining whether predetermined characteristics of the main sample students are correlated to the share of foreign students after including course- instructor and course-term fixed effects. Specifically, I estimate the following equation: 𝑋𝑖 = 𝛼 + 𝛽 × 𝐹𝑜𝑟𝑒𝑖𝑔𝑛 𝑆ℎ𝑎𝑟𝑒𝑖𝑐 𝑗𝑡 + 𝜃𝑐 𝑗 + 𝜆𝑐𝑡 + +𝜖𝑖𝑐 𝑗𝑡, (1.2) 15 where 𝑋𝑖 is a pre-determined characteristic of student 𝑖, 𝐹𝑜𝑟𝑒𝑖𝑔𝑛 𝑆ℎ𝑎𝑟𝑒𝑖𝑐 𝑗𝑡 is the foreign peer exposure of student 𝑖, and 𝜃𝑐 𝑗 and 𝜆𝑐𝑡 are the two fixed effects from the main specification. Table 1A.2, Panel B reports the estimates of 𝛽, where each column is for a separate regression corresponding to a different student characteristic mentioned in the column head. I find that once I account for systematic differences through the inclusion of fixed effects, foreign peer exposure is uncorrelated to the observed pre-determined characteristics of the students — the estimates of 𝛽 are very close to zero and are not statistically significant at any conventional level of significance. Hence, it is reasonable to assume that the residual variation in foreign peer exposure is also orthogonal to the unobserved student characteristics. I conduct a similar balance test for common shocks as well by examining whether the average characteristics of students at the peer group level are correlated to the share of foreign students. Specifically, I collapse the data to the peer group level and estimate the following equation: 𝑃𝐺 𝑐 𝑗𝑡 = 𝛼 + 𝛽 × 𝐹𝑜𝑟𝑒𝑖𝑔𝑛 𝑆ℎ𝑎𝑟𝑒𝑐 𝑗𝑡 + 𝜃𝑐 𝑗 + 𝜆𝑐𝑡 + +𝜖𝑐 𝑗𝑡, (1.3) where 𝑃𝐺 𝑐 𝑗𝑡 is the average characteristics of the peer group, including the peer group size. 𝐹𝑜𝑟𝑒𝑖𝑔𝑛 𝑆ℎ𝑎𝑟𝑒𝑐 𝑗𝑡 is the share of students who are foreign in the peer group. I again include the two fixed effects from the main specification. Appendix Table 1B.2 reports the results of this test. The estimates of 𝛽 for each column, except the average math ability, are not statistically significant, and the magnitudes are small. Although the estimate for average math ability is significant, the magnitude is too small to make any meaningful difference in the peer group. Finally, I also conduct a balance test of selection for foreign peers in the peer group. The composition of foreign peers could be changing over time, in which case, what I estimate is not just the effect of the increased exposure to foreign peers, but also the effect of their changing composition. Thus, to ensure that I am not picking up the effect of the changing composition of foreign peers, it is important that the composition of foreign peers is not systematically correlated with the share of foreign peers after controlling for the fixed effects. I estimate equation 1.2 using the sample of foreign peers instead of the main sample students and examine if the pre-determined 16 characteristics of foreign peers in the peer group are correlated to the share of foreign students in the peer group. I divide the sample of foreign peers into two groups for this exercise. The first group is the ‘main sample equivalents,’ which includes foreign peers who are FTF and are in the peer group in their first term at the university. All foreign peers not in the first group belong to the ‘other foreign peers’ group.20 I divide the sample of foreign peers into two groups as students in each group are likely to have their network of peers within their respective groups, leading to potential selection by these groups. For the balance test, I primarily focus on the country of origin and the math ability of the foreign peers, in addition to other demographic characteristics. The results from this exercise are reported in Table 1A.3, and the estimates are balanced, suggesting that the estimated peer effects in this paper are not biased due to the potential compositional change of foreign students over time. The balance tests increase our confidence that the residual variation in foreign peer exposure is as good as random.21,22 However, despite everything I have done and the convincing balance test results, the unobservables may be correlated to the residual variation, leading to a bias. To further limit the scope of potential bias due to selection and common shocks, I include individual charac- teristics of students and peer group-level characteristics as controls in the preferred specification. Therefore, the estimates in this paper can be interpreted as causal peer effects. 1.5 Empirical Results 1.5.1 Impact on Graduation I first estimate models of domestic students’ graduation rate on exposure to foreign peers, using a variety of control sets and fixed effects. Table 1A.4 reports the results using various specifications of Equation 1.1, where all columns in Table 1A.4, Panel A do not include the fixed effects, whereas all columns in Table 1A.4, Panel B include the course-instructor and course-term fixed effects. In both panels, moving from column 1 to column 6, I sequentially add the following 20‘Other foreign peers’ includes, for example, an FTF foreign peer who is in the peer group in their second year at the university. Another example would be a foreign peer in the peer group who is a transfer student. 21I also conduct a balance test on domestic students not in the main sample using equation 1.2 and report the results in Appendix Table 1B.3. Further, I conduct a balance test on main sample students by course type separately and report the results in Appendix Table 1B.4. All the estimates look balanced. 22Appendix Figure 1B.1 shows the residualized variation in foreign peer exposure for main sample students. 17 controls: individual characteristics, individual math ability, peer group size, average peer group characteristics, and average peer group math ability. I posit that if the residual variation in the share of foreign peers after including the fixed effects is exogenous to individual achievement, then the magnitude of estimated coefficients should remain relatively unchanged as we sequentially add more controls that are known to impact individual achievement. In contrast, if the magnitude of estimated coefficients changes with the addition of individual and peer group-level controls, then one might be concerned that the proposed identification strategy does not fully address the endogeneity issues. In Column 1 of Table 1A.4, Panel A, I simply regress the dummy for if the student graduated or not on the share of foreign peers, and the estimated coefficient is 0.13, suggesting that the likelihood of the student graduating increases with increased exposure to foreign peers. However, as I sequentially add individual and peer group-level controls, the estimated coefficient drops substantially and changes sign. The movement in the point estimates across specifications in Panel A demonstrates the extent of endogeneity problems discussed earlier and how without addressing them, the estimated coefficients can be biased and misleading. Each column in Panel B, Table 1A.4 reruns the regression from the corresponding column of Panel A, controlling for course-instructor and course-term fixed effects. The estimated coefficients remain very stable at around -0.11 as I sequentially add individual and peer group-level controls between column 1 and column 6. This is strong evidence that the variation in exposure to foreign peers after controlling for fixed effects is exogenous to individual academic achievement, and the estimates can be interpreted as causal rather than being driven by selection or common shocks.23 The preferred specification is Column 6 in Table 1A.4, Panel B, which includes both the fixed effects and a long list of individual and peer group-level controls that are known to affect individual academic achievement. The result from the preferred specification implies that a 10 percentage point increase in the share of foreign students in the peer group causes the graduation rate of domestic students to decrease by 1.1 percentage points, and the estimate is statistically significant 23The results for the same exercise using the sample by course type are reported in Table 1B.5. 18 at the 10% level. Now, I look at the causal effects of exposure to foreign peers on the six-year graduation rate of domestic students enrolled in lower and higher-level introductory math courses in their first term at the university using the preferred specification, and the estimates are reported in Table 1A.5. Column 1 in Table 1A.5 replicates the estimate from column 6 of Panel B, Table 1A.4. In Column 2, I find a large negative effect on the graduation rate of domestic students enrolled in lower level courses — a 10 percentage point increase in the share of foreign peers decreases the graduation rate by 6.1 percentage points, which is statistically significant at the 1% level. Given the average six-year graduation rate is 78% in lower-level courses, this is a drop of 7.8% of the mean. In the higher-level courses, there is no effect of exposure to foreign peers on the graduation rate of domestic students — the point estimate is very close to zero and not statistically significant at any conventional level. 1.5.2 Robustness Alternative Specifications.— I conduct a battery of robustness tests to confirm the tenor of the results in the previous section. The first potential concern is that classroom-level characteristics that are known to impact student academic achievement could be correlated to the main treatment variable, the share of foreign peers at the peer group level, and not controlling for them could bias the estimates.24 To test this, I re-estimate the peer effects, additionally controlling for classroom- level shares of female students and first-generation students, average math ability in class, and class size. Results are robust to this inclusion and are reported in Table 1A.6, Column 2. Second, unlike in schools where all students take the same set of courses, at the college level, students enrolled in the same math course could be enrolled in different first-term courses. If the share of students in a math course enrolled in a particular set of other first-term courses is changing over time, it could bias the estimates. To address this concern, I include Freshman Major fixed effects, which essentially restricts the comparison to students pursuing the same major and who are likely to be enrolled in the same set of other first-term courses. The results are consistent and 24Recall that an instructor might be teaching multiple sections of the same course in a term, and we only control for course-instructor-term level characteristics in our preferred specification. 19 are reported in Table 1A.6, Column 3. Third, I restrict the sample to the most basic math course (College Algebra), where a large majority of students are FTF who are taking the course in their first term at the college — endogenous sorting of other students in the peer group not in their first-year is less of concern in this course.25 The results are consistent with earlier findings; in fact, the point estimates are marginally higher than the baseline results (Table 1A.6, Panel A, Column 4). Fourth, another concern could be that the foreign students’ math ability could be changing over time, and just controlling for the average ability of the peer group might not fully capture it. Although I did not find a sizable correlation between the share of foreign students and the math ability of foreign students in the balance test earlier, it is still a potential concern — I might be picking up the effect of changing ability of foreign students overtime through the exposure to foreign peers, thereby overestimating the true impact of foreign peers on domestic students. To address this, I separately control for the average math ability of domestic students and foreign students within the peer group and re-estimate the peer effects. The results are very similar to our baseline specification (Table 1A.6, column 5), thus suggesting that I am not picking up the effects of potentially changing foreign students’ math abilities. Finally, there could be sorting of foreign and domestic students by class-session timings, which could bias the estimates. So, I additionally control for the class-session time dummy and re-estimate the peer effects. Specifically, I include a dummy variable that takes a value of 1 if the student was enrolled in an introductory math class with sessions starting before noon, and 0 otherwise. Results are robust to this inclusion and are reported in Table 1A.6, column 6. Class-level Variation and Instrumental Variable Analysis.— In the main specification, I use variation in exposure to foreign peers at the course-instructor-term (peer group) level instead at the course-instructor-term-section (class) level, which one might argue is the actual peer group of a student. I did this to primarily avoid the endogeneity concerns due to selection into and out of a section within a course-instructor during a term. To address this concern, I re-estimate a specification where the treatment variable is the share of foreign peers at the class-level. However, 25This exercise is only relevant for the lower-level courses sample. 20 this could be endogenous, so I instrument it with the share of foreign peers at the course-instructor- term level, the main treatment variable in my preferred specification, which is plausibly exogenous after controlling for the fixed effects. Results from the IV estimation are reported in Column 7 of Table 1A.6, and they are consistent with our previous findings, thus showing the robustness of our results. 1.5.3 Impact on Major Choice and Major Switching To further understand what is driving the decrease in graduation rates for domestic students exposed to foreign peers, I consider the likelihood of domestic students graduating with certain majors. Science, Technology, Engineering, and Math (STEM) education has been linked to key drivers of growth and innovation (Griliches, 1992; Peri et al., 2015); however, recent decades have seen a drop in the share of students graduating in STEM fields. Evidence suggests that students with STEM major preference either drop out or end up switching to non-STEM majors (Chen, 2013). In regard to this, I examine if increased exposure to foreign peers negatively affects the likelihood that domestic students graduate with a STEM/non-STEM degree, thereby reducing the overall graduation rate. I look at two major choice outcomes: STEM graduation and Non-STEM graduation. The first outcome, STEM graduation, takes a value of 1 if the student graduates with a STEM degree within 6 years of being admitted to the university, and 0 otherwise. Similarly, the second outcome, Non-STEM graduation, takes a value of 1 if the student graduates with a Non-STEM degree within 6 years of being admitted to the university, and 0 otherwise. Using the main equation 1.1, I estimate the effect on both the outcomes and report the results in Table 1A.7. I find that there is no effect of increased exposure to foreign peers on the likelihood that domestic students graduate with STEM degrees either in lower or higher-level courses, thereby not affecting the supply of domestic STEM graduates. This result addresses the concern that foreign students crowd out domestic students from STEM majors. However, increased exposure to foreign peers in lower-level courses leads to a significant decline in the likelihood that domestic students graduate with a Non-STEM degree. A 10 percentage point increase in the share of foreign peers leads to a 7.8 21 percentage point decline in the supply of domestic Non-STEM graduates, a decrease of 15% that is statistically significant at the 5% level. There is no effect on the supply of domestic non-STEM graduates in the higher-level courses. So far I do not find evidence of peer effects in higher-level courses. It could be that the students in those courses are incurring peer effects, but not on the graduation as an outcome — it is less likely that the students placed in a higher-level introductory math course do not graduate. Perhaps, there might be an effect on the major choice of these students — a student might switch from a major with a large share of foreign students to one with a lower share of foreign students, which might increase the likelihood of graduation. To explore this further, I consider if domestic students switch their previously declared major preference and graduate with another major when exposed to foreign peers. Using the sample of students who graduated in 6 years, I look at the following outcomes: STEM to Non-STEM major switch, Non-STEM to STEM major switch, and Exploratory to STEM major switch. The first outcome, STEM to Non-STEM major switch, takes a value of 1 if the student had a STEM freshman major preference but graduated with a Non-STEM major, and 0 otherwise. Similarly, I construct the other two outcomes. The results from this analysis are reported in Table 1A.8. I do not find a significant effect on major-switching in the lower-level courses. However, there is a substantial major-switching activity in the higher-level courses. I find that the likelihood of domestic students with STEM freshman major preference graduating with a STEM major increases with increased exposure to foreign peers (Table 1A.8, Column 4). A 10 percentage point increase in the share of foreign peers reduces the switching away of domestic students from STEM majors by 1.4 percentage points, which is significant at the 5% level. This result further addresses the concern regarding foreign students displacing domestic students from STEM majors. At the same time, there is reduced switching of domestic students with Non-STEM freshman major preference to STEM majors when exposed to an increased share of foreign students. The results suggest how reduced switching out of STEM majors and reduced switching in STEM majors may lead to no effect on the supply of domestic STEM and non-STEM graduates due to exposure to foreign peers in higher-level courses. 22 1.6 Heterogeneity The findings so far show that exposure to foreign peers negatively affects domestic students in lower-level courses, on average. In an attempt to learn if these negative effects are incurred by certain sub-groups of domestic students more than other sub-groups, I look at the heterogeneity by some of the predetermined characteristics of students. Doing so may provide insights into the potential mechanisms driving the results and how to combat these negative peer effects. Differential Effects by Math Ability.— Although the impact of exposure to foreign students in lower-level courses is substantial, it could be even more pronounced among students with relatively lower abilities within these lower-level courses. To explore this possibility, I look at the heterogeneity by the math ability of domestic students. For this analysis, I standardize students’ math ability within the type of course and include the standardized math ability and its interaction term with the foreign share in the main specification. Table 1A.9, Panel A, Columns 1 and 2 present the results from this exercise for lower and higher-level courses, respectively. I find no differential effects by the math ability of domestic students within lower or higher-level courses. Differential Effects by Race.— To examine the heterogeneity of peer effects by race, I include the interaction of race dummies with the foreign share. The base group is White, and the results are reported in Table 1A.9, Panel B. I find that there is no differential effect on domestic students of any race except Asians in lower-level courses. Interestingly, for Asian students, increased exposure to foreign peers leads to a positive effect on their graduation rate. Combining the coefficients on the main treatment variable and its interaction term with the Asian dummy implies that a 10 percentage point increase in the share of foreign peers causes the graduation rate of Asian domestic students to increase by 12 percentage points. Given that over 90% of foreign peers are Asians, the results suggest that race-related factors might be potential channels through which peer effects are operating, where domestic students of different races than their foreign peers are negatively affected, but domestic students of the same race as foreign peers incur positive peer effects. Differential Effects by Freshman Major Preference.— To consider whether students respond differentially to foreign students based on their major preference, I examine the heterogeneity of peer 23 effects by the major preference declared by the student at the beginning of college. Students with a STEM/Non-STEM preference might differ from those with an Exploratory preference in ways that could affect their response to increased exposure to foreign peers in introductory math courses. For instance, students with a major preference might have stronger beliefs about completing the degree they like compared to students with no major preference, leading to differential responses to exposure to foreign peers and differential effects on outcomes. For the analysis, I construct the following three variables: STEM preference, Non-STEM preference, and Exploratory preference. The first variable, STEM preference, takes a value of 1 if the student had a STEM freshman major preference, and 0 otherwise. Similarly, I construct the other two variables based on the student’s freshman major preference. Keeping the Non-STEM preference as the base group, I include the remaining two variables and their interaction term with the foreign share in the main specification. Table 1A.9, Panel C reports the results of this analysis. I find that there is no differential effect across students who declared a major preference. At the same time, the negative effect is stronger for students who did not have a major preference and opted for the Exploratory option. The result implies that a 10 percentage point increase in the share of foreign peers causes the graduation rate of domestic students without any major preference to drop by 11.9 percentage points. 1.7 Potential Mechanisms Why do foreign peers in lower-level courses negatively affect domestic students’ graduation? Figure 1A.3 presents a stylized diagram to describe various possible channels through which exposure to foreign peers could affect domestic students’ outcomes. Channels could arise from differences in ability between foreign and domestic students. For instance, having more foreign peers of higher ability in a peer group may affect the relative performance of domestic students, which in turn may affect their graduation outcomes. Alternatively, instructors may teach at a higher level in the presence of higher-ability foreign students, affecting the domestic students’ engagement, learning, or grit, which in turn may affect their graduation. Various channels could stem from other non-ability factors as well. For instance, factors like cultural distance between domestic and foreign 24 students, social preference over race, and limited English communication skills of foreign students may hinder collaboration among students or affect the classroom environment, which in turn may affect the achievement and graduation of domestic students. Disentangling each mechanism is difficult, given the empirical setting and the available data. Thus, I try to provide suggestive evidence on the potential mechanisms through which the peer effects might operate. 1.7.1 Impact on Short-Run Outcomes: Achievement and Retention The effect of foreign peers in first-term math courses on domestic students’ graduation rate, which is an eventual outcome, may be due to an effect on their short-run outcomes. For instance, lower grades in introductory math courses or first term may discourage students or put them on a path where they are less likely to graduate. Thus, I explore the effect on short-run outcomes that may affect the eventual outcomes. Specifically, I look at three outcomes: first-term math course GPA, first-term GPA, and retention rate. Both GPA variables range from 0 to 4. Retention rate is a binary variable that takes a value of 1 if the student returned in the Fall semester following the first year and 0 otherwise. It is a standard measure of student success that university administration uses, and it will help to identify that the large negative effect we see on the graduation rate is driven by students dropping out soon after the exposure or later. Using the main equation 1.1, I estimate the effect on the short-run outcomes and report the results in Table 1A.10. Increased exposure to foreign students in lower-level courses negatively affects the short-run grades. A 10 percentage point increase in the share of foreign students in the introductory math course peer group decreases domestic students’ GPA in that course by 0.16, a 6.1% drop, which is significant at the 10% level (Column 1). Also, the first semester GPA drops by 2.8%, which is significant at the 5% level (Column 2). There is a strong negative effect on retention as well in lower-level courses. A 10 percentage point increase in the share of foreign students in introductory math courses leads to a 4.4 percentage point drop in the retention of domestic students, which is significant at the 5% level (Column 3). The effect size on retention is larger than 70% the effect size on graduation, suggesting that among 25 the domestic students who do not graduate due to exposure to foreign students, a majority of them drop out in the first year. In higher-level courses, there is no effect on short-term achievement (Columns 4-5). On retention, although the effect is negative and significant at the 10% level, the effect size is too small. Students might perceive short-run achievement as a measure of their relative performance within their respective peer groups, and a negative effect on relative performance might lead to an adverse effect on their graduation, as literature has shown that academic rank affects academic choices and outcomes (Cicala et al., 2018; Elsner and Isphording, 2017; Murphy and Weinhardt, 2020; Elsner et al., 2021). Alternatively, it could be because of an effect on absolute learning or other reasons, such as domestic students’ dislike for the changed classroom environment with an increased share of foreign students. To investigate if graduation effects through this channel are linked to relative performance, I include short-term achievement variables into the main equation 1.1 and re-estimate it. This approach examines the effect of foreign peer exposure on domestic students’ graduation when they have similar relative rankings within their peer groups based on short-term achievement. Table 1A.11 presents the findings, revealing that after accounting for relative performance, the effect on graduation amounts to two-thirds of the main effect (Column 1). This result suggests that one-third of the total effect can be attributed to an effect on relative performance. In summary, the results in this sub-section provide three interesting findings. First, a large share of students who do not graduate due to exposure to foreign students in lower-level courses drop out in the first year. Second, a negative effect on short-run achievement is likely to be a mechanism leading to a negative effect on graduation. Third, one-third of the total effect can be attributed to an effect on the relative performance of students; the rest is due to an effect on absolute learning or due to other potential mechanisms, for instance, due to an effect on the classroom environment that may directly affect students’ graduation. Although grades in introductory math courses and the first term could be driving effects on domestic students’ graduation, they could also be interpreted as intermediate outcomes of exposure to foreign peers. Therefore, I further explore mechanisms that may be driving the main results 26 through an effect on short-run outcomes or through other potential channels. 1.7.2 Ability-Based Mechanisms Potential mechanisms may originate from ability differences between domestic and foreign students. Within lower-level courses, the average math ability of foreign peers is marginally higher than the main sample students (Table 1A.1). Thus, it is possible that higher-ability foreign peers negatively affect lower-ability domestic students in these courses, for instance, through increased competition. In that case, it is likely that domestic students with relatively weaker math abilities than their domestic peers should be impacted more within lower-level courses. While I do not find a differential effect by domestic students’ math ability in lower-level courses in earlier estimates, it is worth considering whether the estimate at the mean masks effects at the lower end of the ability distribution. In order to further examine ability-based mechanisms, I create quintiles of math ability within each course-instructor-term (peer group) and construct dummy variables (Q1-Q5), one for each quintile. For example, Q1 gets a value of 1 if a student lies in the lowest quintile of the math ability distribution of their peer group and 0 otherwise. Similarly, Q5 gets a value of 1 if a student lies in the highest quintile of the math ability distribution of their peer group and 0 otherwise. First, I control for the quintile groups and keep the highest quintile group (Q5) as the omitted group. Table 1A.12, Column 1 shows that after controlling for the quintile groups, the points estimates are virtually unchanged. I then look at the heterogeneous effects by quintile groups, keeping the highest quintile group (Q5) as the omitted group, and report the results in Table 1A.12, Column 2. I find that there is no differential effect on domestic students belonging to the lowest quintile (Q1) of the math ability distribution of their peer group. In fact, students in none of the quintile groups are differentially affected. Overall, the results in this subsection provide suggestive evidence that the peer effects are likely not operating through ability-based mechanisms. 1.7.3 Non-Ability-Based Mechanisms and Survey Evidence I do not find evidence that the peer effects are operating through ability-based mechanisms, which suggests the importance of non-ability factors. Also, the earlier result on heterogeneous 27 effects by race further emphasizes the importance of mechanisms stemming from non-ability factors. In addition to administrative data, I draw on unique panel data from a survey of domestic students at the university (described in Section 1.3.2) to shed light on the role of non-ability-based mechanisms. Given that most foreign students are Asians, negative effects on domestic students of all races except Asians and a positive effect on domestic Asian students suggest that race-related factors might be playing a role. These observed effects could be because of homophily, i.e., the tendency of people to associate with similar others. Evidence suggests that homophily in race creates the strongest divides in personal environments, leading to significant social segregation and shaping individuals’ social interactions and networks (McPherson et al., 2001). Two major factors that may induce homophily in race and potentially lead to heterogeneous effects by race in this context are 1) social preference over race/foreign students and 2) common interests/culture between domestic and foreign students. Domestic Asian students may have more common interests/culture with other foreign students of the same race. They are also likely not to have a social preference against interacting with other Asians. Therefore, having more foreign students may lead to increased collaboration or better classroom experience for domestic Asian students, potentially leading to a positive effect on their academic outcomes. In contrast, domestic White students may not have much in common with foreign students, or they might have a social preference against interacting with Asians, which may hinder their collaboration with foreign students or worsen their classroom experience, potentially leading to a negative effect on their academic outcomes. Foreign students may also feel more comfortable interacting with other foreign or domestic students of the same race. Using the survey data, I first explore domestic students’ social preferences at the beginning of their first year at the university. Figure 1A.4 plots histograms of responses of newly admitted domestic students to two social preference questions. Figure 1A.4a illustrates the expectations of domestic students regarding interacting with foreign students. 80% of the domestic students are looking forward to or very excited to meet foreign students, and 10% of the domestic students have 28 not thought about it. Figure 1A.4b summarizes their opinions on whether immigrants contribute to cultural enrichment in a country. Over 90% of the students agree that immigrants enrich a country culturally. Both these statistics show no signs of social preference against interacting with foreign students — most students feel excited about the possibility of interacting with foreign students and hold a positive belief about immigrants. I further explore the social preference channel using the administrative data. I estimate the effect of domestic Asian students on domestic students of all other races. Domestic Asian students are likely to have more shared interests (for instance, American football) with other domestic students but are similar in appearance to foreign students, who are primarily Asians. Also, communication in English would not be a barrier for them. If social preference over race is a potential mechanism, one should expect domestic Asian students to have a negative impact on other domestic students too. For this analysis, I focus on the main sample students, excluding the Asian students, and use a specification similar to the main specification 1.1, replacing the main treatment variable with the share of domestic Asian students in the peer group. Appendix Table 1B.6 reports the results of this analysis. The impact of domestic Asian students on other domestic students is not statistically significant at any conventional level (p-value = 0.4). Overall, I do not find any evidence that social preferences of domestic students over race, foreign students, or immigrants are likely mechanisms through which peer effects are operating. Next, we look at the actual interactions of domestic students with foreign students during their first year. Using data from the follow-up survey conducted at the beginning of students’ second year at the university, Figure 1A.5 plots histograms of the number of interactions domestic students had with foreign students in their first year in different settings and the number of friends they met who are foreigners. Close to 50% of the domestic students interacted less than three times with any foreign student in their first year in formal settings (Figure 1A.5a). Another 30% of domestic students only interacted sometimes in an entire academic year in formal settings. The histogram on the number of interactions in social settings also tells the same story (Figure 1A.5b). Figure 1A.5c provides additional evidence on the interaction of domestic with foreign students. Roughly 29 70% of domestic students had no friends from foreign countries among the five closest they met in the first year at the university, and an additional 17% had one foreign friend among the five closest ones. These statistics show that even though most domestic students are looking forward or excited to interact with foreign students, their actual interaction is very limited, which may very well affect collaboration or the classroom environment, in turn leading to negative peer effects.26 Why do domestic students have limited interaction with foreign students? The follow-up survey provides two primary reasons why domestic students fail to form friendships with foreign students. The first reason is the lack of common interests; 44% of them think that "international students have different interests that are not the same as mine." While the survey data suggests the potential role of a lack of common interests between domestic and foreign students, additional research is required to provide rigorous evidence. The second reason is the language barrier; 60% of domestic students think ”communication is difficult because of language.” Foreign students may have limited English communication skills as most of them are non-native English speakers, which may lead to limited interaction between domestic and foreign students. In this case, the negative peer effects should be stronger in peer groups with a higher share of foreign peers with lower English ability. To explore the communication mechanism, I look at the effect of the share of foreign students with high and low English proficiencies within the peer group. In particular, I split the main explanatory variable into two: the share of foreign students with high English proficiency and the share of foreign students with low English ability within a peer group. To measure English proficiency, I use scores from different English language proficiency tests that foreign students take for admission into the university. The scores come from seven different tests: TOEFL internet- based, TOEFL paper-based, TOEFL computer-based, IELTS, SAT, ACT, and University English Language Test. The minimum score required for regular admission is 6.5 on IELTS (or a 79 on TOEFL internet-based). There is a corresponding score for each of the other tests to get regular 26Due to small sample size, I can not explore interaction patterns by race, but papers in the literature find strong-race attraction in the determination of social networks among university students (Marmaros and Sacerdote, 2006; Mayer and Puller, 2008). 30 admission. Since IELTS scores are the crudest, I use that to create a cutoff for the high and low English proficiencies. I take the cutoff to be 7 to create the two categories. Given the cutoff, 23% of foreign peers in lower-level courses and 16.3% of foreign peers in higher-level courses have high English proficiency. I rerun the main equation 1.1, replacing the treatment variable with the two share variables, and report the results in Table 1A.13, Panel A.27 I find that the negative effect on graduation is largely due to the exposure to low English proficiency foreign students in lower-level courses. The estimates imply that a 10 percentage point increase in the share of low English proficiency foreign students in the peer group reduces the domestic students’ graduation rate by 6.8 percentage points, which is statistically significant at the 1% level. The effect of exposure to high English proficiency foreign students is not significant at any conventional level in lower-level courses. In a sensitivity analysis of the cutoff, I conduct the same exercise with the cutoff for high English proficiency being 7.5 and report the results in Table 1A.13, Panel B. Results tell the same story. One point to note here is that the presence of low English proficiency foreign students in peer groups can lead to negative peer effects not only through limited interaction between foreign and domestic students but also through how instructors respond to this. For instance, instructors may adjust their pace or style of instruction due to the presence of low English proficiency students, which may lead to negative effects on domestic students. In summary, the results in this subsection provide suggestive evidence of the presence of non-ability-based mechanisms that may drive the main results. While I find evidence of limited interaction, lack of common interests, and language barrier between domestic and foreign students that may affect collaboration or classroom dynamics, among other factors contributing to peer effects, further research is required to provide more conclusive evidence. 1.8 Conclusion In this paper, I estimate how exposure to foreign peers in college courses affects domestic students’ academic outcomes. I use rich administrative and survey data from a large US public university to provide evidence on the peer effects of foreign students. On average, exposure to 27Appendix Figure 1B.2 shows the variation in exposure to foreign peers with high/low English proficiency for every student in the main sample. 31 foreign peers leads to a sizable negative effect on domestic students’ graduation rate in lower-level courses; there is no effect on domestic students in higher-level courses. Of the students who do not graduate due to exposure to foreign peers, roughly 70% of them drop out in the first year. Further, students of all races except Asians incur negative peer effects; Asian students incur positive peer effects. The decline in graduation comes through a drop in students graduating with non-STEM degrees, with no effect on the number of STEM graduates. In fact, exposure to a higher share of foreign students in higher-level courses reduces the likelihood that domestic students move out of STEM majors. I test several mechanisms to explore why foreign peers negatively affect domestic students’ graduation rate in lower-level courses. First, I find a negative effect on the short-term achievement of domestic students, which may negatively affect graduation. Also, one-third of the effect on graduation may be due to an effect on the academic rank of students. Second, I do not find evidence in support of the ability channel — differences in the abilities of foreign and domestic students may not be driving the results. Third, while I do not find evidence of domestic students’ social preferences against interacting with foreign students, there is very limited interaction between students of the two groups. Two major reasons domestic students mention that hinder their interaction with foreign students are a lack of common interests and communication due to the language barrier. Further, I find that negative peer effects are largely driven by the presence of foreign peers with low English proficiency, providing additional evidence on the potential role of the communication mechanism. My findings are also of relevance to policymakers and university administrators. As noted earlier, the number of foreign students in post-secondary education has grown drastically in the last few decades around the world and in the US. The US colleges and universities also saw this as an opportunity to recruit global talent in addition to generating higher revenue. The number of foreign students is expected to grow further, especially from emerging economies (such as China and India) where the average earnings are increasing and more families are able to afford high-quality education in developed countries. At a time when the enrollment of domestic students is declining, US universities may become even more dependent on tuition revenue from foreign students. While 32 we must be cautious in extrapolating from one university to other universities in the US and around the world, my findings show that there are negative effects on academic outcomes, which may not be driven by differences in abilities. At the same time, there may not be benefits of desegregation on social preferences due to limited interaction between the two groups. Universities may consider taking more proactive measures to encourage interaction, engagement, and collaboration between the two groups to harness the potential benefits of diversity on academic and social outcomes. 33 Allport, G. W. (1954). The nature of prejudice. BIBLIOGRAPHY Anderson, N. (2016). Surge in foreign students may be crowding Americans out of elite colleges. The Washington Post. Anelli, M. and Peri, G. (2019). The effects of high school peers’ gender on college major, college performance and income. The Economic Journal, 129(618):553–602. Anelli, M., Shih, K., and Williams, K. (2023). Foreign students in college and the supply of stem graduates. Journal of Labor Economics, 41(2):511–563. Ballatore, R. M., Fort, M., and Ichino, A. (2018). Tower of babel in the classroom: immigrants and natives in italian schools. Journal of Labor Economics, 36(4):885–921. Bartik, T. J. (2020). Using Place-Based Jobs Policies to Help Distressed Communities. The Journal of Economic Perspectives, 34(3 (Summer)):99–127. Betts, J. R. and Fairlie, R. W. (2003). Does immigration induce ‘native flight’from public schools into private schools? Journal of Public Economics, 87(5-6):987–1012. Bifulco, R., Fletcher, J. M., and Ross, S. L. (2011). The effect of classmate characteristics on post- secondary outcomes: Evidence from the add health. American Economic Journal: Economic Policy, 3(1):25–53. Boisjoly, J., Duncan, G. J., Kremer, M., Levy, D. M., and Eccles, J. (2006). Empathy or antipathy? the impact of diversity. American Economic Review, 96(5):1890–1905. Borjas, G. (2004). Do foreign students crowd out native students from graduate programs? NBER Working Paper No. 10349. Bound, J., Braga, B., Khanna, G., and Turner, S. (2020). A Passage to America: University Funding and International Students. American Economic Journal: Economic Policy, 12(1):97–126. Carrell, S. E., Hoekstra, M., and West, J. E. (2019). The impact of college diversity on behavior toward minorities. American Economic Journal: Economic Policy, 11(4):159–182. Carrell, S. E. and Hoekstra, M. L. (2010). Externalities in the classroom: How children exposed to domestic violence affect everyone’s kids. American Economic Journal: Applied Economics, 2(1):211–228. Chen, X. (2013). Stem attrition: College students’ paths into and out of stem fields. statistical analysis report. nces 2014-001. National Center for Education Statistics. 34 Cicala, S., Fryer, R. G., and Spenkuch, J. L. (2018). Self-selection and comparative advantage in social interactions. Journal of the European Economic Association, 16(4):983–1020. Diette, T. M. and Oyelere, R. U. (2014). Gender and race heterogeneity: The impact of students with limited english on native students’ performance. American Economic Review: Papers and Proceedings, 104(5):412–417. Elsner, B. and Isphording, I. E. (2017). A big fish in a small pond: Ability rank and human capital investment. Journal of Labor Economics, 35(3):787–828. Elsner, B., Isphording, I. E., and Zölitz, U. (2021). Achievement rank affects performance and major choices in college. The Economic Journal, 131(640):3182–3206. Figlio, D. and Özek, U. (2019). Unwelcome guests? the effects of refugees on the educational outcomes of incumbent students. Journal of Labor Economics, 37(4):1061–1096. Figlio, D. N. (2007). Boys named sue: Disruptive children and their peers. Education finance and policy, 2(4):376–394. Finseraas, H., Hanson, T., Johnsen, Å. A., Kotsadam, A., and Torsvik, G. (2019). Trust, ethnic diversity, and personal contact: A field experiment. Journal of Public Economics, 173:72–84. Foster, G. (2006). It’s not your peers, and it’s not your friends: Some progress toward understanding the educational peer effect mechanism. Journal of Public Economics, 90:1455–1475. Gould, E. D., Lavy, V., and Paserman, D. (2009). Does immigration affect the long-term educational outcomes of natives? quasi-experimental evidence. The Economic Journal, 119(540):1243– 1269. Griliches, Z. (1992). The search for r&d spillovers. Scand. J. of Economics, 94:29–47. Groot, K. d. (2023). International students offer ‘rich and diverse’ perpectives. Penn Today. Hoxby, C. M. (1998). Do immigrants crowd disadvantaged american natives out of higher edu- cation? In Hamermesh, D. S. and Bean, F. D., editors, Help Or Hindrance?: The Economic Implications of Immigration for African Americans, pages 282–321. New York: Russel Sage Foundation. Hoxby, C. M. (2000). Peer effects in the classroom: Learning from gender and race variation. NBER Working Paper No. 7867. Hoxby, C. M. and Weingarth, G. (2005). Taking race out of the equation: School reassignment and the structure of peer effects. Working Paper. Hunt, J. (2017). The impact of immigration on the educational attainment of natives. Journal of 35 Human Resources, 52(4):1060–1118. Imberman, S. A., Kugler, A. D., and Sacerdote, B. I. (2012). Katrina’s children: Evidence on the structure of peer effects from hurricane evacuees. American Economic Review, 102(5):2048– 2082. Institute of International Education (2022). International Student Enrollment Trends, 1948/49- 2021/22. Open Doors Report on International Educational Exchange. Kovach, S. (2021). California Legislature proposes bill to reduce number of nonresident UC students. Daily Bruin. Lavy, V., Paserman, D., and Schlosser, A. (2012). Inside the black box of ability peer effects: Evidence from variation in the proportion of low achievers in the classroom. The Economic Journal, 122(559):208–237. Lavy, V. and Schlosser, A. (2011). Mechanisms and impacts of gender peer effects at school. American Economic Journal: Applied Economics, 3(2):1–33. Lyle, D. S. (2007). Estimating and interpreting peer and role model effects from randomly assigned social groups at west point. The Review of Economics and Statistics, 89(2):289–299. Manski, C. F. (1993). Identification of endogenous social effects: The reflection problem. The review of economic studies, 60(3):531–542. Marmaros, D. and Sacerdote, B. (2006). How do friendships form? The Quarterly Journal of Economics, 121(1):79–119. Martins, P. S. and Walker, I. (2006). Student achievement and university classes: effects of attendance, size, peers, and teachers. IZA Discussion PaperNo. 2490. Mayer, A. and Puller, S. L. (2008). The old boy (and girl) network: Social network formation on university campuses. Journal of public economics, 92(1-2):329–347. McPherson, M., Smith-Lovin, L., and Cook, J. M. (2001). Birds of a feather: Homophily in social networks. Annual review of sociology, 27(1):415–444. Murphy, R. and Weinhardt, F. (2020). Top of the class: The importance of ordinal rank. The Review of Economic Studies, 87(6):2777–2826. NAFSA (2020). Losing Talent 2020: An Economic and Foreign Policy Risk America Can’t Ignore. Policy Resources. Ohinata, A. and Van Ours, J. C. (2013). How immigrant children affect the academic achievement of native dutch children. The Economic Journal, 123(570):F308–F331. 36 Orrenius, P. M. and Zavodny, M. (2015). Does immigration affect whether us natives major in science and engineering? Journal of Labor Economics, 33(S1):S79–S108. Paluck, E. L., Green, S. A., and Green, D. P. (2019). The contact hypothesis re-evaluated. Behavioural Public Policy, 3(2):129–158. Parker, J., Grant, J., Crouter, J., and Rivenburg, J. (2010). Classmate Peer Effects: Evidence from Core Courses at Three Colleges. Working Paper. Peri, G., Shih, K., and Sparber, C. (2015). Stem workers, h-1b visas, and productivity in us cities. Journal of Labor Economics, 33(S1):S225–S255. Pettigrew, T. F. and Tropp, L. R. (2006). A meta-analytic test of intergroup contact theory. Journal of personality and social psychology, 90(5):751. Rakesh, R. (2023). The Local Economics Impacts of Foreign Students. Available at SSRN 4449614. Rao, G. (2019). Familiarity does not breed contempt: Generosity, discrimination, and diversity in delhi schools. American Economic Review, 109(3):774–809. Sacerdote, B. (2001). Peer Effects with Random Assignment: Results for Dartmouth Roommates. The Quarterly Journal of Economics, 116(2):681–704. Sacerdote, B. (2011). Peer effects in education: How might they work, how big are they and how In Handbook of the Economics of Education, volume 3, pages much do we know thus far? 249–277. Elsevier. Sacerdote, B. (2014). Experimental and quasi-experimental analysis of peer effects: two steps forward? Annu. Rev. Econ., 6(1):253–272. Shih, K. (2017). Do international students crowd-out or cross-subsidize americans in higher education? Journal of Public Economics, 156:170–184. Stinebrickner, R. and Stinebrickner, T. R. (2006). What can be learned about peer effects using col- lege roommates? Evidence from new survey data and students from disadvantaged backgrounds. Journal of Public Economics, 90:1435–1454. Zimmerman, D. J. (2003). Peer effects in academic outcomes: Evidence from a natural experiment. Review of Economics and Statistics, 85(1):9–23. 37 APPENDIX 1A MAIN TABLES AND FIGURES Figure 1A.1 Foreign Student Enrollment Trend at the University Notes: The figure shows the student enrollment trend at the university between 2005 and 2014. The blue line indicates the enrollment trend for total undergraduate student enrollment. The red and green lines indicate the foreign and domestic undergraduate student enrollment trends, respectively. The yellow line indicates the trend of the share of foreign students among the total undergraduate student enrollment. Source: Authors’ calculation using data from the university. 38 Figure 1A.2 Distribution of Share of Foreign Peers (a) All Courses (b) Lower-Level Courses (c) Higher-Level Courses Notes: This figure shows the distribution of exposure to foreign peers (proxied by the share of foreign students in the peer group) for students in the main sample. Source: Authors’ calculation using university administrative data. 39 Figure 1A.3 Stylized Overview of Possible Channels Notes: Overview of the potential channels through which the peer effects might operate. 40 Figure 1A.4 Social Preference of Domestic Students (a) Expectations to Meet Foreign Students (b) Immigrants Enrich a Country Culturally Notes: This figure displays students’ responses to social preference questions in the baseline survey. Figure 1A.4a displays the summary of the responses to the question on students’ expectations to interact with foreign students during their time at the university. Figure 1A.4b displays the summary of the responses to the question about whether the student believes that immigrants enrich a country culturally. Sample: First-time freshman domestic students in Fall 2018 at the university. Source: Authors’ calculation using survey data. 41 Figure 1A.5 Actual Interaction of Domestic Students in the First Year (a) Interaction in Formal Settings (b) Interaction in Social Settings (c) Number of Foreign Students Among Five Closest Friends You Met in the First Year Notes: This figure displays domestic students’ responses to questions on actual interactions with foreign students in their first year. Figure 1A.5a displays the summary of the domestic student’s number of interactions with foreign students in formal settings during their first year at the university. Figure 1A.5b displays the summary of the domestic student’s number of interactions with foreign students in social settings during their first year at the university. Figure 1A.5c displays the summary of the domestic student’s number of friends who are foreign students among the five closest friends they met in the first year at the university. Sample: First-time freshman domestic students in Fall 2018 at the university. Source: Authors’ calculation using survey data. 42 Table 1A.1 Summary Statistics Lower Level Higher Level Total PANEL A: MAIN SAMPLE White Black Asian Hispanic Female First Gen Math English Foreign Share Observations 0.836 (0.370) 0.061 (0.239) 0.029 (0.167) 0.039 (0.194) 0.563 (0.496) 0.184 (0.387) 23.856 (2.494) 24.371 (3.747) 0.039 (0.036) 18495 0.848 (0.360) 0.020 (0.139) 0.071 (0.257) 0.024 (0.153) 0.338 (0.473) 0.135 (0.342) 27.970 (3.041) 26.326 (4.002) 0.150 (0.127) 13620 PANEL B: FOREIGN PEERS China Korea Taiwan India Saudi Arabia Female First Gen Math 0.470 (0.499) 0.141 (0.348) 0.043 (0.203) 0.049 (0.215) 0.052 (0.223) 0.352 (0.478) 0.205 (0.404) 25.922 (3.235) Observations 1072 0.714 (0.452) 0.096 (0.294) 0.027 (0.161) 0.024 (0.152) 0.026 (0.158) 0.352 (0.478) 0.200 (0.400) 28.803 (2.336) 5683 0.841 (0.366) 0.043 (0.203) 0.047 (0.211) 0.033 (0.178) 0.468 (0.499) 0.163 (0.369) 25.588 (3.409) 25.206 (3.977) 0.086 (0.103) 32115 0.675 (0.468) 0.103 (0.304) 0.029 (0.169) 0.028 (0.164) 0.030 (0.170) 0.352 (0.478) 0.201 (0.401) 28.327 (2.725) 6755 Notes: Panel A in the table shows the summary statistics (mean) for demographic characteristics, academic ability, and share of foreign peers for the main student sample. Panel B shows the summary statistics (mean) for demographic characteristics (including home country) and academic ability of foreign peers of the main sample students. Each column corresponds to the sample used for the analysis and is denoted in the column header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus- based courses. Sample: First-time freshman domestic students and their foreign peers enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. 43 Table 1A.2 Balance Test: Main Sample White Black Asian Hispanic Other Minority Female First Gen Math English (1) (2) (3) (4) (5) (6) (7) (8) (9) PANEL A: WITHOUT FIXED EFFECTS Foreign Share -0.007*** -0.009*** (0.002) (0.001) 0.014*** (0.001) -0.001 (0.001) 0.004*** (0.001) -0.058*** (0.003) 0.012*** (0.002) 1.124*** (0.019) 0.711*** (0.023) Mean Dep. Var. N 0.84 32115 0.04 32115 0.05 32115 0.03 32115 0.04 32115 0.47 32115 0.16 32115 25.59 31919 25.21 29263 PANEL B: WITH FIXED EFFECTS Foreign Share Mean Dep. Var. N -0.002 (0.006) 0.84 32099 -0.003 (0.003) 0.04 32099 0.005 (0.004) 0.05 32099 -0.002 (0.003) 0.03 32099 0.002 (0.003) 0.04 32099 -0.001 (0.007) 0.47 32099 -0.002 (0.006) 0.16 32099 0.053 (0.044) 25.59 31903 0.046 (0.060) 25.21 29246 Notes: This table shows the results from the balance test using the main sample. Each column corresponds to a separate regression of students’ pre-determined demographic and academic characteristics on exposure to foreign peers (proxied by the share of foreign students in the peer group), with outcome variables denoted by the column headers. In Panel A, the regressions do not control for any additional variables. In Panel B, all regressions control for course-instructor FEs and course-term FEs. Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 44 Table 1A.3 Balance Test: Foreign Peers China Korea Taiwan India Saudi Arabia Female First Gen Math (1) (2) (3) (4) (5) (6) (7) (8) PANEL A: MAIN SAMPLE EQUIVALENT Foreign Share 0.009 (0.009) 0.001 (0.005) 0.004 (0.003) -0.001 (0.002) Mean Dep. Var. N 0.71 3622 0.09 3622 0.03 3622 0.03 3622 -0.003 (0.003) 0.02 3622 -0.011 (0.012) -0.000 (0.012) 0.022 (0.049) 0.40 3622 0.22 3622 28.57 3370 PANEL B: OTHER FOREIGN STUDENTS Foreign Share 0.035 (0.026) 0.026 (0.018) -0.008 (0.006) -0.012 (0.010) Mean Dep. Var. N 0.63 2918 0.11 2918 0.02 2918 0.03 2918 -0.002 (0.006) 0.04 2918 0.039* (0.021) 0.020 (0.016) 0.246* (0.127) 0.30 2918 0.18 2918 28.03 2719 Notes: This table shows the results from the balance test using the sample of foreign peers. Panel A reports the results for the sample of first-time freshman foreign peers who enrolled in introductory math courses in their first term between 2005-2014. Panels B reports the results for all the other foreign peers who enrolled in introductory math courses between 2005-2014. Each column corresponds to a separate regression of students’ pre-determined demographic and academic characteristics on exposure to foreign peers (proxied by the share of foreign students in the peer group), with outcome variables denoted by the column headers. All regressions control for course-instructor FEs and course-term FEs. Robust standard errors clustered at instructor level in parenthesis. Sample: Foreign peers of first-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 45 Table 1A.4 Comparing Specifications Graduation in 6 Years (1) (2) (3) (4) (5) (6) PANEL A: WITHOUT FIXED EFFECTS Foreign Share 0.013*** (0.003) 0.014*** (0.003) 0.008*** (0.003) 0.008*** (0.003) 0.002 (0.003) -0.000 (0.003) Mean Dep. Var. N R-squared 0.79 32115 0.00 0.79 32115 0.02 0.80 29263 0.02 0.80 29263 0.02 0.80 29263 0.02 0.80 29263 0.02 PANEL B: WITH FIXED EFFECTS Foreign Share -0.011* (0.006) -0.011* (0.006) -0.010 (0.006) -0.009 (0.006) -0.010 (0.006) -0.011* (0.006) Mean Dep. Var. N R-squared Ind. Char Ind. Ability Peer Group Size Peer Char. Peer Ability 0.79 32099 0.03 0.79 32099 0.04 ✓ 0.80 29246 0.05 ✓ ✓ 0.80 29246 0.05 ✓ ✓ ✓ 0.80 29246 0.05 ✓ ✓ ✓ ✓ 0.80 29246 0.05 ✓ ✓ ✓ ✓ ✓ Notes: This table reports the results of regression using various versions of the main equation. Each column corresponds to a separate regression of “six-year graduation” on exposure to foreign peers (proxied by the share of foreign students in the peer group). In Panel A, the regressions do not include fixed effects. In Panel B, all regressions include course-instructor FEs and course-term FEs. Moving from column 1 to column 6, controls are sequentially included. Individual characteristics controls include race dummies, a female indicator, and a first-generation indicator. Individual ability controls include math and English ability. Peer characteristics controls include the shares of female students and first-generation students in the peer group. Peer ability control includes the mean math ability of the students in the peer group. Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 46 Table 1A.5 Main Result: Impact on Graduation Graduation in 6 Years (1) (2) Full Sample Lower Level Higher Level (3) Foreign Share Mean Dep. Var. N R-squared -0.011* (0.006) 0.80 29246 0.05 -0.061*** (0.020) 0.78 16771 0.04 -0.007 (0.007) 0.83 12475 0.06 Notes: This table reports the effect of exposure to foreign peers (proxied by share of foreign students in the peer group) on domestic students’ six-year graduation rate. Each column corresponds to the sample used for the analysis and is denoted in the column header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 47 Table 1A.6 Robustness Baseline Class-Level Controls Freshman Major FEs Only College Algebra Foreign Math AM/PM Control Class-Level Variation (1) (2) (3) (4) (5) (6) (7) PANEL A: LOWER LEVEL Foreign Share -0.061*** (0.020) -0.062*** (0.020) -0.064*** (0.020) -0.073*** (0.021) -0.057** (0.029) -0.058*** (0.021) Class Foreign Share Mean Dep. Var. First Stage F-stat N R-squared 0.78 0.78 16771 0.04 16771 0.04 0.78 16681 0.07 0.79 10757 0.05 0.78 0.78 14334 0.04 16771 0.04 PANEL B: HIGHER LEVEL Foreign Share -0.007 (0.007) -0.007 (0.007) -0.009 (0.007) -0.008 (0.007) -0.007 (0.007) Class Foreign Share Mean Dep. Var. First Stage F-stat N R-squared 0.83 0.83 12475 0.06 12475 0.06 0.83 11987 0.08 0.83 0.83 10817 0.06 12475 0.06 -0.065*** (0.021) 0.78 797 16771 0.02 -0.007 (0.007) 0.83 2211 12475 0.01 Notes: This table reports the effect of exposure to foreign peers on domestic students’ six-year graduation rate. Each panel corresponds to the sample used for the analysis. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. Each column corresponds to a separate robustness test of the main result, denoted in the column header. Column 1 replicates the main result using the main equation 1.1. Column 2 additionally controls for the class-level characteristics (class size, shares of female students, share of first-generation students, and average math ability). Column 3 additionally controls for freshman major preference fixed effects. Column 4 restricts the sample to the most basic course, College Algebra. Column 5 separately controls for the average ability of foreign and domestic students in the peer group instead of the average ability of all students in the peer group. Column 6 controls for the morning (before noon) class session dummy. Column 7 estimates the effect of exposure to foreign peers at the class level instead of at the peer group level using equation 1.1. I instrument for the share of foreign peers at the class level with the share of foreign peers at the peer group level. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 48 Table 1A.7 Impact on STEM/Non-STEM Major Graduation Lower Level Higher Level (1) (2) (3) (4) STEM Graduation Non-STEM Graduation STEM Graduation Non-STEM Graduation Foreign Share Mean Dep. Var. N R-squared 0.015 (0.035) 0.29 16771 0.11 -0.078** (0.039) 0.50 16771 0.11 0.002 (0.007) 0.47 12475 0.18 -0.009 (0.008) 0.37 12475 0.20 Notes: This table reports the effect of exposure to foreign peers (proxied by the share of foreign students in the peer group) on domestic students’ major choice outcomes - six-year graduation with a STEM major and six-year graduation with a Non-STEM major. Each column group corresponds to the sample used for the analysis and is denoted in the column group header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. Each column corresponds to a separate regression of students’ outcomes on exposure to foreign peers, with outcome variables denoted by the column headers. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 49 Table 1A.8 Impact on STEM/Non-STEM Major Switching (1) Lower Level (2) (3) (4) Higher Level (5) (6) STEM to Non-STEM Non-STEM to STEM Exploratory to STEM STEM to Non-STEM Non-STEM to STEM Exploratory to STEM Foreign Share Mean Dep. Var. N R-squared -0.026 (0.076) 0.21 5058 0.07 0.011 (0.020) 0.08 7369 0.04 0.163 (0.138) 0.28 950 0.18 -0.014** (0.007) 0.11 5624 0.11 -0.018*** (0.007) 0.09 3784 0.12 0.017 (0.034) 0.53 981 0.46 Notes: This table reports the effect of exposure to foreign peers (proxied by the share of foreign students in the peer group) on domestic students’ major switching outcomes - starting with a STEM preference but graduating with a Non-STEM major in six years, starting with a Non-STEM preference but graduating with a STEM major in six years, and starting with an Exploratory preference but graduating with a STEM major in six years. Each column group corresponds to the sample used for the analysis and is denoted in the column group header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. Each column corresponds to a separate regression of students’ outcomes on exposure to foreign peers, with outcome variables denoted by the column headers. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014 and graduated within six years. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 50 Table 1A.9 Heterogeneity With Pre-determined Characteristics Graduation in 6 years (1) (2) Lower Level Higher Level PANEL A: MATH ABILITY Foreign Share Foreign Share * Math Foreign Share Foreign Share * Black Foreign Share * Asian Foreign Share * Hispanic Foreign Share * Other Minority -0.061*** (0.020) 0.005 (0.009) PANEL B: RACE -0.007 (0.007) 0.000 (0.003) -0.007 (0.007) -0.027 (0.033) 0.002 (0.011) 0.027 (0.019) -0.015 (0.015) -0.064*** (0.021) 0.003 (0.055) 0.184*** (0.056) 0.040 (0.050) -0.024 (0.046) PANEL C: FRESHMAN MAJOR PREFERENCE Foreign Share Foreign Share * STEM Preference Foreign Share * Exploratory Preference Mean Dep. Var. N R-squared -0.065*** (0.021) 0.021 (0.020) -0.053* (0.028) 0.78 16771 0.04 0.000 (0.007) -0.011 (0.007) -0.016 (0.011) 0.83 12475 0.06 Notes: This table reports the heterogeneous effect of exposure to foreign peers by domestic students’ pre- determined characteristics on their six-year graduation rate. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. In Panel A, the student’s own math ability is standardized among the sample of students within a course type, and regressions include the interaction term of own math ability and the foreign share. In Panel B, regressions include interaction terms of race dummies and the foreign share, keeping White as the omitted group. In Panel C, the regressions include major preference dummies and their interaction with the foreign share, keeping Non-STEM preference as the omitted group. All regressions further control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 51 Table 1A.10 Short-Run Outcomes Lower Level Higher Level (1) (2) (3) (4) (5) (6) Math Course GPA First Semester GPA Retention Math Course GPA First Semester GPA Retention Foreign Share Mean Dep. Var. N R-squared -0.159* (0.094) 2.64 16769 0.14 -0.085** (0.033) 3.03 16767 0.09 -0.044** (0.018) 0.89 16771 0.02 0.018 (0.019) 2.93 12468 0.16 0.007 (0.011) 3.27 12474 0.11 -0.006* (0.004) 0.93 12475 0.05 Notes: This table reports the effect of exposure to foreign peers (proxied by the share of foreign students in the peer group) on domestic students’ short-run academic outcomes - introductory math course GPA, first semester GPA, and retention. Each column group corresponds to the sample used for the analysis and is denoted in the column group header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. Each column corresponds to a separate regression of students’ outcomes on exposure to foreign peers, with outcome variables denoted by the column headers. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 52 Table 1A.11 Effect on Graduation and Retention Controlling for Short-Run Grades Lower Level Higher Level (1) (2) (3) (4) Graduation in 6 years Retention Graduation in 6 years Retention Foreign Share Short-Term Grades Mean Dep. Var. N R-squared -0.040* (0.023) ✓ 0.78 16765 0.17 -0.030* (0.016) ✓ 0.89 16765 0.14 -0.009 (0.006) ✓ 0.83 12468 0.17 -0.007* (0.004) ✓ 0.93 12468 0.13 Notes: This table reports the effect of exposure to foreign peers (proxied by the share of foreign students in the peer group) on domestic students’ six-year graduation rate and retention. Each column group corresponds to the sample used for the analysis and is denoted in the column group header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. Each column corresponds to a separate regression of students’ outcomes on exposure to foreign peers, with outcome variables denoted by the column headers. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Regressions also control for students’ own introductory math course GPA and first-semester GPA. Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 53 Table 1A.12 Ability-Based Mechanism Lower Level Higher Level (1) (2) (3) (4) Graduation in 6 Years Graduation in 6 Years Graduation in 6 Years Graduation in 6 Years -0.061*** (0.020) Foreign Share Foreign Share * Ability Quintile 4 Foreign Share * Ability Quintile 3 Foreign Share * Ability Quintile 2 Foreign Share * Ability Quintile 1 (Lowest) Mean Dep. Var. N R-squared 0.78 16769 0.04 -0.054* (0.028) -0.012 (0.032) -0.003 (0.036) -0.010 (0.033) -0.010 (0.031) 0.78 16769 0.04 -0.006 (0.007) 0.83 12468 0.06 -0.003 (0.009) 0.005 (0.008) -0.009 (0.009) -0.009 (0.009) -0.001 (0.009) 0.83 12468 0.06 Notes: This table reports the results from tests of ability-based mechanism. Each column group corresponds to the sample used for the analysis and is denoted in the column group header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. The estimates are from the regression of domestic students’ six-year graduation on the shares of foreign peers. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Regressions in columns 1 and 3 include ability-quintile dummies, where quintile 5 (highest) is the omitted group. Regressions in columns 2 and 4 include ability-quintile dummies and their interaction terms with the share of foreign peers, where quintile 5 (highest) is the omitted group. Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 54 Table 1A.13 Communication Mechanism Graduation in 6 years (1) (2) Lower Level Higher Level PANEL A: CUTOFF SCORE = 7 Foreign Share Low English Proficiency (< 7) Foreign Share High English Proficiency (>= 7) Mean Dep. Var. N R-squared -0.068*** (0.023) -0.030 (0.061) 0.78 16771 0.04 PANEL B: CUTOFF SCORE = 7.5 Foreign Share Low English Proficiency (< 7.5) Foreign Share High English Proficiency (>= 7.5) Mean Dep. Var. N R-squared -0.066*** (0.021) -0.023 (0.078) 0.78 16771 0.04 -0.007 (0.007) -0.005 (0.021) 0.83 12475 0.06 -0.007 (0.007) -0.011 (0.029) 0.83 12475 0.06 Notes: This table reports the results from tests of communication mechanisms. Each column corresponds to the sample used for the analysis and is denoted in the column header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. The estimates are from the regression of domestic students’ six-year graduation on the shares of foreign peers with low English proficiency and high In Panel A, the cutoff score for high English proficiency is 7 in IELTS, and English proficiency. in Panel B, the cutoff score for high English proficiency is 7.5 in IELTS. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 55 APPENDIX 1B ADDITIONAL TABLES AND FIGURES Table 1B.1 List of Introductory Math Courses Introductory Math Courses Average Number of Students in Total Number of Course-Instructor-Term Combinations Course-Instructor-Term Combinations PANEL A: LOWER-LEVEL College Algebra Finite Math and Elements of College Algebra Trigonometry College Algebra and Trigonometry 72.68 171.81 84.89 173.41 PANEL B: HIGHER-LEVEL Survey of Calculus Calculus 1 Calculus 2 Multivariable Calculus Differential Equations 41.89 32.60 53.88 73.43 76.92 225 16 19 34 336 270 75 61 26 Notes: This table shows the list of introductory math courses and their summary statistics. Panel A lists Lower- Level courses, which include introductory non-calculus courses. Panel B lists Higher-Level courses, which include introductory calculus-based courses. Sample: Introductory math courses between 2005-2014. Source: Authors’ calculation using university administrative data. 56 Table 1B.2 Balance Test: Peer Group Female First Gen Math Peer Group Size (1) (2) (3) (4) PANEL A: WITHOUT FIXED EFFECTS Average Foreign Share -0.021*** (0.004) 0.027*** (0.003) 0.672*** (0.057) -5.060** (2.127) Mean Dep. Var. N 0.39 1062 0.15 1062 25.94 1062 PANEL B: WITH FIXED EFFECTS Average Foreign Share -0.002 (0.005) -0.007 (0.005) 0.228*** (0.042) Mean Dep. Var. N 0.39 611 0.15 611 25.94 611 56.50 1062 0.242 (0.761) 56.50 611 Notes: This table shows the results from the balance test of the peer group. Each column corresponds to a separate regression of peer group level average pre-determined demographic and academic characteristics of students on the share of foreign students in the peer group, with outcome variables denoted by the column headers. In Panel A, the regressions do not control for any additional variables. In Panel B, all regressions control for course-instructor FEs and course-term FEs. Robust standard errors clustered at instructor level in parenthesis. Sample: Introductory math courses between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 57 Table 1B.3 Balance Test: Domestic Non-Main Sample White Black Asian Hispanic Other Minority Female First Gen Math English (1) (2) (3) (4) (5) (6) (7) (8) (9) PANEL A: WITHOUT FIXED EFFECTS Foreign Share 0.005 (0.004) -0.011*** (0.003) 0.003* (0.001) 0.001 (0.002) Mean Dep. Var. N 0.80 21138 0.08 21138 0.04 21138 0.04 21138 0.003** (0.001) 0.04 21138 -0.042*** (0.009) 0.046*** (0.008) 0.743*** (0.152) 0.583*** (0.083) 0.36 21138 0.18 21138 24.23 19932 23.56 16786 PANEL B: WITH FIXED EFFECTS Foreign Share -0.002 (0.007) Mean Dep. Var. N 0.80 21126 0.002 (0.004) 0.08 21126 -0.001 (0.004) -0.003 (0.004) 0.04 21126 0.04 21126 0.004 (0.003) 0.04 21126 -0.018** (0.008) 0.36 21126 -0.005 (0.007) 0.18 21126 -0.112 (0.070) 24.23 19920 -0.074 (0.089) 23.56 16763 Notes: This table shows the results from the balance test using the domestic peers who are not in the main sample. Each column corresponds to a separate regression of students’ pre-determined demographic and academic characteristics on exposure to foreign peers (proxied by the share of foreign students in the peer group), with outcome variables denoted by the column headers. In Panel A, the regressions do not control for any additional variables. In Panel B, all regressions control for course-instructor FEs and course-term FEs. Robust standard errors clustered at instructor level in parenthesis. Sample: Domestic students not in the main sample and are enrolled in introductory math courses between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 58 Table 1B.4 Balance Test: Main Sample (By Course-Type) White Black Asian Hispanic Other Minority Female First Gen Math English (1) (2) (3) (4) (5) (6) (7) (8) (9) PANEL A: LOWER LEVEL Foreign Share 0.001 (0.028) -0.010 (0.020) -0.005 (0.009) 0.003 (0.010) Mean Dep. Var. N 0.84 18493 0.04 18493 0.05 18493 0.03 18493 0.012 (0.015) 0.04 18493 PANEL B: HIGHER LEVEL Foreign Share -0.002 (0.007) -0.003 (0.002) 0.006 (0.004) -0.002 (0.003) Mean Dep. Var. N 0.84 13606 0.04 13606 0.05 13606 0.03 13606 0.001 (0.003) 0.04 13606 -0.025 (0.026) 0.47 18493 0.000 (0.007) 0.47 13606 -0.009 (0.026) 0.16 18493 -0.002 (0.006) 0.16 13606 -0.085 (0.173) 0.149 (0.266) 25.59 18477 25.21 16771 0.062 (0.045) 0.039 (0.062) 25.59 13426 25.21 12475 Notes: This table shows the results from the balance test using the main sample by course type. Each column corresponds to a separate regression of students’ pre-determined demographic and academic characteristics on exposure to foreign peers (proxied by the share of foreign students in the peer group), with outcome variables denoted by the column headers. All regressions control for course-instructor FEs and course-term FEs. Each panel corresponds to the sample used for the analysis. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 59 Table 1B.5 Comparing Specifications (By Course-Type) Graduation in 6 Years (1) (2) (3) (4) (5) (6) PANEL A: LOWER LEVEL (WITHOUT FIXED EFFECTS) Foreign Share Mean Dep. Var. N R-squared 0.009 (0.009) 0.77 18495 0.00 0.019* (0.010) 0.77 18495 0.02 0.006 (0.009) 0.78 16772 0.02 -0.001 (0.009) 0.78 16772 0.02 -0.015* (0.008) 0.78 16772 0.02 -0.015* (0.008) 0.78 16772 0.02 PANEL B: LOWER LEVEL (WITH FIXED EFFECTS) Foreign Share -0.055** -0.056*** -0.062*** -0.059*** (0.022) (0.019) (0.020) (0.020) -0.064*** (0.020) -0.061*** (0.020) Mean Dep. Var. N R-squared 0.77 18493 0.02 0.77 18493 0.03 0.78 16771 0.04 0.78 16771 0.04 0.78 16771 0.04 PANEL C: HIGHER LEVEL (WITHOUT FIXED EFFECTS) Foreign Share Mean Dep. Var. N R-squared -0.002 (0.003) 0.82 13620 0.00 0.000 (0.003) 0.82 13620 0.01 0.001 (0.003) 0.83 12491 0.01 0.000 (0.003) 0.83 12491 0.01 -0.005 (0.003) 0.83 12491 0.01 0.78 16771 0.04 -0.005 (0.003) 0.83 12491 0.01 PANEL D: HIGHER LEVEL (WITH FIXED EFFECTS) -0.008 (0.007) 0.82 13606 0.04 -0.008 (0.007) 0.82 13606 0.05 ✓ -0.006 (0.006) 0.83 12475 0.06 ✓ ✓ Foreign Share Mean Dep. Var. N R-squared Ind. Char Ind. Ability Peer Group Size Peer Char. Peer Ability -0.006 (0.006) -0.006 (0.006) -0.007 (0.007) 0.83 12475 0.06 ✓ ✓ ✓ 0.83 12475 0.06 ✓ ✓ ✓ ✓ 0.83 12475 0.06 ✓ ✓ ✓ ✓ ✓ Notes: This table reports the results of regression using various versions of the main equation. Each column corresponds to a separate regression of “six-year graduation” on exposure to foreign peers (proxied by the share of foreign students in the peer group). Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. Individual characteristics controls include race dummies, a female indicator, and a first-generation indicator. Individual ability controls include math and English ability. Peer characteristics controls include the shares of female students and first-generation students in the peer group. Peer ability control includes the mean math ability of the students in the peer group. Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 60 Table 1B.6 Effect of Exposure to Domestic Asian Students On Other Domestic Students Graduation in 6 Years (1) (2) Lower Level Higher Level -0.030 (0.035) 0.78 16310 0.04 -0.007 (0.012) 0.83 11653 0.06 Domestic Asian Share Mean Dep. Var. N R-squared Notes: This table reports the effect of exposure to domestic Asian students (proxied by share of domestic Asian students in the peer group) on Non-Asian domestic students’ six-year graduation rate. Each column corresponds to the sample used for the analysis and is denoted in the column header. Lower Level denotes the sample of Non-Asian domestic students enrolled in introductory non-calculus courses. Higher Level denotes the sample of Non-Asian domestic students enrolled in introductory calculus-based courses. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman Non- Asian domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 61 Figure 1B.1 Distribution Of Share of Foreign Peers (Residualized) (a) All Courses (b) Lower-Level Courses (c) Higher-Level Courses Notes: This figure shows the distribution of residualized exposure to foreign peers (proxied by the share of foreign students in the peer group) for students in the main sample. Source: Authors’ calculation using university administrative data. 62 Figure 1B.2 Distribution of Share of Foreign Peers with High/Low English Proficiency (By Course- Type) (a) Lower-Level Courses: Low English Proficiency Foreign Peers (b) Lower-Level Courses: High English Proficiency Foreign Peers (c) Higher-Level Courses: Low English Proficiency Foreign Peers (d) Higher-Level Courses: High English Proficiency Foreign Peers Notes: This figure shows the distribution of exposure to foreign peers with high/low English proficiency (proxied by the share of foreign students with high/low English proficiency in the peer group) for students in the main sample. Source: Authors’ calculation using university administrative data. 63 APPENDIX 1C ADDITIONAL ANALYSIS 1C.1 Non-Linear Effects There is substantial variation in the share of foreign peers — Figure 1C.1 shows the mean exposure to foreign peers corresponding to each of the 5 quintile categories of the distribution of the share of foreign peers. The highest quintile, Q5, corresponds to the highest share of foreign peers, and the lowest quintile, Q1, corresponds to the lowest share of foreign peers. The average share of foreign peers in the first quintile (Q1) is 0.4 %, whereas the average share in the fifth quintile (Q5) is 25.9%. Given the large variation, constraining the effects of exposure to foreign peers to be linear may be too restrictive; its effects might manifest only when the share of foreign peers is above a certain threshold. Further, the distribution of the share of foreign peers is different in the lower-level courses than in the higher-level ones (Figure 1A.2), where the share of foreign peers is much higher in higher-level courses. It might be that exposure to foreign peers influences graduation non-linearly, and that is partly the reason why we observe no effect in higher-level courses. Thus, to better understand the effects of exposure to foreign peers, I explore the non-linearities in the effect across the quintile categories of the distribution of the share of foreign peers. Specifically, I estimate the following equation: 𝑌𝑖𝑐 𝑗𝑡 = 𝛼 + ∑︁ 𝑞 𝛽𝑞1[𝑄𝑖 = 𝑞, 𝑞 ≠ 1] + 𝜃𝑐 𝑗 + 𝜆𝑐𝑡 + 𝛾𝑋𝑖 + 𝛿𝐺 𝑐 𝑗𝑡 + 𝜖𝑖𝑐 𝑗𝑡 (1C.1) where 𝑄𝑖 denotes the 𝑖𝑡ℎ quintile category of the distribution of the share of foreign peers in the main sample. The lowest quintile, Q1, is the omitted group. Thus, 𝛽𝑄 estimates the impact of foreign peers on domestic students in each quintile (Q2 to Q5) relative to those with foreign peers in Q1. All other terms are the same as Equation 1.1. Figures 1C.2 a and b plot the estimated effects on graduation using the main sample students enrolled in lower-level and higher-level courses, respectively. In each figure, the x-axis denotes the quintile measure of the share of foreign peers, where Q1 corresponds to the lowest quintile, and 64 Q5 corresponds to the highest quintile. The y-axis denotes the six-year graduation rate. Compared to domestic students in Q1, domestic students in Q2 have a 4 percentage points lower graduation rate due to exposure to foreign peers in lower-level courses, This result suggests there is a negative effect of exposure to foreign peers, even at very low levels of share of foreign peers. The negative effect stays roughly the same in Q3 and Q4 before getting much stronger on students in Q5, where the graduation rate is lower by 8.8 percentage points compared to students in Q1. Although the standard errors are large, which is expected, as fewer students in Q5 are enrolled in lower-level courses, the p-value is 0.11, very close to the 10% significance level. At the same time, there is no effect on students’ graduation in higher-level courses across the entire distribution of the share of foreign peers. These results also indicate that it is not the difference in the foreign share distribution across the lower and higher-level courses that leads to different estimates of peer effects across the two groups. Lastly, the results for effect on retention tell the same story (Figure 1C.2 c and d). 65 Figure 1C.1 Average Share of Foreign Peers, By Quintile Notes: This figure shows the average exposure to foreign peers (proxied by the share of foreign students in the peer group) in each quintile of the distribution of the share of foreign peers. Source: Authors’ calculation using university administrative data. 66 Figure 1C.2 Impact on Graduation and Retention: Quintile Measure of Exposure (a) Lower-Level Courses (b) Higher-Level Courses (c) Lower-Level Courses (d) Higher-Level Courses Notes: This figure shows the effect of exposure to foreign peers (proxied by share of foreign students in the peer group) on domestic students’ six-year graduation rate and retention. The x-axis denotes the quintile measure of the share of foreign students, where Q1 corresponds to the lowest quintile, and Q5 corresponds to the highest quintile. The y-axis denotes the students’ six-year graduation rate in 1C.2 a and b, whereas students’ retention rate in 1C.2 c and d. Each quintile shows the impact of exposure to foreign peers relative to the omitted quintile (Q1). The sub-heading denotes the sample used for the analysis. Lower Level Courses denote the sample of students enrolled in introductory non-calculus courses. Higher Level Courses denote the sample of students enrolled in introductory calculus-based courses. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First- time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. 67 1C.2 Additional Heterogeneity Analysis Table 1C.1 Heterogeneity with Gender Graduation in 6 years (1) (2) Lower Level Higher Level -0.073*** (0.023) 0.022 (0.014) 0.78 16771 0.04 -0.007 (0.007) 0.000 (0.006) 0.83 12475 0.06 Foreign Share Foreign Share * Female Mean Dep. Var. N R-squared Notes: This table reports the heterogeneous effect of exposure to foreign peers (proxied by share of foreign students in the peer group) by domestic students’ gender. Each column corresponds to the sample used for the analysis and is denoted in the column header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. The regressions include the interaction term of the female dummy and the foreign share, keeping male as the omitted group. All regressions further control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first- generation flag), students’ own math and English ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman domestic students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 68 1C.3 Effect of Exposure to Foreign Peers on Foreign Students Table 1C.2 Effect of Exposure to Foreign Peers on Foreign Students’ Graduation Lower Level Higher Level (1) (2) (3) (4) Graduation in 6 years Retention Graduation in 6 years Retention Foreign Share Mean Dep. Var. N R-squared 0.157 (0.106) 0.66 390 0.20 0.098* (0.057) 0.82 390 0.20 0.002 (0.013) 0.77 2980 0.16 0.007 (0.008) 0.91 2980 0.16 Notes: This table reports the effect of exposure to foreign peers (proxied by the share of foreign students in the peer group) on foreign students’ graduation. Each column group corresponds to the sample used for the analysis and is denoted in the column group header. Lower Level denotes the sample of students enrolled in introductory non-calculus courses. Higher Level denotes the sample of students enrolled in introductory calculus-based courses. Each column corresponds to a separate regression of students’ outcomes on exposure to foreign peers, with outcome variables denoted by the column headers. All regressions control for course-instructor FEs, course-term FEs, students’ own characteristics (race, gender, first-generation flag), students’ math ability, peer group size, peer characteristics (share of female students, share of first-generation students), and peer math ability (average math ability). Robust standard errors clustered at instructor level in parenthesis. Sample: First-time freshman foreign students enrolled in introductory math courses in their first term between 2005-2014. Source: Authors’ calculation using university administrative data. Significance: *p < 0.10, **p < 0.05, ***p < 0.01. 69 CHAPTER 2 THE LOCAL ECONOMIC IMPACTS OF FOREIGN STUDENTS 2.1 Introduction Recent decades have seen a massive increase in the number of foreign students in post-secondary education (henceforth, foreign students) in the US. Following an almost two-fold increase since the beginning of the twenty-first century, the total foreign student enrollment stood at 1 million in 2016, which accounted for roughly 5% of total post-secondary enrollment in the US.1 The foreign students bring billions of dollars in revenue to the US post-secondary institutions and the economy, in addition to global talent and diverse cultural values; however, a rapid increase in their population may adversely affect the economic outcomes of the natives at places with host post-secondary institutions. This paper investigates this concern by examining the impacts of the expansion of foreign student enrollment on the local economic outcomes of the natives. An influx of foreign students creates positive local demand shocks at places with host institutions. As the local economic activities are interconnected, demand shocks evolve, creating a multiplying effect and affecting different aspects of the local economy. Because of this externality, many policies that aim to promote local demand are implemented.2 However, the impact on the local economy could eventually dissipate as labor and firms move across locations to arbitrage the benefits of the increased local demand, putting upward pressure on the land rents. Economists have long debated the distortions in economic behavior and eventual effects of local demand shocks (Glaeser and Gottlieb, 2008; Kline and Moretti, 2014b). Overall, the incidence and efficiency of these shocks are both empirical questions, depending on the mobility of workers and firms, housing supply elasticity, and changes in the factor prices. The effect of local demand shocks created by foreign students could be particularly important for local economies that depend heavily on the education sector and lack growth opportunities in other sectors. A positive effect may foster economic growth 1The number of newly enrolled foreign students on the most commonly issued student visa for the US, F1 visa, has dramatically increased from 138,500 in 2004 to 364,000 in 2016 (Ruiz and Budiman, 2018). There is no official yearly limit on the number of F1 visas that can be issued, unlike most other visa types issued by the government of the United States. 2See Kline and Moretti (2014b), Neumark and Simpson (2015), Bartik (2020) 70 in such areas. While there has been a long-standing debate on the impact of immigrants on the native outcomes and the host economy (Abramitzky and Boustan, 2017; Kerr and Kerr, 2011), foreign students are notably different. Usually, the immigrant population live, consume, and work in the host area, thereby affecting both the demand and supply in the labor market. Dustmann et al. (2017) study an unusual case where the immigrants were only allowed to work in the host area, but were denied residency rights, which led to a labor supply shock only.3 In contrast, a distinctive and key feature of foreign students in the US is that they cannot work on a student visa until they have finished their education;4 hence the shock is arguably a “pure” demand shock. In addition to contributing to the debate of whether foreign students are good for the local economy, this paper fills a gap in the literature by exploring the effects of a unique case of “pure” demand shock. In this paper, I study the local economic impacts of foreign student enrollment expansions between 2004 and 2016, when foreign student enrollment doubled in the US. Focusing on counties with post-secondary institutions where students were a large share of the county population in the base year (henceforth, sample counties5), I estimate the causal effects on the local economic outcomes. I also look at the local economic effects of domestic post-secondary student (henceforth, domestic student) enrollment and discuss the potential welfare impacts on different agents in the local economy. My primary data sources are publicly available Integrated Postsecondary Education Data System (IPEDS), Bureau of Economic Analysis (BEA), County Business Pattern (CBP) series, and National Historical Geographical Information System(NHGIS). I use a long difference specification and exploit the cross-county variation in the change in enrollment of foreign and domestic students. However, a major challenge in estimating the causal impact is that the student enrollment could be 3Dustmann et al. (2017) evaluate a policy implemented 14 months after the fall of the Berlin wall. The policy allowed Czech workers to seek employment in German border municipalities, but denied residency rights, leading to an exogenous labor supply shock. 4An exception to this is working part-time on-campus or full time on Curricular Practical Training (CPT). CPT is temporary employment authorization for students on F-1 visa while enrolled in a college-level degree program. Also, the work on CPT must be related to the student’s degree program and necessary to complete the degree. 5To be precise, I set the student-to-population ratio cutoff to be 5% in the year 2004. 71 correlated with the unobserved county-specific secular trend or the unobserved contemporaneous shocks affecting the local economic outcomes of the county. For instance, a worsening state economy could reduce state appropriations to public universities and increase universities’ reliance on foreign students leading to a problem of reverse causality and downward bias in the OLS estimates. First, to address the potential endogeneity issue between the enrollment and the secular trend, I control for a county-specific pre-trend in the outcome. Second, to address the potential endogeneity concern between foreign student enrollment and the unobserved contemporaneous shock, I construct a shift-share instrument (henceforth, the foreign IV) based on the historical share of the foreign students in a county in the US. Counties with higher initial shares of foreign students are more likely to substantially increase foreign student enrollment during a period when foreign enrollment increases at the national level. One of the potential reasons for this is the network effect — foreign students provide information and assistance to a compatriot planning to study abroad. In particular, the foreign IV, which is the predicted change in the actual foreign enrollment, is the interaction of the historical presence of foreign students in a county (“share") and the contemporaneous national level expansion in foreign student enrollment (“shift"). While the instrument is uncorrelated with contemporaneous shocks as long as the “shift” part of the instrument is not driven by idiosyncratic local shocks, the “share" part of the instrument could be correlated with the secular trend. Therefore, it is crucial to credibly partial out the secular trend of the local economic outcomes, without which the instrument can be invalid. Third, I construct a shift-share instrument (henceforth, the domestic IV) to address the endo- geneity concern with domestic student enrollment as well. However, unlike the foreign IV, the domestic IV uses the variation in the historical share of the domestic students in a county from different states in the US, rather than from total domestic students in the US as it better explains the variation in actual domestic enrollment. In particular, the instrument is constructed by summing the interaction terms between the historical presence of domestic students in a county (“share”) from a particular state and the contemporaneous change in the number of post-secondary students who 72 are residents of the corresponding state (“shift”). Since a large share of domestic students attends post-secondary institutions within their state of residence, the domestic IV could be reflecting the overall state economy; however, including state fixed effects addresses the concern. Using similar arguments as for foreign IV, the domestic IV is plausibly exogenous to the local economy. I examine the plausibility of identifying assumptions, including the validity of exclusion restrictions in the case of shift-share instruments. I conduct a test recently suggested by Goldsmith-Pinkham et al. (2020) and show that the instruments are unlikely to be correlated with the unobservables. I find sizable effects of the increase in foreign student enrollment on the level of local economic activities between 2004 and 2016. The estimates imply that one additional foreign student created 2.73 jobs in the same county over the 12 years. Demographic-adjusted wages also increased steeply by 3.32% for one percentage point increase in the foreign student enrollment-to-population ratio. A potential reason that the effects are stronger than in other immigration contexts is that foreign students have very restricted work opportunities, thereby reducing possible supply-side effects. Further, foreign student enrollment led to a large positive increase in the county population, which is reasonable as the creation of new employment opportunities might attract more workers from other places. In the housing market, I do not find a statistically significant effect on gross housing rent. The rapid increase in housing units might have eased upward pressure on the housing rent as I find that the housing units increased by 1.1 for every additional foreign student. While the marginal effect of foreign student enrollment is sizable, domestic student enrollment led to little or no effect on the levels of local outcomes over the 12 years. Overall, the results suggest potential welfare gains for native workers as employment oppor- tunities and wages improved but there is no significant effect on the housing rent. In theory, the movement of firms and workers into a particular geographical area puts upward pressure on rent. And if the housing supply is inelastic, it leads to welfare gains capitalized in land rents that would otherwise accrue to resident workers. However, in this paper, I find no significant effect on housing rent. While foreign student enrollment increased rapidly between 2004 and 2016, domestic student 73 enrollment increased significantly until 2010 and declined rapidly after that. A 12-year long difference specification masks this sharp change in trend and could lead to conflated effects. However, a split period analysis addresses this concern and validates the main results. Several robustness tests further strengthen the results presented in this paper. The findings are robust to the additional controls that partial out secular trends more flexibly and alternate sample analysis. Moreover, I do not find any adverse impacts on the counties without post-secondary institutions that neighbor sample counties. As workers and firms are mobile, the overall impact of the expansion of foreign student enrollment could be misrepresented without looking at the effect on neighboring counties. The findings suggest that foreign student enrollment expansions lead to welfare gains, on average, for the natives; however, the extent to which there are heterogeneous effects could be large, as the adjustment of the local economy depends on various local characteristics. To further unfold how local demand shocks evolve and affect the local economy, I look at the heterogeneity by county’s population density. While I find that employment increases with increasing population density, housing rent also increases. Although the welfare impacts on the resident workers would depend on the relative magnitude of the increase in wages and housing rent, the results provide suggestive evidence of greater benefits for natives in sparsely populated counties than in densely populated counties in the longer run. This paper makes three broad contributions to the literature. First, it contributes to the literature on the effects of local demand shocks on the local economy. To the best of my knowledge, my paper is the first to look at the effects of local demand shocks created by foreign students on the local economy. While the literature on local demand shocks includes papers that focus on place-based policies (Neumark and Kolko, 2010; Chodorow-Reich et al., 2012; Busso et al., 2013; Kline and Moretti, 2014a; Chaurey, 2017), shocks to amenities and infrastructure (Chirakijja, 2022), or other specific shocks (Black et al., 2005; Zou, 2018), the expansion of foreign students provides a suitable and unique setup to study the effects of “pure” demand shocks. Many studies in this literature focus on the local labor markets and look at the local job multiplier, which is the number of additional 74 jobs created by exogenously generating one more job (Black et al., 2005; Moretti, 2010; Chodorow- Reich et al., 2012). However, looking at only the job multiplier may misrepresent the true welfare impacts since the various aspects of the economy are connected, and factors move across locations (Zou, 2018).6 So, I look at a vector of outcomes and provide a more complete picture of the local economic impacts. My paper further examines the heterogeneous effects by county’s population density, a relatively understudied area within this literature. This aspect is essential as the potential welfare gains or losses to native workers would depend on how prices adjust in different markets in the local economy, which may vary substantially by the local characteristics. Second, this paper contributes to the extensive literature on the impact of immigration on the host economy. Most papers in this literature look at the immigrant population that can provide labor (Card, 1990; Altonji and Card, 1991; Card, 2001; Ottaviano and Peri, 2012; Doran et al., 2014). While it is still an unresolved debate whether immigration negatively affects the local economic outcomes, my paper finds sizable positive effects of foreign students on the natives and the local economy (Abramitzky and Boustan, 2017), potentially due to their distinctive feature of not being able to supply labor. Finally, this paper contributes to relatively new and growing economics literature on foreign post-secondary students, an immigrant type that is expanding rapidly around the world and is expected to grow further in the future with the globalization of education. The existing literature on foreign students focuses on domestic students’ educational outcomes (Borjas, 2007; Shih, 2017; Anelli et al., 2020), universities’ reliance on foreign students to generate revenue (Bound et al., 2020) or future labor market effects on natives (Demirci, 2020). Another study looks at the impact of the international student boom between 2005 and 2015 on the housing markets at the college-town level (Mocanu and Tremacoldi-Rossi, 2019). My paper, in contrast, looks at the local economic effects of foreign students at the county level, which arguably constitutes a local economy.7 6Zou (2018) looks at the local economic impact of the US military contractions between 1988 and 2000. It shows that even though the local job multiplier was sizable, the welfare costs to workers were small as the local population adjusted quickly to the shock, mainly through reduced in-migration, which led to small changes in wages but large declines in the rental prices. 7A concurrent working paper, Dang (2022), also studies the effects of foreign students’ increase on the labor market but with a different empirical specification and shift-share instrument. The author shows that foreign students’ 75 2.2 Foreign Students in US Post-Secondary Institutions The number of foreign students enrolled in degree programs in post-secondary institutions in the United States increased dramatically between 2004 and 2016. Figure 2A.1a shows that foreign student degree enrollment increased by 70% in this period from around 565,000 students to 950,000 students. This includes total degree enrollment at post-secondary institutions of all level8 and control9 types that are eligible for the federal financial aid program. The increase in undergraduate enrollment accounts for 60% of this increase, and the number of new foreign students enrolled grew faster at public institutions than at private institutions (Ruiz and Budiman, 2017). The average increase in foreign student degree enrollment was 517 per county over the 12 years among the sample counties. Over the same period, the share of foreign students in total post-secondary degree enrollment increased from 3.5% to over 5%. Not only has foreign student enrollment increased in absolute numbers, but also as a share of the population. The average foreign student-to-population ratio almost doubled in counties with post-secondary institution (Figure 2A.1b). The foreign students come to the US from around the world but the countries that send the most students are China, India, South Korea, and Saudi Arabia. In fact, China, India, and South Korea accounted for 54% of all the new foreign students in the US in 2016 (Ruiz and Budiman, 2017). Various push and pull factors may have contributed to the significant increase in foreign students in the US. First, due to the rapid economic growth of the sending countries, the number of families who can afford their child’s post-secondary education in a foreign country increased in the last two decades.10 Second, to generate higher revenue, universities are admitting more foreign students who increase in period 𝑡 is associated with an increase in local employment and wages in period 𝑡 + 1. Unlike them, I use a long-difference equation that estimates the adjustment of the local economy in the “long run” that takes into account the macro trends over a large period. Looking at the labor market, housing market, and population, I comment on welfare implications for different agents in the local economy to provide a broad picture. Further, I include the domestic student enrollment in the main specification, without which the shift-share-style foreign instrument could violate the exclusion restriction, as both are likely to be correlated. 8A classification of whether an institution’s programs are 4-year or higher (4-year), 2-but-less-than 4-year (2-year), or less than 2-year. 9A classification of whether an institution is operated by publicly elected or appointed officials or by privately elected or appointed officials and derives its major source of funds from private sources. 10Bound et al. (2020) document that with the fourfold increase in China’s GDP per capita between 1996 and 2012 and appreciation of yuan since 2005, the percentage of Chinese families with average income greater than the average out-of-state tuition plus boarding expense increased exponentially from 0.005% in 2000 to over 2% in 2013. 76 pay higher out-of-state tuition. Many new programs have also sprung up, particularly in the STEM fields, where foreign students are heavily represented. The increase in foreign student enrollment is closely related to the decrease in state appropriations to public universities. Bound et al. (2020) estimate a 16% increase in foreign enrollment at the public research universities, which partially compensate for lost funding, with a 10% reduction in the state appropriations. Third, the Optional Practical Training (OPT) period was extended from one year to 29 months in 2008 for the STEM graduates to retain foreign STEM students as workers.11 OPT is a program that allows full-time foreign students to temporarily work on their student visas after completing their post-secondary education. The extension addressed concerns of losing students due to a limit on H-1B visas, a primary work visa for the US. Unlike H-1B work visa, which has an annual cap of 85,000 visas, the number of approvals under OPT has no cap. So, an increased OPT period meant that the foreign students in the US would have two additional chances (once every year) of getting approved for the H-1B work visa and entering the US labor market, which encouraged more foreign students to enroll for a post-secondary STEM course in the US (Amuedo-Dorantes et al., 2019).12 Finally, the number of students completing high school or an undergraduate degree increased in the sending countries (UNESCO, Institute for Statistics, 2017). During the same period, the number of domestic students enrolled in degree programs increased from around 15 million to around 17.5 million but not monotonically. Figure 2A.1c shows domestic student enrollment increased to 19.3 million by 2010 and decreased after that. Most of this decrease since 2010 is due to a decrease in enrollment at 2-year and less than 2-year post-secondary institutions. The students contribute to the host economy by paying for their education and expenditure to support themselves while enrolled in post-secondary institutions. An increase in the student population would lead to additional demand for goods and services, creating additional local labor demand. Foreign student influx likely created strong local demand shocks, primarily because 11This period was further extended to 36 months in 2016. 12Also, 20,000 visas of the total H-1B visas are set aside for those who hold advanced degrees (master’s, professional, or doctorate) in any subject from a US higher educational institution. This provides an added advantage to foreign students enrolled in a US post-secondary institution. 77 of their strong financial background. They usually pay higher out-of-state tuition than domestic students. So, families abroad who can afford out-of-state tuition, boarding expense, and travel costs can only send their child for post-secondary education, suggesting foreign students have higher resources.13 Compared to domestic students, foreign students from different countries are also likely to create demand for diverse goods and services, creating opportunities for a wide variety of new businesses. Moreover, the market for goods and services “traditionally” demanded by domestic students might already exist to a large extent. This suggests that foreign students are likely to induce stronger demand shocks than domestic students. Foreign students contributed nearly $41 billion to the US economy in the academic year 2018-19 (NAFSA, 2020).14 To put that in perspective, the financial incentives provided by all tiers of the US government under place-based job policies was around $60 billion in 2015 (Bartik, 2020).15 The labor demand shocks may evolve through various channels and create a multiplier effect, affecting different aspects of the local economy. First, existing businesses may expand, and new businesses may open up, generating more employment, which in turn creates additional jobs mainly through increased demand for goods and services (Moretti, 2010). Increased demand for labor with supply fixed increases wages in the short term. Second, new employment opportunities may lead to population adjustments, mainly through increased in-migration of workers and their families, which may partially offset the wage increase over time. Third, the population adjustment may affect the demand for housing units, with an increase in population putting upward pressure on housing rent. Finally, the housing market may respond with the supply of new housing units. Depending on the housing supply elasticity of the area, it might partially offset the housing rent over time. Because the local economic activities are so interconnected, we must look at various outcomes in 13As mentioned previously, there has been rapid improvement in the financial conditions of the families from the primary sending countries. Further, using the administrative data on the F-1 student visa, Bound et al. (2020) documented that for the 2010-15 period, only 6% of undergraduate students from China at research universities received funding from the universities they attended, which again suggests strong financial background of the foreign students in the US. 14NAFSA (2020) estimate of economic value contribution by foreign students is the overall imported dollars from foreign students without any multiplier effect. 15Some other estimates of incentives are provided by Thomas (2011) and Story (2012). Thomas (2011) calculates $73 billion, and Story (2012) calculates $101 billion (in 2019 dollars) in incentives. 78 the local economy to get a broad picture of the effects, which depend on the mobility of workers and firms, the local housing market conditions, and other local characteristics. 2.3 Econometric Framework I estimate the impact of a change in the number of foreign and domestic student enrollment on the local economic outcomes during the phase of the dramatic increase in foreign post-secondary student enrollment in the US over the period 2004-16 using the following long-difference specification: ∆𝑦𝑘 𝑐 = 𝛼𝑘 + 𝛽𝑘 1 ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐 + 𝛽𝑘 2 ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐 + 𝑋𝑐 · Θ𝑘 + 𝜆𝑠 + ∆𝜖 𝑘 𝑐 (2.1) The unit of observation is the county and is denoted by the c subscript in the regression. ∆ denotes the 12-year difference between the years 2004 and 2016. 𝑦𝑘 denotes a local outcome, which include (a) employment, (b) log average demographic-adjusted wage, (c) the number of business establishments, (d) non-student population, (e) housing units, and (f) log median gross housing rent. Wages and rents are in constant 2010 dollars and are used in the logarithmic form as local outcomes. Outcome variables ∆𝑦𝑘 𝑐 are changes in local outcomes 𝑦𝑘 of county c, which are scaled by the county’s 2004 population for non-logarithmic local outcomes (a), (c), (d), and (e). ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐 = (𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,2016 − 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,2004)/𝑃𝑜 𝑝𝑐,2004 is the change in number of foreign students in county c scaled by the county’s 2004 population. Similarily, ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐 = (𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐,2016 − 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐,2004)/𝑃𝑜 𝑝𝑐,2004 is the change in number of domestic students in county c scaled by the county’s 2004 population. 𝑋𝑐 is a vector of observable county characteristics. 𝜆𝑠 is the state fixed effects. The primary coefficients of interest are 𝛽𝑘 1 and 𝛽𝑘 2 , which are the changes in the local outcome associated with a net increase of one foreign student and one domestic student, respectively, for non-logarithmic local outcomes. For logarithmic local outcomes, the coefficients of interest are the percentage changes in the local outcome associated with a percentage point increase in the foreign student enrollment-to-population ratio and the domestic student enrollment- to-population ratio, respectively. Lastly, ∆𝜖 𝑘 𝑐 is the error term that includes the unobserved factors that might influence the outcome variables. 79 There are a few challenges to causally estimating the impact of change in foreign and domestic enrollment on the local economy using an ordinary least squares regression. First, foreign and domestic student enrollment changes could be endogenous to county-specific secular trend. For instance, a fast-growing county economy could lead to higher housing rent and discourage students from enrolling in an institution in that county. This could bias the OLS estimates downward. Second, foreign and domestic student enrollment changes could be endogenous to unobserved contemporaneous shocks. For instance, a negative shock to the state economy between 2004 and 2016 could reduce state appropriations to universities, inducing universities to admit more full- tuition-paying foreign students to cushion the lost revenue. This could bias the OLS estimates downward. To address the endogeneity issues, it is important to control for the secular trend of the county. Following the conventional approach, I control for the secular trend in outcome 𝑦𝑘 driven by the observable characteristics 𝑋𝑐. Specifically, I control for the growth rate of all the outcomes from the year 1996 to 2001.16 For wages and rents, the control is the change in the log of the outcome in the pre-period. Moreover, I include state fixed effects to control for the state-specific secular trend in the outcomes. To address the potential endogeneity issue of correlation between foreign student enrollment and the unobserved contemporaneous shocks, I construct a shift-share instrument using the initial distribution of number of foreign students by county. Network effect is one of the primary determinant of location choice of foreign students (Beine et al., 2014).17 A foreign student is likely to provide information and assistance to a compatriot planning to study abroad. So, counties with higher initial share of foreign students are more likely to substantially increase foreign student enrollment during a period when foreign student enrollment increases at the national level. Figure 2A.2 presents the fitted line of the county level regression of the change in the ratio of foreign 16Housing market outcomes require decennial census data so the change is between 1990 and 2000. 17Beine et al. (2014) study the location choice determinants of foreign students and finds network effect to be a primary determinant. They define network to include stock of all migrants from the origin country living at the destination. Although they look at the determinants of the location choices at the country level, similar factors should determine the location choices at the city or county level within a particular destination country. 80 student-to-population between 2004 and 2016 on the ratio in the year 2001.18 The slope of the fitted line is 0.57, and it is significant at the 1% level. It shows that foreign student enrollment increased more in counties with a higher initial foreign student enrollment-to-population ratio. Based on this idea, I construct the foreign IV, which can be interpreted as the predicted change in the number of foreign student enrollment in a county. Specifically, I construct it as follows:19 ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝐼𝑉 𝑐 = 1 𝑃𝑜 𝑝𝑐,2004 · 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,2001 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑈𝑆,2001 · (𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑈𝑆,2016 − 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑈𝑆,2004) (2.2) In equation (2.2), 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑈𝑆,𝑡 denotes the total foreign student enrollment in the US in year t. The second term is the “share” part of the instrument, which is the ratio of foreign students in county c to foreign students in the US in the year 2001. The third term is the “shift” part of the instrument, which is the change in the number of foreign students in the US between 2004 and 2016. Similar to the main explanatory variables, the product of terms is scaled by the county’s 2004 population. As long as the “shift” part of the instrument is not driven by idiosyncratic local shocks, the instrument is uncorrelated with contemporaneous shocks. The variation in the foreign IV across the sample counties is presented in Figure 2B.2. I again use a shift-share instrument to address the potential endogeneity issue of correlation between domestic student enrollment and unobserved contemporaneous shocks. The instrument uses the same idea of calculating the predicted change in enrollment, which in this case would be of the domestic students. For this, I use the information on the total number of first-time degree- seeking domestic students in an institution by the state of residence.20 The instrument I construct for the change in the number of domestic students enrolled in a county is the weighted average 18I use 2001 as the base year because the US government imposed restrictive immigration policies in the immediate aftermath of 9/11 due to security concerns, which could have affected the natural distribution of foreign students across locations in the US in a couple of years following 2001. 19This is similar to the one used in Altonji and Card (1991). 20The state of residence information is only available for first-year freshmen students. Since most undergraduate programs are four-year-long, I multiply it by four to calculate the total number of students attending an institution from a particular state of residence. A couple of other factors could affect this ratio of domestic enrollment to domestic freshmen enrollment. First, first-year students dropping out of college would decrease this ratio. Second, considering the domestic graduate enrollment would increase this ratio. So, on average, it is reasonable to argue that the domestic enrollment would be approximately four times the domestic freshmen enrollment. 81 of the change in the number of domestic students by the state of residences, with weights being the county-specific domestic student enrollment share in those resident states in the year 2004. Specifically, I construct the domestic IV using the following equation: ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝐼𝑉 𝑐 = 1 𝑃𝑜 𝑝𝑐,2004 ∑︁ · 𝑠∈𝑆 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐,𝑠,2004 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,2004 · (𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,2016 − 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,2004) (2.3) In equation (2.3), 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,𝑡 denotes the total first-time degree-seeking domestic students coming from a resident state s in the year t. S is the set of all states in the US. The second term is the share of first-time degree-seeking domestic students from the resident state s in county c in the year 2004. The third term is the total change in the number of first-time degree-seeking domestic students from the resident state s between 2004 and 2016. Finally, the summation of the product of two terms over all the resident states 𝑠 ∈ 𝑆 is scaled by the baseline population of the county. Consider, for example, two counties where the total domestic enrollment is the same, but the share of domestic enrollment from different states is different. If the total number of post-secondary students from a state increases (decreases), the county with a higher share of students from that state receives more (less) domestic students from that resident state. The variation in the domestic IV across the sample counties is presented in Figure 2B.3. A potential concern with the domestic IV is that it could be correlated with the state-level contemporaneous shocks. Since a large share of domestic students attends a post-secondary institution within their state of residence, the domestic IV could be reflecting the overall state economy. However, including state fixed effects in the main specification addresses this concern. Note that the “shift” part of the foreign IV is the same for all counties. The variation comes from the “share” part of the instrument, which might be correlated with the secular trend of the county. For instance, a county that experienced an economic downturn in the 1990s could lead to both a large share of foreign student enrollment in the base year and lower economic growth between 2004 and 2016. So, it is imperative to control for the secular trend of the county, without which the instruments could violate the exclusion restriction. A similar argument goes for the 82 domestic IV as well. As mentioned previously, I partial out the secular trend by controlling for the pre-period growth rate of the outcome variable, but there could still be concerns about the term adequately capturing the secular trend. So, as a robustness exercise, I control for a long list of non-linear functions of controls to capture the secular trend more flexibly, and the results are similar. In an additional exercise recently suggested by Goldsmith-Pinkham et al. (2020), I show that the instruments are unlikely to be correlated with the secular trend. 2.4 Data Annual institution-level enrollment numbers of domestic and foreign students and institutional characteristics, including county address, are available from the Integrated Postsecondary Education Data System (IPEDS). The IPEDS universe includes institutions of all levels, sectors, and degree- granting and non-degree-granting status. IPEDS also includes institution-level state of residence data for first-time degree-seeking first-year students (this includes students who enrolled in the fall term and the last summer term), collected in even-numbered years. It gathers information for every institution participating in the federal student financial aid program (henceforth, Title IV institution). The Higher Education Act of 1965 requires all the Title IV institutions to report to IPEDS annually.21 For the analysis, I consider those institutions that were Title IV eligible in at least one of the years from 1996 to 2017. I use the Fall Enrollment component22 of IPEDS to calculate the annual enrollment in an institution, among degree/certificate-seeking students. Non-degree/certificate-seeking students are more likely to be enrolled in an online or distant program and not directly influence the county’s local economy. However, these students might affect the local economy indirectly as they are paying tuition to the institution. Next, I aggregate the institution-level annual enrollment to get county-level annual enrollment. For some institutions, the county address was entered manually,23 particularly for those that did not operate pre-2000 and post-2008, as IPEDS does not provide 21A non-Title IV institution must request to be part of IPEDS, but IPEDS does not identify what percentage of those institutions are represented in its universe. Aggregate annual enrollment in non-Title IV institutions accounts for less than 0.05% of aggregate annual enrollment in all institutions in the IPEDS universe in 2004 and 2016. 22It collects data on the number of foreign and domestic students enrolled in an institution in the fall. 23The county address come from the official websites of the institutions. 83 county information from 2000 to 2008. Annual county-level population, employment, and earnings by industry come from the Re- gional Economic Accounts (REA) available on the Bureau of Economic Analysis (BEA) website. The County Business Pattern (CBP) series provides annual county-level business establishment numbers. Data on housing units and rents come from county-level tabulations of Census and American Community Survey (ACS) data on National Historical Geographical Information Sys- tem (NHGIS). I use the county level tabulations of 5% ACS 2009 as a proxy for the year 2004 as no dataset provides data on these variables for all the counties for the year 2004. I construct a county density variable using the area information from county shapefiles available from NHGIS. Lastly, the county adjacency files come from NBER public use data archive. There are 1534 counties with at least one Title IV institution and 1591 counties without any Title IV institution. I restrict the sample to counties with a high student-to-population ratio in 2004, where it is more likely that shocks to student composition would create substantial demand shocks and subsequent adjustments of the local economy. I set the student-to-population ratio threshold to be 5% in 2004, leaving a sample of 655 counties — these counties hosted over 80% of foreign students in the US in 2004. Further, I remove three counties due to missing values of one or the other variables, leaving a final sample of 652 counties (Figure 2B.1). Table 2B.1 presents the summary statistics of the variables for the sample counties. Appendix 2C provides more details on the construction of key variables. 2.5 Empirical Results 2.5.1 Effects of Foreign and Domestic Enrollment on Employment I first look at the first stage results of the 2SLS estimator for both the foreign and domestic student enrollment with employment as an outcome in Table 2A.1. Column 1 reports results for foreign student enrollment, and column 2 reports results for domestic student enrollment. The coefficient for the foreign IV in column 1 is 0.963, and the coefficient for the domestic IV in column 2 is 1.396, which means the instruments quite accurately predict the actual change in student enrollment between 2004 and 2016. Next, the positive and significant coefficient for foreign IV in the second 84 column suggests a cross-subsidization of domestic enrollment fees by high tuition payments from foreign students leading to an increase in domestic enrollment (Shih, 2017). Moreover, the positive correlation implies the need to control for domestic student enrollment, without which the foreign IV will violate the exclusion restriction.24 The last row reports the Angrist-Pischke (AP) first-stage F-statistic of 53.10 and 17.59 for foreign and domestic student enrollment, respectively, which suggests the strong predictive power of the instruments. Table 2A.2 reports the estimation results from various versions of equation 2.1 using OLS and 2SLS estimators with employment as an outcome. The coefficients can be interpreted as local job multiplier, which would be the increase in the number of jobs due to an additional student enrollment. Column 1 is the OLS estimation using just foreign student enrollment, and the estimated coefficient is 1.280, which is statistically significant at the 1% level. Column 2 controls for domestic student enrollment, and the coefficient drops to 0.825. This is expected as domestic student enrollment is likely to be positively correlated with both employment and foreign student enrollment. Column 3 further controls for secular trend and state fixed effects, and the coefficient increases to 1.081. Columns 4 to 7 report the estimation results using the 2SLS estimation method. The AP first-stage F-statistics are reported in the last two rows of the table depending on the version of the specification 2.1 used in that column. Column 4 instruments for foreign student enrollment but does not control for domestic student enrollment, secular trend and state fixed effects. Column 5 adds domestic student enrollment as a control and column 6 instruments for both the enrollment variables. The estimated coefficient in column 6 is 2.748, which is statistically significant at the 1% level. Column 7 further controls for the secular trend and the state fixed effects, and the estimate is 2.725. The estimates in Columns 6 and 7 are virtually similar, somewhat addressing the concern that the “share” part of the foreign IV might be correlated to the secular trend. Moving from columns 1 to 7, the movement in the estimated coefficient for foreign student enrollment shows how the estimates could be biased if the endogeneity issues are not addressed. 24Further, it is essential to instrument for domestic student enrollment, without which there may be an induced “spillover bias” on the estimate of coefficient for foreign student enrollment. 85 From column 7, which is the preferred specification, the local job multiplier of foreign enroll- ment over the 12 years is 2.73, and the estimate is significant at the 1% level. At the same time, a net increase of one domestic student enrollment in a county created 0.24 jobs in the same county, although the estimate is not significant at any conventional level. Given that the average initial employment-to-population ratio is 0.574 and the average increase in the foreign student enrollment- to-population ratio in the sample is 0.26 percentage points, the employment in the sample counties increased by 1.24% due to the foreign student boom over the 12 years.25 Comparing the estimates with other local job multiplier estimates in the literature suggests that the effect of foreign student enrollment is sizable. Moretti (2010) finds that an additional job in the tradable sector26 in a given city creates 1.6 jobs in the nontradable sector in the same city over a decade, whereas an additional skilled job in the tradable sector generates 2.5 jobs in the nontradable sector. The effect is significantly larger for skilled jobs because they command higher earnings, leading to stronger local demand shocks. The estimate associated with foreign student enrollment is similar to the one for the skilled job in the tradable sector, as foreign students are also likely to have strong local demand shocks due to their strong financial background. 2.5.2 Effects of Foreign and Domestic Enrollment on Other Outcomes Table 2A.3 reports estimates for other outcomes in local labor markets and local businesses. All columns in this table present results for the specification in Table 2A.2, column 7. Column 1 repeats the result for employment. Next, I look at the effect on employment in tradable and nontradable sectors (Black et al., 2005; Zou, 2018).27 Columns 2 and 3 report that a net increase of one foreign student enrollment did not affect employment in the tradable sector but created 2.3 jobs in the nontradable sector. At the same time, a net increase of one domestic enrollment created 0.08 jobs × 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2004 𝑝𝑜 𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛2004 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2016 −𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2004 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2004 . So, the per- 25∆𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡 = 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2016 −𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2004 𝑝𝑜 𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛2004 = 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2016 −𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2004 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2004 centage change in employment due to foreign student enrollment expansions is 𝛽∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛 𝑝𝑜 𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛2004 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡2004 logarithmic local outcomes. 100 = 100 = 1.24. Similarly, I calculate the percentage change in other non- 100 = 2.73×0.0026 0.574 26The tradable sector includes industries whose products could be primarily traded nationally or internationally. Whereas the nontradable sector includes industries whose products are primarily traded locally. 27Following Black et al. (2005), the tradable sectors here includes manufacturing. The nontradable sector includes all private nonfarm employment sectors excluding manufacturing, mining, forestry, fishing, and related activities. 86 in the tradable sector and did not affect the nontradable sector. Consistent with the literature, the effects are primarily concentrated in the nontradable sector. As one would expect, the production of goods and services sold locally is likely to be impacted more. Column 4 reports the effect on the log demographic-adjusted average wage28 in the county. The adjusted wage increased by 3.32% for a percentage point increase in the foreign student enrollment- to-population ratio. Column 5 reports that a net increase of 13 foreign students led to an increase in one business establishment.29 There was no impact of the change in domestic student enrollment on wages or business establishments. As expected, the effect of foreign students on local labor market outcomes and business establishments is much stronger than domestic students, as foreign students are likely to induce stronger demand shocks. Moreover, the effects differ from other immigration contexts, possibly due to foreign students’ restricted access to work, thus reducing possible supply side effects. Table 2A.4 reports effects on county population and outcomes in the housing market. Column 1 shows that with a net increase of one foreign student enrollment, the non-student population in the county increased by 3.17, which is significant at the 1% level. A sizable increase in the non-student population is consistent with the large positive effect on employment, as the creation of new job opportunities may have attracted more workers and their dependents to the host counties. If the dependents of new workers are accounted for among the migrating population, the difference in the number of new jobs and new workers is large, suggesting increased employment among the residents. Further, an increase in the employed population may partially offset the wage increase; still, there was a substantial wage increase. Thus, the results strongly suggest that the resident and newly migrated workers benefited in the labor market due to an increase in foreign student enrollment. At the same time, an increase in domestic student enrollment led to a decrease in the non- student population. One potential explanation is that the increase in domestic student enrollment did not create more job opportunities but might have led to the development of amenities geared to 28Appendix 2C provides details on the construction of this variable. 29Given that the initial average ratio of business establishments-to-population is 0.025, the number of business establishments expanded by 0.8% due to foreign student enrollment expansions. 87 the student demographic, which the resident population might not like, resulting in out-migration. Column 2 reports that the total housing units30 increased by 1.1 with additional foreign student enrollment, and the coefficient is statistically significant at the 1% level. The number of housing units increased by 0.6% due to foreign student enrollment expansions over 12 years. In column 3, the estimated coefficient shows that median gross housing rent31 increased by 0.6% with a percentage point increase in the foreign student enrollment-to-population ratio, but the effect is not statistically significant. The rapid increase in the supply of housing units may have eased upward pressure on housing rent due to the increasing student and non-student population. On average, the wages increased more than the housing rent, suggesting increased welfare for the natives. Lastly, the coefficient associated with domestic student enrollment for housing market outcomes in Table 2A.4 is small and not statistically significant at any conventional level, suggesting no effect on the housing market due to the change in domestic student enrollment over the 12 years. I find sizable effects of the local demand shocks created by the increase in foreign student enrollment on the level of local economic activities. The results suggest potential welfare gains for native workers of the county as the employment opportunities and wages improved, but there was no significant effect on housing rent. In theory, the movement of firms and workers into a particular geographical area puts upward pressure on rent. And if the housing supply is inelastic, it leads to welfare gains capitalized in land rents that would otherwise accrue to resident workers. However, I find no significant effect on housing rent. A potential reason could be the rapid increase in the housing supply. During the same time, the change in domestic enrollment had little to no effect on the levels of local economic outcomes over the 12 years. 2.5.3 Effects of Foreign and Domestic Enrollment on Local Outcomes using Split Long Difference While there was a net increase in domestic student enrollment between 2004 and 2016, the long difference masks the substantial increase in domestic enrollment between 2004 and 2010 and equally rapid decline between 2010 and 2016 (Figure 2A.1c). This may lead to conflated effects 30A housing unit is a house, an apartment, a mobile home, a group of rooms, or a single room that is occupied (or if vacant, is intended for occupancy) as separate living quarters. 31Gross housing rent is the monthly contract rent plus the estimated average monthly cost of utilities and fuels. 88 in the 12-year long difference estimation. In this subsection, I address this concern by splitting one long period (2004-16) into two periods and using them to estimate the effect on the local outcomes. The two split periods are 2004-10 (henceforth, first period) and 2010-16 (henceforth, second period). Specifically, I estimate the following equation: ∆𝑦𝑘 𝑐𝑡 = 𝛼𝑘 + 𝛽𝑘 1 ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐𝑡 + 𝛽𝑘 2 ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐𝑡 + 𝑋𝑐 · Θ𝑘 + 𝜆𝑠 + 𝜏𝑡 + ∆𝜖 𝑘 𝑐𝑡 (2.4) This equation is a modified version of equation 2.1 where I introduce subscript t with the outcome and the enrollment variables to denote the two time periods. Here the unit of observation is county cross time period and is denoted by the subscript ct in the equation. ∆𝑦𝑘 𝑐𝑡 either denotes − 𝑦𝑘 𝑐,2004 or 𝑦𝑘 𝑦𝑘 𝑐,2010, depending on the time period t, scaled by the county’s 2004 𝑐,2010 population, where 𝑦𝑘 is a local outcome of the county c. The outcomes include employment, − 𝑦𝑘 𝑐,2016 business establishments and non-student population. Housing market variables are not included in this analysis due to non availability of the data for the two split periods. ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐𝑡 = (𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,𝑡2 − 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,𝑡1)/𝑃𝑜 𝑝𝑐,2004 is the change in number of foreign students in county c scaled by the county’s 2004 population, where t2 = 2010, t1 = 2004 for the first period and t2 = 2016, t1 = 2010 for the second period. The construction of domestic student enrollment variable is analogous to this. I also introduce the time period dummy 𝜏𝑡 to absorb the time period effect which takes value 0 and 1 for the first and second period, respectively. As before, 𝑋𝑐 · Θ𝑘 controls for the secular trend, and 𝜆𝑠 is the state fixed effects. ∆𝜖 𝑘 𝑐𝑡 is the error term. The instruments are modified accordingly as well. The “share” part of the foreign and domestic IVs is the same as before for both periods, but the “shift” part depends on the time period. Specifically, the modified foreign and domestic IV are as follows: ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝐼𝑉 𝑐𝑡 = 1 𝑃𝑜 𝑝𝑐,2004 · 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,2001 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑈𝑆,2001 · (𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑈𝑆,𝑡1 − 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑈𝑆,𝑡2) (2.5) ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝐼𝑉 𝑐𝑡 = 1 𝑃𝑜 𝑝𝑐,2004 ∑︁ · 𝑠∈𝑆 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐,𝑠,2004 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,2004 · (𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,𝑡1 − 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,𝑡2) (2.6) where t1 = 2010, t2 = 2004 for the first period and t1 = 2016, t2 = 2010 for the second period. The 2SLS estimates using specification (2.4) are reported in the Table 2A.5. The standard errors 89 are clustered at the county level. The last two rows in the table showing the AP F-statistics indicate strong first stage relevance. The point estimate of the local job multiplier of foreign student enrollment is slightly higher than the earlier estimate, but they fall within the range of one standard error from each other. Note that the results in this analysis is the adjustment of the local economy over the 6 years period. The effects on the employment, business establishments and non-student population tell a similar story as the main results. Overall, they address the concerns associated with a sharp change in domestic student enrollment trend. 2.5.4 Heterogeneity with Population Density The results so far suggest that an increase in foreign student enrollment leads to potential welfare gains, on average, for the natives. However, the adjustment of the local economy to shocks may depend on various local characteristics, and it is interesting to explore the extent to which there are heterogeneous effects. So, to further unfold these adjustments, I investigate the heterogeneous effects across the area’s population density. Densely populated areas are likely to have better urban amenities, lower transportation costs of goods between different stages of production, or other agglomeration benefits, which could contribute to a stronger effect on the local labor market (Ciccone and Hall, 1993). For instance, having better road facilities improves the accessibility to businesses, leading to a stronger demand shock. At the same time, these areas are likely to have congestion effects or other agglomeration costs. For instance, the housing market could be tight due to lower vacancy rates, or the housing supply could be inelastic due to scarcity of land, which could put upward pressure on the rent when firms and workers move into the area to arbitrage the benefits of local demand shocks. The eventual effects of the same population shock in different local economies could vary substantially depending on these local factors. To study the heterogeneous effects, I include an interaction term of the foreign student explana- tory variable and the population density of the county in the main equation (2.1). In particular, I 90 estimate the following equation: ∆𝑦𝑘 𝑐 = 𝛼𝑘 +𝛽𝑘 1 ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐 +𝛽𝑘 2 ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐 +𝛽𝑘 3 (∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐 ×𝐷𝑐)+𝛾 𝑘 𝐷𝑐 +𝑋𝑐 ·Θ𝑘 +𝜆𝑠 +∆𝜖 𝑘 𝑐 , (2.7) where 𝐷𝑐 is the demeaned log of population density of the county 𝑐. All the other terms are the same as before. In addition to the earlier two instruments, I construct a third one similarly as the interaction term, by interacting ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝐼𝑉 𝑐 and 𝐷𝑐.32 Table 2A.6 reports the 2SLS estimates using specification (2.7). The AP F-statistics show that all endogenous variables have a strong first stage. Finally, I cluster the standard errors at the county level. Column 1 in Table 2A.6 shows that the local job multiplier increases with population density. The estimated coefficient on the interaction term is 2.2, which is statistically significant at the 1% level. This means that with every 10% increase in population density, the job multiplier increases by 0.22. The effects on wages, business establishments, and the non-student population exhibit similar patterns, although the estimates are not statistically significant. I do not find the effects on housing units differ by the area’s population density; the estimated coefficient on the interaction term in column 5 is small and not significant at any conventional level. In light of a larger positive effect on employment but no effect on housing units in more densely populated areas, it is not surprising that I find a large positive effect (statistically significant at the 1% level) on housing rent with increasing population density of the area. There could be a stronger positive effect on the housing rent in the future because of the possible housing supply saturation in more densely populated areas due to the relative scarcity of land. In contrast, sparsely populated areas could have more slack in the local housing market to absorb the increasing population without upward pressure on rent. Although the welfare impacts of a resident worker would depend on the relative magnitude of the increase in wages and housing rent, the results provide suggestive evidence that the welfare benefits might get smaller in more densely populated areas, due to increasing housing rent, compared to sparsely populated areas. 32There is no correlation between foreign student enrollment and population density of the county in the sample. 91 2.6 Robustness 2.6.1 Alternative Specifications Several alternative specifications confirm the tenor of the results in previous sections and, in the interest of space, are presented in the Appendix 2B. First, I include the quadratic and cubic terms of the pre-period growth rate of all the local economic outcomes. While the AP F-statistic is low for domestic student enrollment after inclusion of a long list of controls, the results are consistent with our main specification (Table 2B.2). Second, I expand the sample by sequentially including counties with a lower student-to-population ratio in the base year as a check on how I define a “high” student-to-population ratio and restrict the sample. For the main sample, the ratio threshold was set to be 5%. Table 2B.3 reports the results when I estimate the main specification on samples of varying sizes. Results tell a similar story. Finally, I look at the impact on the local outcomes of the neighboring counties without post- secondary institutions. As workers and firms are mobile, the demand shocks could affect the local outcomes of the neighboring counties, so without looking at them, the true overall effects of the foreign student enrollment boom might be misrepresented. In particular, one would be interested to know if the positive impact in counties with host post-secondary institutions comes at the expense of a negative impact on the neighboring counties without host post-secondary institutions. I use a sample of the counties without post-secondary institutions that neighbor a county with post-secondary institutions (henceforth, neighboring counties) for this analysis. I use the 12-year long difference in the local outcome of the neighboring county as the outcome variable, where all non-logarithmic outcomes are scaled by the neighboring county’s 2004 population. The two main explanatory variables for each neighboring county are the 12-year enrollment changes in domestic and foreign students summed over all the adjacent counties with post-secondary institutions. Further, I control for the secular trend and the state fixed effects. I construct the instruments as before, where they are the predicted change in enrollment in a county; however, now they are summed over all the adjacent counties with post-secondary institutions for each neighboring county. I further scale the enrollment variables and the instruments by the neighboring county’s 92 2004 population. I find that there is no effect of foreign student enrollment increase on the local outcomes of the neighboring counties except a very small positive effect on wages (Table ??). The results address the concerns related to adverse spillover effects on the neighboring counties. 2.6.2 Plausibility of Identifying Assumptions The first identifying assumption is that the instrument is not correlated with the unobserved part of the secular trend (exclusion restriction). As mentioned previously, the “shift” part of the foreign IV is the national level change in foreign enrollment over the years; the variation comes from the “share” part of the instrument, which could be correlated to the unobserved part of the secular trend. In other words, the initial share of the foreign student in a county could be correlated to the unobserved county-specific factors that affect the outcome variable. A similar argument goes for domestic IV as well. Looking at the robustness of the results when I control for a long list of controls in the previous subsection somewhat addresses this concern. In addition, I conduct a standard test suggested by Goldsmith-Pinkham et al. (2020) to look at how balanced instruments are across observable potential confounders, which suggests the importance of the unobservable confounders. So, I regress the foreign IV and domestic IV on the list of covariates used in the regressions previously and report the results in Table 2A.8. I use the logarithmic transformation of the non-logarithmic variables so that the coefficient interpretation is straightforward.33 In Columns 1 and 3 of Table 2A.8, the instrument is regressed on all the pre-period growth rates of the outcome variables. I find that the 𝑅2 is very low in both the regressions; the covariates only explain 3% and 7% variation in the foreign IV and domestic IV, respectively. Even after adding the quadratic and cubic terms of the covariates in columns 2 and 4, the 𝑅2 increases to 9% and 13%, respectively. As a point of reference, the 𝑅2 is low compared to the 𝑅2 of 43% in the canonical model in Goldsmith-Pinkham et al. (2020). Moreover, the magnitude of all the coefficients, including the statistically significant ones, is very small. This test, along with the robustness of estimates to a long list of controls, suggests that the instruments are unlikely to be correlated with the unobserved part of the secular trend, and it is reasonable to assume that the instruments satisfy the exclusion 33Because the non-logarithmic variables can take the least value of -1, I add 1.1 to all the variables and then take the logarithm of it. 93 restrictions. Finally, one assumption is that the instrument is not correlated with the unobserved contempo- raneous factors (exclusion restriction). By construction, the shift-share IV should not be correlated with the contemporaneous factors. However, one concern in the literature is that if a local economy is particularly big in a particular “industry” (foreign enrollment in this case), the national shock could be correlated with the local shock. In other words, it means that the national level shocks and the main effects are driven by only a few influential counties, which might violate the exclusion restriction. To check that, I remove counties with the highest absolute number of foreign student enrollment in 2004 and run the main results. In particular, I remove counties in the top 1 percentile of total foreign student enrollment in 2004. Results tell a similar story (Table 2B.4). 2.7 Conclusion This paper looks at the local economic impacts of the rapid increase in foreign student enrollment in the US between 2004 and 2016. Focusing on counties with post-secondary institutions where students were a large share of the county population, I look at several outcomes and provide a broad picture of the effects on these local economies. On average, expansions in foreign student enrollment led to a substantial increase in local employment, business establishments, and wages. A potential reason why the labor market effects are different from other immigration contexts is that the foreign students are notably different — they have a strong financial background and cannot work on a student visa until they finish their education. In the housing market, the housing supply increased rapidly, and there was no significant effect on the gross housing rent. Overall, the results suggest potential welfare gains for the native workers. Further, I find that the native workers may benefit more in sparsely populated counties in the long run than in densely populated counties, where the housing rent could rise steeply, leading to a shift of welfare gains from the native workers to the landlords. Finally, while foreign students have a sizable marginal effect, domestic students have little to no marginal effect on the local economy over the 12 years. An influx of foreign students creates local demand shocks similar to various place-based policies that are usually implemented in underperforming locations to reduce economic disparity. Many 94 argue that place-based policies are inefficient and that they simply reallocate economic activity across locations. Often, the equity argument is made in support of these policies. Whether the policy leads to welfare gains for intended recipients is largely an empirical question. In this paper, I find potential welfare gains for natives due to foreign student enrollment expansions in the host counties. At the same time, there is no evidence of the adverse effects on the neighboring counties without post-secondary institutions. Further, unlike the place-based policies usually funded by diverting resources from other regions, which might not be cost-effective, the local demand shocks created by foreign students are funded primarily by money from abroad. While informing about the overall effects of foreign student enrollment on the local economy, the results in this paper highlight the potential advantages of policies that promote foreign student enrollment — they can lead to economic growth in targeted locations. In the long run, they might especially be beneficial for less densely populated locations that depend heavily on the education sector and lack growth opportunities in other sectors. 95 BIBLIOGRAPHY Abramitzky, R. and Boustan, L. (2017). Immigration in American Economic History. Journal of Economic Literature, 55(4):1311–1345. Altonji, J. G. and Card, D. (1991). The Effects of Immigration on the Labor Market Outcomes of Less-Skilled Natives. In Abowd, J. M. and Freeman, R. B., editors, Immigration, Trade, and the Labor Market, pages 201–234. University of Chicago Press, Chicago and London. Amuedo-Dorantes, C., Furtado, D., and Xu, H. (2019). OPT Policy Changes and Foreign Born STEM Talent in the US. Labour Economics, 61. Anelli, M., Shih, K., and Williams, K. (2020). Foreign Students in College and STEM Graduates. Bartik, T. J. (2020). Using Place-Based Jobs Policies to Help Distressed Communities. The Journal of Economic Perspectives, 34(3 (Summer)):99–127. Beine, M., Noel, R., and Ragot, L. (2014). Determinants of the International Mobility of Students. Economics of Education Review, 41:40–54. Black, D., McKinnish, T., and Sanders, S. (2005). The Economic Impact of the Coal Boom and Bust. Economic Journal, 115(503):449–76. Borjas, G. J. (2007). Do Foreign Students Crowd Out Native Students from Graduate Programs? In Ehrenberg, R. G. and Stephan, P. E., editors, Science and the University. Bound, J., Braga, B., Khanna, G., and Turner, S. (2020). A Passage to America: University Funding and International Students. American Economic Journal: Economic Policy, 12(1):97–126. Busso, M., Gregory, J., and Kline, P. (2013). Assessing the Incidence and Efficiency of a Prominent Place Based Policy. American Economic Review, 103(2):897–947. Card, D. (1990). The Impact of the Mariel Boatlift on the Miami Labor Market. Industrial and Labor Relations Review, 43(2):245–257. Card, D. (2001). Immigrant Inflows, Native Outflows, and the Local Labor Market Impacts of Higher Immigration. Journal of Labor Economics, 19(1):22–64. Chaurey, R. (2017). Location-Based Tax Incentives: Evidence from India. Journal of Public Economics, 156:101–120. Chirakijja, J. (2022). The Local Economic Impacts of Prisons. Working Paper. Chodorow-Reich, G., Fieveson, L., Liscow, Z., and Woolston, W. G. (2012). Does State Fiscal Relief During Recessions Increase Employment? Evidence From the American Recovery and 96 Reinvestment Act. American Economic Journal: Economic Policy, 4(3):118–45. Ciccone, A. and Hall, R. E. (1993). Productivity and the Density of Economic Activity. NBER Working Paper 4313. Dang, T. (2022). The Local Economic Impact of International Students: Evidence from US Commuting Zones. Working Paper. Demirci, M. (2020). International Students and Labour Market Outcomes of US-born Workers. Canadian Journal of Economics, 53(4):1495–1522. Doran, K., Gelber, A., and Isen, A. (2014). The Effects of High-Skilled Immigrantion Policy on Firms: Evidence from H-1B Visa Lotteries. NBER Working Paper 20668. Dustmann, C., Schönberg, U., and Stuhler, J. (2017). Labor Supply Shocks, Native Wages, and the Adjustment of Local Employment. The Quarterly Journal of Economics, 132(1):435–483. Glaeser, E. L. and Gottlieb, J. D. (2008). The economics of place-making policies. Brookings Paper on Economic Activity, 39(1):155–253. Goldsmith-Pinkham, P., Sorkin, I., and Swift, H. (2020). Bartik Instruments: What, When, Why, and How. American Economic Review, 110(8):2586–2624. Kerr, S. P. and Kerr, W. R. (2011). Economic Impacts of Immigration: A Survey. NBER Working Paper 16736. Kline, P. and Moretti, E. (2014a). Local Economic Development, Agglomeration Economies, and the Big Push: 100 Years of Evidence from the Tennessee Valley Authority. The Quarterly Journal of Economics, 129(1):275–331. Kline, P. and Moretti, E. (2014b). People, places, and public policy: Some simple welfare economics of local economic development programs. Annual Review of Economics, 110(8):629–662. Mocanu, T. and Tremacoldi-Rossi, P. (2019). International Student Migration and Local Housing Markets. Mimeo. Moretti, E. (2010). Local Multipliers. American Economic Review: Papers and Proceedings, 100(2):373–77. NAFSA (2020). Losing Talent 2020: An Economic and Foreign Policy Risk America Can’t Ignore. Policy Resources. Neumark, D. and Kolko, J. (2010). Do enterprise zones create jobs? Evidence from California’s enterprise zone program. Journal of Urban Economics, 68(1):1–19. 97 Neumark, D. and Simpson, H. (2015). Chapter 18 - Place-Based Policies. In Duranton, G., Henderson, J. V., and Strange, W. C., editors, Handbook of Regional and Urban Economics, volume 5, pages 1197–1287. Elsevier. Ottaviano, G. I. P. and Peri, G. (2012). Rethinking the Effect of Immigration on Wages. Journal of the European Economic Association, 10(1):152–97. Ruiz, N. G. and Budiman, A. (2017). New Foreign Enrollment at U.S. Colleges and Universities doubled since Great Depression. Pew Research Center. Ruiz, N. G. and Budiman, A. (2018). Number of Foreign College Students Staying and Working in U.S. After Graduation Surges. Pew Research Center. Shih, K. (2017). Do International Students Crowd-Out or Cross-Subsidize Americans in Higher Education. Journal of Public Economics, 156:170–84. Story, L. (2012). As Companies Seek Tax Deals, Governments Pay High Price. The New York Times. Thomas, K. (2011). The Effects of Immigration on the Labor Market Outcomes of Less-Skilled Natives. In Investment Incentives and the Global Competition for Capital. Palgrave Macmillan, London and New York. UNESCO, Institute for Statistics (2017). Enrollment by Level of Education. Zou, B. (2018). The Local Economic Impacts of Military Personnel. Journal of Labor Economics, 36(3):589–621. 98 APPENDIX 2A MAIN TABLES AND FIGURES Figure 2A.1 Degree Enrollment in US Post-Secondary Institutions Over Time (a) Foreign Student Enrollment (b) Foreign Student by Population (c) Domestic Student Enrollment (d) Total Student Enrollment Notes: The figures show the student enrollment numbers in degree programs over the years in the US starting from 1996. Three vertical light green lines indicate the years 2001, 2004, and 2016 in all the panels. Only post-secondary institutions eligible for federal financial aid program are included in calculating the enrollment numbers. Source: Author’s calculation using IPEDS and BEA data. 99 Figure 2A.2 Initial Foreign Student Share and Future Increase Notes: This figure shows the fitted line of the regression of future change in foreign student enrollment-to- population ratio on the initial ratio at the county level. The regression is weighted by the initial population of the county. Each bubble is a county, and the size of the bubble is proportional to the initial population of the county. The slope of the fitted line is 0.57, and the robust standard error is 0.07. An outlier is dropped here, which does not affect the overall takeaway from the graph. Source: Author’s calculation using IPEDS and BEA Data. 100 Table 2A.1 Employment: First Stage Regression for Both Endogenous Explanatory Variable ∆ foreign IV ∆ domestic IV ∆ foreign ∆ domestic (1) (2) 0.963*** (0.160) -0.004 (0.028) 2.460*** (0.862) 1.396*** (0.344) Secular Trend State Fixed Effects N AP Fstat × × 652 53.10 × × 652 17.59 Notes: This table reports the first stage results for employment as an outcome. Column 1 reports the results for ∆ foreign IV and column 2 reports the results ∆ domestic IV. In both the columns, the endogenous explanatory variable is regressed on both the excluded instruments, secular trend controls, and the State FEs. “AP Fstat” row reports the Angrist Pischke first stage F statistics. N denotes the number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 101 Table 2A.2 Effect of Change in Student Enrollment on Employment Dependent Variable: ∆ employment (1) (2) (3) (4) (5) (6) (7) 1.280*** (0.343) 0.825** (0.327) 1.081*** (0.295) 4.168*** 3.859*** (1.094) (1.082) 2.748*** 2.725*** (0.961) (1.019) 0.221*** (0.075) 0.074* (0.045) 0.135* (0.075) 0.621*** (0.193) 0.237 (0.209) × × OLS 652 OLS 652 OLS 652 Foreign 2SLS 652 39.53 Foreign 2SLS 652 39.84 Both 2SLS 652 51.15 23.85 × × Both 2SLS 652 53.10 17.59 ∆ foreign ∆ domestic Secular Trend State Fixed Effects Instrument Estimation Method N AP Fstat Foreign AP Fstat Domestic Notes: This table reports results of regression for employment as an outcome using various versions of the empirical specification. The dependent variable is the change in the employment of the county between 2004 and 2016 scaled by the population of the county in 2004. The main explanatory variables are changes in enrollment between 2004 and 2016 scaled by the population of the county in 2004. “Secular Trend” row denotes if the secular trend control has been included. Secular trend control includes the growth rate of the outcome between 1996 and 2001. “State Fixed Effects” row denotes if the state fixed effects has been included. “Instrument” row denotes what instruments have been used. Foreign is for ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝐼𝑉 and Both is for both ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝐼𝑉 and ∆ 𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝐼𝑉 . “Estimation Method” row denotes whether we use OLS or 2SLS method for estimation. “AP Fstat Foreign” row reports the Angrist Pischke first stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first stage F statistics for the ∆ domestic. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 102 Table 2A.3 Effect of Change in Student Enrollment on Local Labor Market and Local Business Outcomes ∆ employment ∆ tradable employment ∆ nontradable employment ∆ log adjusted wage ∆ business establishment ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic (1) 2.725*** (0.961) 0.237 (0.209) 652 53.10 17.59 (2) 0.043 (0.216) 0.079* (0.046) 652 53.10 17.59 (3) 2.314** (0.908) -0.035 (0.203) 652 53.10 17.59 (4) 3.319*** (1.137) -0.122 (0.367) 652 53.10 17.59 (5) 0.077** (0.032) 0.009 (0.008) 652 53.10 17.59 Notes: This table reports results of regression for various outcomes. The outcome variable is depicted in the column head. All the columns are estimated using the specification in column 7 of Table 2A.2. Outcome variables in column 1,2,3 and 5 are scaled by 2004 population. Wages are denominated in 2010 dollars. “AP Fstat Foreign” row reports the Angrist Pischke first stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first stage F statistics for the ∆ domestic. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 103 Table 2A.4 Effect of Change in Student Enrollment on Population and Housing Market Outcomes ∆ non-student population ∆ house units ∆ log median rent ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic (1) (2) 3.170*** (1.082) -1.012*** (0.268) 652 53.10 17.59 1.136*** (0.345) 0.029 (0.099) 652 53.10 17.59 (3) 0.618 (0.877) 0.034 (0.251) 652 53.10 17.59 Notes: This table reports results of regression for various outcomes. The outcome variable is depicted in the column head. All the columns are estimated using the specification in column 7 of Table 2A.2. Outcome variables in column 1 and 2 are scaled by 2004 population. Rents are denominated in 2010 dollars. “AP Fstat Foreign” row reports the Angrist Pischke first stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first stage F statistics for the ∆ domestic. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 104 Table 2A.5 Effect of Change in Student Enrollment on County Outcomes Using Split Periods ∆ employment ∆ business establishment ∆ non-student population ∆ foreign ∆ domestic Secular Trend State Fixed Effects Time Period Dummy N AP Fstat Foreign AP Fstat Domestic (1) 3.217*** (0.937) 0.045 (0.133) × × × 1304 46.65 25.73 (2) 0.067** (0.027) 0.010** (0.005) × × × 1304 46.65 25.73 (3) 2.110** (1.073) -0.463** (0.193) × × × 1304 46.65 25.73 Notes: This table reports results of regression for various outcomes. The outcome variable is depicted in the column head. All the columns are estimated using specification 2.4. All outcome variables are scaled by 2004 population. “Secular Trend” row denotes if the secular trend control has been included. “State Fixed Effects” row denotes if the state fixed effects has been included. “Time Period Dummy” row denotes if the time period dummy has been included. “AP Fstat Foreign” row reports the Angrist Pischke first stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first stage F statistics for the ∆ domestic. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 105 Table 2A.6 Heterogeneity With Population Density ∆ employment ∆ log adjusted wage ∆ business establishment ∆ non-student population ∆ house units ∆ log median rent ∆ foreign ∆ domestic ∆ foreign × PD N AP Fstat Foreign AP Fstat Domestic AP Fstat Interaction (1) 1.609** (0.797) 0.168 (0.203) 2.212*** (0.528) 652 63.81 19.32 82.30 (2) 3.337*** (1.133) -0.149 (0.359) 0.782 (0.730) 652 63.81 19.32 82.30 (3) 0.048* (0.028) 0.008 (0.008) 0.030 (0.028) 652 63.81 19.32 82.30 (4) 2.055** (0.874) -1.023*** (0.210) 0.515 (0.741) 652 63.81 19.32 82.30 (5) 1.052*** (0.328) 0.026 (0.096) 0.112 (0.213) 652 63.81 19.32 82.30 (6) 0.272 (0.884) -0.010 (0.247) 1.348*** (0.515) 652 63.81 19.32 82.30 Notes: This table reports results of regression for various outcomes. The outcome variable is depicted in the column head. All the columns are estimated using specification 2.7. Outcome variables in column 1,3,4 and 5 are scaled by 2004 population. Wages and rents are denominated in 2010 dollars. “D” is demeaned log of population density of the county. “AP Fstat Foreign” row reports the Angrist Pischke first stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first stage F statistics for the ∆ domestic. “AP Fstat Interaction” row reports the Angrist Pischke first stage F statistics for the interaction term. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 106 Table 2A.7 Neighboring County Outcomes ∆ employment ∆ log adjusted wage ∆ business establishment ∆ non-student population ∆ house units ∆ log median rent ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic (1) -0.046 (0.087) 0.036 (0.022) 1429 100.14 13.03 (2) 0.135* (0.070) 0.009 (0.015) 1429 100.14 13.03 (3) -0.000 (0.005) 0.002 (0.001) 1429 100.14 13.03 (4) -0.044 (0.068) 0.014 (0.012) 1429 100.14 13.03 (5) 0.036 (0.043) -0.003 (0.008) 1429 100.14 13.03 (6) -0.026 (0.129) 0.019 (0.021) 1429 100.14 13.03 Notes: This table reports the results of effects of foreign and domestic student enrollment on the various outcomes of neighboring counties without institutions. The outcome variable is depicted in the column head. The sample includes all counties without institutions that neighbor a county with an institution (neighboring counties). The dependent variable is the change in the outcome of the neighboring county between 2004 and 2016. Dependent variables in columns 1,3,4 and 5 are scaled by the 2004 population. Wages and rents are denominated in 2010 dollars. The two main explanatory variables ((∆ foreign and ∆ domestic) ) are the 12-year enrollment changes of domestic and foreign students summed over all the adjacent counties with institutions. All the explanatory variables except wages and rents are further scaled by the population of the neighboring county in 2004. All regressions have secular trend control, i.e., the growth rate of the outcome between 1996 and 2001. Also, state fixed effects are included in every regression. The estimation method used is 2SLS. “AP Fstat Foreign” row reports the Angrist Pischke first-stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first-stage F statistics for the ∆ domestic. N denotes the number of observations. Robust standard errors clustered at the county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, CBP, and NBER Public Use Data. *** p<0.01, ** p<0.05, * p<0.1. 107 Table 2A.8 Correlation Between Student Enrollment IV And Controls Log(f(Employment Growth (1996-01))) Log(f(Nontradable Employment Growth(1996-01))) Log(f(Tradable Employment Growth(1996-01))) ∆ Log Wage(1996-01) Log(f(Business Establishment Growth(1996-01))) Log(f(Non Student Population Growth(1996-01))) Log(f(Houseunits Growth (1990-00))) ∆ Log Median Rent(1990-00) More Controls N 𝑅2 log(f(∆ foreign IV) log(f(∆ domestic IV) (1) (2) (3) (4) 0.009*** 0.015*** (0.003) (0.002) -0.001 (0.002) 0.000 (0.000) -0.001 (0.003) -0.001 (0.001) -0.007** (0.003) -0.008** (0.004) -0.002 (0.003) -0.002 (0.003) 0.001 (0.002) -0.000 (0.001) 652 0.03 -0.000 (0.003) -0.002 (0.003) 0.004 (0.004) -0.002 (0.002) × 652 0.09 -0.004 (0.016) 0.011 (0.011) -0.002 (0.003) 0.009 (0.014) -0.013 (0.014) -0.004 (0.021) 0.044*** (0.010) -0.010 (0.007) 652 0.07 -0.007 (0.019) 0.005 (0.013) -0.006 (0.005) 0.011 (0.018) -0.022 (0.015) 0.028 (0.017) -0.003 (0.023) -0.004 (0.010) × 652 0.13 Notes: This table reports results of regression of the instrument variables on the variables controlling for secular trend in the earlier regressions. Logarithmic transformation of the variables has been used for straighforward interpretation. Before applying logarithmic transformation to non-logarithmic variables, I add 1.1 to the variables which is denoted by function f in the table. Columns 2 and 4 include the quadratic and cubic terms of the controls as well, which is indicated in the “More Controls” row. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 108 APPENDIX 2B ADDITIONAL TABLES AND FIGURES Table 2B.1 Summary Statistics ∆ foreign ∆ domestic Foreign Enrollment in 1000s (2004) Foreign Enrollment in 1000s (2016) Domestic Enrollment in 1000s (2004) Domestic Enrollment in 1000s (2016) County Population in 1000s (2004) County Population in 1000s (2016) Non-student Population in 1000s (2004) Non-student Population in 1000s (2016) Employment in 1000s (2004) Employment in 1000s (2016) Tradable Employment in 1000s (2004) Tradable Employment in 1000s (2016) Nontradable Employment in 1000s (2004) Nontradable Employment in 1000s (2016) Average Wage in 1000s (in 2010 dollars, 2004) Average Wage in 1000s (in 2010 dollars, 2016) Business Establishments in 1000s (2004) Business Establishments in 1000s (2016) Housing Units in 1000s (2004) Housing Units in 1000s (2016) Median Monthly Gross Rent (in 2010 dollars, 2004) Median Monthly Gross Rent (in 2010 dollars, 2016) Employment-to-Population Ratio (2004) Tradable Employment-to-Population Ratio (2004) Nontradable Employment-to-Population Ratio (2004) Business Establishments-to-Population Ratio (2004) Housing Units-to-Population Ratio (2004) Population Density in 1000s (2000) Observations Mean 0.00263 0.00592 0.641 1.158 16.77 19.01 200.3 220.4 182.9 200.2 127.7 146.3 10.26 8.891 97.33 116.5 35.83 38.36 5.338 5.604 85.63 89.65 660.6 686.8 0.574 0.0567 0.392 0.0252 0.451 0.189 652 SD 0.00770 0.0658 1.822 3.514 31.68 39.85 529.0 562.0 497.8 521.6 330.3 387.6 27.52 22.50 265.1 327.3 7.628 8.382 14.11 15.44 201.3 208.6 158.1 166.0 0.130 0.0359 0.125 0.00687 0.0539 1.097 Notes: This table shows the summary statistics of the variables for the sample counties. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. 109 Table 2B.2 County Outcomes: Sensitivity To Additional Controls ∆ employment ∆ log adjusted wage ∆ business establishment ∆ non-student population ∆ house units ∆ log median rent ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic (1) 2.519** (1.052) 0.382* (0.226) 652 61.05 11.40 (2) 3.568*** (1.246) -0.061 (0.340) 652 61.05 11.40 (3) 0.079** (0.038) 0.016 (0.010) 652 61.05 11.40 (4) (5) 2.849*** (1.100) -0.817*** (0.295) 652 61.05 11.40 0.971*** (0.372) 0.066 (0.119) 652 61.05 11.40 (6) 0.400 (0.975) 0.141 (0.279) 652 61.05 11.40 Notes: This table reports results of regression for various outcomes. The outcome variable is depicted in the column head. All the columns are estimated using the specification in column 7 of Table 2A.2. Outcome variables in column 1,3,4 and 5 are scaled by 2004 population. Wages and rents are denominated in 2010 dollars. Secular trend control includes the growth rate of all the outcomes between 1996 and 2001 as well as their quadratic and cubic terms .“AP Fstat Foreign” row reports the Angrist Pischke first stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first stage F statistics for the ∆ domestic. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 110 Table 2B.3 County Outcomes: Alternate Sample Analysis ∆ employment ∆ log adjusted wage ∆ business establishment ∆ non-student population ∆ house units ∆ log median rent (1) (2) (3) (4) (5) (6) ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic PANEL A: ATLEAST 1% STUDENT POPULATION 2.878*** (0.928) 2.970*** (1.098) 0.287 (0.229) 1235 73.69 16.24 -0.163 (0.354) 1235 73.69 16.24 0.068** (0.032) 0.014 (0.009) 1235 73.69 16.24 3.649*** (1.071) -0.837*** (0.279) 1235 73.69 16.24 0.964*** (0.328) 0.025 (0.097) 1235 73.69 16.24 PANEL B: ATLEAST 2% STUDENT POPULATION 2.551*** (0.909) 0.319 (0.227) 1121 73.95 15.58 2.821** (1.105) -0.170 (0.366) 1121 73.95 15.58 0.061* (0.032) 0.014 (0.009) 1121 73.95 15.58 3.216*** (1.065) -0.754** (0.304) 1121 73.95 15.58 PANEL C: ATLEAST 3% STUDENT POPULATION 2.575*** (0.920) 0.286 (0.233) 973 70.11 14.95 2.759** (1.098) -0.090 (0.359) 973 70.11 14.95 0.057* (0.032) 0.014 (0.010) 973 70.11 14.95 3.129*** (1.059) -0.817*** (0.307) 973 70.11 14.95 0.826** (0.335) 0.068 (0.105) 1121 73.95 15.58 0.827** (0.344) 0.066 (0.110) 973 70.11 14.95 PANEL D: ATLEAST 4% STUDENT POPULATION 2.594*** (0.847) 0.183 (0.207) 802 65.79 16.02 2.662** (1.101) -0.144 (0.361) 802 65.79 16.02 0.070** (0.028) 0.009 (0.008) 802 65.79 16.02 3.154*** (0.982) -0.980*** (0.273) 802 65.79 16.02 0.957*** (0.314) 0.034 (0.098) 802 65.79 16.02 -0.047 (0.815) 0.221 (0.249) 1235 73.69 16.24 -0.131 (0.821) 0.236 (0.252) 1121 73.95 15.58 0.081 (0.810) 0.136 (0.247) 973 70.11 14.95 0.397 (0.807) -0.001 (0.247) 802 65.79 16.02 Notes: This table reports results of regression for various outcomes. The outcome variable is depicted in the column head. All the columns are estimated using the specification in column 7 of Table 2A.2. Outcome variables in column 1,3,4 and 5 are scaled by 2004 population. Wages and rents are denominated in 2010 dollars. Different panels report results of regressions run on a sample of counties having different share of student population in the base year, which is depicted in the panel head. “AP Fstat Foreign” row reports the Angrist Pischke first stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first stage F statistics for the ∆ domestic. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 111 Table 2B.4 County Outcomes: Excluding Influential Counties ∆ employment ∆ log adjusted wage ∆ business establishment ∆ non-student population ∆ house units ∆ log median rent ∆ foreign ∆ domestic N AP Fstat Foreign AP Fstat Domestic (1) 2.187** (0.859) 0.212 (0.208) 645 50.83 17.86 (2) 3.372*** (1.173) -0.108 (0.363) 645 50.83 17.86 (3) 0.076** (0.033) 0.009 (0.008) 645 50.83 17.86 (4) (5) 3.075*** (1.111) -1.020*** (0.267) 645 50.83 17.86 1.148*** (0.359) 0.030 (0.099) 645 50.83 17.86 (6) 0.341 (0.890) 0.032 (0.249) 645 50.83 17.86 Notes: This table reports results of regression for various outcomes. The outcome variable is depicted in the column head. All the columns are estimated using the specification in column 7 of Table 2A.2. Outcome variables in column 1,3,4 and 5 are scaled by 2004 population. Wages and rents are denominated in 2010 dollars. Sample includes all main sample counties except those in top 1 percentile of total foreign student enrollment in 2004. “AP Fstat Foreign” row reports the Angrist Pischke first stage F statistics for the ∆ foreign. “AP Fstat Domestic” row reports the Angrist Pischke first stage F statistics for the ∆ domestic. N denotes number of observations. Robust standard errors clustered at county level in parentheses. Source: Author’s calculation using IPEDS, BEA, NHGIS, and CBP Data. *** p<0.01, ** p<0.05, * p<0.1. 112 Figure 2B.1 Main Sample Counties Notes: This figure shows main sample counties highlighted on the US map. Source: Author’s calculation using IPEDS, BEA, and NHGIS Data. 113 Figure 2B.2 Variation in ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝐼𝑉 across sample counties Notes: This figure shows the variation in the ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝐼𝑉 across the main sample counties on the US map. Source: Author’s calculation using IPEDS, BEA, and NHGIS Data. 114 Figure 2B.3 Variation in ∆ 𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝐼𝑉 across sample counties Notes: This figure shows the variation in the ∆ 𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝐼𝑉 across the main sample counties on the US map. Source: Author’s calculation using IPEDS, BEA and NHGIS Data. 115 APPENDIX 2C DESCRIPTION OF VARIABLES 2C.1 Explanatory Variables 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐𝑡 is the number of foreign post-secondary students in county c in year t. 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐𝑡 is the number of domestic post-secondary students in county c in year t. 𝑃𝑜 𝑝𝑐𝑡 is the population of county c in year t. ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐 = (𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,2016 − 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,2004)/𝑃𝑜 𝑝𝑐,2004 is the change in number of foreign students in county c between 2004 and 2016, scaled by county’s 2004 population. ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐 = (𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐,2016 − 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐,2004)/𝑃𝑜 𝑝𝑐,2004 is the change in number of domestic students in county c between 2004 and 2016, scaled by county’s 2004 population. For Split Period Analysis (Table 2A.5): ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐𝑡 = (𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,𝑡2 − 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑐,𝑡1)/𝑃𝑜 𝑝𝑐,2004 and ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐𝑡 = (𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐,𝑡2 − 𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐,𝑡1)/𝑃𝑜 𝑝𝑐,2004, where t2 = 2010, t1 = 2004 for the first period and t2 = 2016, t1 = 2010 for the second period. For Neighboring County Sample Analysis (Table 2A.7): ∆ 𝑓 𝑜𝑟𝑒𝑖𝑔𝑛𝑐 = (cid:205)𝑠∈𝐴𝑑𝑗𝐶 (𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑠,2016− 𝐹𝑜𝑟𝑒𝑖𝑔𝑛𝑠,2004)/𝑃𝑜 𝑝𝑐,2004 and ∆𝑑𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑐 = (cid:205)𝑠∈𝐴𝑑𝑗𝐶 (𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,2016−𝐷𝑜𝑚𝑒𝑠𝑡𝑖𝑐𝑠,2004)/𝑃𝑜 𝑝𝑐,2004, where AdjC is the set of counties with post-secondary institutions that share border with county c without any post-secondary institutions. 2C.2 Outcome Variables 𝑦𝑘 is a local outcome. 𝑦𝑘 the population of county c in year t. ∆𝑦𝑘 𝑐𝑡 is the measure of local outcome 𝑦𝑘 of county c in year t. 𝑃𝑜 𝑝𝑐𝑡 is 𝑐 are changes in local outcomes 𝑦𝑘 of county c, which are scaled by the county’s 2004 population for non-logarithmic local outcomes. ∆𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐 = (𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐,2016 − 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐,2004)/𝑝𝑜 𝑝𝑐,2004, where 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐𝑡 is the total employment in county c in year t. ∆𝑡𝑟𝑎𝑑𝑎𝑏𝑙𝑒 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐 = (𝑡𝑟𝑎𝑑𝑒 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐,2016−𝑡𝑟𝑎𝑑𝑒 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐,2004)/𝑝𝑜 𝑝𝑐,2004, ∆𝑛𝑜𝑛𝑡𝑟𝑎𝑑𝑎𝑏𝑙𝑒 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐 = (𝑛𝑜𝑛𝑡𝑟𝑎𝑑𝑒 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐,2016−𝑛𝑜𝑛𝑡𝑟𝑎𝑑𝑒 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐,2004)/𝑝𝑜 𝑝𝑐,2004, where 𝑡𝑟𝑎𝑑𝑒 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐𝑡 is the total employment in tradable sector in county c in year t and 𝑛𝑜𝑛𝑡𝑟𝑎𝑑𝑒 𝑒𝑚 𝑝𝑙𝑜𝑦𝑚𝑒𝑛𝑡𝑐𝑡 is the total employment in nontradable sector in county c in year t. 116 The tradable sector includes industries whose products could be primarily traded nationally or internationally. Whereas the nontradable sector includes industries whose products are primarily traded locally. Following Black et al. (2005), the tradable sectors here includes manufacturing. The nontradable sector includes all private nonfarm employment sectors excluding manufacturing, mining, forestry, fishing, and related activities. ∆𝑙𝑜𝑔 𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑤𝑎𝑔𝑒𝑐 = 𝑙𝑜𝑔 𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑤𝑎𝑔𝑒𝑐,2016 − 𝑙𝑜𝑔 𝑎𝑑 𝑗𝑢𝑠𝑡𝑒𝑑 𝑤𝑎𝑔𝑒𝑐,2004, where 𝑙𝑜𝑔 𝑎𝑑𝑗𝑢𝑠𝑡𝑒𝑑 𝑤𝑎𝑔𝑒𝑐𝑡 is the log demographic adjusted average wage of county c in year t. Following Zou (2018), I calculate it as follows: First, I calculate the average wage for each county by dividing total county wages and salary earnings by total county wages and salary employment. Second, I regress log average wages on county demographic characteristics, which includes racial composition (white, black), share of population with a college degree, and the quadratic terms of these variables. The log demographic adjusted average wage is the residual from this regression. ∆𝑏𝑢𝑠𝑖𝑛𝑒𝑠𝑠 𝑒𝑠𝑡𝑎𝑏𝑙𝑖𝑠ℎ𝑚𝑒𝑛𝑡𝑐 = (𝑏𝑢𝑠 𝑒𝑠𝑡𝑎𝑏𝑐,2016−𝑏𝑢𝑠 𝑒𝑠𝑡𝑎𝑏𝑐,2004)/𝑝𝑜 𝑝𝑐,2004, where 𝑏𝑢𝑠 𝑒𝑠𝑡𝑎𝑏𝑐𝑡 is the total number of business establishments in county c in year t. ∆𝑛𝑜𝑛 𝑠𝑡𝑢𝑑𝑒𝑛𝑡 𝑝𝑜 𝑝𝑢𝑙𝑎𝑡𝑖𝑜𝑛𝑐 = (𝑛𝑜𝑛 𝑠𝑡𝑢 𝑝𝑜 𝑝𝑐,2016 − 𝑛𝑜𝑛 𝑠𝑡𝑢 𝑝𝑜 𝑝𝑐,2004)/𝑝𝑜 𝑝𝑐,2004, where 𝑛𝑜𝑛 𝑠𝑡𝑢 𝑝𝑜 𝑝𝑐𝑡 is the non-student population of county c in year t. ∆ℎ𝑜𝑢𝑠𝑒 𝑢𝑛𝑖𝑡𝑠𝑐 = (ℎ𝑜𝑢𝑠𝑒 𝑢𝑛𝑖𝑡𝑠𝑐,2016 − ℎ𝑜𝑢𝑠𝑒 𝑢𝑛𝑖𝑡𝑠𝑐,2004)/𝑝𝑜 𝑝𝑐,2004, where ℎ𝑜𝑢𝑠𝑒 𝑢𝑛𝑖𝑡𝑠𝑐𝑡 is the total number of housing units in county c in year t. A housing unit is a house, an apartment, a mobile home, a group of rooms, or a single room that is occupied (or if vacant, is intended for occupancy) as separate living quarters. Separate living quarters are those in which the occupants live and eat separately from any other persons in the building and which have direct access from the outside of the building or through a common hall. ∆𝑙𝑜𝑔 𝑚𝑒𝑑𝑖𝑎𝑛 𝑟𝑒𝑛𝑡𝑐 = 𝑙𝑜𝑔 𝑚𝑒𝑑𝑖𝑎𝑛 𝑟𝑒𝑛𝑡𝑐,2016−𝑙𝑜𝑔 𝑚𝑒𝑑𝑖𝑎𝑛 𝑟𝑒𝑛𝑡𝑐,2004, where 𝑙𝑜𝑔 𝑚𝑒𝑑𝑖𝑎𝑛 𝑟𝑒𝑛𝑡𝑐𝑡 is the log median gross rent of county c in year t. Gross rent is the monthly housing cost expenses for renters. It is the contract rent plus the estimated average monthly cost of utilities (electricity, gas, and water and sewer) and fuels (oil, coal, kerosene, wood, etc.) if these are paid by the renter (or paid for the renter by someone else). 117 CHAPTER 3 SCIENCE EDUCATION AND LABOR MARKET OUTCOMES IN A DEVELOPING ECONOMY 3.1 Introduction Science education early in life can influence long term education and professional outcomes in multiple ways. Science education can directly train and qualify individuals for positions in specialized STEM (Science, Technology, Engineering and Mathematics) occupations which are often associated with higher wages. Science education could develop analytical and quantitative skills that can be applied in a variety of managerial and leadership roles. In developing countries, the shortage of workers with science backgrounds might yield significant wage premiums in the labor market. Conversely, science education at a critical stage might lead students into STEM careers they are not interested in or suited for. If inflexible education systems track students into careers early, science training might crowd out other subjects in humanities, social sciences or vocational studies that might match students’ long term interests better. In developing countries with fewer opportunities for relevant post-secondary education and lower industrial development, science students might not be able to translate their analytical skills into higher earnings. Set in the context of India, this paper investigates the labor market consequences, including earnings, the likelihood of graduating from college or a professional course, and the sector of employment, of science education at the senior secondary level. We uncover the role of addi- tional factors, including academic ability, English education, social status, parental education and computer skills, that might mediate the relationship between science education and labor market outcomes. Understanding the influence of STEM on eventual career choices and earnings as well as the pathways that enable these outcomes could help design policies that encourage students and administrators to pursue the most productive educational paths. India is second, only after China, in educating college graduates specializing in STEM. Among graduates in India, 35% are STEM majors while 53% are humanities in 2012 (OECD (2015), national statistics websites for China and India). Despite the success of several countries in 118 producing STEM graduates, and the attempt of others to follow, the labor market consequences of STEM education are poorly understood. An attractive feature of the Indian setting is that students must specialize in either science, business or humanities at the higher secondary stage, in contrast to education systems in North America and the UK that do not require such specific focus. Estimating the labor market consequences of science education faces a number of challenges. From an empirical perspective, estimating the causal impact of high-school major choices on earnings is not straightforward due to endogeneity of the major choice variable. Omitted variables such as ability, language and communication skills or labor market conditions could bias estimates of the relationship between the choice of major and earnings. Using rich data from the India Human Development Survey (IHDS) allows us to control for ability using performance in the high stakes tenth grade exam, as well as English-language fluency which could be correlated with the ability to do well in science, admission tests, and job interviews. After including a rich set of controls, we report precise partial correlations between major choice and labor market earnings. We also take the route suggested by Altonji et al. (2005) and Oster (2017) and conduct bounds analysis on the estimated coefficients, as well as check the robustness of our main coefficients using Lewbel (2012)’s heteroscedasticity based estimator. Our main empirical finding is that in urban India, mean annual earnings are 22% higher for men who study science in high school relative to men who study business and humanities. Quantile regressions show that earnings are similar at all points of the wage distribution, though as high as 37% for income earners in the 99𝑡ℎ percentile. We explore channels through which science in high school influences human capital outcomes, with subsequent effects on income and productivity. The literature provides evidence that incre- mental years of education, especially at the tertiary level are important for income and economic growth (Colglough et al., 2010; Castello-Climent et al., 2018). We find that science students have an additional 0.22 years of post higher secondary education, and are 5 percentage points more likely to complete a bachelors degree. We find that studying science increases the probability of completing a professional course (engineering, management, accounting – important drivers 119 of economic growth (Griliches, 1992; Peri et al., 2015)) by 6 percentage points. While science students are more likely to be subsequently employed as engineering professionals, we do not find discernable effects on the probability of being a manager. This suggests that the link between studying science and greater incomes is mediated through engineering jobs. Joensen and Nielsen (2009) and Arcidiacono (2004) show that mathematics education yields higher incomes. We find some suggestive evidence that mathematical abilities drive a large part of the return to science: controlling for studying mathematics in high school attenuates the point estimate for earnings with science by 23%. A persistent question in the literature is what drives the strong association between entrepreneur- ship and incomes. For example, Hartog et al. (2010) find that mathematical ability among business- men drives income, and Kremer et al. (2016) find math scores are correlated with entrepreneurial success, particularly profits, among small retailers. While we do not find correlation between study- ing science and the probability of being an entrepreneur, income is 18% greater for entrepreneurs with a high school science background. Entrepreneurs with high scholastic ability have even higher earnings from studying science (42%), pointing to a strong complementarity between ability and science background. We report on factors that facilitate or hinder the translation of science education into higher earnings. Differential earnings from studying science relative to business and humanities are at all levels of ability. We find higher relative marginal earnings to studying science only when individuals have at least moderate level of English-language fluency and computer skills, suggesting strong complementarity between English-language fluency and science earnings. Heterogeneity analysis suggests that greater earnings associated with science are concentrated among those without professional degrees.1 Greater earnings do not accrue, on average, to disadvantaged Scheduled Caste (SC) and Scheduled Tribe (ST) communities. We explore the role of additional behavioral factors that potentially influence students’ decision to take science and bias our results.2 A primary survey of 524 students in grade 12 (the last 1Professional courses refer to the study of engineering, medicine, management, accounting and law. 2A line of research, summarized by Heckman and Kautz (2012), shows the importance of cognitive and non- 120 year of high-school) across 44 schools in Andhra Pradesh and Bihar, reveals the extent to which commonly measured cognitive and non-cognitive skills differ across students who study science and non-science majors. We find a weak correlation between grit and the likelihood of studying science. Additionally, we find that ambiguity aversion, positive personality traits and Cognitive Reflection Test (CRT) score do not differ by stream choice. Thus, behavioral traits and the decision to study science do not seem to be systematically correlated. This paper contributes to the literature on returns to STEM majors by examining the case of a large developing country, where the nature of education as well as the structure of labor market might qualitatively change the returns to STEM education. In the United States, several papers focus on understanding why a large share of college students drop out of STEM major, with a handful of papers estimating the returns to studying STEM majors in college (Altonji et al., 2012; Black et al., 2015). Beyond STEM, a large literature documents differences in earnings across majors for college graduates using quasi-experimental variation in student assignment to different majors.3 Arcidiacono (2004) finds that mathematics ability is important for labor market returns and for sorting into particular majors, and that after controlling for this selection, students who select natural science and business majors receive large financial returns. We also add to the literature on returns to human capital by estimating earnings associated with major choices in a developing country context. The literature unpacking the impact of education content include Munshi and Rosenzweig (2006) and Chakraborty and Kapur (2016) on English- medium instruction, Jain (2017) on mother-tongue instruction, Azam et al. (2013) on English fluency, and Cantoni et al. (2017) and Dhar et al. (2020) on curriculum. Finally, our paper relates to the literature on determinants of the stream and occupational choices, a classic research question in social sciences. This literature focuses on two sets of relationships: occupational choices and future expected earnings, and college major choices and occupational choices. Several papers including Grogger and Eide (1995), Brown and Corcoron (1997), Wein- cognitive skills factors in the labor market. 3See Altonji et al. (2012), Altonji et al. (2016), Daymont and Andrisani (1984), Grogger and Eide (1995), Hastings et al. (2013), Kirkeboen et al. (2016), James et al. (1989), Loury (1997), Loury and Garman (1995) and Gemici and Wiswall (2014). 121 berger (1998) and Gemici and Wiswall (2014) have documented that post-secondary human capital investment is an important determinant of future expected earnings, and most importantly, college major choices can provide insights to understand long-term changes in inequality and earnings differences by gender and race. 3.2 Context In the Indian education system, students receive ten years of primary and secondary education, supplemented by two years of senior secondary education, and three to five years of higher education (MHRD, Government of India, 1998). The objective of the first ten years is to provide a non-selective general education to all students. The two years of senior secondary education allows students to specialize, while preparing them for higher education. At this stage, students must enrol into one of three standard majors or specializations: Science, Business and Humanities.4 Anecdotal evidence and reviews of the popular media suggest that students, especially in urban areas, have information on subsequent education and career choices (India Today, 2017; Sharma, 2019). At the end of schooling, students pursue higher education, or enter the labor force. The duration of higher education varies from three to five years depending on the course. Bachelor in Arts and Science programs are three years duration, technical courses are four years, and medicine and architecture last five years. A unique feature of the Indian education system is the deterministic role of high school major choices on college majors, where pre-college major choices are largely irreversible. Students who study science as a high school major are the only ones eligible to study STEM courses in college, while remaining eligible to pursue various non-STEM courses. Conversely, students who study business or humanities in high school are eligible to only pursue non-STEM courses in college. Therefore, high school major choices directly controls the set of courses students can pursue after high school, and is considered to be a critical first step in long-term career paths.5 4The state or all-India boards of secondary education determine curriculum at the higher secondary level. The curriculum that a particular school follows is determined by the state or national board to which it is affiliated. For instance, schools affiliated with the New Delhi-based Central Board of Secondary Education will follow its curriculum and offer the Board’s All India Senior School Certificate Examinations. 5This restriction is more fluid for recent age cohorts who are yet to hit the labor market. 122 In theory, any student who obtains a passing grade, i.e., 30-35% out of a maximum of 100% (depending on the examination board), in the Secondary School Certificate (SSC) examination (grade 10) can be admitted to the 11th grade.6 However, in practice, the eligibility thresholds are higher for science stream compared to business or humanities for any given school. There are multiple reasons why students take science in high school in India, apart from heterogenous tastes. One factor is that career paths associated with science are more prestigious. Another possible factor is that an undergraduate degree in science or a professional undergraduate course following science are terminal qualifications for high quality jobs, whereas non-science undergraduate programs require further education for similar positions. Moreover, science students can pursue various non-STEM undergraduate courses and careers but not vice-versa, so the number and types of jobs available after studying science in high school is much wider than otherwise. On the costs side, students studying science might supplement school education with private coaching to prepare for entrance examinations to elite engineering and medical colleges.7 Finally, students might migrate for better schools or private coaching, with associated financial and non- financial costs. 3.3 Data The main empirical analysis uses the India Human Development Survey (IHDS) data collected in 2011-12. IHDS is a nationally representative, multi-topic household survey covering 42,152 households across India.8 Uniquely among nationally representative household surveys, the IHDS collects data on individuals major choices in high school and their current earnings. In addition, the survey also reports variables associated with demographic characteristics, ability, English-language fluency, computer skills, educational achievement and occupational outcomes. The population of interest in this paper are urban males aged 25 to 65 years who have completed 6In India, students must pass the SSC examination to be eligible for further schooling. Better scores in this exam enable students to attend better schools. 7Azam (2016) reports that the average cost of private tutoring in 2007-08 is about 42.7% of total private education expenditure, which is about 16.5% of household per capita expenditure. This jumps to approximately 40% of total private expenditure on education at the secondary and senior secondary level. 8The survey covered all the contiguous states and union territories of India. For data analysis, we use IHDS design weights to obtain nationally representative statistics. 123 at least secondary schooling (grade 10), made a stream choice in higher secondary stage (grades 11 and 12). Our sample therefore consists of such males who report information on both major choices and earnings. We do not include women in the analysis because women in India have a low labor force participation rate (Klasen and Pieters, 2015; Pande and Moore, 2015). We do not consider rural residents because measuring agricultural and in-kind income is difficult.9 These restrictions yield 4,763 men in the sample, most of whom report being in wage or salaried employment, in business, or being self-employed. Only 4% report being unemployed. Since they earn zero income, we drop them as we use log of earnings as our primary dependent variable.10 The IHDS reports household level enterprise profit along with the labor time contribution in the enterprise of household members. We use this information to calculate earnings for business/self- employment.11 Using the household’s net enterprise profit (already net of costs), we apportion the amount of the net profit based on the individual’s share of the total time spent by household members on enterprise activities. Such apportioning introduces measurement error on the dependent variable but avoids the need for selection models, for which identifying variables are difficult to find. Further, we do not look at wage (earnings) rate but instead focus on annual earnings. This incorporates both the wage rate as well as the number of hours worked in the year. Given that labor is typically inelastically supplied by most male adult members of households in developing countries like India, the actual amount of work is likely to reflect demand for labor. Thus, the demand for labor is an important part of the earning payoff for an individual. For example, while public salaried employment may not offer the highest wage rate in the labor market, the fact that most public employees are assured work throughout the year, ensures larger earnings from such jobs. 9One concern with this dataset is the omission of emigrants who potentially self-select on the basis of science training. The National Sample Survey 64th migration survey (2007-2008) shows that the proportion of urban male students going abroad is 0.29% if children in the age group 19-20 years are the base. Using all the children in the age group 19-20 years who have graduated from high school as the base, the proportion remains low at 0.65%. This small fraction is unlikely to qualitatively change our empirical findings. 10Our results are robust to including the unemployed using a dependent variable in levels. 11Typically, regressions that calculate Mincerian returns in the context of India include only wage employees. This is dictated by the lack of earnings data in the employment datasets of the National Sample Survey, the most commonly used dataset for estimating earnings in India. 124 Table 3A.1 reports descriptive statistics for a number of relevant variables. The first row reports the mean annual earnings are Rs. 178,330 (approximately, $2,830). Respondents who have completed 10𝑡ℎ grade have, on average, 3.86 years of further education, representing completion of high school and some college. In our sample, 26% report to be working in public employment, 25% in private employment and 27% in business employment. Table 3A.1 reports descriptive statistics for various control variables used in the estimation. Twenty five percent of the sample studied science in high school. Approximately 32% received a first division (> 60% score), 57% received a second division (50% < score < 60%), 12% received a third division (40% < score < 50%), and 12% repeated a grade. Similarly, 36% of the sample speaks fluent English, 48% speak English less fluently while the remaining 16% cannot speak any English. Among the demographic variables, the average age is slightly less than 40 years, 83% men are married, 33% belong to Other Backward Classes, 12% are Scheduled Castes, 3% are Scheduled Tribes, 8% are Muslims and 3% are Christians.12 Given the background of those surveyed, is there a difference in earnings between those studying science and those studying other majors? Figure 3A.1 plots the distribution of log earnings by major choice showing that the distributions are different, with the mean log earnings for science students higher than students from other majors. The mean earnings for science students is Rs. 224,194 ($3,558) while that of students from other majors is Rs. 156,000 ($2,476). This difference remains even while conditioning on the scholastic ability of the individual although the two density functions are far closer for those with first division (Figure 3A.2) as compared to those with lower divisions (Figure A.1). Finally, in order to understand other correlates of studying science major, we conducted a primary survey on grade 12 students in six districts across Bihar and Andhra Pradesh states in India. Our choice of these states was guided by the high proportion of science students, though varying on a range of other economic and social dimensions.13 12The Scheduled Castes and Scheduled Tribes are historically disadvantaged minorities recognized by the Consti- tution in India. The Government of India classifies approximately 41% of the country’s population as Other Backward Class (OBC) who are socially and educationally disadvantaged. 13Of students of age 16-18 years and who attend school, 65% study science in Bihar and 91% in Andhra Pradesh, 125 The survey was conducted at the beginning of the academic calendar (May to July) in 2017 in district towns of Patna, Bhagalpur and Sitamarhi in Bihar, and Vijayawada, Kurnool and Srikakulam in Andhra Pradesh. Schools in these towns were randomly chosen, stratified on private versus public management. Students across science, business and humanities majors were chosen at random from school lists and interviewed at home. While not representative of the nation or the state, the sample is representative of high school students in these specific cities in India. The population of Patna in Bihar and Vijayawada in Andhra Pradesh exceeds one million individuals while the remaining four are mid-sized cities. To the best of our knowledge, these cities have no cultural or economic characteristics that would make them outliers in the determinants of STEM education. The final sample consists of 524 students in class 12 (the last year of high-school) across 55 schools spread across the two states. In addition to information on subjects studied by students and their life and career aspirations, the survey also measured various behavioral parameters: grit, ambiguity aversion, cognition-reflection ability and positive personality traits, typically not available from developing countries (see detailed descriptions of these in the Appendix). Appendix Tables 3B.4 and 3B.5 present summary statistics for the survey sample. 3.4 Empirical Analysis 3.4.1 Specification We use the 2011-12 round of the IHDS with one observation per individual, and estimate the earnings associated with studying science in high school using the following specification. 𝑦𝑖𝑑 = 𝛽0 + 𝛽1𝑆𝑐𝑖𝑒𝑛𝑐𝑒𝑖𝑑 + 𝛽2X𝑖𝑑 + 𝜆𝑑 + 𝜖𝑖𝑑 (3.1) In equation (3.1), the main outcome of interest 𝑦𝑖𝑑 is log earnings of an individual 𝑖 residing in district 𝑑. The variable 𝑆𝑐𝑖𝑒𝑛𝑐𝑒𝑖𝑑 is 1 if the individual studied science in high school, and 0 otherwise. The primary coefficient of interest is 𝛽1, which is the percent increase in earnings associated with studying science in high school. We add a vector of control variables (X𝑖𝑑) which includes a measure of ability, represented by indicator variables for whether the individual obtained in contrast to 55% for the entire country. 126 first, second and third division in the grade 10 exam.14 Students with higher ability are more likely to study science in high school as well as have better jobs, leading to an upward bias on the estimate of the return to studying science if a measure of ability is omitted.15 IHDS data permits controls for academic ability using individual performance on the secondary school certificate (SSC) examination conducted at the end of grade 10. To control for other aspects of ability, we add an indicator representing whether the individual ever failed or repeated a grade.16 The specification controls for average household education (excluding the respondent) to proxy for household level ability, as well as for parental education, a traditional control to proxy for ability in the returns to education literature (Card, 1999). Fluency in English might directly impact labor market returns (Azam et al., 2013), so X𝑖 includes measures for self-reported English fluency, represented by indicator variables for “very fluent”, “little fluent” and “not fluent”. We add a rich set of control variables for individual age, marital status, religious and social group. The specification includes district fixed-effects (𝜆𝑑) which account for differences both in labor market as well as educational systems and resources across and within states, in addition to geographic, economic and social factors that are common to all individuals within a district. Finally, the term 𝜖𝑖𝑑 represents 𝑖.𝑖.𝑑. unobserved factors that might influence earnings, and is clustered at the state level. In addition to log earnings, we estimate equation (3.1) for a number of follow-on outcome variables. These variables represent human capital achievement (specifically, years of schooling, whether the individual completed graduate education, and whether the individual completed pro- fessional education) and employment (in particular, public sector tenured employment, private sector tenured employment, and employment in business, as well as income associated with public, private and business employment).17 14This is similar to controlling for aptitude test scores to address the ability bias while estimating the returns to schooling. 15According to the authors calculation using the estimation sample, 39.31% of students who receive first division (higher ability), 20.48% of students who receive second division, and only 10.22% of students who receive the third division study science in grade 10. 16Heckman and Kautz (2012) point out that students with the same achievement scores but different failure histories have different life outcomes. 17Private sector tenured employment represents employment within the private sector with self reported security 127 3.4.2 Main results Table 3A.2 reports findings from estimating equation (3.1), sequentially introducing controls in Columns 1 through 4. The main result in Column 4, after including all control variables, is that studying science is associated with 22% higher earnings (𝑝 < 0.01). Once controls for ability are introduced in column (2), the estimated coefficients for science do not change much with the addition of control variables, offering greater confidence that the estimated coefficients for 𝑆𝑐𝑖𝑒𝑛𝑐𝑒 are due to studying science in high school rather than remaining omitted variables. The magnitude of this coefficient is comparable to the influence of “Fluent English” skills, (+36.0%, consistent with estimates reported by Azam et al. (2013)), indicating the importance of high school curriculum on adult earnings. Also important is household education, with a year increase in average education of the other household members being associated with 3% greater earnings.18 Finally, Column 5 reports the results from a specification that includes the highest level of education as a regressor. This yields a coefficient of 17% greater earnings from science education in high school. Because we regard this as overcontroling, we use the specification for Column 4 as our main specification in subsequent analysis.19 Next, Table 3A.3 examines the relative importance of science education for students at the 10𝑡ℎ, 25𝑡ℎ, 50𝑡ℎ, 75𝑡ℎ and 90𝑡ℎ and 99𝑡ℎ percentiles of the earnings distribution. Studying science has a comparable uniform significant influence on earnings at all these points of the wage distribution, except the top 1%. At the 99𝑡ℎ percentile, we find that studying science in high school is associated with 37% greater earnings. By highlighting that stream choice is correlated with the highest incomes, our results illustrate an important driver of convexity in returns to education.20 Examining heterogeneity in the results helps to determine the pathways through which science of tenure. When we consider income associated with private employment, we include tenured as well as untenured private sector employees as the income gains may be correlated with getting tenure. 18Appendix Table 3B.1 reports finding on estimating equation (3.1) after including the maximum of the parental education as an additional control. This sample is about half owing to co-residence requirements. Column 1 presents results from the main specification with this smaller sample. Column 2 presents results with parental education, and finds that studying science is associated with 21% higher earnings (𝑝 < 0.01), consistent with results in column (1). 19The impact of studying science on wages is mediated through various mechanisms, one of them being subsequent human capital choices that studying science in high school leads to. Hence, controlling for the highest degree, which is a potential mediator, would not allow us to capture the full return to studying science in high school. 20In the Appendix, we show that our main results are robust to dynamics in education and employment over time. 128 education transmits to earnings. Higher ability students might be more able to translate knowledge of science into greater earnings. Table 3A.4 examines this empirically by dividing the sample among those who received a first division versus a second or third division scores in tenth grade, and reporting separate results from estimating equation (3.1). We find that the point estimate of earnings associated with science is higher for students with first (+0.25% greater earnings) versus lower (+0.19% greater earnings) division scores in tenth grade. Although the two estimates are not statistically different from each other, these findings, along with those from the quintile regressions, are consistent with science education complementary with ability, with the greatest marginal value for the most capable students and workers. We also explore the complementarity of science with spoken English and computer fluency. Such complementarities might be particularly important in the labor market, where the structure of jobs might dictate the earnings associated with different skills. If STEM jobs require extensive communication with others, especially in the business world where language skills are important, then the labor market value of science education might be influenced by English fluency. Conversely, if STEM careers require expertise in science with communications handled by other employees, then earning of workers with science proficiency would be independent of their language skills. For similar reasons, the value of science education could depend on knowledge and fluency with computers. Without precisely defining a production function, the empirical exercise offers insight into the complementarities between science education and English language and computer skills. Panel A of Table 3A.5 finds that earnings from science accrue significantly more when an individual knows English. The earnings are 28% greater with fluency in spoken English (𝑝 < 0.01), and 19% higher with little English (𝑝 < 0.01). The earnings associated with science are statistically indistinguishable from zero without English, regardless of ability measured by tenth grade scores, indicating the critical role of English language skills in complementing science education in the job market.21 Mirroring these results are the findings associated with computer skills in Panel B. 21Our result is consistent with Berman et al. (2003) and Lang and Siniver (2009) who find evidence of language- skill complementarity in Israel. They show that improved Hebrew and English in addition to their native language 129 Science education is associated with high earnings (+31% for first division students, 𝑝 < 0.01; +19% for second and third division students, 𝑝 < 0.05), especially when the respondent was proficient in computers. Earnings are significantly lower (7%, 𝑝 < 0.10) for students who report no proficiency in computers. Collectively, these findings point to the critical role of communication and technical skills in operationalizing greater earnings from science education. The returns to education literature for India shows that market oriented professional courses in engineering, medicine, business, law and accounting yield the greatest earnings (Duraiswamy, 2002). Such courses typically command higher wages as compared to “general” university ed- ucation, partly because their skills sets can be readily deployed without firm-specific training. Therefore, complementarities between science education and market oriented skills are interesting to investigate. Panel C of Table 3A.5 finds that science education has no impact when comparisons are made among those who have completed professional degrees across all ability levels. This is not because completing professional degrees correlates perfectly with studying science: among those with professional degrees, science and non-science school majors are almost equally represented, since 43.6% of those with professional degrees did not study science in high school. In contrast, we find studying science results in higher earnings across all ability levels among those without such market skills, i.e., those who did not complete a professional degree. This finding suggests market oriented degrees and science education in higher secondary school are potentially substitutes in the labor market. We next examine how the social environment, represented by social group and parent edu- cation, influences the value of studying science. Socially privileged individuals might benefit disproportionately more from science education, since they might have access to job and commer- cial opportunities required to convert their education into higher earnings. Conversely, the marginal value of science education might be lower for individuals from such backgrounds, compared to individuals from socially and educationally disadvantaged groups. Thus, the value of science education by social and educational background is an open empirical question. We explore this accounts for two-thirds to three-fourths of the differential in earnings growth between immigrant and native employed in high-skilled occupation. 130 question by estimating two equations, the first of which interacts 𝑆𝑐𝑖𝑒𝑛𝑐𝑒 with an indicator variable representing membership of a Scheduled Caste, and the second where 𝑆𝑐𝑖𝑒𝑛𝑐𝑒 is interacted with the parental education. Panel D of Table 3A.5 reports greater earnings from studying science for members of castes higher in the social hierarchy. Overall earnings are 25% greater for individuals in the highest “General” category (𝑝 < 0.01) and 20% for the Other Backward Classes in the middle of the social hierarchy (𝑝 < 0.01), but 15% and statistically indistinguishable from the null for the Scheduled Castes and Tribes. 3.4.3 Plausible channels This section analyzes the role of two potential channels through which STEM education can lead to greater earnings (without necessarily ruling out additional explanations). First, studying science in higher secondary grades might be associated with greater participation and completion of higher education, which would subsequently lead to increased incomes (Castello-Climent et al., 2018). Second, the combination of science in high school with more years of education might shift the sector (private or public) or type (salaried or business) where students are employed. Table 3A.6 estimates equation (3.1) using three different measures of educational attainment. Panel A of the table examines the association between studying science and the years of post- secondary education, Panel B reports whether the respondent at least completed a bachelor’s degree (or equivalent), and Panel C whether the respondent completed any professional program (defined in the previous section). We find that science education at the secondary school stage is associated with 0.22 additional years of post-secondary education (𝑝 < 0.01). One possible reason is selection into science, where motivation explains both the decision to study science as well as persistence within higher education. Alternatively, studying science could preserve more options for post-secondary education, which allows students to continue education more easily compared to non-science students. Corresponding to this finding, science students are also 5% more likely to complete a bachelors degree (Panel B, 𝑝 < 0.05), and 6% more likely to complete a professional degree (Panel C, 𝑝 < 0.01). 131 The labor market for educated men in urban India can be classified into one of four types of employment: a position in the private sector with security of tenure (we refer to it as Private Tenured Employment), a relatively secure job in the public sector, running one’s own business and untenured jobs largely in the private sector. Panels A and B in Table 3A.7 show that studying science makes one more likely to get a public sector job, but only among low ability science students. However, we do not find effect on both private sector tenured employment (Panel B) or business employment (Panel C). Thus, among lower ability students, science education makes one more likely to be in public sector relative to private sector. While public sector jobs are demanded by a relatively large section of society, private jobs are competitive at the top end of the wage distribution. But such private jobs often select those with better education. Thus studying science makes high ability students equally likely to be in different kinds of employment. However, many high quality private jobs may not be available for low ability students, leading them to prefer the public sector. For such students, technical backgrounds might improve the likelihood of obtaining a job in that sector. Table 3A.8 shows the implications of science education on income associated with employment in different sectors, conditional of being selected in a particular kind of job. We include both tenured and untenured jobs in the private sector, because security of tenure and income are likely to highly correlated in the private sector.22 Panels A, B and C report that earnings from science are the highest among high ability students who operated their own business (+0.42, 𝑝 < 0.10). Earnings from science versus non-science for high ability students are not very different in the private (+0.21, 𝑝 < 0.10) and public sectors (+0.26, 𝑝 < 0.01). For those with lower scholastic ability, science offers little earnings boost in business ownership and the public sector. Together with the result that science education increases the likelihood of being in public sector for low ability students, this suggests that a science education gets such individuals over the threshold of a government job but no further. However, there are earnings gains to science education in the private sector among low ability students. Table 3A.9 estimates equation (3.1) on a narrower definition of employment outcomes. The 22Many public sector jobs are already tenured (permanent). 132 objective is to understand the drivers of earnings associated with science education. Panel D of the Table 3A.9 examines the consequences of studying science on Science jobs, while Panel E, F, and G reports the same for Computing, Managerial, and Clerical jobs respectively. Panel D in Table 3A.9 show that studying science makes one more likely to get Science jobs such as engineers, doctors and scientists across all ability levels (+11%, 𝑝 < 0.01). However, the point estimates of getting the Science jobs is larger for the high ability students (+17%, 𝑝 < 0.01) as compared to the low ability students (+6%, 𝑝 < 0.01). Although we do not find effect on either Computing (Panel E) or Managerial jobs (Panel F), we find small effect on Clerical jobs in Panel G (+2%, 𝑝 < 0.10). Taken together, these results suggest that the earnings from studying science are largely driven by Science jobs.23 Table 3A.10 estimates equation (3.1), sequentially adding controls that further explain the drivers of earnings. We seek to examine how much the coefficient associated with science attenuates when we plug in potential mechanisms. Column 1 reports the estimate from Column 4 of Table 3A.2. In Column 2, once we add whether students had studied mathematics in high school, the point estimates suggest that studying science is associated with 17% greater earnings (𝑝 < 0.01). As discussed in Section 3.2, a student cannot study science in high school without mathematics, so part of the value of studying science can be plausibly driven by mathematics alone.24 However, even controlling for studying mathematics in high school, the point estimates only drop to 17% (𝑝 < 0.01). In Column 3, despite controlling for years of education (in addition to controlling for mathematics in high school), the point estimates remains 18% (𝑝 < 0.01). The magnitude is half when we control for occupation type (specifically, Science jobs) in column 7. 23Appendix Tables 3B.2 and 3B.3 estimate equation (3.1) to examine the effect of studying science on a wide spectrum of employment outcomes. We do not find that studying science makes one more likely to get jobs like: Police and Government Administration, Teaching, Electrician, and Construction jobs. However, we do find the effect to be positive (+2%, 𝑝 < 0.05) for Nursing jobs in Panel K of Appendix Table 3B.2; and negative (-2, 𝑝 < 0.01) for Accountant jobs in Panel J of Appendix Table 3B.2, and Salesman jobs (-2, 𝑝 < 0.01) in Panel M of Appendix Table 3B.3. 24Studying business in high school also requires mathematics. 133 3.4.4 Robustness 3.4.4.1 Bound Analysis Though the main regression model controls for ability by including dummy variable for divi- sions, a concern is the possibility for other kinds of unobserved abilities not completely subsumed by scholastic performance, as well as households factors to potentially bias our estimated results. This section assesses the extent of potential bias due to exclusion of these variables in the model following the strategy developed by Altonji et al. (2005) and Oster (2017). This methodology is based on the idea that selection on observables can provide a useful guide to assess selection on unobservables. To elaborate further, let 𝑌 = 𝛽𝑠 𝑋 + 𝛽𝑧 𝑍 + 𝑊 (3.2) where 𝑋 is the main variable of interest, 𝑍 is observed and 𝑊 contains all the unobserved compo- nents. The objective is to estimate the bias on 𝛽𝑠 because of 𝑊. Altonji et al. (2005) estimate this bias by assuming the following. 𝐶𝑜𝑣(𝑋, 𝑊) 𝑉 𝑎𝑟 (𝑊) = 𝛿 𝐶𝑜𝑣(𝑋, 𝛽𝑧 𝑍) 𝑉 𝑎𝑟 (𝛽𝑧 𝑍) (3.3) In other words, the relation of 𝑋 and unobservables is proportional to the relation of 𝑋 to observables, the degree of proportionality given by 𝛿. This basic insight has been extended by Oster (2017) to incorporate the idea that one can look at coefficient movements (of 𝛽𝑠) when covariates are added and deduce a similar bias. This extension keeps account of movement in the 𝑅-squared value due to addition of control variables. Following this method, we derive a consistent estimator for the effect of 𝑆𝑐𝑖𝑒𝑛𝑐𝑒 as a function of two parameters: 𝛿 and 𝑅𝑚𝑎𝑥, denoted by 𝛽𝑠(𝑅𝑚𝑎𝑥,𝛿). 𝑅𝑚𝑎𝑥 is the 𝑅-square of a hypothetical regression which includes the complete set of controls including the unobservable variables. To operationalize this method, we start with a baseline regression where log of earnings is regressed on 𝑆𝑐𝑖𝑒𝑛𝑐𝑒, and then add further controls. As a second step, we posit 𝑅𝑚𝑎𝑥. One way 134 this could be set is by looking at 𝑅-squares obtained in other studies in the same context that control for the omitted variables. While the literature contains Mincerian returns to education regressions for India, none look at the earnings of urban males who have completed high school.25 Given the lack of a known 𝑅𝑚𝑎𝑥, we follow Oster (2017)’s suggestion and set 𝑅𝑚𝑎𝑥 as 1.3 times the 𝑅-square of the regression that controls for 𝑍 (controlled regression). Since the 𝑅-square in our main specification is 0.304, we set 𝑅𝑚𝑎𝑥 = 0.4. The robustness check suggested by Oster (2017) is that the interval [𝛽𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑙𝑒𝑑 𝑠 , 𝛽𝑠(min(1.3*𝑅2 𝑐𝑜𝑛𝑡𝑟𝑜𝑙𝑙𝑒𝑑, 1), 1)] should not contain 0. We find that this is indeed not the case (Table 3A.11). In our case, the 𝛽𝑠 (0.4, 1)] = 0.16. Moreover, we provide the value of 𝛿 for which 𝛽𝑠 would become 0. The obtained value of 3 is high since Oster (2017) found that the average value of 𝛿 was 0.545 with 86% of the values of 𝛿 falling within [0, 1]. Alternatively, we show the 𝑅𝑚𝑎𝑥 needed to make 𝛽𝑠 = 0 when 𝛿 = 1 is 0.6, almost twice the 𝑅-square from the controlled regression. Thus, this exercise indicates that the estimated earnings associated with science education are robust to potential omitted variable bias. However it is important to also point out that the values taken for this bound analysis are necessarily ad-hoc. Therefore we provide an additional robustness exercise below. 3.4.4.2 Heteroscedasticity Based Instruments Finding exogenous variation in observational data on stream choice is hard and instruments in such settings are often hard to find. In such situations characterised by the absence of credible instruments, Lewbel (2012) suggested a heteroscedasticity based estimator. To elaborate, let and 𝑌 = 𝛽𝑥 𝑋 + 𝛽𝑧Z + 𝜖1 𝑋 = 𝛿𝑧Z + 𝜖2 (3.4) (3.5) 25Azam (2012) uses a sample of urban male wage earners to calculate returns to education. However, business employees are excluded in that analysis. Moreover, the sample considered includes all adult males and not just those who have passed high school. 135 where 𝑋 is endogenous and Z is a vector of exogenous variable. Identification in such a setting is based on the idea that if the endogenous variable 𝑋 is regressed on exogenous variables Z, and the residuals from this regression (𝜖2) are heteroscedastic, one can use such residuals to construct instruments which can be used like standard instrumental variables. For such an application, two conditions should hold. First, as indicated above, the residuals 𝜖2 should be heteroscedastic, which can be checked through a Breusch-Pagan test. If the tests indicate heteroscedasticity, one can construct instruments (Z𝑠 − ¯Z𝑠)𝜖2, where Z𝑠 is a subset of Z. These can be used to identify 𝛽𝑥 under the assumption that 𝐶𝑜𝑣(Z𝑠, 𝜖1𝜖2) = 0. If Z𝑠 has more than one variable, an overidentifying test can be run to provide some evidence of this assumption. Table 3A.12 presents the result from such an exercise. In Column (1), we present our OLS results while in column (2) we present the results from the Lewbel method. We take the following variables as Z𝑠: the average education of the household members, age and age squared as the ability controls. To begin with, note first that the Breusch-Pagan test reports a 𝜒2 value of 435.72 and rejects the null of homoscedasticity of 𝜖2. Second, the Hansen J overidentifying test reports a p value of 0.73 which gives some suggestive evidence that these generated instruments are valid. Given these, we estimate a coefficient of 0.29 which is close to the OLS estimate. Hence, this analysis provides further credence for our estimates. However, we stress that this method of identification has some weaknesses. It makes assumptions on the form of heteroskedas- ticity of the underlying unobserved component, which are impossible to check. 3.4.4.3 Migration Analysis Our main analysis does not control for migration, which is potentially endogenous to educational and employment opportunities. This could upward bias the returns to a science degree if only the most successful rural men get a job in an urban area. Conversely, the estimated returns could be downward biased if rural men have lower-quality education in science than in urban areas. To address possible selection bias, we use the information on family migration history provided in the IHDS dataset. We create a dummy variable No Migration which is 1 if the individual’s family has lived in the current town/city forever (which IHDS uses when the individual’s family first moved 136 to the current town/city at least 90 years ago). We estimate equation (3.1) after including the No Migration dummy and its interaction with Science. Appendix Table 3B.7 reports that the estimated coefficient on the interaction term is small and not statistically significant (-0.06, 𝑝 > 0.10). Further, the coefficient associated with Science (+0.24 𝑝 < 0.01) is similar in magnitude and statistical significance as the main specification. This empirical exercise suggests that the estimated returns to studying science are not biased due to migration histories. 3.5 Role of Behavioral Characteristics Recent literature has highlighted the role of behavioral characteristics in the labor market. While Section 3.4 shows that the biases due to the omission of these characteristics may not be large, we delve explicitly into how strong is the correlation between behavioral characteristics and the choice to study science. Drawing on a unique data from a survey of high school students in the states of Andhra Pradesh and Bihar (described in Section 3.3), this section explores if there is any role of some of these factors in students’ choice of science. Our primary interest is unpacking the role of behavioral characteristics in students’ decision to study science. These characteristics are generally unobserved in most studies in the developing countries. Our survey includes measures of students’ grit, ambiguity aversion, cognitive ability, and personality and non cognitive skills, typically not available in any nationally representative data sets (see descriptions of each in the Appendix). These are supplemented with information on individual and household characteristics, from both student and parent respondents. The literature has pointed to individual grit being correlated with educational success and other long-term goals (Bowman et al., 2015; Duckworth et al., 2007), even after controlling for IQ and Big Five conscientiousness. Thus, grit could be important in the labor market and subsequent economic returns. We investigate the role of grit in the decision to study science by using standardized questions suggested by the psychology literature, and scoring each sampled students on the grit scale (Duckworth et al., 2007). In order to make fully-informed education choices, students should have information on earnings associated with different types of jobs and also know the probability of obtaining such jobs. 137 However, our qualitative survey revealed that students and parents had poor knowledge at the high school stage of options that follow from studying different subjects.26 One possibility is that students may not want to make substantive choices till they have better information on labor market options. Since studying science leaves options open to study all subjects whereas studying business or humanities forecloses STEM options, the choice of high school major might be correlated to students’ willingness to make decisions in ambiguous situations. Hence, those who choose to study science might be relatively more ambiguity averse. To investigate this hypothesis, we measure ambiguity aversion using ambiguity tolerance scale suggested by the psychology literature (MSTAT-II) as well as ambiguity experiments suggested by Ellsberg. The expected labor market value from a science education is a potentially important determinant of subject choice. Greater cognitive ability facilitates understanding the relative costs and benefits associated with this decision. We use a three-item Cognitive Reflection Test (CRT), suggested by Frederick (2005), as a measure of cognitive ability. This measure is predictive of the types of choices that are used to test expected utility theory and prospect theory. Finally, the recent literature (Heckman and Kautz, 2012; Acosta et al., 2015; Deming, 2017) emphasizes the role of socio-emotional skills (personality traits and behaviors) in the labor market. Consistent with this, we collect information on students’ personality traits. Appendix Tables 3B.4 and 3B.5 describe several variables from the survey dataset. We collect information on major choice in high school, and 60% of students were enrolled in the science stream. A large fraction of students (62%) earned first division in grade 10. Survey responses indicate that students reflected on their stream choices (75%) and that challenging careers and subsequent earnings are important for them (80% and 83%, respectively). The responses also indicate that 43% of parents’ are involved in their children’s educational choices. Siblings are more frequently sources of career information compared to friends (41% versus 28%). Table 3A.13 summarizes the effects of the behavioral measures described above on students’ decisions to take science. The full specification in Column 2, which includes the full set of 26For instance, many students and parents were unable to name their dream institutions after high school nor the kinds of jobs that could follow. 138 control variables, shows that the grit score is uncorrelated with the choice of science (+0.01, 𝑝 > 0.10). Greater ambiguity is negatively correlated with taking science, although the ambiguity score coefficient is relatively small (-0.006, 𝑝 < 0.10), and the ambiguity experiment score is not statistically significant (-0.085, 𝑝 > 0.10). Neither the CRT score nor the personality variable have significant correlation with science education. Overall, we interpret these findings as suggestive that there is, at best, weak correlation between these commonly measured behavioral characteristics and influences on educational choices at the senior secondary stage in the context of our sample. 3.6 Conclusion We explore the role of science education in high school, a stage where important career choices are made, on subsequent education, career and earnings outcomes. Our analysis shows that science education at the higher secondary level is associated with 22% higher earnings compared to humanities and business. We find that science education complements academic ability, English fluency, computer skills, parental education, and privileged social background, pointing to the importance of supporting these among disadvantaged students. We find that a large part of the returns accrue due to knowledge of mathematics, as well as entry into potentially higher paying science jobs among science graduates. Interestingly, studying science in high school also increases returns to entrepreneurship among students with high cognitive ability, whereas for students with lower cognitive ability, those studying science are more likely to get a public sector position. Our results should be read with a number of caveats. First, in the absence of experimental or quasi-experimental research methodology, we cannot claim causality. Causal estimates of the effects of science education on professional outcomes might reveal the relative importance of selection versus treatment effects of science education, which is important for understanding the underlying production function as well as suggesting policy measures. Second, due to data limitations, we include neither women nor rural residents in our sample. Since the dynamics of how science education translates into earnings and employment is potentially very different for these groups, we caution against extending our estimates for these groups. Finally, we do not analyze potential barriers to the effectiveness of different pedagogical approaches to science education, which might 139 create significant variation in the estimates associated with professional outcomes. We hope that these issues will be addressed in future research.27 Our findings could have implications for other contexts. In economies with a smaller fraction of science students, the payoff to studying science might be greater because of the relatively scarcity of such workers, and earnings for scientific workers might reduce as their numbers increase. In contrast, fewer science students might also coincide with fewer scientific and technical higher education institutions and knowledge-based industries, which dampen earnings for workers with science education. Science education and training could also impact other sectors through greater innovation and growth of technology firms. We encourage research that extends our analysis into other contexts as well as examines the general equilibrium impacts of science education. Nonetheless, our analysis informs the global debate over the value of STEM-focused education versus a traditional liberal arts curriculum. Our findings not only suggest high labor market earnings associated with high school science, but also suggest strong complementarities between science and other skills such as computer and English fluency. Collectively, results from this paper point to the importance of policies to facilitate the ability of socially disadvantaged individuals to undertake human capital investments. 27Recent attempts to check causal estimates find similar results as those presented in this paper (Roychowdhury, 2021). 140 BIBLIOGRAPHY Acosta, P., Muller, N., and Sarzosa, M. (2015). Beyond qualifications: Returns to cognitive and socio-emotional skills in Colombia. World Bank Policy Research Working Paper No. 7430. Altonji, J., Arcidiacono, P., and Maurel, A. (2016). The analysis of field choice in college and graduate school: Determinants and wage effects. Handbook of the Economics of Education, 5:305–396. Altonji, J., Blom, E., and Meghir (2012). Heterogeneity in human capital investments: High school curriculum, college major, and careers. Annual Review of Economics, 4:185–223. Altonji, J., Elder, T., and Taber, C. (2005). Selection on observed and unobserved variables: Assessing the effectiveness of catholic schools. Journal of Political Economy, 113(1):151–184. Arcidiacono, P. (2004). Ability sorting and the returns to college major. Journal of Econometrics, 121:343–375. Azam, M. (2012). Changes in wage structure in urban India, 1983-2004: A quantile regression decomposition. World Development, 40(6):1135–1150. Azam, M. (2016). Private tutoring: Evidence from India. Review of Development Economics, 20(4):739–761. Azam, M., Chin, A., and Prakash, N. (2013). The returns to English-language skills in India. Economic Development and Cultural Change, 61(2):335–367. Berman, E., Lang, K., and Siniver, E. (2003). Language-skill complementarity: Returns to immigrant language acquisition. Labour Economics, 10:265–90. Black, S., He, Z., Muller, C., and Spitz-Oener, A. (2015). On the origins of STEM: The role of high school STEM coursework in occupational determination and labour market success in mid life. University of Texas working paper. Bowman, N., Hill, P., Denson, N., and Bronkema, R. (2015). Keep on truckin’ or stay the course? Exploring grit dimensions as differential predictors of educational achievement, satisfaction, and intentions. Social Psychological and Personality Science, 6(6):639–645. Brown, C. and Corcoron, M. (1997). Sex based differences in school content and the male-female wage gap. Quarterly Journal of Economics, 99:31–44. Cantoni, D., Chen, Y., Yang, D., Yuchtman, N., and Zhang, Y. (2017). Curriculum and ideology. Journal of Political Economy, 125(2):338–392. Card, D. (1999). The causal effects of education on earnings. In Ashenfelter, O. and Card, D., 141 editors, Handbook of Labor Economics, volume 3A, chapter 30, pages 1801–1863. Elsevier Science B.V. Castello-Climent, A., Chaudhary, L., and Mukhopadhyay, A. (2018). Higher education and prosper- ity: From Catholic missionaries to luminosity in India. The Economic Journal, 128:3039–3075. Chakraborty, T. and Kapur, S. (2016). English language premium: Evidence from a policy experiment in India. Economics of Education Review, 50:1 – 16. Colglough, C., Kingdon, G., and Patrinos, H. (2010). The changing pattern of wage returns to education and its implications. Development Policy Review, 28(6):733—-747. Daymont, T. and Andrisani, P. (1984). Job preferences, college major, and the gender gap in earnings. Journal of Human Resources, 19:408–428. Deming, D. (2017). The growing importance of social skills in the labor market. Quarterly Journal of Economics, 132(4):1593–1640. Dhar, D., Jain, T., and Jayachandran, S. (2020). Reshaping adolescents’ gender attitudes: Evidence from a school-based experiment in India. NBER Working Paper No. 25331. Duckworth, A., Peterson, C., Matthews, M., and Kelly, D. (2007). Grit: Perseverance and passion for long-term goals. Journal of Personality and Social Psychology, 92(6):1087–1101. Duraiswamy, P. (2002). Changes in returns to education in India, 1983-94: By gender, age-cohort and location. Economics of Education Review, 21(6):609–622. Frederick, S. (2005). Cognitive reflection and decision making. Journal of Economic Perspectives, 19(4):1087–1101. Gemici, A. and Wiswall, M. (2014). Evolution of gender differences in post-secondary human capital investments: College majors at the intensive margins. International Economic Review, 55:23–56. Griliches, Z. (1992). Introduction to “output measurement in the service sectors”. In Output measurement in the service sectors, pages 1 – 225. University of Chicago Press. Grogger, J. and Eide, E. (1995). Changes in college skills and the rise in the college wage premiums. Journal of Human Resources, 30:280–310. Hartog, J., Praag, M. V., and Sluis, J. V. D. (2010). If you are so smart, why aren’t you an entrepreneur? Returns to cognitive and social ability: Entrepreneurs versus employees. Journal of Economics & Management Strategy, 19(4):947–989. Hastings, J., Neilson, C., and Zimmerman, S. (2013). Are some degrees worth more than others? 142 Evidence from college admissions cutoffs in Chile. NBER Working Paper No. 19241. Heckman, J. and Kautz, T. (2012). Hard evidence on soft skills. Labour Economics, 19(4):451–464. India Today (2017). How to choose your stream after Class 10 boards for a successful career. Jain, T. (2017). Common tongue: The impact of language on educational outcomes. Journal of Economic History, 77(2):473 – 510. James, E., Nabeel, A., Conaty, J., and To, D. (1989). College quality and future earnings: Where should you send your child to college? American Economic Review, 79:247–252. Joensen, J. and Nielsen, H. (2009). Is there a causal effect of high school math on labor market outcomes? Journal of Human Resources, 44(1):171 – 198. Kirkeboen, L., Leuven, E., and Mogstad, M. (2016). Field of study, earnings, and self-selection. Quarterly Journal of Economics, 131(3):1057—-1111. Klasen, S. and Pieters, J. (2015). What explains the stagnation of female labor force participation in urban India? World Bank Economic Review, 29(3):449–478. Kremer, M., Robinson, J., and Rostapshova, O. (2016). Success in entrepreneurship: Doing the math. Chapter in NBER book African Successes, Volume II: Human Capital, University of Chicago Press, pages 281–303. Lang, K. and Siniver, E. (2009). The return to English in a non-English speaking country: Russian immigrants and native Israelis in Israel. The B.E. Journal of Economic Analysis & Policy, 9:1–30. Lewbel, A. (2012). Using heteroscedasticity to identify and estimate mismeasured and endogenous regressor models. Journal of Business & Economic Statistics, 30(1):67–80. Loury, L. (1997). The gender-earnings gap among college-educated workers. Industrial and Labor Relations Review, 50:580–593. Loury, L. and Garman, D. (1995). College selectivity and earnings. Journal of Labor Economics, 13:289–308. McLain, D. (2009). Evidence of the properties of an ambiguity tolerance measure: The multiple stimulus types ambiguity tolerance scale–II (MSTAT–II). Psychological Reports, 105(3):975– 988. MHRD, Government of India (1998). National policy on education 1996 (modified in 1992). Munshi, K. and Rosenzweig, M. (2006). Traditional institutions meet the modern world: Caste, gender and schooling choice in a globalizing economy. American Economic Review, 96(4):1225– 143 1252. OECD (2015). Education Indicators in Focus 31. Oster, E. (2017). Unobservable selection and coefficient stability: Theory and evidence. Journal of Business & Economic Statistics, pages 1–18. Pande, R. and Moore, C. (2015). Why aren’t India’s women working? The New York Times,, August 23. Peri, G., Shih, K., and Sparber, C. (2015). Stem workers, H-1b visas, and productivity in US cities. Journal of Labor Economics, 33(S1):S225 – S255. Roychowdhury, P. (2021). (Em)powered by science? Estimating the causal effect of high school major choice on labor market earnings in India. Economics of Education Review, 82:102–118. Sharma, S. (2019). Courses after 12th for all streams - Arts, Commerce and Science. Times of India. Weinberger, C. (1998). Race and gender wage gaps in the market for recent college graduates. Industrial Relations, 37:67–84. 144 APPENDIX 3A MAIN TABLES AND FIGURES Figure 3A.1 Earnings distribution by high school major Notes: Source: Authors’ calculation using IHDS-II (2011-12) dataset 145 Figure 3A.2 Earnings Distribution by High School Major - First Division Students Notes: Source: Authors’ calculation using IHDS-II (2011-12) dataset 146 Table 3A.1 Summary Statistics Mean Standard Deviation DEPENDENT VARIABLES Annual Earnings (Rs. ’000s) 178.33 212.23 Years of Education (Post Grade 10) Dummy: At least Graduate Education Dummy: Professional Education Dummy: Private Tenured Employment Dummy: Public Tenured Employment Dummy: Business Employment 3.86 0.57 0.06 0.25 0.26 0.27 INDEPENDENT VARIABLES Science Major Business Major Humanities Major Division I Division II Division III Repeated Grade Fluent English Less Fluent English 0.25 0.23 0.52 0.32 0.57 0.12 0.12 0.36 0.48 DEMOGRAPHIC CONTROLS Age Married Scheduled Castes Scheduled Tribes Other Backward Class Muslim Christian Average Household Education Max Parent Education Observations 39.81 0.83 0.12 0.03 0.33 0.08 0.03 10.02 8.26 4763 1.84 0.49 0.23 0.43 0.44 0.44 0.43 0.42 0.5 0.47 0.5 0.32 0.32 0.48 0.5 10.2 0.38 0.32 0.17 0.47 0.27 0.17 3.84 4.96 Notes: This table reports the mean and standard deviation for the estimation sample. The four categories of employment are Business Employment, Public Tenured Employment, Private Tenured Employment with Non Tenured-Non Business Employment as the omitted reference group. The number of observa- tions for the variables Average Household Education and Max Parent Education are 4,687 and 2,513, respectively. Source: Authors’ calculation using IHDS-II (2011-12) data. 147 Table 3A.2 Earnings and High School Science Major Log(Earnings) (1) (2) (3) (4) (5) No Control Ability FE District FE Demographics Highest Degree Science Dummy: 1st Division Dummy: 2nd Division Dummy: Repeated Grade Dummy: Fluent English Dummy: Less Fluent English Age Age Square Dummy: Married Dummy: Scheduled Castes Dummy: Scheduled Tribes Dummy: Other Backward Class Dummy: Muslim Dummy: Christian HH Education Highest Degree Controls N R-squared 0.36*** (0.04) 0.20*** (0.04) 0.33*** (0.06) 0.08 (0.06) -0.32*** (0.05) 0.41*** (0.10) 0.08 (0.09) 0.25*** (0.04) 0.22*** (0.06) 0.02 (0.05) -0.25*** (0.05) 0.42*** (0.08) 0.14** (0.07) 0.22*** (0.04) 0.21*** (0.06) 0.02 (0.05) -0.22*** (0.05) 0.35*** (0.07) 0.11* (0.06) 0.06*** (0.02) -0.00*** (0.00) 0.09* (0.05) -0.06 (0.04) 0.02 (0.13) -0.03 (0.04) -0.01 (0.06) 0.07 (0.08) 0.03*** (0.00) 4763 0.03 4763 0.10 4763 0.25 4687 0.30 0.17*** (0.03) 0.16*** (0.06) 0.01 (0.04) -0.20*** (0.05) 0.25*** (0.06) 0.08 (0.06) 0.06*** (0.02) -0.00*** (0.00) 0.09* (0.05) -0.06 (0.04) 0.04 (0.12) -0.01 (0.04) -0.01 (0.06) 0.09 (0.08) 0.02*** (0.01) ✓ 4687 0.32 Notes: This table reports regression results using equation (3.1). The main explanatory variable 𝑆𝑐𝑖𝑒𝑛𝑐𝑒 is 1 if the person studied science stream in high school, and 0 otherwise. Highest Degree Controls include dummies for BA, BSc, BCom, Engineering, Medicine, Masters, PhD, Professional, Diploma, and Others. Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 148 Table 3A.3 Earnings and High School Science Major, by quintile (1) 10𝑡ℎ (2) 25𝑡ℎ Log(Earnings) (3) 50𝑡ℎ (4) 75𝑡ℎ (5) 90𝑡ℎ (6) 99𝑡ℎ Science 0.25*** (0.03) 0.18*** 0.20*** (0.02) (0.02) 0.23*** (0.02) 0.25*** (0.02) 0.37*** (0.04) N Psuedo R-squared 4687 0.22 4687 0.22 4687 0.22 4687 0.23 4687 0.25 4687 0.41 Notes: This table reports coefficients corresponding to specification associated with column 4 of Table 3A.2. All specifications control for ability, demographics and district fixed effects. Each cell is the estimated coefficient of choosing Science major from separate regressions. Robust standard errors in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 149 Table 3A.4 Earnings and High School Science, by ability Log(Earnings) (1) (2) Division I Division II/III Science 0.25*** (0.06) N R-squared 1497 0.36 0.19*** (0.05) 3190 0.30 Notes: This table reports coefficients corresponding to specification associated with column 4 of Table 3A.2. All specifications control for ability, demographics and district fixed effects. Each column is the estimated coefficient of choosing Science major from separate regressions. Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 150 Table 3A.5 Heterogeneity in Earnings Associated with High School Science Log(Earnings) (1) (2) Full Sample Division I Division II/III (3) Fluent English Little English No English N R-squared Computer: Yes Computer: No N R-squared PANEL A: Language Proficiency 0.28*** (0.05) 0.19*** (0.05) 0.13 (0.10) 4687 0.31 0.28*** (0.10) 0.23** (0.11) 0.18 (0.14) 1497 0.36 PANEL B: Computer Proficiency 0.26*** (0.05) 0.31*** (0.08) 0.07* (0.04) 4687 0.33 -0.03 (0.09) 1497 0.40 PANEL C: Professional Degree Professional Edu: Yes 0.20 (0.13) 0.12 (0.19) 0.27*** (0.09) 0.21*** (0.04) -0.06 (0.17) 3190 0.31 0.19** (0.09) 0.11** (0.05) 3190 0.33 0.37 (0.24) Professional Edu: No 0.20*** 0.23*** 0.17*** N R-squared (0.04) 4687 0.31 (0.08) 1497 0.37 PANEL D: Caste Groups Caste Group: General Caste Group: OBC Caste Group: SC/ST N R-squared 0.25*** (0.05) 0.20*** (0.05) 0.15 (0.09) 4687 0.31 0.32*** (0.07) 0.19* (0.11) 0.12 (0.16) 1497 0.36 (0.05) 3190 0.31 0.21** (0.09) 0.16*** (0.05) 0.21** (0.10) 3190 0.30 Notes: This table reports the marginal effects of studying science. All specifications control for ability, demographics and district fixed effects. Column 1 reports the marginal effect by various indicators: Language Proficiency, Computer Proficiency, Professional Degree, Caste Groups and Household Edu- cation. Column 2 and 3 report the similar marginal effects by divisions (I and II & III). Each panel is a separate regression. Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 151 Table 3A.6 Science Majors and Human Capital Outcomes Full Sample Division I Division II/III (2) (3) (1) Dependent Variable: Panel A: Years of Education Science Constant N R-squared 0.22*** (0.07) 1.98*** (0.51) 4,687 0.33 0.25** (0.11) 3.68*** (0.75) 1,497 0.36 0.23*** (0.07) 1.58*** (0.53) 3,190 0.30 Dependent Variable: Panel B: Graduate Education Science Constant N R-squared 0.05** (0.02) 0.10 (0.16) 4,687 0.30 0.06* (0.03) 0.55* (0.30) 1,497 0.33 0.06*** (0.02) 0.03 (0.13) 3,190 0.27 Dependent Variable: Panel C: Professional Education Science Constant N R-squared 0.06*** (0.01) 0.08*** (0.01) 0.13 (0.08) 4,687 0.13 0.24 (0.14) 1,497 0.22 0.04** (0.02) 0.07 (0.07) 3,190 0.11 Notes: This table reports coefficients corresponding to specification associated with column 4 of Table 3A.2. All specifications control for ability, demographics and district fixed effects. Each column is the estimated coefficient of studying science from separate regressions by divisions (Full Sample, I and II & III). Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 152 Table 3A.7 Science Majors and Employment Outcomes - I Full Sample Division I Division II/III (2) (1) (3) Dependent Variable: PANEL A: Salaried - Public Employment Science Constant N R-squared 0.02 (0.02) -0.73*** (0.15) 4,687 0.21 -0.00 (0.04) -0.57 (0.37) 1,497 0.29 0.04* (0.02) -0.67*** (0.17) 3,190 0.22 Dependent Variable: PANEL B: Salaried - Private Employment Science Constant N R-squared -0.02 (0.02) 0.55*** (0.12) 4,687 0.14 -0.01 (0.04) 0.59** (0.27) 1,497 0.22 -0.02 (0.02) 0.56*** (0.12) 3,190 0.17 Dependent Variable: PANEL C: Business Employment Science Constant N R-squared 0.01 (0.02) 0.39*** (0.11) 4,687 0.17 -0.02 (0.03) 0.15 (0.22) 1,497 0.24 0.02 (0.02) 0.43*** (0.12) 3,190 0.20 Notes: This table reports coefficients corresponding to specification associated with column 4 of Table 3A.2. All specifications control for ability, demographics and district fixed effects. Each column is the estimated coefficient of studying science from separate regressions by divisions (Full Sample, I and II & III). Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 153 Table 3A.8 Science Majors and Income Full Sample Division I (1) (2) Division II/III (3) Dependent Variable: PANEL A: Income from Public Employment Science Constant N R-squared 0.18*** (0.05) 3.60*** (0.64) 1,209 0.48 0.26*** (0.07) 5.01*** (1.38) 488 0.52 0.13 (0.08) 2.72*** (0.50) 721 0.56 Dependent Variable: PANEL B: Income from Private Employment Science Constant N R-squared 0.24*** (0.05) 3.26*** (0.40) 2,143 0.38 0.21** (0.08) 4.03*** (0.84) 661 0.44 0.29*** (0.07) 3.32*** (0.48) 1,482 0.40 Dependent Variable: PANEL C: Income from Business Employment Science Constant N R-squared 0.18* (0.10) 2.36*** 1,273 0.37 0.42* (0.22) 2.74 320 0.53 0.08 (0.11) 2.10*** 953 0.40 Notes: This table reports coefficients corresponding to specification associated with column 4 of Table 3A.2. All specifications control for ability, demographics and district fixed effects. Sample in Panel A, B and C consists of all public employed individuals (both tenured and non tenured), all private employed individuals (both tenured and non tenured) and individuals employed in business respectively. Each column is the estimated coefficient of studying science from separate regressions by divisions (Full Sample, I and II & III). Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 154 Table 3A.9 Science Majors and Employment Outcomes - II Full Sample Division I Division II/III (2) (3) (1) Dependent Variable: PANEL D: Science Job Science Constant N R-squared 0.11*** (0.01) 0.17*** (0.02) 0.06*** (0.01) -0.01 (0.04) 4,687 0.15 -0.06 (0.12) 1,497 0.25 0.01 (0.04) 3,190 0.09 Dependent Variable: PANEL E: Computing Job Science Constant N R-squared 0.00 (0.01) 0.10** (0.05) 4,687 0.06 -0.01 (0.01) 0.10 (0.10) 1,497 0.14 0.01 (0.01) 0.10 (0.06) 3,190 0.08 Dependent Variable: PANEL F: Managerial Job Science Constant N R-squared 0.00 (0.01) -0.01 (0.08) 4,687 0.08 -0.01 (0.02) 0.07 (0.16) 1,497 0.16 0.00 (0.01) -0.02 (0.09) 3,190 0.13 Dependent Variable: PANEL G: Clerical Job Science Constant N R-squared 0.02* (0.01) 0.09 (0.08) 4,687 0.06 0.01 (0.02) 0.01 (0.23) 1,497 0.15 0.03* (0.02) 0.13* (0.08) 3,190 0.09 Notes: This table reports the coefficients corresponding to specification associated with column 4 of Table 3A.2. Science Jobs in Panel D include engineers, doctors and scientists. Computing Jobs in Panel E include computing operatives. Managerial Jobs in Panel F include managerial jobs in finance, manufacturing, service sector, and other not elsewhere classified managerial jobs. Clerical Jobs in Panel G include clerical supervisors and not elsewhere classified clerical jobs. All specifications control for ability, demographics and district fixed effects. Each column is the estimated coefficient of choosing Science major from separate regressions by divisions (Full Sample, I and II & III). Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 155 Table 3A.10 Earnings and High School Science Major - Role of Additional Controls Log(Earnings) (1) Base Math Background Years of Education Professional Education Computer Knowledge Employment Type Occupation Type (6) (7) (5) (3) (2) (4) Science 0.22*** (0.04) N R-squared 4687 0.30 0.17*** (0.05) 4687 0.31 0.18*** (0.04) 4687 0.31 0.16*** (0.04) 4687 0.32 0.14*** (0.04) 4687 0.34 0.13*** (0.04) 4687 0.35 0.07* (0.04) 4687 0.38 Notes: This table reports coefficients corresponding to specification associated with column 4 of Table 3A.2. Column 1 is same as column 4 in Table 3A.2. We sequentially add more controls in each column, which is indicated in the column head. Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 156 Table 3A.11 Robustness to Omitted Variable Bias Coefficient of Science Uncontrolled Controlled Identified (Estimated Bias) 𝑚𝑎𝑥 = 0.4 𝑅2 𝛿 = 1 𝑚𝑎𝑥 for 𝛽𝑠 = 0 𝛽𝑠 for 𝛿 = 1 𝛿 for 𝛽𝑠 = 0 𝑅2 𝛽𝑠 𝑅2 0.36 0.03 0.22 0.30 0.16 3 0.6 Notes: We follow Oster (2017) to formally test for robustness to omitted variable bias by observing coefficient movements after inclusion of controls. 𝑅2 𝑐𝑜𝑛𝑡𝑟 𝑜𝑙𝑙𝑒𝑑 = 0.4. This is based on recommendations made in Oster (2017). Source: Authors’ calculation using IHDS-II (2011-12) data. 𝑚𝑎𝑥 = 1.3∗𝑅2 157 Table 3A.12 Robustness: Lewbel Method Dependent Variable: Log(Earnings) (1) OLS 0.22*** (0.05) Science First Stage F Stat Hansen J Statistic p value Bruesch Test ( 𝜒2) p value R-squared Observations 0.36 4,687 (2) IV 0.29*** (0.08) 86 4.37 0.74 435.76 0.00 0.31 4,687 Notes: Column 1 reports coefficients corresponding to specification associated with column 4 of Table 3A.2. Columns 2 reports estimates using the Lewbel method. Robust standard errors in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. 158 Table 3A.13 Behavioral Correlates of High School Science Major Grit score Ambiguity score Ambiguity experiment score CRT score Personality N R-squared Household Controls Demographic Controls Other Controls Log(Earnings) (1) 0.077 (0.051) -0.003 (0.003) -0.082 (0.078) 0.067∗∗ (0.028) -0.006 (0.009) 319 0.04 (2) 0.100∗ (0.054) -0.006∗ (0.003) -0.085 (0.074) 0.032 (0.022) -0.001 (0.009) 313 0.22 ✓ ✓ ✓ Notes: This table reports the results of regression of studying science in high school on the potential behavioral correlates. Data from primary survey conducted by authors in 2017. The survey includes students from the Indian states of Andhra Pradesh (Vijayawada, Kurnool and Srikakulam districts) and Bihar (Patna, Bhagalpur and Sitamarhi districts.) The sample used in this table is male only. Dependent variable is 1 if student studied science in class 12. Household controls include household size, mother completing class 10, father completed class 10, asset index, distance to closest bank, father salaried employee, mother salaried employee. Demographic controls include age, caste, religion. Other controls include city tier and state board syllabus. Standard errors clustered at school level in parenthesis. Significance *** p<0.01, ** p<0.05, * p<0.1 159 APPENDIX 3B ADDITIONAL TABLES AND FIGURES Figure 3B.1 Earnings Distribution by High School Major - Second and Third Division Students Notes: Source: Authors’ calculation using IHDS-II (2011-12) dataset 160 Table 3B.1 Earnings and High School Science Major: Controlling for Parent Education Log(Earnings) (1) (2) Demographics Parent Education Science Dummy: 1st Division Dummy: 2nd Division Dummy: Repeated Grade Dummy: Fluent English Dummy: Less Fluent English Age Age Square Dummy: Married Dummy: Scheduled Castes Dummy: Scheduled Tribes Dummy: Other Backward Class Dummy: Muslim Dummy: Christian HH Education Max Parent Education Constant N R-squared 0.22*** (0.05) 0.19* (0.11) 0.00 (0.08) -0.25*** (0.06) 0.36*** (0.07) 0.08 (0.07) 0.03 (0.02) -0.00 (0.00) 0.07 (0.05) -0.06 (0.06) -0.25 (0.17) -0.01 (0.06) 0.07 (0.08) 0.16* (0.09) 0.04*** (0.01) 3.09*** (0.44) 2513 0.36 0.21*** (0.05) 0.18* (0.11) 0.00 (0.08) -0.25*** (0.06) 0.36*** (0.07) 0.08 (0.07) 0.03 (0.02) -0.00 (0.00) 0.07 (0.05) -0.06 (0.06) -0.25 (0.17) -0.01 (0.05) 0.07 (0.08) 0.16* (0.09) 0.03*** (0.01) 0.01 (0.01) 3.07*** (0.45) 2513 0.36 Notes: This table reports regression results using equation (3.1). The main explanatory variable 𝑆𝑐𝑖𝑒𝑛𝑐𝑒 is 1 if the person studied science stream in high school, and 0 otherwise. In both columns, the sample is restricted to individuals for which we have the maximum education values for parents. Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 161 Table 3B.2 Science Majors and Employment Outcomes - III Full Sample Division I Division II/III (2) (1) (3) Dependent Variable: PANEL H: Police and Govt. Admin Job Science Constant N R-squared -0.02 (0.01) -0.05 (0.05) 4,687 0.09 -0.02 (0.02) 0.01 (0.10) 1,497 0.21 -0.01 (0.01) -0.04 (0.07) 3,190 0.12 Dependent Variable: PANEL I: Teaching Job Science Constant N R-squared 0.01 (0.01) -0.14* (0.08) 4,687 0.11 0.00 (0.02) -0.17 (0.15) 1,497 0.24 0.02 (0.01) -0.11 (0.08) 3,190 0.12 Dependent Variable: PANEL J: Accountant Job Science Constant N R-squared -0.02*** (0.01) -0.04*** (0.01) -0.03 (0.04) 4,687 0.06 -0.03 (0.09) 1,497 0.18 -0.01 (0.01) 0.01 (0.06) 3,190 0.10 Dependent Variable: PANEL K: Nursing Job Science Constant N R-squared 0.02** (0.01) 0.00 (0.04) 4,687 0.08 0.02 (0.01) 0.04 (0.08) 1,497 0.20 0.02** (0.01) -0.01 (0.04) 3,190 0.08 Notes: This table reports the coefficients corresponding to specification associated with column 4 of Table 3A.2. Police and Government Admin Jobs in Panel H include police and government officials. Teaching Jobs in Panel I include teachers. Accountant Jobs in Panel J include accountants. Nursing Jobs in Panel K include nursing jobs. All specifications control for ability, demographics and district fixed effects. Each column is the estimated coefficient of choosing Science major from separate regressions by divisions (Full Sample, I and II & III). Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 162 Table 3B.3 Science Majors and Employment Outcomes - IV Full Sample Division I Division II/III (2) (3) (1) Dependent Variable: PANEL L: Electrician Job Science Constant R-squared N 0.00 (0.01) 0.08 (0.05) 0.06 4,687 0.01 (0.01) -0.01 (0.07) 0.21 1,497 -0.00 (0.01) 0.13** (0.06) 0.09 3,190 Dependent Variable: PANEL M: Salesman Job Science Constant R-squared N -0.02*** (0.01) 0.28*** (0.07) 0.09 4,687 -0.01 (0.01) 0.25** (0.10) 0.19 1,497 -0.03*** (0.01) 0.29*** (0.09) 0.10 3,190 Dependent Variable: PANEL N: Driving Job Science Constant R-squared N -0.01 (0.01) 0.07 (0.04) 0.06 4,687 -0.02** (0.01) 0.14 (0.09) 0.18 1,497 -0.00 (0.01) 0.05 (0.06) 0.10 3,190 Dependent Variable: PANEL O: Construction Job Science Constant R-squared N -0.01 (0.00) 0.05 (0.04) 0.11 4,687 -0.01 (0.01) -0.01 (0.03) 0.25 1,497 -0.01 (0.01) 0.06 (0.06) 0.15 3,190 Notes: This table reports the coefficients corresponding to specification associated with column 4 of Table 3A.2. Electrician Jobs in Panel L include all electrical jobs. Salesman Jobs in Panel M include non-technical sales jobs. Driving Jobs in Panel N include drivers. Construction Jobs in Panel O include all construction related jobs. All specifications control for ability, demographics and district fixed effects. Each column is the estimated coefficient of choosing Science major from separate regressions by divisions (Full Sample, I and II & III). Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 163 Table 3B.4 Survey Data Summary Statistics I Variable Obs. Mean Std. dev. Academic measures Science Math score in class 10 Science score in class 10 English score in class 10 First division in class 10 Second division in class 10 Third division in class 10 CBSE syllabus in class 10 ICSE syllabus in class 10 State syllabus in class 10 Behavioral characteristics Grit score Ambiguity tolerance score CRT score Personality score Other variables Student gave a lot of thought on his/her stream choice Student thinks science stream is for smarter students Challenging career is important for student Earnings is important for student Career with travel opportunities is important for student Career that allows to stay in big city is important for student Career that emphasizes managerial skills is important for student Career that has non-transferable job is important for student Parent gave a lot of thought on student’s education Parent thinks stream choice is important signal Parent thinks stream choice is important for job Friend took science Friend took commerce Friends took arts Referred to siblings for information Referred to friends for information 524 373 366 301 524 524 524 524 524 524 524 524 319 524 524 524 524 524 524 524 524 524 524 524 524 524 524 524 524 524 0.60 70.10 68.50 70.83 0.62 0.14 0.13 0.23 0.11 0.65 3.43 40.29 0.76 31.34 0.75 0.27 0.80 0.83 0.65 0.70 0.56 0.52 0.43 0.59 0.52 0.73 0.20 0.11 0.41 0.28 0.49 19.02 16.93 20.18 0.49 0.35 0.34 0.42 0.31 0.48 0.62 8.85 0.96 3.35 0.43 0.44 0.40 0.37 0.48 0.46 0.50 0.50 0.50 0.49 0.50 0.45 0.40 0.31 0.49 0.45 Notes: Data from primary survey conducted by authors in 2017. The survey includes grade 12 students spread across 55 schools in the Indian states of Andhra Pradesh (Vijayawada, Kurnool and Srikakulam districts) and Bihar (Patna, Bhagalpur and Sitamarhi districts.) The sample includes males only. 164 Table 3B.5 Survey Data Summary Statistics II Variable Obs. Mean Std. dev. Household characteristics Student age Bihar Mother completed class 10 Father completed class 10 Household size Distance to closest bank (in kms.) Religion: Hindu Religion: Muslim Scheduled Caste General Caste Other Backward Caste Tier I city Tier II city Tier III city Electric connection Land line telephone Internet connection Tap water supply Student has access to cell phone Student’s phone has internet access 520 524 524 524 524 523 524 524 524 524 524 524 524 524 524 524 524 524 524 514 16.93 0.61 0.67 0.85 4.84 2.41 0.89 0.06 0.16 0.32 0.51 0.41 0.35 0.23 0.98 0.04 0.35 0.69 0.68 0.45 0.96 0.49 0.47 0.36 2.00 5.26 0.31 0.25 0.37 0.47 0.50 0.49 0.48 0.42 0.14 0.21 0.48 0.46 0.47 0.50 Notes: Data from primary survey conducted by authors in 2017. The survey includes grade 12 students spread across 55 schools in the Indian states of Andhra Pradesh (Vijayawada, Kurnool and Srikakulam districts) and Bihar (Patna, Bhagalpur and Sitamarhi districts.) The sample includes males only. 165 Table 3B.6 Robustness on Age Dynamics Log(Earnings) (1) (2) (3) Main Result Age Cohort FE Age < 40 Science N R-squared 0.22*** (0.04) 4687 0.30 0.22*** (0.04) 4687 0.31 0.20*** (0.05) 2654 0.35 Notes: This table reports the coefficients corresponding to specification associated with column 4 of Table 3A.2. Column 2 adds age-cohort fixed effects. Column 3 restricts the sample to men aged 40 years and below. Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 166 Table 3B.7 Robustness on Migration History Log(Earnings) (1) (2) Main Result Migration Science 0.22*** (0.04) Science × No-migration No Migration N R-squared 4687 0.30 0.24*** (0.05) -0.06 (0.07) -0.14*** (0.04) 4687 0.31 Notes: This table reports the coefficients corresponding to specification associated with column 4 of Table 3A.2. Column 2 adds the dummy for no migration history of the individual’s household and its interaction with studying science. The dummy variable “No Migration” is 1 if the individual’s family first came to current town/city at least 90 years ago. Robust standard errors clustered at district level in parentheses. Source: Authors’ calculation using IHDS-II (2011-12) data. Significance *** p<0.01, ** p<0.05, * p<0.1. 167 APPENDIX 3C ADDITIONAL ANALYSIS AND DETAILS 3C.1 Age cohort analysis One concern with our analysis covering a population that ranges from 25 to 65 years old is that factors such as education outcomes, school availability etc. have changed over time. To address this concern, we estimate an augmented equation that controls for age cohorts fixed effects, and also estimate equation (1) for a sample that is less than 40 years. The results from both the regressions are shown in Appendix Table 3B.6. The magnitude and significance of the coefficient from both these additional analyses (+0.22, 𝑝 < 0.01 and +0.20, 𝑝 < 0.01) are qualitatively consistent with the main results. This exercise suggests that the dynamics of educational and employment opportunities do not significantly affect our main results. 3C.2 Description of Survey Variables Grit: Grit is defined as the perseverance and passion for long term goals. We employ Duckworth et al. (2007)’s 12-item Grit Scale. During the survey, respondents rated their agreeableness with each of the statements (items) in the grit scale according to a 5 point rating with 1 corresponding to ‘Very much like me’ and 5 corresponding to ‘Not like me at all’. A high score on the aggregated Grit scale indicates higher grit. Extant research has found that grit is positively associated with educational achievement, GPA scores and probability of completing a task which are important determinants of a successful career. Ambiguity score: Ambiguity intolerance of respondents was measured using the Multiple Stimulus Types Ambiguity Tolerance Scale - II (MSTAT II). This 13-item psychometric scale assesses the cognitive response of respondents to different ambiguous stimuli McLain (2009). Individual items were measured on a 5 point rating with 1 corresponding to ‘Do not agree’ and 5 corresponding to ‘Completely agree’. Low scores on the Ambiguity Tolerance Scale indicate ambiguity intolerance and high scores indicate a liking for ambiguity. Ambiguity experiment: Respondents were presented with four boxes consisting of 10 blue and red balls in varying proportions. Box 1 contained 5 red and 5 blue balls. The second, third 168 and fourth boxes contained anywhere between 4 and 6, 2 and 8, 0 and 10 blue balls respectively. Respondents then picked a box from which a ball will be drawn at random. They win the game if the ball drawn is blue in color. From our data, we construct binary variables by combining Box 1, Box 1 and Box 2, and Box 1, Box 2 and Box 3 which equals various thresholds of ambiguity aversion. The game is an adaptation of the famous “Ellsberg Paradox” in which participants were found to prefer situations with known probabilities of events to unknown probabilities of events. CRT score: Cognitive Reflection Test (CRT) is a test of how quickly respondents process and respond to basic aptitude questions, ignoring an obvious looking incorrect answer, and instead processing the question and responding with a correct answer. Each correct answer was awarded one point, with a total score for each student calculated out of 3. The questions are as follows: 1) A bat and a ball cost Rs 110 in total. The bat costs Rs 100 more than the ball. How much does the ball cost? 2) If it takes 5 machines 5 minutes to make 5 phones, how long would it take 100 machines to make 100 phones? 3) In a closed container, there is an insect. Every day, the number of insects doubles. If it takes 48 days to fill the container. When was the container half filled? Personality: This test measured respondents’ personality and non cognitive skills. Respondents rated their agreeableness on a 5 point scale (with 1 corresponding to ‘Do not agree’ and 5 cor- responding to ‘Completely agree’) to a set of positive statements related to personality and non cognitive skills. These scores were aggregated for all statements to create a cumulative personality score. The questions are as follows: 1) I like to be very good at what I do. 2) I feel I can do just about anything if I put my mind to it. 3) I can be very disciplined and push myself. 4) I am often in a good mood. 5) I want to achieve more than my parents have. 6) I am looking forward to a successful career. 7) I have high goals and expectations for myself. 169