THREE ESSAYS IN HEALTH ECONOMICS By Katlyn Christine Hettinger A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Economics – Doctor of Philosophy 2023 ABSTRACT CHAPTER 1: Intertemporal Substitution in Response to Non-Linear Health Insurance Contracts Health insurance contracts with high annual deductibles have become increasingly popular in the U.S. This feature of insurance contracts allows consumers to substitute healthcare in one period for healthcare in another period by, for example, increasing consumption in the year the annual deductible was met and decreasing future consumption. I obtain an estimate of the causal effect of meeting the deductible on healthcare consumption in the following year. I exploit variation in the timing of an injury that generates significant healthcare expenses and a regression discontinuity design to identify the effect of meeting the deductible. Data for the analysis are from the Marketscan database of medical claims on privately insured individuals at large firms. Estimates indicate that there is intertemporal substitution in healthcare consumption. Reaching the coinsurance arm in one year leads to $13,263 less healthcare consumed, $788 less paid out of pocket, and 7.4 fewer care dates in the following year. For those induced to consume more healthcare by reaching the coinsurance arm of their plan, I find that for every dollar of discretionary healthcare consumed in the year the coinsurance arm is reached, roughly $0.56 less is consumed in the following year. CHAPTER 2: Postpartum Medicaid Eligibility and Postpartum Health Measures (with Claire Margerison) Maternal mortality and morbidity in the US are high compared to similar countries, and racial/ethnic disparities exist, with many of these events occurring in the later postpartum period. Proposed federal and recently enacted state policy interventions extend pregnancy Medicaid from covering 60 days to a full year postpartum. This work estimates the association between maintaining Medicaid eligibility in the later postpartum period (relative to only having pregnancy Medicaid eligibility) with postpartum checkup attendance and depressive symptoms using regression analysis, overall and stratified by race/ethnicity. People with postpartum Medicaid eligibility were 1.0-1.4% more likely to attend a postpartum checkup relative to those with only pregnancy Medicaid eligibility overall, primarily driven by a 3.8-4.0% higher likelihood among Hispanic postpartum people. Conversely, postpartum Medicaid is associated with a 2.2-2.3% lower likelihood of postpartum checkup attendance for Black postpartum people. Postpartum eligibility is also associated with a 9.7-11.6% lower likelihood of self- reported depressive symptoms compared to only pregnancy Medicaid eligibility for white postpartum people only. Postpartum Medicaid eligibility is associated with some improvements in maternal health care utilization and mental health, but differences by race and ethnicity imply that inequitable systems and structures that cannot be overcome by insurance alone may also play an important role in postpartum health. CHAPTER 3: Effects of State Medical Amnesty Policies on Alcohol Use Medical Amnesty policies (MAP) eliminate legal consequences relating to underage drinking when minors seek emergency assistance. I exploit the variation in state policy enactment in a difference-in-differences framework to examine the effects of MAP on alcohol use. Data on self- reported drinking behaviors for 18–20-year-old comes from the 2011-2018 Behavioral Risk Factor Surveillance System. Using 95% confidence intervals, my results can rule out increases larger than 3.5 and 3.3 percentage points for drinking and binge drinking, respectively. My main results support the conclusion that there is no significant long-term increase in underage drinking behaviors due to state MAP implementations. ACKNOWLEDGEMENTS This dissertation would not have been possible with the support of many. First, I would like to thank my advisor, Todd Elder, for his kindness, pep talks, and guidance. Next, thank you to Claire Margerison for providing so many opportunities and years of collaboration. Thank you to my additional committee members, John Goddeeris and Ajin Lee, for providing the most thoughtful feedback every step of the way. Thank you to the Economics faculty of Hope College for encouraging and investing in my talents. To my mentors, formal and informal, thank you for taking the time to share your wisdom. I am grateful for collaborations throughout graduate school with my coauthors Robert Kaestner, Colleen MacCallum-Bridges, Danielle Gartner, and Yasamean Zamani-Hanks. To my cohort-mates, thank you for your support and help finding laughter throughout the journey. To my Women of Economics, I am so thankful that this community existed. Your support and friendship are unmatched. To my parents, Jeff and Brenda Hettinger, I could not be more grateful for your love and support. Thank you for a lifetime of supporting my ambitions and investing in me to make this pursuit possible. To my family and friends, thank you for your love and support even when you had no clue what I was spending all my time doing. The sense of normalcy you provide is invaluable. Finally, I would like to thank this program for introducing me to Alex Johann. I cannot imagine this process without you by my side every step of the way. Your love and support have gotten me through the hardest days. iv TABLE OF CONTENTS CHAPTER 1: INTERTEMPORAL SUBSTITUTION IN RESPONSE TO NON-LINEAR HEALTH INSURANCE CONTRACTS…………………………………..………………...……1 BIBLIOGRAPHY………………………………………………………….…………….39 APPENDIX………………………………………………………………………………41 CHAPTER 2: POSTPARTUM MEDICAID ELIGIBILITY AND POSTPARTUM HEALTH MEASURES………………………………………………………..……………………………46 BIBLIOGRAPHY………………………………………………………….………....….59 APPENDIX………………………………………………………………………………61 CHAPTER 3: EFFECTS OF STATE MEDICAL AMNESTY POLICIES ON ALCOHOL USE………………………………………………………………………………………………74 BIBLIOGRAPHY………………………………………………………………… ..….106 APPENDIX……………………………………………………………………………..108 v CHAPTER 1: INTERTEMPORAL SUBSTITUTION IN RESPONSE TO NON-LINEAR HEALTH INSURANCE CONTRACTS Introduction Insurance plans are increasingly using annual deductibles, which force consumers to take on the full price at the beginning of the plan year, to address moral hazard. However, when individuals meet their deductible, they face a much lower out-of-pocket price based on a coinsurance rate until the start of the next calendar year when the deductible resets. This non- linear plan design causes individuals to face sharp price changes that may create dynamic incentives. There is currently limited knowledge on to what degree consumers respond strategically to these dynamically changing out-of-pocket prices. Given the dynamically changing out-of-pocket price, consumers have the ability to substitute consumption from the high-price periods to the low-price period, for example, by increasing discretionary consumption after meeting the deductible and before the end of the plan year. This response includes “ex post moral hazard” where individuals change their consumption in response to the current out-of-pocket prices they face (i.e., paying only the coinsurance rate), and intertemporal substitution where individuals pull forward consumption that they would have consumed at some point in the future in response to the change in relative prices across time periods. The ability to avoid the deductible, at least for consumption that is discretionary and for which the timing is not significant, suggests that high-deductible plans may not be as effective in reducing consumption as generally thought. My work contributes to this research area by investigating whether individuals decrease their healthcare consumption in the year after unexpectedly meeting their deductible in the context of private health insurance. A reduction in the following year suggests that individuals are not just 1 increasing healthcare consumption in response to lower prices, but rather changing the timing or intertemporally substituting their healthcare consumption. My research design is able to isolate what portion of the increase in spending is intertemporal substitution versus moral hazard. It is important to differentiate these responses because they may have varying implications for optimal plan design. Further, this specific setting is significant as high deductibles have become an increasingly common part of private health insurance benefit designs in recent years. From 2010 to 2020, the share of employer-sponsored health insurance plans with a deductible over $1,000 for singles rose from 27% to 57%. Among plans with a deductible, the average deductible rose from $917 to $1,644 (Kaiser Family Foundation, 2020). To obtain estimates of the effect of meeting the deductible, I use a fuzzy regression discontinuity design. I exploit differences in the timing of experiencing a major healthcare event —an injury—that causes differences in the timing of meeting a deductible among otherwise similar people. Those who are injured earlier in the year and hit the deductible earlier as a result have more opportunity to exploit the lower cost-sharing associated with meeting the deductible to substitute current healthcare for future healthcare. I use the 2010-2012 IBM MarketScan Commercial Claims Database, which is well-suited to answer the research question because it follows privately insured individuals and their dependents throughout the healthcare system for three years. Further, the data includes individuals from a variety of large firms and private insurers, making it more representative of a broad population than previous research using a single employer or insurer. Results of the analysis indicate that individuals substitute healthcare consumption across years in response to dynamically changing out-of-pocket price. Comparing those with similar injuries in late 2010 and early 2011, I find that those meeting their deductible in one year (i.e., 2 2011) consume $13,263 less of healthcare and spend $788 less out of pocket in the following year (i.e., 2012). For those induced to consume more healthcare by reaching the coinsurance arm of their plan, I find that for every dollar of discretionary healthcare consumed in the year the coinsurance arm is reached, roughly $0.56 less is consumed in the following year. Broadly, this work contributes to the literature on consumer responses to health insurance contract design. Within a single year, dynamic incentives, spot prices, and future prices all matter for healthcare consumption choices (Aron-Dine et al., 2015; Brot-Goldberg, 2017; Dalton et al., 2019; Guo & Zhang, 2019; Kowalski, 2016). If within year dynamic incentives are relevant for healthcare consumption decisions, it is likely that incentives across plan years are also important to consider (Klein et al., 2022). These works focus on a single plan year either for simplicity or due to data limitations; however, below I show that certain estimates using only a single year could overstate savings from high-deductible health insurance plans. My paper complements recent studies on intertemporal substitution in response other insurance features including dental insurance annual maximum benefits (Cabral, 2016), a Swedish policy eliminating primary care copayments for the elderly (Johansson, 2023), and the nonlinear contract design (Einav et al., 2015) and anticipation of program implementation (Alpert, 2016) of Medicare Part D. The most closely related work, Lin & Sacks (2019), use the RAND Health Insurance Experiment to conclude that failing to account for intertemporal substitution could cause estimates to overstate savings from high deductible health insurance plans by 20% or more. However, little is known about how individuals respond to non-linear health insurance contracts across years in the modern private US health insurance market. My contributions to the literature are threefold. First, I add to the literature on response to dynamically changing prices by documenting consumers intertemporal substitution in response 3 to varying out-of-pocket prices across years. While this response in the health insurance context is interesting in its own right as health care spending is a sizable portion of the US economy, the findings in this article may also shed light on how consumers are more generally able to strategically respond to non-linear pricing schemes and substitute intertemporally. Second, I use a new fuzzy regression discontinuity design leveraging accidental injuries around the plan year change to identify the effect of meeting a deductible. Third, I contribute to the literature on consumer responses to health insurance by quantifying the causal effect of meeting the deductible on healthcare consumption in the following year. This further demonstrates the importance of accounting for across-year intertemporal substitution in estimates of cost savings from non-linear health insurance plans and is especially important as recent works have estimated price elasticities using a single plan year which fails to account for across-year substitution that undermines cost savings. Background on Non-Linear Health Insurance Contracts Non-Linear Health Insurance Contract Design First, I discuss the simple case of an individual with a non-linear (or high deductible) insurance plan1, and then explain the relevant variations for family plans. The most common form a non-linear plan takes includes a deductible arm, coinsurance arm, and a maximum out-of- pocket limit (or stoploss). The deductible is usually an annual limit and before reaching it the consumer is responsible for the entirety of their healthcare costs.  In practice, several services, for example, an annual physical or contraception, are often not included in the deductible and instead are subject to a copay or zero out-of-pocket cost. These exemptions make it less likely that effects will be observed among preventive care 1 I use the term high-deductible plan broadly and do not tie it to the legal definition for Health Savings Account eligibility. 4 outcomes.2 However, despite these exceptions the majority of medical procedures and diagnostic tests are subject to the deductible. Once the deductible is met, the coinsurance arm is reached. In this period, the insured is only responsible for a coinsurance amount. If a consumer is within the coinsurance arm of their plan and has a 20% coinsurance rate, the out-of-pocket cost for a $500 scan would be $100. Once a certain amount has been paid out of pocket (through the deductible, coinsurance, and copays), the stoploss or out-of-pocket maximum applies, and the insured person has no cost- sharing during this period. The complexity of the typical high-deductible plan increases in the context of a family unit. Many plans include both individual and family deductibles, and out-of-pocket maximums. Most commonly the family deductible and stoploss are two to three times the individual deductible and stoploss. Consider a family plan where each family member faces an individual deductible of $1,000 and a family deductible of $2,000. For a family of two this is equivalent to individual deductibles of $1,000, but it is not equivalent for larger families. For a larger family, the family deductible means that instead of an individual being guaranteed to pay a $1,000 deductible, they could pay a maximum of $1,000 before reaching the coinsurance arm. The coinsurance arm could be met by no family member meeting their individual deductible and instead multiple family members contributing a sum larger than the family deductible. A single family member could still hit their individual deductible of $1,000 and then individually move to the coinsurance arm of the plan. Then the threshold for the other family members to reach the coinsurance arm is either $1,000 individually or summed among the other members. I study the effect of reaching the deductible by identifying individuals that reach the 2 Additionally, the Affordable Care Act (ACA) increased the number of services insurance plans must provide without consumer cost-sharing. 5 coinsurance arm of their plan by meeting either their individual or family deductible. Additional Related Literature Within a single year, dynamic incentives, spot prices, and future prices all matter for healthcare consumption choices (Aron-Dine et al., 2015; Brot-Goldberg, 2017; Dalton et al., 2019; Guo & Zhang, 2019; Kowalski, 2016). If within year dynamic incentives are relevant for healthcare consumption decisions, it is likely that incentives across plan years are also important to consider (Klein et al., 2022). Guo & Zhang (2019) concludes that, relative to fully-forward looking behavior, the myopia of fathers in responding to nonlinear health insurance plans in the year of childbirth leads to a 21-24% decrease in annual medical spending. Brot-Goldberg (2017) exploits a firm switching from a free-healthcare to a high-deductible plan and estimates that the firm saved 11.8-13.8% on healthcare spending from switching. These works focus on a single plan year either for simplicity or due to data limitations; however, below I show that estimates using only a single year could overstate savings from high-deductible health insurance plans. Data Description Dataset Data for the analysis comes from the 2010-2012 IBM Truven Health MarketScan Commercial Database which is obtained from large companies and private health insurers across the United States. The sample is not representative of the U.S. population overall but may reflect reasonably well the population of people with large firm private insurance—a non-trivial group. Further, because the dataset comes from a variety of employers and companies across the U.S., I am able to study a broader privately insured population than studies with data from a single employer or insurer. The MarketScan database contains all insurance claims for individuals and their 6 spouse/dependents including inpatient, outpatient, and pharmaceutical claims. Each claim provides information on the total amount paid by the insurer and out-of-pocket payment categorized as payment towards the deductible, coinsurance, or copayment. I aggregate this claim-level information to the individual level for all analyses. The dataset also includes limited demographic information on the age and sex of all individuals. Claims data allows me to observe, in detail, the spending, diagnoses, and procedures of individuals. The major limitation of this claims data is that I do not observe individuals without any claims in a year. However, any small claim within the year such as an influenza vaccine or prescription refill would lead to inclusion. To follow changes in healthcare consumption over time, I must limit my sample to those observed in my sample all three years. This means that the sample represents a group where the primary enrollee is linked to the same employer or private insurer for three consecutive years. I also exclude individuals under the age of 18 out of concern that guardians may be more altruistic towards their children and not be willing to delay their care, but I also show robustness to their inclusion in Appendix Table A1. To select a sample of those with unexpected injuries, I use the injuries selected by Kowalski (2016), which were selected based on the fact that individuals that have the injury in their families do not spend more on their own medical care before the injuries occur. Those that have one of these injuries would make significant progress towards, if not meet, their deductible in the year that the injury occurs. Based on their selection, this class of injuries does not appear to be strategically timed in any manner. In Table 1, I show the identifying injuries, their ICD-9 codes, and the counts overall and by year of injury for the 90-day bandwidth sample. The most common identifying injury is 7 sprains and strains of joints and adjacent muscles occurring in 34 percent of the sample. Other common injuries occurring in roughly ten percent of the sample include fractures, dislocations, open wounds, contusions, and complications of trauma. Summary statistics of observed covariates overall and by injury year are displayed in Table 2. The summary statistics across the two sides of the discontinuity appear quite similar. The sample is 58% women, and the most common age range is 45-54. The mean individual deductible in 2011 is $608. Table 3 presents summary statistics of the 2012 outcomes overall. The mean of total spending is $11,631, while the 99th percentile of total spending is $125,763.3 The out-of-pocket mean is $1,468, while the 99th percentile, $7,121, is relatively smaller because out-of-pocket maximums exist on nearly all plans. The average number of care dates in 2012 is 16 with the majority being outpatient, while only 9% of individuals have an inpatient claim. In 2012, 31 and 60% have at least one elective or preventive claim observed, respectively. We would expect those with the same injuries occurring a short time apart to be incredibly similar. However, when in the year the injury occurs determines how long the individual has to benefit from meeting their deductible and thus reaching the coinsurance arm of their plan. Therefore, an individual with an injury occurring in the end of the year would have little time to react from the increased likelihood of meeting their deductible, while an individual with the same injury in the beginning of the subsequent year could have twelve months to react to the increased likelihood of reaching their deductible. In Figure 1, I show the probability of reaching the coinsurance arm (i.e., meeting the deductible) by the first date the injury is observed. The probabilities of reaching the coinsurance arm in 2011 for those with a 2010 or 2011 injury are 70.4 and 77.6, respectively (Table 4). 3 See Section 3.3 for definitions of outcomes. 8 In my main specification, I focus on those with the first observed date of the injury occurring within 90 days of the 2010–2011-year change. This selection of a 90-day bandwidth is based on the average across outcomes for data-driven bandwidth selection methods following the procedures of Calonico, Cattaneo, and Titiunik (2014a, b, 2015b). I additionally show robustness to other bandwidths in Section 7. Determining Deductible Amount The major limitation of this dataset is that it does not include specific plan details such as the deductible or out-of-pocket maximum. But I am able to back out deductible amounts for over 90% of individuals through a few reasonable assumptions and data features that are very similar to the procedures used in Guo and Zhang (2019). First, I assume that all family members will have the same deductible and that the family deductible is two times the individual deductible. I also have a plan key that identifies all individuals on the same insurance plan for a portion of the sample and assume that all individuals with the same plan key have the same deductible. Because deductibles are most often chosen at common increments, I assume that individual deductibles must be multiples of 25 and family deductibles must be multiples of 50. For each individual and each family, I sum the deductible spending throughout each calendar year. If the individual sum of deductible spending is a multiple of 25 or the family sum is a multiple of 50, I consider that to be the most likely deductible. Then for each plan key, I consider the most common deductible amount to be the deductible for all with the plan key with the exception of zero. If the most common deductible amount is zero, I consider the deductible to be zero if greater than 90% of those on the plan have a deductible of zero. If not, then the second most common deductible on the plan is chosen. This is done because some plans have deductibles but offer many deductible exemptions for things like standard office visits. 9 For those with no plan key, I assume the deductible is the sum of deductible spending observed for the individual (family) if it is a multiple of 25 (50). If I still have not found a deductible amount for a family in 2011, then I use the deductible estimated for 2012 or 2010, respectively, since it is reasonable to assume that those enrolled at the same employer would enroll in similar plans across years. In my main analysis, I exclude those with a deductible of $100 or less as the instrument shows little validity for this group (Table 7).  2012 Outcomes The primary outcome of interest is total healthcare spending in year 𝑡 + 1 (i.e., 2012). Total spending is defined as the total amount spent by the insurance company and individual out of pocket on all healthcare claims (outpatient, inpatient, and pharmaceutical) in year 𝑡 + 1. To examine the spending most relevant to the healthcare consumer, I define the outcome total out- of-pocket spending on all healthcare claims (outpatient, inpatient, and pharmaceutical) in year 𝑡 + 1. Out-of-pocket spending is the sum of all payments by the individual towards the deductible, coinsurance, or copayments. To address concerns about the skewed distributions of these spending outcomes, I also examine both outcomes with natural log, ln(1 + 𝑦𝑖𝑡+1 ), and inverse hyperbolic sine, 𝑎𝑟𝑠𝑖𝑛ℎ(𝑦𝑖𝑡+1 ), transformations. To examine a different margin and avoid concerns about the propensity to consume more medical care and the propensity to consume more expensive medical care being related, I also examine total care dates. A count of dates rather than a count of services is used because it is often difficult to distinguish separate services in claims or they may be charged as a bundle of services. I define care dates as a count of the number of service dates in year 𝑡 + 1 that an individual has outpatient or inpatient claims. To better understand in what areas individuals respond, I examine outpatient and 10 inpatient care dates separately, and classes of elective and preventive healthcare. Outpatient and inpatient care dates are counts of the number of service dates in year 𝑡 + 1 that an individual has outpatient or inpatient claims, respectively. The elective care dates outcome is a count of the number of dates in year 𝑡 + 1 with elective services being defined based on elective procedures defined using Berenson-Eggers Type of Service (BETOS) codes by Clemens and Gottlieb (2014) and also used by Guo and Zhang (2019). This class of elective services includes procedures such as cataract removal, joint replacement, colonoscopy, and minor skin procedures. The preventive care dates outcome is a count of the number of dates in year 𝑡 + 1 with preventive services being defined based on the Center of Medicare and Medicaid Services’ list of preventive services. Examples of these preventive services include annual wellness visits, influenza vaccinations, and disease screenings. The full list of elective and preventive services is available in the Appendix. For outcomes that do not always occur and zeros are common (inpatient, elective, and preventive), I also examine the outcome as an indicator of any of that type of care consumed in the year. Tables 2 and 3 present summary statistics on the year 𝑡+1 (i.e., 2012) outcomes. Mean total healthcare spending and out-of-pocket spending in year 𝑡 + 1 are $11,631 and $1,468, respectively. The average number of care dates is 16.2 with outpatient and inpatient averages of 15.8 and 0.5. Only nine percent of the sample has any inpatient care while 31 and 60 percent have any elective or preventive care, respectively. 2011 Outcomes To give an estimate of the tradeoff between discretionary spending in the year the deductible is met and the following year, I also estimate the models for the outcomes in year 𝑡 (i.e., 2011). First, I separate the outcomes in year 𝑡 into those that are related to the injury or not. 11 For the total healthcare spending outcomes, I measure spending related to the injury as the totals from any claim that has one of the injury ICD-9 codes. I then also measure the complement, spending unrelated to the injury, which is the total of any spending in year 𝑡 which does not have one of the injury ICD-9 codes. However, it is possible that the injury codes are not always used on claims for follow-up care. This would lead to an underestimate of the tradeoff of discretionary spending because the unidentified follow up care would inflate the denominator. To avoid issues of errors in measurement of discretionary care, I also measure all spending and care dates in year 𝑡, regardless of injury. These measures in year 𝑡 contain the cost of treating the original injury for those with the first date the injury is observed in the beginning of year 𝑡. Follow-up care for these injuries could be contained in the measures for anyone in the sample. In Table 4, I present averages of year 𝑡 (2011) and year 𝑡 + 1 (2012) outcomes by the year of injury to show that the expected patterns are observed in the raw data. Year 𝑡 total spending is higher for those with an injury in year 𝑡 comparing $14,804 to $11,761. Then in the following year average spending is lower for those with a year 𝑡 injury relative to a year 𝑡 − 1 injury ($11,526 to $11,726). Empirical Strategy I use a fuzzy regression discontinuity design to identify the effect of meeting a deductible in one year on healthcare consumption in the following year. I exploit the fact that most health insurance deductibles reset at the beginning of the calendar year by comparing those with a class of unexpected injuries, which would be unlikely to be strategically delayed, in late 2010 and early 2011. We would expect that those who suffer injuries on either side of the calendar year are similar, except that the year in which their injury occurs changes the probability of them meeting 12 their deductible in 2011. My model relies on this variation to identify the effect of meeting a deductible (i.e., reaching the coinsurance arm) in one year (2011) on healthcare consumption in the following year (2012). First Stage I implement the fuzzy regression discontinuity using a local linear regression with a bandwidth of 90 days and a rectangular kernel. I use an instrumental variables estimation framework (Imbens and Lemieux, 2008) where the first stage of the model is: 𝑤𝑖𝑡 = 𝛾0 + 𝛾1 𝐼𝑛𝑗𝑢𝑟𝑦𝑖𝑡 + 𝛾2 𝐼𝑛𝑗𝑢𝑟𝑦𝑖𝑡−1 ∗ 𝐷𝑎𝑡𝑒𝑖𝑡−1 + 𝛾3 𝐼𝑛𝑗𝑢𝑟𝑦𝑖𝑡 ∗ 𝐷𝑎𝑡𝑒𝑖𝑡 + 𝜸𝟒 𝑿𝒊𝒕′ + 𝜐𝑖𝑡 (1) where 𝑤𝑖𝑡 is an indicator for individual 𝑖 reaching the coinsurance arm of their plan in year 𝑡. An individual could reach the coinsurance arm of their plan by either meeting their individual deductible or their family deductible. The year is indexed with 𝑡 − 1 for 2010 measures, 𝑡 for 2011 measures, and 𝑡 + 1 for 2012 measures. The excluded instrument, 𝐼𝑛𝑗𝑢𝑟𝑦𝑖𝑡 , is an indicator equaling one if the injury was first observed in year 𝑡 (early 2011) and zero if in year 𝑡 − 1 (late 2010). The running variable is represented by 𝐷𝑎𝑡𝑒𝑖𝑡−1 and 𝐷𝑎𝑡𝑒𝑖𝑡 which range from 1-90 in years 𝑡 − 1 and 𝑡, respectively. The interactions 𝐼𝑛𝑗𝑢𝑟𝑦𝑖𝑡−1 ∗ 𝐷𝑎𝑡𝑒𝑖𝑡−1 and 𝐼𝑛𝑗𝑢𝑟𝑦𝑖𝑡 ∗ 𝐷𝑎𝑡𝑒𝑖𝑡 between the running variable and what year the injury occurs allow the two sides of the discontinuity to have separate slopes. The vector 𝑿𝒊𝒕′ contains age, sex, individual deductible amount, and number of family members observed on the plan in ranges. Although there appear to be balance across the discontinuity in these observables (Figure 2), I include the deductible amount and number of family members because they are mechanical factors in predicting the likelihood of an individual reaching the coinsurance arm of the plan. I also show robustness to the exclusion of these covariates in Table 7. 13 Figure 1 shows a visual of the first-stage identification. Visually, the discontinuity is clear with little overlap across the year change in the probability of meeting the deductible. The first stage point estimate of 𝛾1 of 0.059 represents a roughly six percentage point increase at the discontinuity in the probability of an individual meeting their deductible in year 𝑡. The first stage meets all recent standards for power with a first stage F-statistic of 237.0. To further support the validity of the research design, Figure 2 presents visualizations of the balance of covariates including the 2011 deductible, number of family members observed on the plan, birth year, and sex. They all show no major patterns or discontinuity at the cutoff supporting the validity of the design. Figure 3 shows the density of observations across the first observed service date for injuries. There are clear weekly patterns in the frequencies and some variations that can be attributed to holidays. Seasonal variation in these injuries is also plausible since many of the injuries could be connected to risky behaviors. However, the density plot still appears relatively smooth across the discontinuity. Second Stage The second stage is as follows: 𝑦𝑖𝑡+1 = 𝛽0 + 𝛽1 𝑤𝑖𝑡 + 𝛽2 𝐼𝑛𝑗𝑢𝑟𝑦𝑖𝑡−1 ∗ 𝐷𝑎𝑡𝑒𝑖𝑡−1 + 𝛽3 𝐼𝑛𝑗𝑢𝑟𝑦 𝑖𝑡 ∗ 𝐷𝑎𝑡𝑒𝑖𝑡 + 𝜷𝟒 𝑿𝒊𝒕′ + 𝜀𝑖𝑡 (2) where 𝑦𝑖𝑡+1 , is a healthcare consumption outcome in the year 𝑡 + 1. The second stage contains the same date-injury year interactions and covariates as the first stage. Here 𝑤𝑖𝑡 , the indicator for reaching the coinsurance arm in year 𝑡 is instrumented for by 𝐼𝑛𝑗𝑢𝑟𝑦𝑖𝑡 , which is an indicator for having an injury in year 𝑡 versus 𝑡 − 1. The estimates of 𝛽1 are therefore identified by the discontinuities in the probability of reaching the coinsurance arm in year 𝑡 at the year change. 14 Results and Discussion 2012 Outcomes Table 5 presents estimates of 𝛽1 based on Eqs. (1) and (2). Because injury date, 𝐷𝑎𝑡𝑒𝑖𝑡 , is discrete, all models report standard errors clustered by injury date following the inference procedure proposed by Lee and Card (2008). The first row has outcomes in the form of dollars spent or count of care dates, the second and third row presents the natural log and inverse hyperbolic sine of spending outcomes, and the fourth row presents outcomes as an indicator for the type of care occurring during the year. In the first column, the outcome is total spending on healthcare in year 𝑡 + 1 and the point estimate of -13,263 implies that an individual meeting their deductible in the prior year is associated with a $13,263 decrease in total spending. This effect is large relative to the sample mean of $11,631 for total spending. In the second column, the point estimate for total out-of- pocket spending is -788. This implies that an individual meeting their deductible in the prior year is associated with a $788 decrease in out-of-pocket spending. The magnitude of this estimate is still large relative to the mean of $1,468, but not proportional to the total spending estimates because of the non-linear plan structure and out-of-pocket maximums. To remove variations in costs as a factor, I examine total care dates in the third column. I estimate that meeting the deductible in the year prior leads to a decrease of 7.4 care dates relative to a mean of 16.2 care dates. The effect appears to be driven by decreases in both outpatient and inpatient care dates. When examining a class of elective care dates, the point estimates imply a 12.3 percentage point decrease in the probability of consuming any elective care in the year after the deductible is met, relative to a mean of 31 percent. In my main specification, I fail to detect any changes in preventive care dates. This result is suggestive that insurance companies and 15 public policy efforts to exempt preventive care from consumer cost-sharing may successfully prevent decreases compared to other types of care. To support the estimates of Table 5 visually, Figures 4 and 5 show the reduced-form relationship between the first date of service for the injury and outcomes. Despite being only the reduced form, the discontinuity in the expected direction is visually apparent. 2011 Outcomes I use outcomes in the year the deductible is met to (1) verify that there is an observed increase in care in the year the deductible is met, and (2) estimate the tradeoff between spending across years. The first three rows of estimates in Table 6 contain the identifying injury for those with an injury in year 𝑡 and potentially follow-up care for all individuals. The fourth and fifth rows decompose total spending based on whether an injury code is present on the claim. Overall, the point estimate for total spending is $36,706 with the estimate for injury spending in year 𝑡 being $12,818 and for non-injury spending being $23,887. Relative to the mean of total spending in year 𝑡 of $13,215, these estimates are large and economically significant increases in consumption. Combining the year 𝑡 overall estimates with those from year 𝑡 + 1 implies that for those induced to consume more healthcare by meeting their deductible in one year, for every dollar of healthcare consumed in the year the deductible is met they consume $0.37 less in the following year ($13,263/$36,706). Further, if we are interested in only the tradeoff between discretionary spending beyond that related to the injury, my estimates imply that for those induced to consume more healthcare by meeting their deductible in one year, for every dollar of elective care consumed in the year the deductible is met, they consume $0.56 less in the following year ($13,263/$23,887). From the consumer’s perspective, the out-of-pocket estimates imply that for 16 those induced to consume more healthcare by meeting their deductible in one year, for every dollar spent out of pocket in the year the deductible is met $0.24 less is spent in the following year ($788/$3,288). While it is difficult to categorize what care is directly related to the injury as diagnosis and billing codes may vary, measurement error from failing to match all injury-related claims to the injury would cause an overestimate of discretionary (non-injury related) spending. Additionally, if those with injuries in year 𝑡 − 1 are participating in intertemporal substitution to any degree, the year 𝑡 spending estimates will overestimate the true effect. However, the year 𝑡 spending estimates serve as the denominator of the tradeoff, so potential overestimates lead to a conservative estimate and make my estimates represent a lower bound of the true tradeoff. Examining elective and preventive care dates, which are unrelated to the injury, provides insights into where the increases in discretionary spending are occurring. The point estimate for elective care dates of 1.96 relative to a mean in year 𝑡 of 0.65 suggests that these elective procedures are a major channel where this spending occurs. Interestingly, there is a 37.1 percentage point increase in the probability of consuming any preventive care relative to a mean of 59.4 percent. This contrasts with failing to detect any significant effect on preventive care in year 𝑡 + 1. A possible explanation is that increased interaction with the healthcare system increases preventive care usage, but cost-sharing exemptions are effective at minimizing decreases in preventive care due to non-linear plans. Back-of-the-Envelope Calculation of Economic Impact To understand the scope of this intertemporal substitution, I conduct a back-of-the- envelope calculation of the cost savings to the privately insured in the U.S. from intertemporally substituting when consuming the average out-of-pocket cost using 2020 insurance rates. I 17 calculate 150,000,000 ∗ 0.57 ∗ 0.83 ∗ 0.06 ∗ 0.24 ∗ $1,468 = $1,500,143,328 (3) where there are roughly 150 million employed workers in the US, 57% of workers are covered by employer-provided insurance, 83% then have a deductible, roughly 6% of the population are compliers, the average out-of-pocket is $1,468, and the out-of-pocket spending tradeoff found is $0.24 less spent per dollar spent in the previous year (KFF, 2020).4,5 I find that U.S. consumers are saving at least $1.5 billion per year through intertemporal substitution. While a strategic subpopulation is benefiting from these savings, all individuals with employer-provided health insurance likely bear the cost through slightly higher premiums (or lower wages). Estimates of meeting the deductible from instrumental variables strategies using within year variation and a single year of data are unable to capture that consumers are strategically intertemporally substituting across years to consume healthcare at lower prices. Because intertemporal substitution is not accounted for in estimates of the cost-saving benefits of high- deductible plans extrapolating from their estimates will overstate savings for insurers and employers. Similarly, I estimate what savings health insurance companies are not receiving that certain single year estimates would suggest. I calculate 150,000,000 ∗ 0.57 ∗ 0.83 ∗ 0.37 ∗ $11,631 ∗ 0.06 = $18,323,744,913 (4) where there are roughly 150 million employed workers in the US, 57% of workers are covered by employer-provided insurance, 83% then have a deductible, roughly 6% of the population are compliers, , the average of total spending is $11,631, and the total spending tradeoff found is $0.37 less spent per dollar consumed in the previous year (KFF, 2020).4, 5 Thus, I find that 4 While there should be standard errors on all the terms of the calculation, some are unpublished, so I am unable to provide a confidence interval. 5 These back-of-the-envelope calculations require the assumption that those with an accidental injury in my sample are representative of a broader population of those with employer-provided health insurance with a deductible. 18 estimates using only a single year would overestimate the savings of high-deductible plans by $18.3 billion. Further, if intertemporal substitution extends beyond the year after the deductible is met, all of these estimates will underestimate the true effect. It seems plausible that intertemporal substitution would extend beyond a single year as some types of care could be strategically timed across multiple years and some individuals may expect to meet their deductible only every few years Additionally, the size of deductibles has continued to grow over time which would likely lead to both increased rates and magnitudes of intertemporal substitution. Thus, these results are a conservative estimate of the economic impact of intertemporal substitution. Sensitivity, Placebo, and Heterogeneity Analyses Placebo Tests and Heterogeneity by Deductible Amount In Figure 6, I show reduced form plots of 2011 and 2012 total spending for the main sample and then for those that are excluded from the sample because they have no deductible. Those with no deductible are a placebo test in the sense that we would expect to see a mechanical difference in spending 2011 because they experience the same injuries. However, we would expect to see little to no change in spending in 2012 because the group has no deductible and has no incentives to intertemporally substitute if they did not reach their out-of-pocket maximum in 2011, which would be unlikely for the average injury in the sample. The 2011 total spending plots look similar across groups, while the 2012 total spending plots show different patterns as expected. For those with a deductible greater than $100 there is a noticeable decrease in total spending at the discontinuity as expected, while for those without a deductible, there is an increase in 2012 total spending at the discontinuity of a much smaller magnitude. The fact that the placebo test of those without a deductible is of the opposite sign shows that the identification 19 is coming solely from those with deductibles and is convincing evidence in support of the identification strategy’s validity. In Table 7, I conduct a heterogeneity analysis stratifying by deductible amount. But the first group of those with a deductible that is less than or equal to $100 serve as a placebo test because a deductible that small is much less likely to induce these behavioral responses across years. I find no significant results for this group and the instrument is weak with a first-stage F- statistic of 0.17. The second through fifth columns of Table 7 presents the results from the main sample stratified by deductible amount. I would predict that individuals with mid-sized deductibles (i.e., $300-$1000) are the most likely to only meet their deductible in some years leading to strong incentives to substitute care across years. Those with smaller deductibles (i.e., <$300) are more likely to reach the coinsurance arm of their plan consistently. Similarly, those with much larger deductibles (i.e., ≥$1000) are less likely to meet their deductible often or from the injury. The first-stage coefficient is largest for mid-range deductibles, and smaller for both smaller and larger deductibles aligning with these predictions. Further, point estimates are largest for deductibles of $300-999 as these individuals are likely the most likely to only meet their deductible in some years leading to a substituting care across years. Sample size limitations make it difficult to comment on deductibles greater than $1000. Sensitivity/Robustness In my main specifications, I include covariates to address concerns about observables which may impact the likelihood of an individual meeting their deductible. I show robustness to this choice in Table 8 where the covariates in the vector 𝑿𝒊𝒕′ (age, sex, individual deductible amount, and number of family members observed on the plan in ranges) are excluded. Overall, 20 results appear quite similar with estimates having slightly larger magnitudes making the main specification a conservative estimate. Specifically, the estimates without covariates for total and out-of-pocket spending of -15,085 and -925 are very close to the corresponding estimates with covariates of -13,263 and -788. Due to lack of plan information, I assume that the family deductible is twice the individual deductible meaning it is only relevant for families with more than two individuals enrolled. To show robustness to this assumption, I show results stratified by the number of observed family members enrolled in Table 9. Point estimates are similar and of a larger magnitude for those with one or two family members observed on the plan. Since the family deductible is not relevant for these individuals, assumptions about the family deductible are not driving results. To address concerns about individuals strategically shifting treatment around the year change, I run two donut regression discontinuities with bandwidths of 7 and 14 days (Table 10). These specifications also address potential concerns about the uniqueness of injuries during the winter holiday season. These estimates are quite similar to the main specification with a slightly larger magnitude for the point estimates. Similarly, Figure 7 shows variations in bandwidth of 30-180 days. Point estimates are similar for bandwidths of roughly 60-180 days. For smaller bandwidths standard errors increase and the proportion of individuals with 2011 injuries who may have injury care in 2012 increases. Conclusion In this work, I show that intertemporal substitution across years in response to non-linear health insurance plans exists in the modern U.S. healthcare context. This finding is especially important as high-deductible health insurance plans have become increasingly common, and 21 many estimates of the cost savings of high deductibles come from sources that do not capture intertemporal substitution. Using claims data following privately insured individuals over three years, I find that every dollar of discretionary healthcare consumed in the year the coinsurance arm is reached, roughly $0.56 less is consumed in the following year for those induced to consume more care by meeting their deductible. The local average treatment effects indicate that reaching the coinsurance arm in one year leads to $13,263 less healthcare consumed and $788 fewer paid out of pocket in the following year. These results align with the conclusions of Lin and Sacks (2019), based on a simulation using the RAND Health Insurance Experiment, that failing to account for intertemporal substitution could cause estimates to overstate cost savings from high deductible plans by more than 20 percent. While my fuzzy regression discontinuity produces a local average treatment effect with strong internal validity, it does not immediately translate to total cost estimates for the entire population. To understand the scope of this intertemporal substitution, I conduct a back-of- the-envelope calculation of and find that U.S. consumers are saving at least $1.5 billion per year through intertemporal substitution. Further, single year estimates would overstate the national savings from high-deductible plans by at least $18.3 billion emphasizing the importance of the results that ignoring across-year intertemporal substitution would cause many previous estimates using a single year to overstate the cost-saving benefits of high-deductible plans. 22 TABLES AND FIGURES Figure 1.1. Probability of Reaching Coinsurance Arm (Meeting Deductible) in 2011 23 Figure 1.2. Balance Tests 24 Figure 1.3. Number of Individuals Observed by First Service Date of Injury 25 Figure 1.4. Reduced Form Plots for Spending Outcomes 26 Figure 1.5. Reduced Form Plots for Care Date Outcomes 27 Figure 1.6. Reduced-Form Plot Placebo Test 28 Figure 1.7. Estimates for 30-180 Day Bandwidths 40,000 30,000 20,000 10,000 0 -10,000 -20,000 -30,000 30 40 50 60 70 80 90 100 110 120 130 140 150 160 170 180 29 Table 1.1. Identifying Injuries in Sample, Overall and By Year of Injury Late 2010 Early 2011 Total Injuries Injuries Injuries Injuries from Kowalski (2016) ICD-9 (Percent) (Percent) (Percent) Entire Sample 254,902 (100) 133,069 (100) 121,833 (100) Fractures 800- 23,847 (9.4) 12,187 (9.2) 11,660 (9.6) 829 Dislocation 830- 24,742 (9.7) 13,509 (10.2) 11,233 (9.2) 839 Sprains and Strains of Joints and Adjacent 840- Muscles 87,431 (34.3) 44,608 (33.5) 42,823 (35.1) 849 Intracranial Injuries, Excluding Skull 850- Fractures 4,334 (1.7) 2,099 (1.6) 2,235 (1.8) 859 Internal Injury of Thorax, Abdomen, and 860- Pelvis 0 (0.0) 0 (0.0) 0 (0.0) 869 Open Wounds 870- 25,453 (10.0) 13,749 (10.3) 11,704 (9.6) 899 Injury to Blood Vessels 900- 267 (0.1) 131 (0.1) 136 (0.1) 904 Late Effects of Injuries, Poisonings, Toxic 905- Effects, and Other External 1,018 (0.4) 609 (0.5) 409 (0.3) 909 Superficial Injuries 910- 13,535 (5.3) 7,748 (5.8) 5,787 (4.7) 919 Contusion with Intact Skin Surface 920- 23,908 (9.4) 12,256 (9.2) 11,652 (9.6) 924 Crushing Injuries 925- 675 (0.3) 383 (0.3) 292 (0.2) 929 Foreign Body Injuries 930- 5,031 (2.0) 2,685 (2.0) 2,346 (1.9) 939 Burns 940- 2,346 (0.9) 1,268 (1.0) 1,078 (0.9) 949 Injuries to Nerves and Spinal Cord 950- 1,165 (0.5) 651 (0.5) 514 (0.4) 957 Complications of Trauma 958- 32,188 (12.6) 15,882 (11.9) 16,306 (13.4) 959 Poisoning by Drugs, Medicinal and 960- Biological Substances 2,254 (0.9) 1,117 (0.8) 1,137 (0.9) 979 Toxic Effects of Substances Chiefly 980- Nonmedicinal and Other External 17,729 (7.0) 9,177 (6.9) 8,552 (7.0) 995 Complications of Surgical and Medical Care, 996- Not Elsewhere Classified 16,789 (6.6) 8,469 (6.4) 8,320 (6.8) 999 30 Table 1.2. Covariate Summary Statistics, Overall and By Year of Injury 2010 2011 2010 2011 Overall Injury Injury Overall Injury Injury 2011 Deductible (in USD) Sex Mean 607.9 606.8 609.1 Male 42.0 42.3 41.7 101-199 6.9 6.7 7.0 Female 58.0 57.7 58.3 200-299 18.6 18.8 18.3 Age (in 2010) 300-399 20.0 20.2 19.9 18-34 24.7 24.4 25.0 400-499 6.6 6.5 6.7 35-44 24.6 24.5 24.8 500-749 22.0 22.1 21.8 45-54 33.0 33.3 32.6 750-999 5.3 5.2 5.4 55-64 17.7 17.9 17.6 1000- 9.7 9.5 9.9 Number of Family Members Enrolled 1249 1250- 3.3 3.1 3.5 1 20.0 19.5 20.6 1499 1500- 4.3 4.3 4.2 2 23.6 24.0 23.3 1749 1750- 0.4 0.5 0.4 3 18.1 18.2 18.0 1999 2000- 1.2 1.2 1.1 4 22.7 22.8 22.7 2499 2500- 0.6 0.6 0.6 5-6 13.4 13.4 13.4 2999 3000- 1.1 1.2 1.1 7-8 1.7 1.7 1.7 4999 5000- 0.0 0.0 0.0 ≥9 0.4 0.4 0.4 10000 31 Table 1.3. Summary Statistics of Year 𝒕 + 𝟏 (2012) Outcomes 2012 Outcomes Mean SD P1 P25 P50 P75 P99 Total Spending 11631 32495 52 1368 3867 10403 125763 Out of Pocket 1468 1900 0 383 958 2038 7121 Care Dates 16.17 19.43 0 5 10 21 88 Outpatient 15.75 18.39 0 5 10 20 83 Inpatient 0.53 3.64 0 0 0 0 11 Elective 0.54 1.47 0 0 0 1 5 Preventive 1.08 1.31 0 0 1 2 5 1(Inpatient>0) 0.09 0.28 0 0 0 0 1 1(Elective>0) 0.31 0.46 0 0 0 1 1 1(Preventive>0) 0.60 0.49 0 0 1 1 1 Table 1.4. Means of Outcomes Overall and by Year of Injury 2011 Outcomes 2010 2011 2012 Outcomes 2010 2011 (𝑦!" ) Overall Injury Injury (𝑦!"#$ ) Overall Injury Injury Probability of 73.83 70.42 77.55 Meeting Deductible Total Spending 13,215 11,761 14,804 Total Spending 11,631 11,726 11,526 Out of Pocket 1,583 1,461 1,717 Out of Pocket 1,468 1,474 1,461 Care Dates 19.08 18.02 20.23 Care Dates 16.17 16.32 16.01 Outpatient 18.52 17.57 19.57 Outpatient 15.75 15.89 15.60 Inpatient 0.69 0.56 0.83 Inpatient 0.52 0.53 0.52 Elective 0.65 0.59 0.72 Elective 0.54 0.55 0.53 Preventive 1.08 1.07 1.08 Preventive 1.08 1.08 1.08 1(Inpatient>0) 0.110 0.096 0.126 1(Inpatient>0) 0.087 0.087 0.087 1(Elective>0) 0.375 0.337 0.416 1(Elective>0) 0.309 0.313 0.305 1(Preventive>0) 0.594 0.594 0.595 1(Preventive>0) 0.603 0.603 0.604 32 Table 1.5. Effect of Meeting Deductible in Year 𝒕 on Year 𝒕 + 𝟏 Outcomes Total Outpatien Inpatient Elective Prevent. Outcome Total Total Out Care t Care Care Care Care Form Spending of Pocket Dates Dates Dates Dates Dates - 𝑦!"#$ -788.4** -7.39** -6.19* -1.56*** -0.36* -0.20 13,263*** (5,005) (367.5) (3.50) (3.29) (0.59) (0.19) (0.18) ln(1 + 𝑦!"#$ ) -0.503* -0.559** - - - - - (0.276) (0.235) 𝑎𝑟𝑠𝑖𝑛ℎ(𝑦!"#$ ) -0.504* -0.577** - - - - - (0.277) (0.277) - 𝟏(𝑦!"#$ > 0) - - - - -0.123* -0.0667 0.0863** (0.042) (0.064) (0.059) N 254,902 254,902 254,902 254,902 254,902 254,902 254,902 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑦!"#$ 11,631 1,468 16.17 15.8 0.52 0.54 1.08 𝑆𝐷 𝑜𝑓 𝑦!"#$ 32,495 1,900 19.42 18.4 3.64 1.47 1.31 Note: Each coefficient is estimated from a single regression and represents 𝛽$ in Eq. (2). All standard errors are clustered at the first service date of the injury. *p<0.10, **p<0.05, ***p<0.10. 33 Table 1.6. Effect of Meeting Deductible in Year 𝒕 on Year 𝒕 Outcomes Total Total Outpatien Inpatient Elective Prevent. Total Outcome Form Out of Care t Care Care Care Care Spending Pocket Dates Dates Dates Dates Dates 36,706** 3,288** 23.09** 2.159** 1.962** 𝑦!" 21.46*** 0.122 * * * * * (5,938) (263.5) (3.838) (3.654) (0.686) (0.277) (0.216) 4.423** ln(1 + 𝑦!" ) 4.896*** - - - - - * (0.344) (0.212) 4.502** 𝑎𝑟𝑠𝑖𝑛ℎ(𝑦!" ) 4.905*** - - - - - * (0.345) (0.216) 0.486** 1.544** 0.371** 𝟏(𝑦!" > 0) - - - - * * * (0.0372) (0.104) (0.0427) 𝑦!" 𝑜𝑛 𝐼𝑛𝑗𝑢𝑟𝑦 12,818** - - - - - - (5,297) 23,887** 𝑦!" 𝑁𝑜𝑡 𝑜𝑛 𝐼𝑛𝑗𝑢𝑟𝑦 - - - - - - * (1,918) N 254,902 254,902 254,902 254,902 254,902 254,902 254,902 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑦!" 13,215 1,583 19.08 18.52 0.691 0.651 1.075 𝑆𝐷 𝑜𝑓 𝑦!" 32,861 1900 20.4 19.12 4.187 1.450 1.343 Note: Each coefficient is estimated from a single regression and represents 𝛽! in Eq. (2) with year 𝒕 (2011) outcomes. All standard errors are clustered at the first service date of the injury. *p<0.10, **p<0.05, ***p<0.10. 34 Table 1.7. Effect of Meeting Deductible in Year 𝒕 on Year 𝒕 + 𝟏 Outcomes Deductible (d) 500 ≤ d < 1000 ≤ d 2012 𝑂𝑢𝑡𝑐𝑜𝑚𝑒 (𝑦!"#$ ) 0 < d ≤ 100 100 < d < 300 300 ≤ d < 500 1000 < 10000 Total Spending -342,356 -7,004 -18,791** -15,106** -9,222 (878,255) (9,629) (8,970) (7,498) (15,000) Total Out of Pocket -18,383 -644.3 -294.9 -1,788*** 148.8 (47,906) (465.1) (357.1) (596.0) (1,195) Total Care Dates -76.56 -1.561 -6.719 -14.19*** -2.410 (234.5) (6.255) (4.870) (5.240) (8.845) Outpatient Care -26.37 -0.605 -5.138 -12.75*** -1.919 Dates (130.7) (5.971) (4.575) (4.854) (8.252) Inpatient Care Dates -61.27 -1.248 -1.890** -1.887** -0.839 (150.8) (1.268) (0.889) (0.896) (1.488) Elective Care Dates -0.0106 0.0699 -0.190 -0.579** -0.665 (11.12) (0.485) (0.383) (0.263) (0.509) Preventive Care -8.509 0.164 -0.249 -0.389 -0.132 Dates (22.02) (0.329) (0.364) (0.260) (0.0987) Sample Size 27,538 64,822 67,919 69,492 52,669 First-stage 0.004 0.048 0.058 0.073 0.054 coefficient First-stage F- 0.17 43.33 93.48 91.48 50.69 statistics Note: Each coefficient is estimated from a single regression and represents 𝛽! in Eq. (2) stratified by deductible amount. All standard errors are clustered at the first service date of the injury. *p<0.10, **p<0.05, ***p<0.10. 35 Table 1.8. Effect of Meeting Deductible in Year 𝒕 on Year 𝒕 + 𝟏 Outcomes without Covariates Total Total Outpatient Inpatient Elective Prevent. Outcome Total Out of Care Care Care Care Care Form Spending Pocket Dates Dates Dates Dates Dates - - 𝑦!"#$ -925.2** -8.954** -3.822 -0.486** -0.356* 15,085*** 1.594*** (5,133) (463.9) (3.822) (3.640) (0.578) (0.203) (0.209) ln(1 + 𝑦!"#$ ) -0.712** -0.709** (0.322) (0.309) - 1(𝑦!"#$ > 0) -0.185** -0.130* 0.0880** (0.0418) (0.0718) (0.0728) 𝑀𝑒𝑎𝑛 𝑜𝑓 𝑦!"#$ 11,631 1,468 16.17 15.8 0.52 0.54 1.08 𝑆𝐷 𝑜𝑓 𝑦!"#$ 32,495 1,900 19.42 18.4 3.64 1.47 1.31 Note: Each coefficient is estimated from a single regression and represents 𝛽! in Eq. (2) without covariates (deductible amount, number of family members observed on plan, age, sex). All standard errors are clustered at the first service date of the injury. *p<0.10, **p<0.05, ***p<0.10. 36 Table 1.9. Effect of Meeting Deductible in Year 𝒕 on Year 𝒕 + 𝟏 Outcomes by Number of Family Members Observed on Plan Number of Family Members Observed on Plan 2012 𝑂𝑢𝑡𝑐𝑜𝑚𝑒 (𝑦!"#$ ) 1-2 3-4 ³5 Total Spending -18,762*** -5,424 -12,603 (6,221) (8,241) (10,279) Total Out of Pocket -844.9* -754.6 -721.6 (434.5) (571.1) (564.7) Total Care Dates -11.23** -4.094 -1.523 (4.395) (5.164) (6.888) Outpatient Care Dates -9.909** -3.713 1.100 (4.212) (4.827) (6.370) Inpatient Care Dates -1.782** -0.494 -3.209** (0.765) (0.885) (1.297) Elective Care Dates -0.676*** 0.0273 -0.163 (0.245) (0.371) (0.475) Preventive Care Dates -0.520** 0.165 0.0568 (0.213) (0.364) (0.351) Sample Size 111,247 104,121 39,534 First-stage coefficient 0.070 0.049 0.056 First-stage F-statistics 125.68 71.33 47.00 Note: Each coefficient is estimated from a single regression and represents 𝛽! in Eq. (2) stratified by number of family members observed on the plan. All standard errors are clustered at the first service date of the injury. *p<0.10, **p<0.05, ***p<0.10. 37 Table 1.10. Donut Hole Specification of Effect of Meeting Deductible in Year 𝒕 on Year 𝒕 + 𝟏 Outcomes Donut Hole Length in Each Year 2012 𝑂𝑢𝑡𝑐𝑜𝑚𝑒 (𝑦!"#$ ) 7 days 14 days Total Spending -17,300*** -22,092*** (5,877) (7,060) Total Out of Pocket -1,176*** -1,524*** (404.2) (471.8) Total Care Dates -8.973** -14.38*** (4.115) (4.765) Outpatient Care Dates -7.386* -12.44*** (3.948) (4.535) Inpatient Care Dates -2.051*** -2.400*** (0.637) (0.719) Elective Care Dates -0.351 -0.699*** (0.238) (0.268) Preventive Care Dates -0.297 -0.312 (0.213) (0.279) Sample Size 234,965 215,995 First-stage coefficient 0.058 0.059 First-stage F-statistics 191.46 118.51 Note: Each coefficient is estimated from a single regression and represents 𝛽! in Eq. (2) with donut holes of 7 and 14 days. All standard errors are clustered at the first service date of the injury. *p<0.10, **p<0.05, ***p<0.10. 38 BIBLIOGRAPHY Alpert, A. (2016). The anticipatory effects of Medicare Part D on drug utilization. Journal of health economics, 49, 28-45. Aron-Dine A, Einav L, Finkelstein A, Cullen M. Moral Hazard In Health Insurance: Do Dynamic Incentives Matter? Rev Econ Stat. 2015 Oct;97(4):725-741 Brot-Goldberg Z, Chandra A, Handel B, Kolstad J. What does a Deductible Do? The Impact of Cost-Sharing on Health Care Prices, Quantities, and Spending Dynamics. The Quarterly Journal of Economics. 2017;132(3 1):1261-1318. Cabral, M. (2016). Claim Timing and Ex Post Adverse Selection. The Review of Economic Studies, 84(1), 1-44. Calonico, S., Cattaneo, M. D., & Titiunik, R. (2014a). Robust Nonparametric Confidence Intervals for Regression-Discontinuity Designs. Econometrica, 82(6), 2295–2326. Calonico, S., Cattaneo, M. D., & Titiunik, R. (2014b). Robust Data-Driven Inference in the Regression-Discontinuity Design. The Stata Journal, 14(4), 909–946. Calonico, S., Cattaneo, M. D., & Titiunik, R. (2015). Optimal Data-Driven Regression Discontinuity Plots. Journal of the American Statistical Association, 110(512), 1753–1769. U.S. Centers for Medicare and Medicaid Services. Preventive & Screening Services, https://www.medicare.gov/coverage/preventive-screening-services. Clemens J, Gottlieb JD. Do physicians’ financial incentives affect medical treatment and patient health? American Economic Review. 2014;104(4):1320-1349. Dalton, C. M., Gowrisankaran, G., & Town, R. J. (2019). Salience, Myopia, and Complex Dynamic Incentives: Evidence from Medicare Part D. The Review of Economic Studies. Einav L, Finkelstein A, Schrimpf P. The Response Of Drug Expenditure To Non-Linear Contract Design: Evidence From Medicare Part D. Q J Econ. 2015 May;130(2):841-899. Einav L, Finkelstein A. Moral Hazard in Health Insurance: What We Know and How We Know It. J Eur Econ Assoc. 2018 Aug;16(4):957-982. Guo A, Zhang J. What to expect when you are expecting: Are health care consumers forward- looking? J Health Econ. 2019 Sep; 67:102216. Imbens, G. W., & Lemieux, T. (2008). Regression discontinuity designs: a guide to practice. Journal of Econometrics, 142(2), 615. Johansson, N., de New, S. C., Kunz, J. S., Petrie, D., & Svensson, M. (2023). Reductions in out- 39 of-pocket prices and forward-looking moral hazard in health care demand. Journal of health economics, 87, 102710. Kaiser Family Foundation, Health Research and Educational Trust, 2020. Employer Health Benefits, 2020 Annual Survey. Klein, T. J., Salm, M., & Upadhyay, S. (2022). The response to dynamic incentives in insurance contracts with a deductible: Evidence from a differences-in-regression-discontinuities design. Journal of Public Economics, 210. Kowalski A. Censored Quantile Instrumental Variable Estimates of the Price Elasticity of Expenditure on Medical Care. J Bus Econ Stat. 2016 Jan 2;34(1):107-117. Lee, D. S., & Card, D. (2008). Regression discontinuity inference with specification error. Journal of Econometrics, 142(2), 655–674. Lin, H., & Sacks, D. W. (2019). Intertemporal substitution in health care demand: Evidence from the RAND Health Insurance Experiment. Journal of Public Economics, 175, 29-43. 40 APPENDIX • Count of Elective Care Dates in 2012 is a count of the number of elective service dates in 2012 with elective services being defined based on elective procedures defined using Berenson-Eggers Type of Service (BETOS) codes by Clemens and Gottlieb (2014) and also used by Guo and Zhang (2019). o List of BETOS codes used: P2A: Major procedure, cardiovascular – CABG, P2C: Major procedure, cardiovascular – thrombo-endarterectomy, P2D: Major procedure, cardiovascular – coronary angioplasty (PTCA) P3B: Major procedure, orthopedic – hip replacement, P3C: Major procedure, orthopedic – knee replacement, P4B: Eye procedure – cataract removal/lens insertion, P5A: Ambulatory procedures – skin • P5B: Ambulatory procedures – musculoskeletal, P6A: Minor procedures – skin, P6B: Minor procedures – musculoskeletal, P8A: Endoscopy – arthroscopy, P8B: Endoscopy – upper gastrointestinal, P8C: Endoscopy – sigmoidoscopy, P8D: Endoscopy – colonoscopy, P8E: Endoscopy – cystoscopy, P8F: Endoscopy – bronchoscopy, P8G: Endoscopy – laparoscopic cholecystectomy, P8H: Endoscopy – laryngoscopy, I4A: Imaging/procedure – heart including cardiac catheter • Count of Preventive Care Dates in 2012 is a count of the number of preventative service dates in 2012 with preventative services being defined based on the Center of Medicare and Medicaid Services’ list of preventive services. o Preventive Service Categories include: Alcohol Misuse Screening & Counseling, Annual Wellness Visit, Bone Mass Measurements, Cardiovascular Disease Screening Tests, Cervical Cancer Screening, Colorectal Cancer Screening, 41 Counseling to Prevent Tobacco Use, Depression Screening, Diabetes Screening, Diabetes Self-Management Training, Flu Shot & Administration, Glaucoma Screening, Hepatitis B Screening, Hepatitis B Shot & Administration, Hepatitis C Screening, HIV screening, IBT for Cardiovascular Disease, IBT for Obesity, Initial Preventive Physical Exam, Ling Cancer Screening, STI Screening & HIBC to Prevent STIs, Screening Pelvic Exams, Ultrasound AAA Screening 42 Table A1.1. Effect of Meeting Deductible in Year 𝒕 on Year 𝒕 + 𝟏 Outcomes Sample Exclusions Group Included along with Main Sample 2012 𝑂𝑢𝑡𝑐𝑜𝑚𝑒 (𝑦!"#$ ) Under 18 Deductible < 100 Total Spending -9,888*** -15,270*** (3,748) (5,426) Total Out of Pocket -476.0* -892.6** (261.5) (399.3) Total Care Dates -4.74** -7.79** (2.42) (3.65) Outpatient Care Dates -3.79* -6.28* (2.30) (3.42) Inpatient Care Dates -1.21*** -1.94*** (0.40) (0.63) Elective Care Dates -0.27* -0.36* (0.15) (0.21) Preventive Care Dates -0.13 -0.25 (0.12) (0.19) Sample Size 375,739 282,440 First-stage coefficient 0.063 0.054 First-stage F-statistics 370.35 211.81 Note: Each coefficient is estimated from a single regression and represents 𝛽! in Eq. (2) with varying observations included. All standard errors are clustered at the first service date of the injury. *p<0.10, **p<0.05, ***p<0.10. 43 Table A1.2. Main Estimate Stratified by Identifying Injury and Main Sample Excluding Each Injury Late Outco Effects me Dislocati Intracran Open of Superfic Contusi Form Fractures ons Sprains ial Wounds Vessels Poison ial on Only 41,812** 36,972** 14,054* 96,520* 𝑦!" 21,729* 463,632 6,367 21,594 9,086 * * * ** (10,174) (11,569) (7,044) (30,952) (11,890 (545,816 (63,755) (17,127) (7,941) ) ) 𝑦!"#$ -2,115 -14,409 -2,978 -1,395 -18,813 -97,881 -34,581 4,786 -5,328 (8,948) (10,760) (7,897) (14,468) (12,384 (248,501 (52,880) (16,476) (9,413) ) ) N 23,847 24,742 87,431 4,334 25,453 267 1,018 13,535 23,908 First- stage coeffici ent 0.105 0.059 0.041 0.141 0.080 0.073 -0.067 0.057 0.076 First- stage SE (0.011) (0.009) (0.006) (0.027) (0.011) (0.079) (0.056) (0.014) (0.011) First- stage F- statistic 88.79 39.13 44.94 27.50 56.03 0.85 1.45 16.70 44.75 Less 35,703** 36,522** 46,137* 33,549* 38,755* 35,850* 36,542* 37,055* 40,597* 𝑦!" * * ** ** ** ** ** ** ** (6,700) (6,365) (6,950) (5,982) (7,201) (5,904) (5,887) (6,123) (6,747) - - - - - - - - - 𝑦!"#$ 15,539** 14,932* 13,910* 12,707* 13,185* 13,374* 14,339* 14,413* 13,501** * * ** * ** ** ** ** (5,910) (5,340) (5,940) (5,178) (5,823) (5,037) (4,953) (5,202) (5,394) N 231,055 230,160 167,471 250,568 229,449 254,635 253,884 241,367 230,994 First- stage coeffici ent 0.054 0.059 0.069 0.057 0.056 0.059 0.059 0.058 0.057 First- stage SE (0.004) (0.004) (0.004) (0.004) (0.004) (0.004) (0.004) (0.004) (0.004) F-stat 194.95 209.62 250.94 229.50 199.65 236.88 237.29 213.75 201.74 44 Table A1.2. (cont’d) Outcome Complicati Form Crushing Foreign Burns Nerve Trauma Poison Toxic ons Only 𝑦!" 11,789 69,844 173,041 273,968 16,365* 192,216 2,987 148,992*** (14,675) (109,289 (168,140 (335,290 (8,760) (144,225 (31,920) (30,440) ) ) ) ) - 𝑦!"#$ 2,155 -70,976 64,271 -130,958 -58,502 -59,856 -32,788 13,939** (12,367) (102,413 (86,344) (208,248 (6,693) (130,720 (36,410) (22,414) ) ) ) N 675 5,031 2,346 1,165 32,188 2,254 17,729 16,789 First- stage coefficien t 0.218 0.022 0.034 0.038 0.101 0.051 0.045 0.079 First- stage SE (0.073) (0.024) (0.034) (0.043) (0.009) (0.030) (0.015) (0.011) F-stat 8.78 0.81 1.02 0.78 123.71 2.94 9.27 54.60 Less 36,948** 36,496** 36,022** 36,139** 42,769** 35,320** 38,653** 𝑦!" 22,336*** * * * * * * * (5,974) (5,970) (5,944) (5,940) (7,066) (6,109) (5,775) (4,533) - - - - - - - 𝑦!"#$ 13,335** 13,736** 12,819** 13,199** -12,279*** 12,900** 13,146** 10,877** * * * * (5,045) (5,042) (5,051) (4,946) (6,460) (4,979) (5,069) (4,210) N 254,227 249,871 252,556 253,737 222,714 252,648 237,173 238,113 First- stage coefficien t 0.059 0.059 0.059 0.059 0.052 0.059 0.060 0.058 First- stage SE (0.004) (0.004) (0.004) (0.004) (0.004) (0.004) (0.004) (0.004) F-stat 233.05 241.53 234.24 234.85 176.78 233.44 274.96 197.96 45 CHAPTER 2: POSTPARTUM MEDICAID ELIGIBILITY AND POSTPARTUM HEALTH MEASURES Introduction While maternal mortality has been falling in recent decades for nearly all high-income countries, it has risen in the United States,1 and occurrences of severe maternal morbidity more than doubled from 1988-89 to 2010-11.2 Additionally, within the United States there exist significant racial disparities in maternal mortality with 37.3 deaths per 100,000 births for non- Hispanic (NH) Black people compared to 14.9 for NH white people in 2018.3 Because many of these deaths and morbidity events occur after pregnancy, the postpartum period is a key time in which preventive care and interaction with the health care system are necessary and may offer opportunities to reduce racial inequities.4 Of growing interest to policymakers at both the state5 and federal level6 is the possibility of expanding Medicaid coverage that many low-income pregnant people receive to cover the entire first year postpartum. Since the 1980s and 1990s, Medicaid has prioritized low-income pregnant people and now covers over 40% of births in the US.3 The income guideline for pregnant people to qualify for Medicaid varies by state but must be at least 138% of the federal poverty level (FPL) and is over 300% FPL in some states. However, this pregnancy Medicaid coverage typically ends 60 days after delivery. After that period, non-disabled, adult postpartum people can only retain Medicaid coverage if they qualify under income eligibility thresholds for parental Medicaid, which are typically much less generous. There is the option to purchase subsidized Marketplace coverage for those between 100-400% FPL, but there is no option available for those below 100% FPL in states that did not expand Medicaid. This gap in coverage between pregnancy and parental Medicaid coverage means that many postpartum people may lose or change health insurance 46 during a particularly vulnerable period and may thus lose contact with the health care system.7 In fact, Johnston and colleagues found that 21.9% of people enrolled in Medicaid for prenatal care became uninsured 2-6 months postpartum.8 The American Rescue Plan Act gives states the option to extend pregnancy Medicaid to a full year postpartum and federal legislation has also proposed a requirement of this extension, but there is limited knowledge on the potential effects of these policies. A recent study using American Community Survey (ACS) data estimated that 28% of uninsured and 16% of privately insured postpartum people would gain eligibility if all pregnancy Medicaid was extended to a full year.9 A recent policy report from Gordon and coauthors estimates that 720,000 postpartum people would increase their Medicaid coverage to the full postpartum year.10 Thus, we suspect that extending Medicaid to a full year postpartum has the potential to extend coverage to a substantial number and percent of postpartum people with Medicaid-covered births. The policies of interest increase Medicaid eligibility, although eligibility does not always translate into Medicaid coverage. Importantly, however, we do not know whether extending pregnancy Medicaid to the first year postpartum would improve health care utilization or health outcomes for people who would gain eligibility. A comparison between Colorado (which raised its parental Medicaid limit to 138 from 105% FPL in 2014) and Utah (which did not change its parental Medicaid eligibility threshold in this time period) found that new mothers in Colorado were more likely to utilize outpatient care11, but we do not know whether these findings would generalize nationally or differ by race/ethnicity. To address this gap in the literature, we use a multistate sample to compare— among people who qualify for Medicaid during pregnancy—the likelihood of 1) attending a postpartum checkup and 2) self-reported postpartum depressive symptoms between those who are 47 eligible for Medicaid in the later postpartum and those who are not. We examine impacts of postpartum Medicaid eligibility on both outcomes overall and stratified by race/ethnicity to assess whether impacts of a postpartum Medicaid extension would contribute to reducing racial and ethnic inequities in postpartum health. Methods Data: We use the only multistate, postpartum survey, the Center for Disease Control’s Pregnancy Risk Assessment Monitoring System (PRAMS) Phases 7 & 8, which provides a representative sample of people with live births for the years 2012 to 2018 from 42 participating states (N = 253,865). Approximately 97% of responses occur in our period of interest, i.e., 3-12 months postpartum with 90% occurring in the period 3-6 months postpartum. Eligibility Measures: We focus on measuring Medicaid eligibility (not Medicaid coverage) because the policies of interest target Medicaid eligibility criteria (e.g., allowing people eligible for pregnancy Medicaid to keep coverage for a year postpartum). People who are eligible for Medicaid may choose to enroll and be covered by Medicaid, may use another source of insurance, or not use any health insurance. Therefore, a policy increasing Medicaid eligibility does not lead directly to the same magnitude increase in Medicaid coverage, but there is likely to still be an increase in Medicaid utilization from eligibility expansions. We estimate respondents’ eligibility for a) pregnancy Medicaid and b) parental Medicaid by comparing self-reported household income for the year prior to birth to the FPL.12, 13 While we use the term parental Medicaid, this measure captures eligibility for the most generous Medicaid option of either Medicaid for low-income adults or parents specifically (i.e., it captures state Medicaid expansions for all adults). In PRAMS, household income is provided in ranges, so we calculate eligibility using both the minimum and maximum of the range. (Using the 48 minimum of the income range, we will capture all respondents that are eligible but may also capture some ineligible respondents. Using the maximum, we will include only eligible respondents but may exclude some eligible respondents.) This method allows us to define two groups of pregnancy Medicaid-eligible people: those with postpartum Medicaid eligibility and those falling in the pregnancy-parental Medicaid gap. Postpartum Medicaid-eligible people are defined as those eligible for both pregnancy and parental Medicaid coverage, which allows them to maintain Medicaid eligibility throughout the year postpartum. In the pregnancy-parental Medicaid gap people are defined as those who qualify for pregnancy Medicaid and not for parental Medicaid. Thus, those in the pregnancy- parental eligibility gap currently lose Medicaid eligibility around 60 days postpartum and represent people who would benefit from an extension of pregnancy Medicaid. Sample: We limit our analytic sample to those we estimate eligible for pregnancy Medicaid based on household income, year, and state of residence. To provide a range of estimates, we construct two samples to determine Medicaid eligibility: one using the minimum and one using the maximum of the income ranges. We exclude mothers younger than 18 because they are likely eligible for programs targeted towards children. We also exclude observations with missing data on the maternal characteristics we use as covariates including income and household size (Figure A1). For each outcome separately, we exclude observations with a missing value for postpartum checkup or depressive symptoms. Specific sample sizes for each outcome and each subsample based on the minimum or maximum of the income range are displayed in Figure 2. Outcome Measures: Our outcome measures are postpartum checkup attendance and self- reported postpartum depressive symptoms. Postpartum checkup is constructed as a binary 49 variable that equals 1 if the respondent answered yes to “Since your new baby was born, have you had a postpartum checkup for yourself? A postpartum checkup is the regular checkup a woman has about 4-6 weeks after she gives birth.” And 0 otherwise. While the recommended time of 4-6 weeks postpartum for a checkup falls within the pregnancy Medicaid coverage period, any difficulty scheduling or attending the appointment could cause it to fall outside the pregnancy Medicaid coverage period. Self-reported postpartum depressive symptoms is constructed as a binary variable that equals 1 if a postpartum person reports “Always” or “Often” to either “Since your new baby was born, how often have you felt down, depressed, or hopeless?” or “Since your new baby was born, how often have you had little interest or little pleasure in doing things you usually enjoyed?” and 0 otherwise. Statistical Analysis: To study the association of postpartum Medicaid eligibility compared to only pregnancy Medicaid eligibility with an outcome, we use a linear probability model. Our exposure of interest is an indicator for whether a postpartum person is postpartum Medicaid-eligible or in the pregnancy-parental Medicaid gap. Models also include respondents’ years of education, age, race/ethnicity, parity, marital status, and income as covariates. We include income as a percentage of the FPL as a covariate to account for the fact that postpartum- eligible people have lower income on average than those who fall into the pregnancy-parental eligibility gap. All statistical analyses use the provided survey weights which account for nonresponse, noncoverage, and stratification by state and other sampling factors. Sensitivity Analyses: In Table A6, we additionally show specifications using only observations where Medicaid is the payer noted on the birth certificate rather than estimating pregnancy Medicaid eligibility. In Table A7, we show results for births occurring from 2015- 50 2018, the time period after which all required components of the ACA (Affordable Care Act) had been implemented, so that ACA-related changes in insurance cannot confound our findings. Results Just under 50% of the sample is NH white, around 19% of the sample is NH Black, and around 23% of the sample is Hispanic, with other racial/ethnic groups making up less than 5% of the sample. Postpartum checkup attendance is highest for NH white people compared to NH Black and Hispanic overall and across all our insurance eligibility classifications. Self-report of postpartum depressive symptoms is highest for NH Black people compared to NH white and Hispanic overall and across all our insurance eligibility classifications (Table 1, Table A3). Results of regression analyses are displayed in Table 2 overall and stratified by race/ethnicity with additional racial and ethnic groups (NH Native American/Alaskan Native/Hawaiian Native, NH Asian, NH Mixed/Other) displayed in Tables A4 & A5. We present all estimates as ranges of the point estimates from the two samples created by using the minimum and maximum of an income range. Among postpartum people who would have been eligible for pregnancy Medicaid, having postpartum Medicaid eligibility (relative to having only pregnancy Medicaid) was associated with a 0.9 to 1.2 percentage point higher likelihood of postpartum checkup attendance (p<0.1, p< 0.01). For NH white postpartum people, there was a marginally significant association between having postpartum Medicaid eligibility and postpartum checkup attendance (0.6 to 0.9 percentage point increase; p>0.1, p<0.1). For Hispanic postpartum people, there was a larger positive association (3.2 to 3.4 percentage point increase; p<0.01, p<0.01). For NH Black postpartum people, there was a negative association between postpartum Medicaid eligibility and likelihood of reporting postpartum checkup attendance (-1.9, -2.0; p<0.05). 51 Having postpartum Medicaid eligibility compared to only pregnancy Medicaid eligibility was associated with a 1.2 percentage point lower likelihood of self-reported postpartum depressive symptoms overall (p<0.05, p<0.01). The association between postpartum Medicaid eligibility and postpartum depressive symptoms was negative and statistically significant for NH white people (-1.6, -1.8; p<0.01) but not statistically significant for NH Black and Hispanic people. Our regression analyses are robust to defining our sample based on Medicaid-covered births rather than pregnancy Medicaid eligibility and using a 2015-2018 sample (instead of 2012- 2018) in that results are all of the same sign and similar magnitudes to our main results (Tables A6 & A7). Discussion In this article, we found that, for people who qualify for pregnancy Medicaid, maintaining Medicaid eligibility in the postpartum period is associated with a higher likelihood of attending a postpartum checkup—with the largest difference in likelihood among Hispanic people. Postpartum Medicaid eligibility is also associated with a lower likelihood of self- reported postpartum depressive symptoms—with the largest difference in likelihood among NH white people. For people eligible for Medicaid during pregnancy, postpartum Medicaid eligibility is associated with a 1.0 to 1.4% increase in postpartum checkup attendance. When stratifying by race, however, this finding was positive and statistically significant only among Hispanic people, suggesting that insurance coverage may represent a larger barrier to utilization of care for Hispanics compared to other groups. On the other hand, postpartum Medicaid eligibility is associated with a decrease in postpartum checkup attendance for NH Black people. This 52 counterintuitive finding suggests that health insurance coverage may not translate to health care utilization equitably across race and may be driven by structural racism in the health care system and discriminatory medical care, resulting in medical distrust or delay of care.15 Our results of an increase in postpartum checkup attendance overall align with the increased use of outpatient care previously found using a difference-in-differences framework to study a single state’s Medicaid expansion.11 Postpartum Medicaid eligibility is associated with a decrease in self-reported postpartum depressive symptoms of 7.4 to 7.6% among people who are eligible for pregnancy Medicaid. These improvements appear to be driven by a negative association of 9.7 to 11.6% for NH white postpartum people. Potential mechanisms by which health insurance eligibility may impact mental health include reduced financial stress and increased access to affordable treatment to address symptoms.16 Additionally, being eligible for public insurance reduces the pressure to return to work to maintain employer-provided insurance which may improve mental health by allowing time to heal from delivery or adjust to life changes.17 Associations between postpartum Medicaid eligibility and postpartum depressive symptoms are still negative but of a smaller magnitude or close to zero and not statistically significant for NH Black and Hispanic people. In our data, NH Black people have a higher occurrence of self-reported postpartum depressive symptoms relative to NH white people, but unlike NH white people, Medicaid eligibility in the later postpartum period does not have a significant association with improved symptoms. These results suggest that policies beyond extending postpartum Medicaid eligibility are necessary to address postpartum depression for NH Black people. Policies addressing structural racism and discrimination in multiple sectors, including housing, education, employment, health care, and criminal justice are likely also 53 necessary to increase equity across race and ethnicity in postpartum health outcomes.18 Our analyses are primarily limited by the data available in PRAMS. First, we lack usable data from 8 states and DC which prevents us from having a fully national sample. The absence of data for California and Florida, which together accounted for 31.4% of Hispanic births in 2018, should be noted when interpreting results for the Hispanic population. Another limitation of our data is that household income is only reported for the year prior to birth while Medicaid eligibility would be recalculated for many around 60 days postpartum. Further, we are missing income information for 10% of our sample and income is reported in ranges. To address income being reported in ranges, we produce estimates using both the minimum and maximum of income ranges and show that results are similar. Additionally, we do not have data on citizenship or lawful resident status, so we must assume all postpartum people would meet these eligibility criteria despite 6% of births being to undocumented immigrants in 2016.18 Further, our postpartum checkup variable is not the ideal indicator for health care access or utilization from 60 days to a year postpartum because the checkup is intended to occur around 6 weeks postpartum, however it is the best measure we have of postpartum health care utilization. Conversely, a major strength of our study is that we use PRAMS data, which is the only multistate source of data focusing on the experiences of people with a recent live birth. Another strength of our study is that we focus on eligibility measures, not reported insurance coverage, because the policies of interest target eligibility criteria. Implications for Policy Currently, the American Rescue Plan Act of 2021 gives states the option to extend pregnancy Medicaid to a full year postpartum using a state plan amendment beginning on April 1, 2022. Moreover, recent legislation at the federal level proposed a requirement for pregnancy 54 Medicaid to be extended to 12 months postpartum in all states.6 27 states including DC have implemented, and 7 states are considering such state plan amendments or section 1115 waivers (as of 11-10-22),5 and the federal policy debate continues. Yet, little empirical evidence exists on whether postpartum extension of Medicaid will achieve the intended goals of reducing maternal morbidity and mortality, coverage gaps, and racial disparities, making findings from our study critically important to this policy debate. Conclusions To our knowledge, the current study is the first to provide multistate estimates of the associations between continuous Medicaid eligibility in the later postpartum period and postpartum health outcomes. Our results suggest that postpartum Medicaid eligibility is associated with some improvements in maternal health care utilization and mental health. However, differences by race and ethnicity imply that inequitable systems and structures that cannot be overcome by insurance alone may also play an important role in postpartum health. Thus, more comprehensive policies beyond insurance eligibility may be necessary to improve maternal health outcomes. 55 Table 2.1. Survey-weighted means of maternal characteristics among those with pregnancy Medicaid eligibility Using Minimum of Income Rangesa Using Maximum of Income Rangesa All All Pregnan Postpart Chi2 Pregnan Postpart Chi2 cy um Test cy um Test Medicai Fall in Medicai p- Medicai Fall in Medicai p- d Eligibili d valu d Eligibili d valu Eligible ty Gap Eligible eb Eligible ty Gap Eligible eb Outcomes Postpart. Checkup 86.8 89.7 85.3 0.00 86.1 86.7 85.6 0.00 Postpart. Depress. Symptoms 15.7 13.1 16.9 0.00 16.3 16.1 16.5 0.23 Maternal Characteristics Race/Ethnicity NH white 49.2 58.8 44.7 0.00 47.2 51.2 43.8 0.00 NH Black 18.7 14.3 20.8 0.00 19.4 18.8 20.0 0.00 Hispanic 22.9 19.4 24.7 0.00 24.1 22.4 25.5 0.00 NH NA/NAK/NHI 1.4 1.0 1.5 0.00 1.4 1.3 1.5 0.00 NH Asian 4.2 3.4 4.6 0.00 4.2 3.0 5.3 0.00 NH Mixed/Other 3.6 3.1 3.8 0.00 3.6 3.4 3.8 0.02 Marital Status Married 44.4 60.9 36.8 0.00 41.8 47.0 37.3 0.00 Parity Previous Birth 68.1 69.8 67.3 0.00 68.4 67.6 69.2 0.00 Age 18-19 5.8 3.5 6.8 0.00 6.2 6.2 6.2 0.74 20-24 29.5 24.9 31.6 0.00 19.2 31.0 30.1 0.05 25-29 31.7 33.5 30.8 0.00 30.2 31.4 31.2 0.66 30-34 21.0 24.8 19.2 0.00 29.9 20.3 20.1 0.63 30-34 9.7 10.7 9.2 0.00 14.4 8.9 9.8 0.00 ≥ 40 2.4 2.7 2.3 0.01 3.1 2.2 2.5 0.04 Years of Education 0-8 4.4 2.8 5.2 0.00 4.8 3.8 5.7 0.00 9-11 13.6 7.8 16.2 0.00 14.6 12.6 16.4 0.00 12 36.0 30.7 38.4 0.00 37.3 36.3 38.2 0.00 13-15 33.6 38.5 31.3 0.00 32.8 34.7 31.2 0.00 ≥ 16 12.4 20.2 8.8 0.00 10.4 12.6 8.5 0.00 56 Table 2.1. (cont’d) Household Income 0-49% FPL 39.9 1.6 57.8 0.00 5.1 2.7 7.1 0.00 50-99% FPL 30.3 35.3 28.0 0.00 61.5 49.2 72.2 0.00 100-149% FPL 19.6 33.0 13.4 0.00 21.9 24.8 19.3 0.00 150-199% FPL 8.2 24.0 0.9 0.00 10.0 19.9 1.4 0.00 200-249% FPL 1.6 4.9 0.0 0.00 1.3 2.8 0.0 0.00 250-299% FPL 0.5 1.3 0.0 0.00 0.3 0.6 0.0 0.00 ≥300% FPL 0.0 0.0 0.0 0.00 0.0 0.0 0.0 0.00 NOTES: a Min. and max. samples are the samples constructed using either the minimum or maximum of the income range. See Methods Appendix for further details. b Reports the p-value for a Chi2 test of the differences in survey-weighted means for Fall in Eligibility Gap and Postpartum Medicaid Eligible groups. 57 Table 2.2. Differences in postpartum health measures associated with postpartum Medicaid eligibility among those with pregnancy Medicaid eligibility All NH white Postpartum Postpartum Postpartum Postpartum Depressive Depressive Checkup Checkup Symptoms Symptoms Postpartum Percentage Point Difference ( Beta x100)b Medicaid Eligiblea Min. Samplec 0.9* -1.2** 0.6 -1.8*** (95% CI) (-0.0, 1.8) (-2.1, -0.2) (-0.5, 1.6) (-3.1, -0.6) Max. Samplec 1.2*** -1.2*** 0.9* -1.6*** (95% CI) (0.4, 2.0) (-2.0, -0.3) (-0.1, 2.0) (-2.8, -0.4) Observations Min. Samplec 123,441 123,800 49,550 49,635 Max. Samplec 111,646 111,975 42,810 42,879 Mean Min. Samplec 86.8 15.7 88.0 15.5 Max. Samplec 86.1 16.3 87.3 16.5 NH Black Hispanic Postpartum Postpartum Postpartum Postpartum Depressive Depressive Checkup Checkup Symptoms Symptoms Postpartum Percentage Point Difference ( Beta x100)b Medicaid Eligiblea Min. Samplec -2.0** 0.3 3.2*** -1.3 (95% CI) (-3.9, -0.1) (-2.1, 2.6) (0.8, 5.6) (-3.5, 0.9) Max. Samplec -1.9** -0.3 3.4*** -0.9 (95% CI) (-3.5, -0.4) (-2.1, 1.6) (1.3, 5.5) (-2.7, 0.9) Observations Min. Samplec 29,249 29,340 25,089 25,205 Max. Samplec 27,448 27,535 23,609 23,722 Mean Min. Samplec 86.2 18.9 84.9 12.0 Max. Samplec 85.8 19.4 84.7 12.1 NOTES: Significance: * p<0.10, **p<0.05, ***p<0.01. a Postpartum Medicaid eligible refers to those eligible for both pregnancy and parental Medicaid. Reference group is those that fall in the pregnancy-parental Medicaid eligibility gap (i.e., eligible for pregnancy Medicaid and ineligible for parental Medicaid). b Models control for household income as a percent of FPL, years of education, age, race/ethnicity, marital status, and parity. c Min. and max. samples are the samples constructed using either the minimum or maximum of the income range. See Methods Appendix for further detail. 58 BIBLIOGRAPHY MacDorman MF, Declercq E, Cabral H, Morton C. Recent Increases in the U.S. Maternal Mortality Rate: Disentangling Trends From Measurement Issues. Obstet Gynecol. 2016 Sep;128(3):447-455. Geller SE, Koch AR, Garland CE, MacDonald EJ, Storey F, Lawton B. A global view of severe maternal morbidity: Moving beyond maternal mortality. Reproductive Health. 2018;15(S1). Martin JA, Hamilton BE, Osterman MJK. Births in the United States, 2019. NCHS Data Brief. 2020 Oct;(387):1-8. ACOG Committee on Obstetric Practice. Optimizing Postpartum Care. ACOG. 2018May;(736). Medicaid postpartum coverage extension tracker [Internet]. KFF. 2021 [cited 2021Dec18]. Available from: https://www.kff.org/medicaid/issue-brief/medicaid-postpartum-coverage- extension-tracker/ Sullivan J, Bailey A, Wagner J. Build back better legislation makes major Medicaid improvements [Internet]. Center on Budget and Policy Priorities. 2021 [cited 2021Dec18]. Available from: https://www.cbpp.org/research/health/build-back-better-legislation-makes- major-medicaid-improvements Eliason EL, Daw JR, Allen HL. Association of Medicaid vs Marketplace Eligibility with Maternal Coverage and Access With Prenatal and Postpartum Care. JAMA Netw Open. 2021 Dec 1;4(12):e2137383. Johnston, EM, McMorrow S, Alvarez Caraveo C, Dubay L. Post-Aca, More than One-Third of Women with Prenatal Medicaid Remained Uninsured before or after Pregnancy. Health Aff(Millwood). 2021 Jan;40(4)571-57. Johnston EM, Haley JM, McMorrow S, Kenney GM, Thomas TW, Pan CW, et al. Closing Postpartum Coverage Gaps and Improving Continuity and Affordability of Care through a Postpartum Medicaid/CHIP Extension. Urban Institute. 2021 Jan. Gordon S, Sugar S, Chen L, Peters C, De Lew N, Sommers BD. Medicaid After Pregnancy: State-Level Implications of Extending Postpartum Coverage. Assistant Secretary of Planning and Evaluation Office of Health Policy. 2021Dec7. Gordon SH, Sommers BD, Wilson IB, Trivedi AN. Effects Of Medicaid Expansion On Postpartum Coverage And Outpatient Utilization. Health Aff (Millwood). 2020 Jan;39(1):77-84. Medicaid and chip income eligibility limits for Pregnant Women, 2003-2021 [Internet]. KFF. 2021 [cited 2021Dec18]. Available from: https://www.kff.org/medicaid/state-indicator/medicaid- and-chip-income-eligibility-limits-for-pregnant-women/ 59 Medicaid income eligibility limits for parents, 2002-2021 [Internet]. KFF. 2021 [cited 2021Dec18]. Available from: https://www.kff.org/medicaid/state-indicator/medicaid-income- eligibility-limits-for-parents/ Martin JA, Hamilton BE, Osterman MJK, Driscoll AK. Births: Final Data for 2018. Natl Vital Stat Rep. 2019 Nov;68(13):1-47. Chambers BD, Arega HA, Arabia SE, Taylor B, Barron RG, Gates B, et al. Black women’s perspectives on structural racism across the reproductive lifespan: A conceptual framework for measurement development. Maternal and Child Health Journal. 2021;25(3):402–13. Miller S, Hu L, Kaestner R, Mazumder B, Wong A. The ACA Medicaid expansion in Michigan and financial health. J Policy Anal Manage. 2021;40(2):348–75. Kornfeind KR, Sipsma HL. Exploring the Link between Maternity Leave and Postpartum Depression. Women’s Health Issues. 2018 Jul-Aug;28(4):321-326. Bailey ZD, Krieger N, Agénor M, Graves J, Linos N, Bassett MT. Structural racism and health inequities in the USA: evidence and interventions. Lancet. 2017 Apr 8;389(10077):1453-1463. Passel JS, Cohn DV, Gramlich J. U.S. births to unauthorized immigrants have fallen since 2007 [Internet]. Pew Research Center; 2020 [cited 2021Dec18]. Available from: https://www.pewresearch.org/fact-tank/2018/11/01/the-number-of-u-s-born-babies-with- unauthorized-immigrant-parents-has-fallen-since-2007/ Levis B, Sun Y, He C, Wu Y, Krishnan A, Bhandari PM, et al. Accuracy of the PHQ-2 alone and in combination with the PHQ-9 for screening to detect major depression. JAMA. 2020;323(22):2290. Liu CH, Tronick E. Prevalence and predictors of maternal postpartum depressed mood and anhedonia by race and ethnicity. Epidemiology and Psychiatric Sciences. 2013;23(2):201–9. Margerison CE, Hettinger K, Kaestner R, Goldman-Mellor S, Gartner D. Medicaid expansion associated with some improvements in Perinatal Mental Health. Health Affairs. 2021;40(10):1605–11. 60 APPENDIX Methods Appendix Objectives To estimate the association between maintaining Medicaid eligibility in the later postpartum period and postpartum checkup attendance and depressive symptoms among the pregnancy Medicaid eligible. Sample: To construct the sample for our regression analyses, we begin with all PRAMS Phase 7 & 8 data which samples from live births from 2012-2018 (N = 253,865). First, we exclude mothers younger than 18 because they are likely eligible for programs targeted towards children (n = 4,306). Then we exclude women without usable data for household income and size because they are needed to calculate Medicaid eligibility (n = 23,948). We also exclude observations without the maternal characteristics we use as covariates: education level, age, race/ethnicity, marital status, and parity (n = 10,285). For each outcome separately, we exclude observations with a missing value for postpartum checkup (n = 1,142) or depressive symptoms (n = 689). Thus, prior to Medicaid eligibility calculations, we have a sample size of N = 214,184 and N = 214,637 for postpartum checkup and depressive symptoms, respectively. Eligibility Calculation Details To determine whether a woman is eligible for pregnancy and parental Medicaid, we compare self-reported household income for the year prior to birth to the federal poverty line. In PRAMS household income is provided in ranges, so we calculate eligibility using both the minimum and maximum of the range. When using the maximum of the income range, we assume that those in the top income range are ineligible due to top coding. Due to limitations of 61 the top-coded income ranges, we focus on families with less than 11 which covers 99.9% of the sample. We take income and transform it to a percentage of the FPL based on family size, state of residence, and year of birth. To calculate pregnancy eligibility, we use the reported number of dependents on family income for the year prior to birth. For parental eligibility we assume that the birth(s) would be added as a family member in eligibility calculations (infant no longer alive: add none, singleton birth: add one, twins: add two, triplets or more: add three). We then compare income as a percentage of the FPL to the pregnancy and parental thresholds for Medicaid eligibility based on state and year. This method gives us measures of whether a woman is eligible for pregnancy and parental Medicaid for both the minimum and maximum of her reported income range. Of the 215,326 usable observations, we estimate N = 124,237 and N = 112,379 to be eligible for pregnancy Medicaid using the minimum and maximum of the income ranges, respectively. (Using the minimum of the income range, we will include all that are eligible but may also include some individuals who are not actually eligible. Using the maximum of the range, we will include only those that are eligible but exclude some individuals who may be eligible.) Outcomes: Postpartum Checkup Attendance Question: Since your new baby was born, have you had a postpartum checkup for yourself? A postpartum checkup is the regular checkup a woman has about 4-6 weeks after she gives birth. Responses: • No 62 • Yes Our postpartum checkup outcome is defined as whether the respondent reported having attended a postpartum checkup. Postpartum checkups are recommended to occur 4-6 weeks postpartum and are the primary point of health care interaction in the postpartum period.4 While the recommended visit timing does occur within the timeframe of pregnancy Medicaid eligibility, any difficulties in attending the appointment or delays in scheduling could lead to the appointment falling outside of pregnancy Medicaid coverage. Self-Reported Postpartum Depressive Symptoms Depression Question: Since your new baby was born, how often have you felt down, depressed, or hopeless? Anhedonia Question: Since your new baby was born, how often have you had little interest or little pleasure in doing things you usually enjoyed?” Responses: • Always • Often • Sometimes • Rarely • Never Our self-reported postpartum depressive symptoms indicator is constructed as yes if a postpartum woman reports “Always” or “Often” to either the Depression or Anhedonia Question and no otherwise. While this measure is not a standardized screening measure, it is very similar to the Patient Health Questionnaire-2 (PHQ-2) which is used for depression screenings and has been shown to have good validity (cite PHQ study).20 Further, previous research has used these 63 PRAMS survey questions to measure postpartum maternal mental health.21, 22 Analysis Details: For the PRAMS Phases 7 & 8 sample (2012-2018 births) of pregnancy Medicaid eligible mothers, we estimate the equation: 𝑌! = 𝛽" + 𝛽# 𝑃𝑜𝑠𝑡𝑝𝑎𝑟𝑡𝑢𝑚 𝑀𝑒𝑑𝑖𝑐𝑎𝑖𝑑 𝐸𝑙𝑖𝑔𝑖𝑏𝑙𝑒! + 𝜷𝟐 𝑿𝒊 + 𝜀! , where 𝑌! represents our postpartum outcomes; 𝑃𝑜𝑠𝑡𝑝𝑎𝑟𝑡𝑢𝑚 𝑀𝑒𝑑𝑖𝑐𝑎𝑖𝑑 𝐸𝑙𝑖𝑔𝑖𝑏𝑙𝑒! is an indicator variable for whether a postpartum woman is eligible for parental Medicaid based on our eligibility calculations; and, 𝑿𝒊 is a vector of maternal characteristics including income as a percent of the FPL, education level, age, race/ethnicity, marital status, and parity. We include household income as a percent of the FPL (0-49, 50-99, 100-149, 150-199, 200-249, 250-299, ≥300) and additionally include maternal characteristics as covariates years of education (0-8, 9- 11, 12, 13-15, ≥ 16), age (18-19, 20-24, 25-29, 30-34, 35-39, ≥40), race/ethnicity (Non-Hispanic white, NH Black, Hispanic, NH Native American/Alaskan Native/Hawaiian Native, NH Asian, NH Mixed/Other), parity (previous live birth, no previous live birth), and marital status (married, unmarried). The parameter 𝛽# is our coefficient of interest and can be interpreted as the association between being eligible for Medicaid continuously throughout the later postpartum period and the outcome of interest. We focus on the association of postpartum Medicaid eligibility rather than actual reported insurance here because changing eligibility guidelines is the most likely policy change. 64 Figure A2.1. Analytic Sample Flow Chart Excluded PRAMS Phases 7 & 8 (2012-2018 births in U.S. States) -Under 18 (n = 4,306) N = 253,865 Adult postpartum people -Exclude those with unusable/missing N = 249,559 income of number of dependents on income (n = 23,948) Adult postpartum people with usable income data N = 225,611 -Exclude those with missing education, age, previous birth, marital Adult postpartum people with usable status, race/ethnicity maternal characteristics data (n = 10,285) N = 215,326 Using Minimum of Income Range Using Maximum of Income Range N = 215,326 N = 215,326 Ineligible for Ineligible for pregnancy Medicaid pregnancy Medicaid (n = 91,089) (n = 102,947) Summary Statistics Sample: Summary Statistics Sample: Pregnancy Medicaid eligible based on Pregnancy Medicaid eligible based on minimum of income range maximum of income range N = 124,237 N = 112,379 Missing data on Outcomes Missing data on Outcomes Postpartum Checkup (n = 796) Postpartum Checkup (n = 733) Postpartum Depression (n = 437) Postpartum Depression (n = 404) Regression Sample: Regression Sample: Postpartum Checkup: N = 123,441 Postpartum Checkup: N = 111,646 Postpartum Depression: N = 123,800 Postpartum Depression: N = 111,975 65 Table A2.1. Postpartum Checkup Attendance Survey-Weighted Means Overall and by Medicaid Eligibility Using Minimum of Income Ranges Postpartum All Pregnancy Ineligible for Fall in Medicaid Medicaid Overall Medicaid Eligibility Gap Eligible Eligible Overall 91.0 96.0 89.7 85.3 86.8 NH white 92.8 96.4 90.9 86.2 88.0 NH Black 87.8 94.3 90.7 84.8 86.2 Hispanic 86.8 94.4 86.1 84.5 84.9 NH NA/NAK/NHI 83.2 93.9 87.4 78.5 80.7 NH Asian 92.6 95.4 90.2 87.8 88.4 NH Mixed/Other 87.1 93.9 85.0 82.9 83.5 Using Maximum of Income Ranges Postpartum All Pregnancy Ineligible for Fall in Medicaid Medicaid Overall Medicaid Eligibility Gap Eligible Eligible Overall 91.0 95.6 86.7 85.6 86.1 NH white 92.8 96.0 88.1 86.4 87.3 NH Black 87.8 93.8 87.6 84.3 85.8 Hispanic 86.8 93.5 83.6 85.4 84.6 NH NA/NAK/NHI 83.2 92.4 82.8 78.6 80.4 NH Asian 92.6 95.3 87.3 88.3 88.0 NH Mixed/Other 87.1 93.6 81.2 83.9 82.7 Note: These survey-weighted means are not adjusted for household income or maternal characteristics. 66 Table A2.2. Self-Reported Postpartum Depressive Symptoms Survey-Weighted Means Overall and by Medicaid Eligibility Using Minimum of Income Ranges Postpartum All Pregnancy Ineligible for Fall in Medicaid Medicaid Overall Medicaid Eligibility Gap Eligible Eligible Overall 12.1 7.8 13.1 16.9 15.7 NH white 10.6 6.8 12.4 17.5 15.5 NH Black 17.4 11.2 15.8 19.9 18.9 Hispanic 11.4 9.2 11.5 12.2 12.0 NH NA/NAK/NHI 15.8 7.3 13.4 19.1 17.7 NH Asian 16.8 14.1 19.4 21.2 20.8 NH Mixed/Other 15.1 9.7 15.3 19.1 18.0 Using Maximum of Income Ranges Postpartum All Pregnancy Ineligible for Fall in Medicaid Medicaid Overall Medicaid Eligibility Gap Eligible Eligible Overall 12.1 8.1 16.1 16.5 16.3 NH white 10.6 7.0 15.8 17.3 16.5 NH Black 17.4 11.4 19.2 19.6 19.4 Hispanic 11.4 9.5 12.6 11.7 12.1 NH NA/NAK/NHI 15.7 8.2 18.0 18.1 18.1 NH Asian 16.9 14.6 21.8 20.4 20.9 NH Mixed/Other 15.1 10.3 18.8 18.1 18.3 Note: These survey-weighted means are not adjusted for household income or maternal characteristics. 67 Table A2.3. Survey-Weighted Means of Maternal Characteristics Overall and by Medicaid Eligibility Using Minimum of Income Ranges All Pregnancy Fall in Postpartum Ineligible for Medicaid Eligibility Medicaid Chi2 Test Overall Medicaid Eligible Gap Eligible p-value Race/Ethnicity NH white 62.2 77.6 49.2 58.8 44.7 0.00 NH Black 12.7 5.5 18.7 14.3 20.8 0.00 Hispanic 15.6 6.9 22.9 14.3 20.8 0.00 NH NA/NAK/NHI 0.9 0.4 1.4 1.0 1.5 0.00 NH Asian 5.6 7.3 4.2 3.4 4.6 0.00 NH Mixed/Other 3.0 2.3 3.6 3.1 3.8 0.00 Marital Status Married 64.5 88.5 44.4 60.9 36.8 0.00 Parity Previous Live Birth 61.7 54.0 68.1 69.8 67.3 0.00 Age 18-19 3.3 0.4 5.8 3.5 6.8 0.00 20-24 19.2 6.8 29.5 24.9 31.6 0.00 25-29 30.1 28.3 31.7 33.5 30.8 0.00 30-34 29.9 40.5 21.0 24.8 19.2 0.00 30-34 14.4 19.9 9.7 10.7 9.2 0.00 ≥ 40 3.1 4.0 2.4 2.7 2.3 0.01 Years of Education 0-8 2.6 0.3 4.4 2.8 5.2 0.00 9-11 7.8 0.9 13.6 7.8 16.2 0.00 12 23.4 8.4 36.0 30.7 38.4 0.00 13-15 28.6 22.5 33.6 38.5 31.3 0.00 ≥ 16 37.7 67.9 12.4 20.2 8.8 0.00 Using Maximum of Income Ranges All Pregnancy Fall in Postpartum Ineligible for Medicaid Eligibility Medicaid Chi2 Test Overall Medicaid Eligible Gap Eligible p-value Race/Ethnicity NH white 62.2 76.5 47.2 51.2 43.8 0.00 NH Black 12.7 6.2 19.4 18.8 20.0 0.00 68 Table A2.3. (cont’d) Hispanic 15.6 7.5 24.1 22.4 25.5 0.00 NH NA/NAK/NHI 0.9 0.4 1.4 1.3 1.5 0.00 NH Asian 5.6 7.0 4.2 3.0 5.3 0.00 NH Mixed/Other 3.0 2.4 3.6 3.4 3.8 0.02 Marital Status Married 64.5 86.3 41.8 47.0 37.3 0.00 Parity Previous Live Birth 61.7 55.2 68.4 67.6 69.2 0.00 Age 18-19 3.3 0.6 6.2 6.2 6.2 0.74 20-24 19.2 8.3 19.2 31.0 30.1 0.05 25-29 30.2 29.0 30.2 31.4 31.2 0.66 30-34 29.9 39.1 29.9 20.3 20.1 0.63 30-34 14.4 19.1 14.4 8.9 9.8 0.00 ≥ 40 3.1 3.9 3.1 2.2 2.5 0.04 Years of Education 0-8 2.6 0.4 4.8 3.8 5.7 0.00 9-11 7.8 1.2 14.6 12.6 16.4 0.00 12 23.4 10.1 37.3 36.3 38.2 0.00 13-15 28.6 24.5 32.8 34.7 31.2 0.00 ≥ 16 37.7 63.8 10.4 12.6 8.5 0.00 69 Table A2.4. Survey-Weighted Regressions for Postpartum Checkup Attendance Overall and by Race/Ethnicity NH NA/N NH NH NH Hispan AK/N NH Mixed/ All white Black ic HI Asian Other Postpartum Medicaid Eligible 0.9* 0.6 -2.0** 3.2** 2.4 1.2 4.5* (Using Minimum of Income (-0.0, (-0.5, (- (0.8, (-1.8, (-3.2, (-0.2, Range) 1.8) 1.6) 3.9,0.1) 5.6) 6.6) 5.6) 9.3) Postpartum Medicaid Eligible 1.2*** 0.9* -1.9** 3.4*** 0.7 3.6* 5.0** (Using Maximum of Income (0.4, (-0.1, (-3.5,- (1.3, (-3.2, (-0.2, (1.1, Range) 2.0) 2.0) 0.4) 5.5) 4.6) 7.4) 8.9) Observations (Minimum 123,44 Sample) 1 49,550 29,249 25,089 6,587 6,201 6,765 Observations (Maximum 111,64 Sample) 6 42,810 27,448 23,609 6,171 5,434 6,174 Mean (Minimum Sample) 86.8 88.0 86.2 84.9 80.8 88.5 83.5 Mean (Maximum Sample) 86.1 87.3 85.8 84.7 80.5 88.0 82.7 95% Confidence Intervals in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Coefficients presented as percentage point difference (βx100). 70 Table A2.5. Survey-Weighted Regressions for Self-reported Postpartum Depressive Symptoms Overall and by Race/Ethnicity NH NA/N NH NH NH Hispan AK/N NH Mixed/ All white Black ic HI Asian Other - Postpartum Medicaid Eligible -1.2** 1.8*** 0.3 -1.3 2.6 1.1 -2.9 (Using Minimum of Income (-2.1, - (-3.1, - (-2.1, (-3.5, (-2.9, (-4.4, (-7.2, Range) 0.2) 0.6) 1.6) 0.9) 8.0) 6.7) 1.4) - - Postpartum Medicaid Eligible 1.2*** 1.6*** -0.2 -0.9 -0.9 -1.0 -3.2 (Using Maximum of Income (-2.0, - (-2.8, - (-2.1, (-2.7, (-4.9, (-5.8, (-7.1, Range) 0.3) 0.4) 1.6) 0.9) 3.2) 3.8) 0.7) Observations (Minimum 123,80 Sample) 0 49,635 29,340 25,205 6,604 6,238 6,778 Observations (Maximum 111,97 Sample) 5 42,879 27,535 23,722 6,189 5,466 6,184 Mean (Minimum Sample) 15.7 15.5 18.9 12.0 17.7 20.7 18.0 Mean (Maximum Sample) 16.3 16.5 19.4 12.1 18.0 20.8 18.3 95% Confidence Intervals in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Coefficients presented as percentage point difference (βx100). 71 Table A2.6. Survey-Weighted Regressions for Postpartum Checkup Attendance and Self- Reported Postpartum Depressive Symptoms Overall and by Race/Ethnicity for Postpartum People with Medicaid Paid Births All NH white NH Black Hispanic Postpart. Postpart. Postpart. Postpart. Depress. Depress. Depress. Depress. Postpart. Sympto Postpart. Sympto Postpart. Sympto Postpart. Sympto Checkup ms Checkup ms Checkup ms Checkup ms Postpartum Medicaid Percentage Point Difference (𝛃x100) Eligible Min. Sample 0.9 -1.4** 0.9 -1.7* -2.1* 0.7 2.8* -1.4 (-0.2, (-2.7,- (-0.6, (-3.7, (-4.5, (-2.3, (-0.3, (-4.1, (95% CI) 2.2) 0.1) 2.6) 0.2) 0.3) 3.6) 5.9) 1.3) Max. Sample 0.7 -1.5*** 0.9 -2.0** -2.6*** 0.3 2.4* -1.3 (-0.3, (-2.6,- (-0.5, (-3.6,- (-4.4,- (-1.8, (-0.0, (-3.5, (95% CI) 1.7) 0.5) 2.3) 0.4) 0.9) 2.4) 4.9) 1.0) Observations Min. Sample 80,645 80,897 29,678 29,732 21,678 21,751 16,830 16,902 Max. Sample 79,575 79,824 29,073 29,125 21,501 21,573 16,703 16,775 95% Confidence Intervals in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Coefficients presented as percentage point difference (βx100). 72 Table A2.7. Survey-Weighted Regressions for Postpartum Checkup Attendance and Self- Reported Postpartum Depressive Symptoms Overall and by Race/Ethnicity for Postpartum People with 2015-2018 Births All NH white NH Black Hispanic Postpart. Postpart. Postpart. Postpart. Depress. Depress. Depress. Depress. Postpart. Sympto Postpart. Sympto Postpart. Sympto Postpart. Sympto Checkup ms Checkup ms Checkup ms Checkup ms Postpartum Medicaid Percentage Point Difference (𝛃x100) Eligible Min. Sample 1.2* -1.8*** 1.3* -2.8*** -3.1** 0.5 3.6** -1.4 (-0.1, (-3.1,- (-0.3, (-4.7,- (-5.5,- (-2.5, (-4.3, (95% CI) 2.4) 0.4) 2.9) 1.0) 0.7) 3.5) (0.4, 6.7) 1.4) Max. Sample 1.5*** -1.6*** 2.4*** -3.2*** -2.9*** -0.3 3.5** -0.5 (-2.8,- (-4.9,- (-4.9,- (-2.6, (-2.8, (95% CI) (0.4, 2.6) 0.5) (0.8, 3.9) 1.4) 0.9) 2.0) (0.8, 6.2) 1.8) Observations Min. Sample 71,999 72,198 27,670 27,718 18,260 18,302 15,125 15,199 Max. Sample 65,059 65,243 23,768 23,808 17,120 17,158 14,222 14,294 95% Confidence Intervals in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1. Coefficients presented as percentage point difference (βx100). 73 CHAPTER 3: EFFECTS OF STATE MEDICAL AMNESTY POLICIES ON ALCOHOL USE Introduction Although alcohol consumption is illegal for those under the age of 21 throughout the United States, it is widespread and a factor in approximately 4,300 youth deaths each year (Stahre et al., 2014). Policymakers are concerned that the disincentive of reporting illegal activities, even when it might be beneficial for someone’s health, causes preventable alcohol- related fatalities to occur. In response to this concern, Alcohol Medical Amnesty Policies (MAP), also referred to as Good Samaritan Laws or 911 Lifelines, have been enacted in numerous jurisdictions. These policies eliminate culpability relating to underage drinking when intoxicated minors seek emergency assistance. MAP have been controversial because some argue that eliminating the legal consequences of these actions perpetuates them and that policy should focus on discouraging illegal actions. While first enacted on college campuses, over the past 15 years MAPs have been approved by 40 states. Still, little is known about the causal effects of MAPs and an empirical analysis of MAPs is especially salient since the theoretical effects are ambiguous. While the policies are implemented with the intention of increasing medical care usage and saving lives, there could potentially be an increase in substance use counteracting this effect due to decreased disincentives affecting drinking decisions. In this paper, I examine the effects of state MAPs on drinking decisions in the past 30 days for those ages 18-20 using self-reported data from the 2011-2018 Behavioral Risk Factor Surveillance Survey (BRFSS). I use difference-in-differences models exploiting the variation in the implementation of state MAPs between 2012 and 2018. My main results support the 74 conclusion that there is no significant long-term increase in drinking behaviors due to state MAP implementations. I also consider specifications using event studies, placebo tests, and triple- difference estimates using 21-23-year-olds as an additional control group, and the results from these specifications are broadly similar to those from the central specifications. Policy Background Alcohol MAP, also referred to as Good Samaritan Laws of 911 Lifelines, eliminate legal consequences relating to underage drinking when intoxicated minors seek emergency assistance. They do not provide legal immunity for crimes outside of alcohol possession or consumption. In some jurisdictions these policies cover use of other illegal substances as well, but this research focuses on policies pertaining to underage alcohol consumption. There are three main types of amnesty that these policies commonly may include: individual amnesty, victim amnesty, and caller amnesty. Individual amnesty provides amnesty for the individual experiencing a medical emergency if they seek medical assistance. Victim amnesty gives amnesty to any individual receiving medical treatment regardless of who requested emergency assistance. Caller amnesty provides amnesty for those that seek emergency assistance for another needing medical assistance. Caller amnesty policies often include provisions requiring the caller to stay with the victim and provide information to first responders. Additionally, many limit the number of individuals who may receive amnesty to between one and five. Every state with a MAP provides at least caller amnesty with most states providing additional protections. I hope to explore variations in policy generosity in later work. Alcohol MAP first began on individual college campuses and have been adopted by 40 states since 2005 (Table 1). The implementation of MAP has been controversial because their effects are not fully understood. The disincentive of reporting an illegal activity, such as 75 underage drinking, may be high enough to discourage seeking emergency assistance even when it may be beneficial to one’s health. The goal of an alcohol MAP is to eliminate this disincentive and thus reduce alcohol-related fatalities. Those proposing the policy often argue young adults are going to drink anyways, so providing a safeguard against unintended deaths is only beneficial. However, reducing the cost of the illegal behavior through MAP could affect underage drinking decisions and cause the policy to have unintended consequences such as increased underage-alcohol-poisoning incidents. The controversy of this policy can be well-represented by the differing views of the Maine legislature and governor in 2015. The Maine legislator passed and eventually overturned a veto on an alcohol MAP because they believe that minors are often too concerned about legal consequences when someone is in need of life-saving medical attention after an alcohol overdose and would like to eliminate that barrier. However, Governor Paul LePage vetoed the MAP because he felt it “pampers children who engaged in illegal behavior” and contributes to the “growing pattern of babying” young adults (Associated Press, 2015). Literature Review This work fits most closely into the literature on drug policy evaluations and the moral hazard of drug policies. Many works have studied similar policies, such as Naloxone Access Laws, opioid MAP, and syringe exchange programs, effects on policymaker’s intended outcome of mortality. Even more closely related are works evaluating the effects of harm-reduction policies on unintended consequences like substance use and crime. There is little economic work on alcohol MAP, so I additionally discuss any academic works pertaining to MAP. Economics Carpenter (2004), the most closely related work in the economics literature, uses a similar 76 methodology to my work to study another alcohol policy affecting only those underage, zero tolerance drunk driving laws. He also utilizes BRFSS for data on drinking behaviors and variations in state law enactment for identification in both DD and DDD frameworks. Carpenter (2004) finds a significant reduction in binge drinking from zero tolerance drunk driving laws of 13%. However, he finds no clear effect for females or on the drinking participation margin. This suggests that alcohol policies targeted at excessive drinking may have heterogeneous effects across genders and may not affect drinking participation. Studies of opioid Naloxone Access Laws, such as Rees et al. (2019) and Doleac and Mukerjee (2019) are some of the most closely related work in the literature. Rees et al. (2019) studies the effects of Naloxone Access Laws and Good Samaritan Laws on opioid-related deaths and additionally alcohol-related deaths. They find negative but statistically insignificant effects of Good Samaritan Laws, and that Naloxone Access Laws reduce opioid-related deaths by 9-10 percent. Doleac and Mukerjee (2019) are unable to replicate the results of Rees et al. (2019) and find that Naloxone Access Laws do not decrease opioid-related mortality and increase opioid- related emergency room visits and theft. The results of Rees et al. (2019) suggest that Naloxone Access Laws function as intended in that they reduce opioid-related deaths; however, Doleac and Mukherjee’s work suggests that moral hazard is a factor in these policies. Deiana and Giua (2018) examine the effects of a variety of opioid-related policies on labor market participation and crime rates in a difference-in-differences setting. They find drug Good Samaritan Laws have no significant effect on labor market outcomes and significantly reduce crime. However, this effect is mechanical to some extent since this policy provides amnesty for a certain set of crimes. Packham (2019) uses variation in the opening of syringe exchange programs to identify 77 their effects on intended and unintended outcomes. She finds that the programs decrease HIV diagnoses indicating that syringe exchanges make drug use safer. Additionally, she finds that syringe exchange programs increase both opioid-related deaths, hospital admissions, and crimes suggesting moral hazard is a factor in harm-reduction drug policies. Outside Economics Lewis and Marcell (2006) examines Cornell University’s Medical Amnesty Protocol using data from emergency room visits, health center records, calls for emergency services, and student surveys. While limited in its ability to draw conclusions by its methodology, Lewis and Marcell (2006) does suggest some possible effects of alcohol MAPs. Based on student surveys, fear of getting a person in trouble as a barrier to calling for help decreased by 61% after the policy. Self-reported data on changes in student drinking levels before and after the policy’s implementation are described as ambiguous and cannot be causally connected to the policy. My work looks to fill this gap in the literature. There is also an observed increase in alcohol-related calls to Cornell’s Emergency Medical Services. Martinez et al. (2016) examines the effects of a MAP being instated at a residential college. Using an online questionnaire, they compare the behaviors and opinions of cohorts before and after the policy. They observe higher drinking rates and fewer harmful outcomes after the policy; however, the empirical strategy does not allow for causal inference. Mohanan et al. (2019) studies the implementation of an alcohol MAP at Georgetown University using data on calls to the collegiate-based emergency medical services where the chief complaint is intoxication. They find the policy is associated with students calling earlier in the evening and an increase in calls. Haas et al. (2018) compares first-year cohorts survey responses and administrative data in 78 the years surrounding a MAP being implemented at a four-year university. They find suggestive evidence that drinking behaviors did not change and of a modest increase in seeking assistance in emergencies. Oster-Aaland et al. (2011) proposes a hypothetical alcohol MAP to survey respondents and finds it increases students’ intentions to seek help. Nguyen and Parker (2018) studies the passing of the New York opioid MAP, using New Jersey as a control state, on the effects of hospital utilization. They find a significant effect on accidental heroin overdose hospitalizations but not on those for non-heroin. This result indicates that the policy has an impact; however, it is unclear whether this effect is purely driven by increased hospital utilization or also by increased drug use. Atkins et al. (2019) uses the variation in state opioid Good Samaritan Law enactment to study their effects on opioid-related deaths while controlling for other harm-reduction policies. They find no statistically significant effect of the state laws on opioid-related deaths. These studies all suggest potential outcomes and mechanisms that I may be able to identify as causal with a stronger empirical strategy. Theoretical Model While the effects of MAPs on drinking behaviors are of interest themselves, they are additionally of further interest because they provide insight into interpreting results of the other effects of MAPS such as hospital utilization and deaths. Many of the theoretical effect of alcohol MAPs are ambiguous and dependent upon whether there is moral hazard which makes an empirical analysis especially valuable. Theoretical predictions on the effects of implementing MAPs are dependent on whether the possibility of punishment is considered in the minor’s decision to drink and obtain medical care. I make the assumption that drinking monotonically increases the probability of needing 79 emergency medical care and death through channels such as alcohol poisoning and riskier decision making. Additionally, I assume that choosing to utilize medical care monotonically decreases the probability of death. If minors consider punishments when both choosing to drink and access medical care, the expected cost of drinking would decrease when a MAP is enacted. The decrease in the expected cost of drinking is attributable to a decrease in the probability of facing punishment, and an increase in the probability of receiving medical care causing a decrease in the probability of dying. Thus, we would expect to see an increase in drinking at both the intensive and extensive margins. We would expect the effects of MAPs to be stronger at the binge drinking margin than at the margin to drink because the risk of medical emergency increases as alcohol consumption increases. We would predict an increase in hospital visits related to underage drinking because there is a decrease in the cost of visiting the hospital due to the amnesty and an increase in alcohol consumption for minors. The net effect on alcohol-related deaths of minors is theoretically ambiguous in this situation. Increased hospital utilization could cause a decrease in alcohol- related deaths of minors if the increased treatment rates outweigh the increase in drinking, or an increase in alcohol-related deaths of minors if the increase in drinking outweighs the reduction of the probability of death from medical care. If alcohol MAP do not factor into a minor’s drinking decisions but do affect the choice to seek medical care, we would predict no change in alcohol-consumption levels. Still, we would predict an increase in hospital utilization because the expected cost of medical care has decreased due to the MAP. Since there is no change in alcohol consumption and an increase in those seeking treatment, we would anticipate a decrease in deaths related to underage drinking. 80 Finally, if minors do not take into account punishments when choosing how much alcohol to consume or whether to access medical care, we would expect to see no change in drinking rates, hospital visits, or deaths. Data I utilize data from the 2011-2018 Behavioral Risk Factor Surveillance Survey (BRFSS). I focus on these years because BRFSS underwent major methodologic changes to include cellphones in 2011 making it difficult to compare trends in prevalence around 2011. This survey samples adults from all 50 states and Washington D.C. Because MAPs are only a relevant policy for those under 21, I restrict my primary sample to those ages 18-20.6 I consider those ages 21-23 as a counterfactual group in placebo tests and triple-difference specifications. While minors are also affected by the policy, they are not covered in BRFSS and are a population requiring further research using other surveys. The dependent variables come from survey questions about self-reported alcohol consumption. First, an indicator variable representing whether the respondent has consumed any alcohol in the past 30 days. Second, is an indicator variable representing whether the respondent has engaged in binge drinking in the past 30 days. For men the variable is defined as whether the respondent has consumed five or more drinks on any occasion in the past thirty days. Women are asked whether they had consumed four or more drinks on any occasion in the past thirty days. Demographic control variables including gender, age, marital status, educational level, race, and employment status in categories all come from BRFSS. Table 2 provides summary statistics from the BRFSS dataset of demographic characteristics. As seen in Table 3, underage adult women drink at a lower rate for every measure. For this reason, I estimate all models for 6 I additionally restrict my sample to those that respond to both questions about drinking. This excludes less than one percent of the sample. 81 the entire population and broken down by gender. I also consider that educational institutions may play a role in the effects of MAP either through policy education or a different drinking culture, so I provide estimates for students. I identify students in the survey using those who respond with their employment status as being a student. I collected information on the existence of state MAP by searching for and reading state laws with assistance from the resources published by the Medical Amnesty Initiative. States with MAPs and their dates of effect can be found in Table 1. Empirical Strategy My first approach for estimating effects of state MAP is a dynamic difference-in- differences design that compares states with a state MAP implemented from 2012-2018 to states without a MAP. While my dataset begins in 2011, I focus on MAP implemented in 2012 and later to allow for at least a year to test for pre-trends in each state. Because of the staggered treatment adoption and the likelihood of heterogenous effects in this setting, conventional regression-based estimators would fail to provide unbiased estimates due to the “forbidden comparisons” of newly and previously treated units (Borusyak et al, 2022). To obtain unbiased estimates of the causal effect of state MAP, I implement the Borusyak, Jaravel, Spiess imputation estimator (BJS) throughout which places no homogeneity assumptions on the treatment effect. (Borusyak et al, 2022). First, I estimate a simple difference-in-difference with an average treatment effect across all time periods using the BJS estimator. The first step of the imputation estimator uses the regression: 𝑌!'( = 𝛽" + 𝛽# 𝑀𝐴𝑃'( + 𝜷𝟐 𝑿𝒊𝒔𝒕 + 𝜇' + 𝛿( + 𝜖!'( , (1) where 𝑌!'( denotes the drinking outcome for individual 𝑖 in state 𝑠 in year 𝑡. 𝑀𝐴𝑃'( is an 82 indicator equaling one if state 𝑠 has a MAP in year 𝑡 and zero otherwise. 𝑋!'( is a vector of individual demographic variables including age, race/ethnicity, gender, marital status, education level, and employment status in categories. 𝜇' is a set of state fixed effects and 𝛿( is a set of time fixed effects. 𝛿( is at the month and year level because there has been shown to be important seasonality to drinking behaviors (Cho et al., 2001 and Carpenter, 2003). These fixed effects control for any time-invariant differences across states and any trends across states in any month or year time frame. I calculate robust standard errors clustered at the state level for all models in order to allow for correlation in the error terms for individuals within states. Next, I produce estimates similar to an event study using the same specification except four separate years of treatment exposure are estimated. Those observed more than four years after a MAP is implemented in their state are excluded and estimates are relative to this excluded group. While the pre-treatment estimates are displayed in the same plots and tables like an event study, the BJS procedures produce these estimates from a separate imputation procedure using only untreated observations. Similarly, those observed more than four years prior to a MAP being implemented in their state are excluded and estimates for pre-treatment are relative to this group. The underlying assumption of these models is that the states with and without treatment had parallel trends prior to the treatment allowing the untreated states to serve as a counterfactual. While this assumption cannot be directly examined, significant pre-trends would raise concerns about the validity of this assumption. I evaluate this using the BJS parallel trends assumption F-test which uses only untreated observations. One threat to my empirical strategy would be endogeneity in the implementation of MAPs. However, the short timeframe within which many states implemented this policy 83 suggests that state-specific time trends are not a factor in the treatment. Additionally, my identification relies on the assumption of parallel trends across states with and without MAPs. To address these concerns and verify that I capture a true policy effect I implement multiple other specifications. In an attempt to detect spurious results, I additionally run placebo tests running my main difference-in-differences specification on ages 21-23. While it is possible for there to be spillover effects on behavior to those just over the legal drinking age if they drink with those underage or were exposed to the policy when they were underage, we would expect the effects to be much smaller than those underage and directly affected by the policy. Additionally, if there are concerns that there is a difference in the underage drinking behavior trends between states that did and did not implement MAP but that similar differences in trends exist for those just over the drinking age, then a triple difference specification can be used to produce unbiased estimates of the causal effect of a state MAP. I implement a triple- difference specification including ages 21-23 as an additional counterfactual group. More specifically, the first step of the BJS imputation is to estimate: 𝑌!'( = 𝛽" + 𝛽# 𝑈𝑛𝑑𝑒𝑟𝑎𝑔𝑒 𝑋 𝑀𝐴𝑃 𝑆𝑡𝑎𝑡𝑒 𝑋 𝑃𝑜𝑠𝑡!'( +𝛽+ 𝑈𝑛𝑑𝑒𝑟𝑎𝑔𝑒 𝑋 𝑀𝐴𝑃 𝑆𝑡𝑎𝑡𝑒!'( + 𝛽, 𝑀𝐴𝑃 𝑆𝑡𝑎𝑡𝑒 𝑋 𝑃𝑜𝑠𝑡!'( + 𝛽- 𝑋!'( + 𝜇' + 𝛿( + 𝜖!'( , where 𝑌!'( denotes the same outcomes previously studied. The interaction 𝑈𝑛𝑑𝑒𝑟𝑎𝑔𝑒 𝑋 𝑀𝐴𝑃 𝑆𝑡𝑎𝑡𝑒 𝑋 𝑃𝑜𝑠𝑡!'( indicates those directly affected by the policy. I additionally include the two-way interactions 𝑈𝑛𝑑𝑒𝑟𝑎𝑔𝑒 𝑋 𝑀𝐴𝑃 𝑆𝑡𝑎𝑡𝑒!'( and 𝑀𝐴𝑃 𝑆𝑡𝑎𝑡𝑒 𝑋 𝑃𝑜𝑠𝑡!'( . I account for individuals being underage by controlling for age, being in a MAP state is accounted for through the state fixed effects, 𝜇' , and being in the post period through the time fixed effects, 𝛿( . Additionally, 𝑋!'( represents individual characteristics including age, race/ethnicity, gender, marital status, education level, and employment status. I also repeat this triple-difference 84 specification using four separate year post periods and separate estimates of the four pre-periods in a similar manner to the difference-in-difference event study specification. In these specifications there are two outcomes of interest. The drinking outcomes of interest are 𝐷𝑟𝑖𝑛𝑘!'( , an indicator for whether the individual has consumed any alcohol in the past 30 days, 𝐵𝑖𝑛𝑔𝑒!'( , an indicator for whether the individual has engaged in binge drinking in the past 30 days. I examine 𝐵𝑖𝑛𝑔𝑒!'( for two different populations. First, I examine binge drinking unconditionally looking at all survey respondents. Next, I examine binge drinking conditional on a respondent indicating that they had consumed alcohol in the past 30 days to better understand the two different margins of drinking decision making. Results All estimates presented use the provided survey weights. Table 2 presents summary statistics of the demographic characteristics used as controls for both ages 18-20 and 21-23. Table 3 includes the means of the outcomes of interest for ages 18-20 and 21-23 overall, and by sex and student status. Figures 1-6 plot these outcome means by survey year and age group. Men are more likely to engage in drinking behaviors than females and those of legal age are more likely to engage in drinking behaviors than those underage. Table 4 shows difference-in-differences estimates for a single post-period using the BJS imputation estimator. All estimates are positive, but there are no statistically significant estimates. Point estimates for drinking and binge drinking for the whole age 18-20 population are less than one and two percentage points, respectively. Appendix Table 1 presents additional specifications excluding demographic controls and fixed effects. Figures 7-12 and Table A2 present results from the BJS difference-in-differences event study specification for ages 18-20. Table A2 additionally includes the f-statistics from the BJS 85 pre-trends test and fails to reject the null for all outcomes providing no evidence that the parallel trends assumption is invalid. There is a statistically significant 3 percentage point increase in males drinking in the year that a state MAP is implemented. However, estimates in the following years are not consistently positive or statistically significant. Estimates of the effect of MAPS on drinking overall and for females are not of a consistent sign in the years following implementation. In the year of MAP implementation, there is a statistically significant increase of 1.6 percentage points in binge drinking for ages 18-20 overall (Figures 7-12, Table A2). In the following years, point estimates are consistently positive but of a smaller magnitude and not statistically significant. For females there is a statistically significant increase of 2.2 percentage points in binge drinking in the first year post-MAP implementation. In other years with a MAP point estimates for female binge drinking are positive but of a smaller magnitude and not statistically significant. Table 5 presents difference-in-differences specifications using the BJS imputation estimator for ages 21-23 as a placebo test since the policy is not directly relevant to those of legal drinking age. There are statistically significant increases in binge drinking overall and for females ages 21-23 that are larger than any point estimates for underage adults. Figures 13-15 show BJS event study plots for ages 21-23 and shows statistically significant increases in binge drinking in some years post-MAP implementation. These results are suggestive that there may be differences in trends in drinking behavior beyond MAP implementation for states with and without MAP. If similar trends in drinking behavior exist across the legal drinking age in these states, then a triple-difference specification with ages 18-20 and 21-23 will provide unbiased estimates of the effect of MAP. 86 Table 6 presents estimates from BJS triple-difference specifications for a single post- period. There are no statistically significant results and point estimates for drinking and binge drinking overall are both less than one percentage point. Figures 16-21 and Table A3 present results from BJS triple-difference event study specifications. Table A3 also presents f-statistics from the BJS pre-trends test. All outcomes but binge drinking overall fail to reject the null, but the significant f-statistics for binge drinking overall raises concerns about the validity of the parallel trends assumption for the outcome. In the year of MAP implementation, there is a statistically significant 1.9 and 2.6 percentage point increase in drinking overall and for males, respectively. However, in the other years post-MAP the estimates are not statistically significant, and the sign of point estimates varies. The same patterns exist for binge drinking with estimates in the year of MAP implementation of 2.1 percentage points overall and 4.2 percentage points for males. Discussion Both difference-in-differences and triple-differences specifications support the conclusion that there are not long-term effects of state MAP implementation on underage drinking behavior. As a policy issue, there is interest in ruling out increases in underage drinking due to MAP. Using 95% confidence intervals, my results can rule out increases larger than 3.5 and 3.3 percentage points for drinking and binge drinking, respectively (Table 4). Event study specifications suggest that there may be increases in underage drinking behaviors soon after MAP implementation but that these changes are not sustained over the longer term of multiple years. One plausible explanation for these results is that when a state MAP is first implemented there is a large amount of publicity, so it factors into underage drinking decisions but over time awareness of the policy and its impact of drinking behavior fades. However, all these results 87 must be interpreted with caution as the placebo test of the policy effect on those of legal drinking age produces significant results and a statistically significant BJS pre-trend f-statistic raises concerns about the validity of the parallel trends assumption in triple-difference specifications. 88 TABLES AND FIGURES Figure 3.1. Survey-Weighted Means of Drinking by Survey Year and Age Figure 3.2. Survey-Weighted Means of Binge Drinking by Survey Year and Age 89 Figure 3.3. Survey-Weighted Means of Drinking for Males by Survey Year and Age Figure 3.4. Survey-Weighted Means of Binge Drinking for Males by Survey Year and Age 90 Figure 3.5. Survey-Weighted Means of Drinking for Females by Survey Year and Age Figure 3.6. Survey-Weighted Means of Binge Drinking for Females by Survey Year and Age 91 Figure 3.7. BJS Difference-in-Differences Event Study of Drinking for Ages 18-20 Figure 3.8. BJS Difference-in-Differences Event Study of Drinking for Males Ages 18-20 92 Figure 3.9. BJS Difference-in-Differences Event Study of Drinking for Females Ages 18-20 Figure 3.10. BJS Difference-in-Differences Event Study of Binge Drinking for Ages 18-20 93 Figure 3.11. BJS Difference-in-Differences Event Study of Binge Drinking for Males Ages 18-20 Figure 3.12. BJS Difference-in-Differences Event Study of Binge Drinking for Females Ages 18-20 94 Figure 3.13. BJS Difference-in-Differences Placebo Specification of Binge Drinking for Ages 21-23 Figure 3.14. BJS Difference-in-Differences Placebo Specification of Binge Drinking for Males Ages 21-23 95 Figure 3.15. BJS Difference-in-Differences Placebo Specification of Binge Drinking for Females Ages 21-23 Figure 3.16. BJS Triple Difference Event Study of Drinking 96 Figure 3.17. BJS Triple Difference Event Study of Drinking for Males Figure 3.18. BJS Triple Difference Event Study of Drinking for Females 97 Figure 3.19. BJS Triple Difference Event Study of Binge Drinking Figure 3.20. BJS Triple Difference Event Study of Binge Drinking for Males 98 Figure 3.21. BJS Triple Difference Event Study of Drinking for Females 99 Table 3.1. State Medical Amnesty Policy Dates Alcohol Date Alcohol Date State State MAP Enacted MAP Enacted Alabama YES 6/4/15 Nebraska YES 8/29/15 Alaska NO X Nevada YES 5/29/15 Arizona NO X New Hampshire NO X Arkansas YES 3/11/15 New Jersey YES 10/1/09 California YES 9/24/10 New Mexico NO X Colorado YES 7/1/05 New York YES 9/18/11 Connecticut NO X North Carolina YES 4/9/13 Delaware YES 8/31/13 North Dakota YES 8/1/07 Florida NO X Ohio NO X Georgia YES 4/24/14 Oklahoma YES 11/1/13 Hawaii YES 7/7/15 Oregon YES 1/1/15 Idaho YES 7/1/16 Pennsylvania YES 9/5/11 Illinois YES 6/1/16 Rhode Island YES 7/2/18 Indiana YES 7/1/12 South Carolina YES 6/10/17 Iowa NO X South Dakota YES 7/1/16 Kansas YES 2/23/16 Tennessee NO X Kentucky YES 6/24/13 Texas YES 9/1/11 Louisiana YES 8/1/14 Utah YES 3/26/13 Maine YES 6/10/15 Vermont YES 6/5/13 Maryland YES 10/1/14 Virginia YES 7/1/15 Massachusetts YES 4/13/18 Washington YES 6/10/10 Michigan YES 6/1/12 West Virginia YES 6/12/15 Minnesota YES 5/24/13 Wisconsin YES X Mississippi YES 7/1/18 Wyoming NO X Missouri YES 7/14/17 D.C. YES 3/19/13 Montana YES 10/1/15 100 Table 3.2. Demographic Characteristics Summary Statistics Ages 18-20 Ages 21-23 Variable Unweighted N Weighted % Variable Unweighted N Weighted % Sex Sex Male 27,731 0.5241 Male 32,345 0.5058 Female 30,952 0.4759 Female 32,252 0.4942 Age Age 18 20,172 37.68 21 21,021 0.3469 19 19,050 31.64 22 21,210 0.3268 20 19,461 30.67 23 22,366 0.3263 Race/Ethnicity: Race/Ethnicity: White 38,171 0.6089 White 43,379 0.6201 Black 5,936 0.1494 Black 6,036 0.1431 Hispanic 7,286 0.1376 Hispanic 7,252 0.1311 Other 7,290 0.1040 Other 7,930 0.1057 Education Level: Education Level: Less Than High Less Than High 6,999 0.1808 3,519 0.088 School School High School Grad. 6,558 0.4533 High School Grad. 19,207 0.2976 Some College 29,049 0.358 Some College 26,507 0.4457 College Graduate 21,888 0.008 College Graduate 15,364 0.1688 Employment: Employment: Employed 24,981 0.3918 Employed 38,904 0.5826 Unemployed 5,750 0.0984 Unemployed 6,136 0.1011 Student 26,376 0.4846 Student 16,424 0.267 Other 1,576 0.0252 Other 3,133 0.0493 101 Table 3.3. Outcome Summary Statistics Ages 18-20 Ages 21-23 Sample Weighted Standard Sample Weighted Standard Variable Variable Size Mean Error Size Mean Error All: All: Drink 58,683 0.3290 0.0031 Drink 64,597 0.6738 0.0029 Binge Drink Binge Drink 58,683 0.1657 0.0024 64,597 0.3473 0.003 (Unconditional) (Unconditional) Binge Drink Binge Drink (Conditional on 19,353 0.5036 0.0058 (Conditional on 43,365 0.5154 0.0038 Drinking) Drinking) Male: Male: Drink 30,952 0.3487 0.0043 Drink 32,345 0.7115 0.004 Binge Drink Binge Drink 30,952 0.1935 0.0035 32,345 0.4044 0.0043 (Unconditional) (Unconditional) Binge Drink Binge Drink (Conditional on 10,933 0.5549 0.0076 (Conditional on 23,011 0.5684 0.0051 Drinking) Drinking) Female: Female: Drink 27,731 0.3074 0.0045 Drink 32,252 0.6353 0.0043 Binge Drink Binge Drink 27,731 0.1351 0.0032 32,252 0.2889 0.004 (Unconditional) (Unconditional) Binge Drink Binge Drink (Conditional on 8,420 0.4394 0.0087 (Conditional on 20,354 0.4548 0.0055 Drinking) Drinking) Students: Students: Drink 26,376 0.3199 0.0045 Drink 16,424 0.7002 0.0057 Binge Drink Binge Drink 26,376 0.1586 0.0035 16,424 0.372 0.0059 (Unconditional) (Unconditional) Binge Drink Binge Drink (Conditional on 8,588 0.4957 0.0086 (Conditional on 11,421 0.5313 0.0072 Drinking) Drinking) Men (women) binge drinking measured by whether 5+ (4+) drinks consumed on an occasion. 102 Table 3.4. BJS Difference-in-Differences for Ages 18-20 Drink Any All Males Females Students MAP 0.0076 0.0135 0.0018 0.0007 SE (0.0138) (0.0175) (0.0185) (0.0151) N 58,683 30,952 27,731 26,376 Binge Drink (Unconditional) All Males Females Students MAP 0.0133 0.0120 0.0137 0.0064 SE (0.0103) (0.0146) (0.0115) (0.0105) N 58,683 30,952 27,731 26,376 Binge Drink (Conditional on Drinking) All Males Females Students MAP 0.0268 0.0146 0.0396 0.0176 SE (0.0184) (0.0246) (0.0285) (0.0230) N 19,353 10,933 8,420 8,588 Men (women) binge drinking measured by whether 5+ (4+) drinks consumed on an occasion. Specification includes state and year fixed effects, and individual characteristics controlled for in categories including survey month, sex, age, education level, marital status, employment status, and race/ethnicity. *** p<0.01, ** p<0.05, * p<0.1. 103 Table 3.5. BJS Difference-in-Differences Placebo Specification for Ages 21-23 Drink Any All Males Females Students MAP 0.0130 0.00818 0.0158 0.0171 SE (0.0135) (0.0133) (0.0183) (0.0144) N 64,597 32,345 32,252 16,424 Binge Drink (Unconditional) All Males Females Students MAP 0.0262** 0.0157 0.0345*** 0.0127 SE (0.0107) (0.0137) (0.0102) (0.0167) N 64,597 32,345 32,252 16,424 Binge Drink (Conditional on Drinking) All Males Females Students MAP 0.0285*** 0.0102 0.0444*** 0.0073 SE (0.0089) (0.0122) (0.0103) (0.0197) N 43,365 23,011 20,354 11,421 Men (women) binge drinking measured by whether 5+ (4+) drinks consumed on an occasion. Specification includes state and year fixed effects, and individual characteristics controlled for in categories including survey month, sex, age, education level, marital status, employment status, and race/ethnicity. *** p<0.01, ** p<0.05, * p<0.1. 104 Table 3.6. BJS Triple-Difference Specification for Ages 18-20 and 21-23 Drink Any All Males Females Students MAP 0.0056 0.0025 0.0056 0.0000 SE (0.0148) (0.0150) (0.0175) (0.0169) N 123,280 63,297 59,983 42,800 Binge Drink (Unconditional) All Males Females Students MAP 0.0011 0.0074 -0.0091 0.0002 SE (0.0113) (0.0108) (0.0139) (0.0163) N 123,280 63,297 59,983 42,800 Binge Drink (Conditional on Drinking) All Males Females Students MAP -0.0042 0.0087 -0.0197 -0.0079 SE (0.0180) (0.0163) (0.0299) (0.0314) N 62,718 33,944 28,774 20,009 Men (women) binge drinking measured by whether 5+ (4+) drinks consumed on an occasion. Specification includes the set of interactions of state, year, and underage status, and individual characteristics controlled for in categories including survey month, sex, age, education level, marital status, employment status, and race/ethnicity. *** p<0.01, ** p<0.05, * p<0.1. 105 BIBLIOGRAPHY Abadie, A., Diamond, A., & Hainmueller, J. (2010). Synthetic Control Methods for Comparative Case Studies: Estimating the Effect of California’s Tobacco Control Program. Journal of the American Statistical Association,105(490), applications and case studies. Associated Press. (2015, June 10). Bill to prevent alcohol overdoses goes into law. Citizen Times. Borusyak, K., Jaravel, X., & Spiess J. (2022). Revisiting event study designs: robust and efficient estimation. SSRN. Carpenter, C. (2003). Seasonal variation in self-reports of recent alcohol consumption: racial and ethnic differences. Journal of Studies on Alcohol, 64(3), 415–418. Carpenter, C. (2004). How do Zero Tolerance Drunk Driving Laws work? Journal of Health Economics, 23(1), 61–83. Center for Disease Control (2019). Fact Sheets - Underage Drinking. Cho, Y. I., Johnson, T. P., & Fendrich, M. (2001). Monthly variations in self-reports of alcohol consumption. Journal of Studies on Alcohol, 62(2), 268–272. Deiana, C., & Giua, L. (2018). The U.S. Epidemic: Prescription Opioids, Labour Market Conditions and Crime. MPRA Paper. Doleac, J. L., & Mukherjee, A. (2019). The Moral Hazard of Lifesaving Innovations: Naloxone Access, Opioid Abuse, and Crime. IZA Discussion Paper Series. Lewis, D. K., & Marchell, T. C. (2006). Safety first: A medical amnesty approach to alcohol poisoning at a U.S. university. International Journal of Drug Policy,17(4), 329-338. Nguyen, H., & Parker, B. R. (2018). Assessing the Effectiveness of New York’s 911 Good Samaritan Law—Evidence from a Natural Experiment. International Journal of Drug Policy,58, 149-156. Oster-Aaland, L., Thompson, K., & Eighmy, M. (2011). The Impact of an Online Educational Video and a Medical Amnesty Policy on College Students Intentions to Seek Help in the Presence of Alcohol Poisoning Symptoms. Journal of Student Affairs Research and Practice,48(2), 147-164. Packham, A. (2019). Are Syringe Exchange Programs Helpful or Harmful? New Evidence in the Wake of the Opioid Epidemic. 106 Rees, D., Sabia, J., Argys, L., Latshaw, J., & Dave, D. (2019). With a Little Help from My Friends: The Effects of Naloxone Access and Good Samaritan Laws on Opioid-Related Deaths. Journal of Law & Economics. Stahre, M., Roeber, J., Kanny, D., Brewer, R. D., & Zhang, X. (2014). Contribution of Excessive Alcohol Consumption to Deaths and Years of Potential Life Lost in the United States. Preventing Chronic Disease,11. 107 APPENDIX Table A3.1. BJS Difference-in-Differences Specifications for Ages 18-20 Binge Drink (Conditional on Drink Binge Drink (Unconditional) Drinking) All All All (1) (2) (3) (1) (2) (3) (1) (2) (3) MAP -0.003 -0.003 0.008 MAP 0.006 0.007 0.013 MAP 0.021 0.023 0.027 SE (0.013) (0.013) (0.014) SE (0.010) (0.010) (0.010) SE (0.020) (0.019) (0.018) N 58,683 58,683 58,683 N 58,683 58,683 58,683 N 19,353 19,353 19,353 Males Males Males (1) (2) (3) (1) (2) (3) (1) (2) (3) MAP 0.003 0.004 0.014 MAP 0.003 0.004 0.012 MAP 0.003 0.004 0.015 SE (0.016) (0.016) (0.018) SE (0.013) (0.013) (0.015) SE (0.024) (0.024) (0.025) N 30,952 30,952 30,952 N 30,952 30,952 30,952 N 10,933 10,933 10,933 Females Females Females (1) (2) (3) (1) (2) (3) (1) (2) (3) MAP -0.008 -0.01 0.002 MAP 0.008 0.01 0.014 MAP 0.037 0.040 0.040 SE (0.019) (0.018) (0.019) SE (0.013) (0.012) (0.012) SE (0.029) (0.029) (0.029) N 27,731 27,731 27,731 N 27,731 27,731 27,731 N 8,420 8,420 8,420 Students Students Students (1) (2) (3) (1) (2) (3) (1) (2) (3) MAP -0.015 -0.015 0.001 MAP -0.004 -0.003 0.006 MAP 0.012 0.015 0.018 SE (0.015) (0.014) (0.015) SE (0.011) (0.011) (0.011) SE (0.026) (0.025) (0.023) N 26,376 26,376 26,376 N 26,376 26,376 26,376 N 8,588 8,588 8,588 State State State &Year &Year &Year FEs N Y Y FEs N Y Y FEs N Y Y Contro Contro Contro ls N N Y ls N N Y ls N N Y Men (women) binge drinking measured by whether 5+ (4+) drinks consumed on an occasion. Individual characteristics controlled for in categories are survey month, sex, age, education level, marital status, employment status, and race/ethnicity. 108 Table A3.2. BJS Event Study Specification for Ages 18-20 Drink Any Binge Drink (Unconditional) All Males Females All Males Females 4 Years Pre-MAP 0.0092 0.0163 0.0032 -0.0008 0.0003 -0.0015 (0.0133) (0.0152) (0.0272) (0.0106) (0.0180) (0.0138) 3 Years Pre-MAP 0.0026 0.007 -0.0015 0.0043 0.0102 -0.0027 (0.0151) (0.0199) (0.0292) (0.0097) (0.0180) (0.0184) 2 Years Pre-MAP -0.0187 -0.0283 -0.0094 -0.0150 -0.0131 -0.0192 (0.0167) (0.0228) (0.0311) (0.0129) (0.0198) (0.0211) 1 Year Pre-MAP -0.0082 -0.0237 0.0076 -0.0040 0.0001 -0.0098 (0.0201) (0.0248) (0.0368) (0.0134) (0.0190) (0.0222) Year MAP Implemented 0.0122* 0.0300*** -0.0063 0.0160** 0.0185* 0.0129 (0.0074) (0.0104) (0.0087) (0.0066) (0.0101) (0.0084) 1 Year Post-MAP 0.0113 0.0103 0.0107 0.0141 0.0060 0.0222** (0.0118) (0.0197) (0.0139) (0.0102) (0.0183) (0.0108) 2 Years Post-MAP 0.0027 -0.0041 0.0135 0.0054 -0.0033 0.0158 (0.0109) (0.0165) (0.0146) (0.0082) (0.0107) (0.0114) 3 Years Post-MAP -0.0061 -0.0031 -0.0105 0.0071 0.0087 0.0037 (0.0138) (0.0174) (0.0221) (0.0129) (0.0165) (0.0136) Observations 52,713 27,641 25,072 52,713 27,641 25,072 Pre-trend F-stat 1.29 1.54 0.30 1.22 1.03 0.84 Pre-trend p-value 0.29 0.21 0.88 0.32 0.40 0.51 Men (women) binge drinking measured by whether 5+ (4+) drinks consumed on an occasion. Specification includes state and year fixed effects, and individual characteristics controlled for in categories including survey month, sex, age, education level, marital status, employment status, and race/ethnicity. *** p<0.01, ** p<0.05, * p<0.1. 109 Table A3.3. BJS Triple Difference Event Study Specification Comparing Ages 18-20 to 21- 23 Drink Any Binge Drink (Unconditional) All Males Females All Males Females 4 Years Pre-MAP 0.0032 0.0255 -0.0178 -0.0203 -0.0247 -0.0172 (0.0191) (0.0214) (0.0225) (0.0185) (0.0236) (0.0176) 3 Years Pre-MAP 0.0117 0.0070 0.0128 0.0158 0.0020 0.0255 (0.0122) (0.0142) (0.0191) (0.0148) (0.0192) (0.0200) 2 Years Pre-MAP 0.0046 0.0060 0.0023 -0.0090 -0.0167 -0.0044 (0.0139) (0.0200) (0.0163) (0.0130) (0.0165) (0.0149) 1 Year Pre-MAP 0.0152 0.0161 0.0093 0.0140 0.0091 0.0149 (0.0162) (0.0199) (0.0201) (0.0160) (0.0220) (0.0158) Year MAP Implemented 0.0188** 0.0260** 0.0072 0.0209** 0.0422*** -0.0003 (0.0088) (0.0116) (0.0094) (0.0083) (0.0119) (0.0087) 1 Year Post-MAP 0.0093 -0.0008 0.0138 -0.0039 -0.0061 -0.0048 (0.0113) (0.0186) (0.0103) (0.0137) (0.0209) (0.0152) 2 Years Post-MAP -0.0059 -0.0291 0.0261 -0.0099 -0.0186 0.0037 (0.0153) (0.0202) (0.0164) (0.0116) (0.0131) (0.0139) 3 Years Post-MAP -0.0121 -0.0204 -0.0118 -0.0063 -0.0030 -0.0195 (0.0149) (0.0185) (0.0265) (0.0159) (0.0184) (0.0163) Observations 117,310 59,986 57,324 117,310 59,986 57,324 Pre-trend F-stat 0.60 0.55 0.54 2.63 1.02 1.95 Pre-trend p-value 0.66 0.70 0.71 0.05** 0.41 0.12 Men (women) binge drinking measured by whether 5+ (4+) drinks consumed on an occasion. Specification includes the set of interactions of state, year, and underage status, and individual characteristics controlled for in categories including survey month, sex, age, education level, marital status, employment status, and race/ethnicity. *** p<0.01, ** p<0.05, * p<0.1. 110