THREE ESSAYS ON RISK MANAGEMENT AND IRRIGATION WATER DEMAND IN AGRICULTURE By Pin Lu A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Agricultural, Food, and Resource Economics—Doctor of Philosophy 2022 ABSTRACT Both extensive (share of insured acres in total insurable acres) and intensive (coverage level choice) margin participation rates in the U.S. crop insurance program have increased due to generous subsidies. On a national scale, this program has been well rated to satisfy the actuarial fairness requirement by USDA Risk Management Agency. However, sizeable spatial heterogeneity remains across the Great Plains and Corn Belt regions. If subsidies were to be reduced in the future because of financial constraints, such heterogeneity might be detrimental to the sustainability of the crop insurance program. A central theme of this dissertation is to investigate how farmers make participation decisions when risk factors exist. In a separate but related line of work, this dissertation also explores the irrigation water usage in the Great Lakes region because farmers' irrigation behavior reflects their risk preferences and impacts their incentives for enrolling in the program. The dissertation consists of three essays on farmers' decisions regarding premium mispricing, basis risk, and irrigation water usage. The first essay proposes a novel resampling procedure to estimate farm-level actuarially fair premiums. The resampling procedure mainly contains two parts: (i) semi-parametric quantile regression; and (ii) rejection method. Many previous studies explore whether county-level mispricing exists based on the historical loss ratio records. However, we can identify farm-level mispricing by imputing actuarially fair premiums based on historical yield records, consistent with theory. We find that farmers with lower land quality cropland paid fewer premiums than they should, but a contrary case happens for farmers with higher land quality cropland. Empirical evidence shows farmers may be more concerned about mispricing than subsidy transfer. Regression results support a conclusion that such farm-level mispricing deters farmers’ crop insurance demand. Our analysis sheds light on the policy-making that: (i) mispricing may be a substitution of subsidy so mitigating mispricing can maintain high participation while saving subsidies; and (ii) imputation of premiums based on historical yield records can apply. The second essay focuses on the impact of basis risk on participation rates in the U.S. crop insurance program. In recent years, basis risk has been increasingly recognized as an essential driver for deterring insurance uptake. Most research concentrates on index insurance contracts; however, few investigate the effect of mismatch between cash and futures markets on farmers’ insurance decisions. We first build a conceptual model to show farmers’ acreage response to basis risk within the expected utility framework. Next, we apply the Fractional Probit with Control Function for the empirical analysis and find that the effects of basis risk on participation rates are significantly negative for nearly all insurance contracts. Our analysis implies that: (i) to remove basis risk, revision for revenue contract may be considered; (ii) subsidy structure may be adjusted to be consistent with the underlying basis risk. The third essay investigates irrigation water usage in the Great Lakes region. Although the water conservation policy was implemented, there has been an upward trend in irrigation water demand from 2003 to 2018, including irrigated acres and total water usage. We employ firm-level irrigation data to examine what factors impact farmers' response to irrigation water usage. We find that: (i) price elasticities vary significantly according to model specifications and water costs; (ii) demand at both extensive (irrigated acres) and intensive (water application per acre) margins is input price inelastic; and (iii) price elasticities are homogeneous across crops but heterogeneous across states. For the policy-making, if there is a 10% tax on irrigation water cost, total water usage decreases by about 4% for corn and soybean, respectively. I dedicate this dissertation to my parents, Yiming Lu and Xiaozhi Zhang. iv ACKNOWLEDGEMENTS I am extremely grateful to my advisors, Dr. Hongli Feng and Dr. David A. Hennessy, for their careful guidance and kind support since the second year of my Ph.D. journey at Michigan State University. I cannot complete this dissertation without their insights and suggestions. I would like to express my deepest gratitude to my current advisor, Dr. Scott M. Swinton, for his strong support in my last year of Ph.D., and my committee member, Dr. Kyoo il Kim. They have provided valuable comments and suggestions for my dissertation. In final, I would like to thank my parents, Yiming Lu and Xiaozhi Zhang, for their endless support. v TABLE OF CONTENTS INTRODUCTION .......................................................................................................................... 1 CHAPTER 1 Crop Insurance Rate Making, Land Quality, and Adverse Selection ....................... 4 Abstract ............................................................................................................................... 4 Introduction ......................................................................................................................... 5 Basic Model ........................................................................................................................ 9 Data Description ............................................................................................................... 15 Empirical Results .............................................................................................................. 28 Conclusion ........................................................................................................................ 33 REFERENCES ................................................................................................................. 38 APPENDIX A: Resampling and Premium Estimation ..................................................... 43 APPENDIX B: Supplemental Figures and Tables............................................................ 49 CHAPTER 2 Basis Risk and Farmers’ Participation in the U.S. Federal Crop Insurance Program: A Conceptual Framework and its Application.............................................................................. 52 Abstract ............................................................................................................................. 52 Introduction ....................................................................................................................... 53 A Conceptual Model ......................................................................................................... 56 Data Description ............................................................................................................... 65 Empirical Results .............................................................................................................. 83 Conclusion ........................................................................................................................ 86 REFERENCES ................................................................................................................. 88 APPENDIX A: Monte Carlo Simulation .......................................................................... 92 APPENDIX B: Supplemental Figures and Tables............................................................ 97 CHAPTER 3 Extensive and Intensive Margins of Irrigation Water Demand In the Great Lakes Region ......................................................................................................................................... 102 Abstract ........................................................................................................................... 102 Introduction ..................................................................................................................... 103 Data and Main Variables ................................................................................................ 107 Empirical Analysis .......................................................................................................... 119 Conclusion ...................................................................................................................... 130 REFERENCES ............................................................................................................... 131 APPENDIX ..................................................................................................................... 134 vi INTRODUCTION Understanding farmers’ decision-making in the crop insurance program is crucial because the federal government has provided generous subsidies for the development of the program. The dual goals of the federal government are higher crop insurance participation and coverage levels (USGAO 2015). However, although participation rates have been boosted over twenty years, significant regional disparities still exist in the Great Plains and Corn Belt regions (Babcock et al. 2004; Glauber 2004; Woodard et al. 2012; Chen et al. 2020). Suppose subsidies were to be reduced in the future due to financial constraints. In that case, such heterogeneity might be detrimental to the sustainability of the crop insurance program because farmers with lower risks might opt out of this program. One potential consequence is that the insurance pool will become very risky, and indemnity will be far beyond premiums. This dissertation explores how farmers make insurance purchasing decisions when yield or price risks exist. Our analysis might shed light on policymaking for premium rate-setting, contract design, and irrigation water restriction. The first essay explores whether systematic mispricing exists in the yield contract and how farmers respond to such a bias when choosing among multiple coverage level choices. The motivation is that many studies investigate farmers’ coverage choices, assuming farmers’ premiums (including subsidies) are actuarially fair (Du et al. 2017; Jensen et al. 2018). Although some papers contribute to a deep understanding of rating methods of USDA Risk Management Agency (Goodwin 1993; Babcock et al. 2004; Woodard et al. 2011), to our best knowledge, there is no research investigating whether farmer’s premiums are actuarially fair based on yield records. Therefore, one contribution of this study is that we propose a novel resampling procedure to estimate farm-level actuarially fair premiums at all coverage choices (see, for example, Chen and Yu 2016; Price et al. 2019). With the estimated premiums, we can identify 1 the mispricing for each coverage level choice for each farm. Unlike many previous studies which examine the mispricing based on loss ratio, our analysis is based on actual yields, consistent with theory. After comparing premiums charged by USDA Risk Management Agency and the estimated actuarially fair premiums, we find that a farm with lower (higher) quality cropland in a county has a better (worse)-than-actuarially-fair premium, which implies that a farm with lower (higher) quality cropland paid fewer (more) than they should. Regression results show that such mispricing deters farmers’ crop insurance demand (i.e., coverage level choices). Our analysis shows that the adjustment of mispricing might be a supplement of federal subsidies. The second essay investigates how farmers respond to the basis risk when participating in the crop insurance program. Many previous studies have explored whether basis risk impedes insurance uptake for index contracts (Doherty and Richter 2002; Deng et al. 2007; Cole et al. 2014; Cai et al.2020; Ohashi 2022). The basis risk in index contracts is the imperfect association between experienced losses and indemnification based on index values (Jensen et al. 2018). This essay investigates whether the mismatch between cash and futures markets affects farmers’ insurance decisions. The reason is that the crop revenue contract has been the most popular product offered by the U.S. Federal Crop Insurance Corporation. The mismatch between cash and futures prices is inherent in revenue contracts because of contract designs. As in the literature, we define basis as the difference between cash and futures prices and basis risk as basis variation. We develop a conceptual model within the expected utility framework to identify farmers’ acreage response to basis risk. With elevator-level basis data, we apply the Fractional Probit with Control Function and find that basis risk deters farmers’ insurance demand for nearly all contracts. Our analysis implies that mitigating basis risk by revising revenue contract designs might be an option to maintain high participation rates while reducing the subsidies. 2 In the third essay, we turn to investigate irrigation water usage in the Great Lakes region because there has been an upward trend from 2003 to 2018. Such increased water withdrawal induces a concern regarding the future’s water scarcity, especially in the context that current water restriction policies are inefficient. With the farm-level irrigation data, this essay constructs two water costs (average energy cost and marginal irrigation cost) to estimate various price elasticities of irrigation water usage. The marginal irrigation cost is widespread in studies for U.S. western states (Casewell and Zilberman 1986; Gonzalez-Alvarez et al. 2006; Mieno and Brozović; 2017), but some current research, such as Ito (2014), finds that consumers are more concerned about average cost than marginal cost. Therefore, we extend Kornelis and Norris (2020) with marginal irrigation cost, error tolerances, model specifications, and state-specific price elasticities. Our findings are: (i) AEC and MIC have similar performance for extensive (irrigated acres) and intensive (water application per acre) margins, although price elasticities based on MIC are slightly smaller; (ii) for crop-specific values, price elasticities based on MIC are sensitive to error tolerances and might be underestimated; (iii) demand at both extensive and intensive margins are input price inelastic; (iv) price elasticities are homogeneous across crops but heterogeneous across states; and (v) If there is an arbitrary 10% tax on irrigation water cost, total water usage (acre-inch) will decrease by 4% for corn and soybean, respectively. 3 CHAPTER 1 Crop Insurance Rate Making, Land Quality, and Adverse Selection Abstract Over the past two decades, the U.S. federal government has devoted considerable efforts to encouraging participation rates in the crop insurance program. However, many farmers within a county still feel that their premium rates are too high even after large subsidies have been provided. This paper proposes a novel resampling procedure based on large volumes of unit- level yield data to estimate actuarially fair premiums at all coverage level choices, consistent with theory. By comparing RMA's premiums with our estimated actuarially fair premiums, we find a systematic mispricing situation in the U.S. crop insurance program. That is, farmers with higher quality cropland have worse-than-actuarially-fair premiums, implying that farmers owning good land quality pay more premiums than they should. We find that farmers might be more concerned about mispricing than subsidy transfer. The regression results show that such mispricing deters farmers' crop insurance demand. Therefore, the adjustment of mispricing might be a supplement to federal subsidies. 4 Introduction The U.S. federal crop insurance was authorized in the 1930s, and the Federal Crop Insurance Corporation (FCIC) was created in 1938 to carry out the whole program. The crop insurance program was experimental for a long time, and participation rates were low. After the 2000 Agricultural Risk Protection Act (ARPA), the number of insurance products and participation rates have been boosted due to increased premium subsidies. The increased individual-level loss experience provides crucial information for future loss expectations. USDA Risk Management Agency (RMA) relies on historical loss experience data to evaluate underlying yield risks, aiming to obtain actuarially sound premium rates equal to expected indemnities. Actuarial fairness is required for this program, and the loss ratio equal to 1.0 is the target in the 2008 Farm Bill. Over the past two decades, the crop insurance program has been well-rated nationally, indicating that the total premiums (including subsidies) equal the total indemnities. However, significant regional disparities of participation rates exist while the aggregate actuarial fairness has improved (Babcock et al. 2004; Glauber 2004; Sherrick et al. 2004; Woodard et al. 2012; Chen et al. 2020). Figure 1.1 The Extensive and Intensive Margin Participation Rates for Corn Over 1989-2020. Note: The extensive margin is defined as the ratio of insured acres to total insurable acres; the intensive margin is defined as the acreage-weighted coverage level choices. 5 Actuarial fairness is quite crucial because it matters to the program's sustainability. Without the generous subsidies, producers whose expected indemnities exceed the premiums costs are more likely to purchase insurance; those whose costs exceed their expected indemnities are less likely to purchase (Glauber 2004). After generous subsidies were provided, farmers enrolled in the program if the expected indemnities exceeded their self-paid premiums. In other words, generous subsidies can cover the potential mispricing and support high participation rates. However, an adverse selection problem might exist if subsidies were reduced and farm-level premiums were not actuarially sound. Farmers with higher expected yields or lower risks might opt out of the crop insurance program. Then the entire insurance pool becomes risky, and indemnities might exceed premiums, which will be detrimental to the development of the program (Skees and Reed 1986; U.S. General Accounting Office 1993; Coble and Knight 2002; Babcock et al. 2004; Glauber 2004). In addition, due to the increased indemnities, RMA might adjust the premium rating upward and crowd more farmers with good land quality out of the insurance pool, further exacerbating the program (Coble et al. 2010; Price et al. 2019). Many previous studies have explored multiple topics regarding USDA RMA’s ratemaking approach. For example, there are the Actual Production History (APH) rating method (Goodwin 1993), assumption of constant rate relativities across coverage levels (Babcock et al. 2004), actuarial implications of the loss cost ratio (LCR) rate-making methodology (Woodard et al. 2011), and improvement of rating efficiency with integrating widely available high-resolution soil data (Woodard and Verteramo‐Chiu 2017). However, most research directly focuses on the historical premiums and indemnities because the targeted loss ratio (expected indemnity/ premium) is 1.0. One disadvantage of the USDA/RMA loss ratio approach is that expected indemnity is estimated based on the limited length of time for which historical experience is 6 available. However, in theory, expected indemnity should be estimated based on yield distribution. Therefore, we employ large volumes of farm-level actual yield records to estimate the actuarially fair premiums, with which systematic mispricing can be identified. Therefore, this study aims to: (i) estimate farm-level actuarially fair premiums at all coverage choices; (ii) investigate whether actuarially unfairness exists; and (iii) examine the impact of the unfairness on farmers’ coverage choices. The motivation to investigate mispricing is that many previous studies analyse farmers’ choices assuming premiums are actuarially fair (LaFrance et al. 2002; Du et al. 2017; Jensen et al. 2018). However, the assumption of actuarial fairness might be too restrictive to obtain a proper conclusion. Our analysis builds on and contributes to three primary works of literature. First, we propose a novel resampling procedure to have a representative subsample, with which the underlying risk might be estimated reasonably. Many methods regarding yield distributions have been applied in the literature, including parametric distribution (Babcock and Blackmer 1992; Borges and Thurman 1994; Babcock and Hennessy 1996; Coble et al. 1996; Sherric et al. 2004; Claassen and Just 2011), and nonparametric distribution (Goodwin and Ker 1998; Ker and Goodwin 2000; Ker and Coble 2003; Woodard et al. 2011; Zhu et al. 2011). Due to limited subsidies, participation rates in the crop insurance program were meager before 1999. Therefore, if we estimated yield distribution directly based on yield records, actuarially fair premiums might be underestimated because there was no disaster between 1999 and 2008, the time interval most observations lie in. The advantages of our proposed resampling procedure are that it: (i) effectively imputes missing observations before 1999; and (ii) reweights farms based on long- term county-level historical records. Our procedure mainly includes two parts: (i) semi- parametric quantile regression associated with penalized B-splines (see, for example, Chen and 7 Yu 2016); and (ii) the rejection-acceptance method. Second, this study highlights whether systematic rate bias exists and how such bias impacts farmers’ insurance purchasing decisions. After obtaining a representative subsample, we estimate county-level actuarially fair premiums and then adjust county-level to individual-level premiums at all coverage choices per RMA’s rules (Coble et al. 2010; Price et al. 2019). Finally, there are RMA premiums and estimated actuarially fair premiums at all coverage level choices (see, for example, Du et al. 2017). We then construct a wedge variable, the ratio of RMA’s premium to the estimated actuarially fair premium. Since the premiums are always positive, the wedge range is from zero to positive infinity, and the wedge equal to 1 indicates that RMA’s premium is actuarially sound. Conversely, a wedge higher (lower) than 1 indicates farmers paid more (less) than they should. With farm-level RMA’s and estimated actuarially fair premiums, we can identify that: (i) whether farms with better (worse) land quality have higher (lower) wedges; and (ii) whether counties with better (worse) land capability have higher (lower) wedges. More information about wedges (e.g., distribution) can help policymakers understand the potential actuarial unfairness, which might help improve the rate-setting procedure in the crop insurance program. Third, our analysis provides insights into saving subsidies while maintaining high participation rates. Large subsidies have been provided to encourage a high participation rate and to ensure the sustainability of the crop insurance program. From 2007 to 2027, the Federal Crop Insurance Program has cost and will cost the federal government about $7.5 billion yearly (Rosa 2018a; Rosa 2018b). In this study, farmers may not be concerned about the subsidy transfer as in some previous studies (see, for example, Du et al. 2017). However, the empirical evidence shows that farmers might be concerned about actuarial fairness because the coverage level with the 8 lowest wedge is the most popular. Regression results robustly support the conclusion that mispricing adversely affects farmers’ coverage level choice. Therefore, the mitigation of mispricing can supplement subsidy, which implies that the subsidy structure can be adjusted to increase farmers’ crop insurance demand while providing lower subsidies (Babcock et al. 2004). The rest of this paper is organized as follows. In section 2, we apply the standard expected utility framework to derive how actuarial fairness affects farmers’ coverage level choices. Section 3 reports all data employed in this study. Section 4 shows empirical evidence for our hypotheses and regression results. Section 5 is the final part concluding with some discussions. Basic Model This study focuses on yield contracts because only yield risk is contained. On the other hand, revenue contracts include the price risk, making the analysis hard, although our main conclusion will not be changed. Let 𝑧 denote the approved yield history (i.e., the institutional estimate of farm-level mean yield), which represents land quality on a given insured unit and is used to estimate insurance premium cost by RMA. Let 𝜑 denote the coverage level with 0 < 𝜑 < 1, for a stochastic yield 𝑦 on a given insured unit, the actuarially fair premium (expected payout) can be calculated as: 𝜑𝑧 (1.1) Γ(𝜑, 𝑧) = 𝐸[max⁡(𝜑𝑧 − 𝑦, 0)] = ∫0 (𝜑𝑧 − 𝑦)𝑑𝐹(𝑦), where 𝐸[∙] is the expectation operator; 𝜑𝑧⁡is guaranteed yield below which farmers will receive indemnities from insurers; 𝐹(𝑦) is a cumulative distribution function with the density function; 𝑓(𝑦) on 0 < 𝑦 < +∞. Without loss of generality, we assume the crop price is equal to 1 and total liability is 𝜑𝑧.1 The response of actuarially fair premium to coverage level is 1 In the empirical part, we assume the corn price is $4/bu and the soybean price is $10/bu. 9 𝑑Γ(𝜑, 𝑧)/𝑑𝜑 ≡ Γ𝜑 (𝜑, 𝑧) = 𝑧𝐹(𝜑𝑧). We define a wedge factor as 𝑤(𝜑) = 𝐴𝑃𝑅(𝜑)/𝑂𝑃𝑅(𝜑), where 𝐴𝑃𝑅(𝜑) is actual premium rate made by RMA at coverage level 𝜑 and 𝑂𝑃𝑅(𝜑) is the estimated actuarially fair premium rate. 2 The wedge range is (0, +∞) because both 𝐴𝑃𝑅(𝜑) and 𝑂𝑃𝑅(𝜑) are positive. Then, a farmer’s actual premium payment (including subsidy) is 𝑤(𝜑)Γ(𝜑, 𝑧), and the response to coverage level is 𝑤𝜑 (𝜑)Γ(𝜑, 𝑧) + 𝑤(𝜑)Γ𝜑 (𝜑, 𝑧), where 𝑤𝜑 (𝜑) and Γ𝜑 (𝜑, 𝑧) are the appropriate function derivatives. Denote 𝑠(𝜑) as the subsidy rate for coverage level 𝜑 with 0 < 𝑠(𝜑) < 1. The subsidy’s dollar value is 𝑆(𝜑, 𝑧) = 𝑠(𝜑)𝑤(𝜑)Γ(𝜑, 𝑧), which represents the portion undertaken by federal government. Finally, a farmer’s self-paid premium is 𝜒(𝜑, 𝑧) = [1 − 𝑠(𝜑)]𝑤(𝜑)Γ(𝜑, 𝑧). Denote 𝑠𝜑 (𝜑) and 𝑤𝜑 (𝜑) as the derivative of subsidy rate function and wedge function, respectively, then we have the responses of subsidy’s dollar value 𝑆(𝜑, 𝑧) and farmer’s self-paid premium 𝜒(𝜑, 𝑧) to coverage level 𝜑 as 𝑠𝜑 (𝜑) 𝑤𝜑 (𝜑) 𝑧𝐹(𝜑𝑧) (1.2a) 𝑆𝜑 (𝜑, 𝑧) = [ 𝑠(𝜑) + + ] × 𝑆(𝜑, 𝑧); 𝑤(𝜑) Γ(𝜑,𝑧) Wedge⁡Effect Subsidy⁡Transfer⁡Effect (1.2b) 𝜒𝜑 (𝜑, 𝑧) = ⏞ 𝑤𝜑 (𝜑)Γ(𝜑, 𝑧) + 𝑤(𝜑)𝑧𝐹(𝜑𝑧) − ⏞ 𝑆𝜑 (𝜑, 𝑧) . When RMA premium rates are actuarially fair, then 𝑤(𝜑) ≡ 1 and 𝑤𝜑 (𝜑) ≡ 0, which indicates that the wedge effect vanishes. Then Eqn.(1.2a) becomes 𝑠𝜑 (𝜑) 𝑧𝐹(𝜑𝑧) (1.2a’) 𝑆̃𝜑 (𝜑, 𝑧) = [ 𝑠(𝜑) + Γ(𝜑,𝑧) ] × 𝑆(𝜑, 𝑧). 2 In the literature, wedge is sometimes defined as the difference between actual premium payout and actuarially fair premium (i.e., expected indemnity) for an insurance product (see more in Deng et al. 2007). As in Du et al. (2017), the actual premium payout equals actuarially fair premium multiplied by (1 + loading factor). The wedge factor in this study is the same as (1 + loading factor). 10 Denote 𝑈(∙) as the Bernoulli utility function for farmer’s income 𝜋; denote 𝑈𝜋 (∙) and 𝑈𝜋𝜋 (∙) as appropriate first and second derivative functions respectively with 𝑈𝜋 (∙) > 0 > 𝑈𝜋𝜋 (∙). Let 𝜀 represent other sources of individual farm’s income with a cumulative distribution function 𝐺(𝜀) on a compact set [𝜀, 𝜀]. Then, if purchasing the yield insurance, the farmer’s income is 𝜋(𝜑, 𝑧, 𝑦, ε) ≡ max(𝜑𝑧, 𝑦) − 𝐶(𝜑, 𝑧, 𝑐, 𝜀), where 𝐶(𝜑, 𝑧, 𝑐, 𝜀) ≡ 𝜒(𝜑, 𝑧) + 𝑐̅ − 𝜀, and 𝑐̅ is fixed cost. Therefore, the farmer’s expected utility is: 𝜀 (1.3) 𝐸[𝑈(𝜋)] = 𝐹(𝜑𝑧) ∫ 𝑈(𝜑𝑧 − 𝐶(𝜑, 𝑧, 𝑐, 𝜀))𝑑𝐺(𝜀) 𝜀 𝜀 +∞ + ∫𝜀 ∫𝜑𝑧 𝑈(𝜋(𝜑, 𝑧, 𝑐, 𝜀))𝑑𝐹(𝑦)𝑑𝐺(𝜀). Suppose a farmer is a rational and risk-averse EU maximiser, the first derivative of 𝐸[𝑈(∙)], with regard to coverage level 𝜑 will be Insurance⁡Effect Wedge⁡Effect−Subsidy⁡Transfer⁡Effect ⏞ = 𝑧𝐹(𝜑𝑧) ∫𝜀 𝑈𝜋 (𝜑𝑧 − 𝐶(𝜑, 𝑧, 𝑐̅, 𝜀))𝑑𝐺(𝜀) − ⏟⏞ 𝜕𝐸[𝑈(∙)] 𝜀 (1.4) 𝜒𝜑 (𝜑, 𝑧)𝐸[𝑈𝜋 (𝜋(𝜑, 𝑧, 𝑦, 𝜀))] 𝜕𝜑 ⏟ ≥0 ? 𝜀̅ +∞ where 𝐸[𝑈𝜋 (𝜑, 𝑧, 𝑦, 𝜀)] = ∫𝜀 ∫𝜑𝑧 𝑈𝜋 (𝜋(𝜑, 𝑧, 𝑦, 𝜀))𝑑𝐹(𝑦)𝑑𝐺(𝜀) ≥ 0. We define the first term of Eqn.(1.4) as Insurance Effect because both subsidy rate and wedge factor are not included. The second term of Eqn.(1.4) shows the difference between Wedge Effect and Subsidy Transfer Effect. If premium rates are actuarially fair, the second term of Eqn.(1.4) degenerates to the condition that only Subsidy Transfer Effect is included (see, for example, Du et al. 2017). If premium rates are not actuarially fair and 𝜒𝜑 (𝜑, 𝑧) ≤ 0 (i.e., Case 2 of Figure 1.2), then 𝜕𝐸[𝑈(∙)]⁄𝜕𝜑 > 0 and a rational farmer will choose the highest coverage level because marginal utility increases as the coverage level increases. However, we know this is not true in the real world because it indicates that farmers can pay less premiums when they choose higher coverage levels. Therefore, we will assume 𝜒𝜑 (𝜑, 𝑧) > 0 (i.e., Case 1 of Figure 11 1.2), which implies that farmers’ self-paid premium increases as the coverage level increases. Figure 1.2 Possible Relationships Between Farmer’s Self-Paid Function and Coverage Level. Note: In case 1, χφ > 0; in case 2, χφ < 0. The farmer’s self-paid premium is the premium rates after subsidy. Remark 1. Given 𝜒𝜑 (𝜑, 𝑧) > 0, suppose 𝜒𝜑𝜑 (𝜑, 𝑧) > 0, if ∃𝜑̅ with 0 < 𝜑̅ < 1, s.t 𝜒𝜑̅ = 𝜀 𝑧𝐹(𝜑𝑧) ∫𝜀 𝑈𝜋 (𝜑𝑧 − 𝐶(∙))𝑑𝐺(𝜀)⁄𝐸[𝑈𝜋 (𝜋(∙))], then i) an interior solution exists whenever 𝜒𝜑=0.50 < 𝜒𝜑̅ < 𝜒𝜑=0.85 ; ii) coverage level 0.50 will be chosen whenever 𝜒𝜑̅ < 𝜒𝜑=0.50; iii) coverage level 0.85 will be chosen whenever 𝜒𝜑̅ > 𝜒𝜑=0.85. Hypothesis 1 (H1). Farmers’ coverage choices were either clustered at one single coverage level or two adjacent levels. We now turn to investigate the effect of the wedge factor 𝑤(𝜑) on the subsidy’s dollar value 𝑆(𝜑, 𝑧). The motivations are: (i) the subsidy transfer is considered as an incentive for 12 farmers to choose higher coverage level (see, for example, Du et al. 2017; Cai et al. 2020); (ii) as in Eqn.(1.2a), the subsidy transfer is impacted by subsidy rate 𝑠(𝜑) and wedge factor 𝑤(𝜑). Back to Eqn. (1.1), we obtain 𝑧𝐹(𝜑𝑧)⁡⁄Γ(𝜑, 𝑧) ≥ 1⁄𝜑. Then, Eqn.(1.2a) can be re-written as sign 𝑆𝜑 (𝜑, 𝑧) =⏞ 𝑠𝜑 (𝜑)⁄𝑠(𝜑) + 𝑤𝜑 (𝜑)⁄𝑤(𝜑) + 1⁄𝜑 , which implies that the sign of 𝑆𝜑 (𝜑, 𝑧) only depends on subsidy rate, wedge factor, and their first derivative functions. Assume that 𝑆𝜑 (𝜑, 𝑧) > 0 since the higher subsidies are used to offset the excessive premium charge and make higher coverage levels more attractive (Babcock et al. 2004). Then, we have Remark 2. A sufficient condition for 𝑆𝜑 (𝜑, 𝑧) > 0 is 𝜑𝑠𝜑 (𝜑) 𝜑𝑤𝜑 (𝜑) (1.5) + ≡ ∆𝑠 (𝜑) + Λ𝑤 (𝜑) > −1 𝑠(𝜑) 𝑤(𝜑) where Δ𝑠 (𝜑) and Λ𝑤 (𝜑) are the elasticities of subsidy rate 𝑠(𝜑) and the wedge factor 𝑤(𝜑) with respect to 𝜑 respectively. Table 1.1 Premium Subsidies on Yield Contracts for BaU and OpU. Coverage Level 𝜑 CAT 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 𝑠(𝜑) 1 0.67 0.64 0.64 0.59 0.59 0.55 0.48 0.38 Source: The 2014 Farm Bill. BaU represents Basic Unit; OpU represent Optional Unit. As in Table 1.1, RMA subsidy rate 𝑠(𝜑) decreases when coverage level 𝜑 increases, which implies that 𝑠𝜑 (𝜑) ≤ 0. Since 0 < 𝜑 < 1 and 𝑠(𝜑) > 0, then Δ𝑠 (𝜑) ≤ 0. Suppose Eqn.(1.5) holds, the lower bound of Λ𝑤 (𝜑) will be −1 − Δ𝑠 (𝜑). Since 𝑤(𝜑) > 0, we have: (i) when Δ𝑠 (𝜑) < −1, Λ𝑤 (𝜑) > 0 and 𝑤𝜑 (𝜑) > 0; and (ii) when Δ𝑠 (𝜑) > −1, the sign of Λ𝑤 (𝜑) is undecided, which implies that 𝑤𝜑 (𝜑) may also be negative and positive. We may ask why discussion for the above elasticities is important. This is because if premium is assumed to be 13 actuarially fair, then 𝑤𝜑 (𝜑) ≡ 0, which implies that the sufficient condition for 𝑆𝜑 (𝜑, 𝑧) > 0 is only Δ𝑠 (𝜑) > −1. Therefore, a confusing situation will exist if farmers do not choose highest coverage levels when Δ𝑠 (𝜑) > −1. Our analysis helps explain the confusing case by exploring the role of actuarial unfairness - 𝑤(𝜑) and 𝑤𝜑 (𝜑). Hypothesis 2 (H2). Farmers choose the highest coverage level, at which the sum of the elasticities of subsidy rate and wedge factor is larger than 1, i.e., ∆𝑠 (𝜑) + Λ𝑤 (𝜑) > −1. The H2 is proposed based on a predetermined belief that farmers prefer higher subsidy transfer, i.e., 𝑆𝜑 (𝜑, 𝑧) > 0. If ∆𝑠 (𝜑) + Λ𝑤 (𝜑) ≤ −1, then 𝑆𝜑 (𝜑, 𝑧) ≤ 0 implying that higher coverage level will not be attractive to farmers, which violates RMA’s target. It is noteworthy that the subsidy rate is readily obtained from USDA/RMA. However, the wedge factor is only maintained by farmers if assuming farmers know their underlying production risks. As such, we may ask whether farmers are more concerned about the wedge information. Hypothesis 3 (H3). Farmers choose the coverage level at which the wedge (including subsidies) arrives at the minimum value. If H3 holds, there will be a crucial policy implication that the federal government may investigate the wedge conditions experimentally and adjust wedges at all coverage levels. For example, suppose the federal government's goal is still encouraging farmers to purchase the higher coverage levels. Then, wedges at higher coverage levels can be adjusted downwards to make higher coverage levels more attractive. If farmers can be aware of the wedge, as discussed above, we can further investigate the effect of the wedge on farmers’ coverage level choices in different quality cropland. Hypothesis 4 (H4). Farmers with higher (lower) land quality pay more (fewer) premiums than they should. In other words, a county's higher (lower) quality cropland has a 14 worse (better)-than-actuarially-fair premium rating. In what follows, we will test our hypotheses based on the estimation of the wedge factor, which is the ratio of RMA premium to the estimated actuarially fair premium. The wedge values can be observed at all coverage levels. We will provide the empirical evidence to test H1-H4. Furthermore, to examine the effect of wedge factor on coverage level choice, we employ the Ordered Logit Model (OLM) for farm-level coverage choice; Ordinary Least Squares (OLS) and Weighted Least Square (WLS) for county-level coverage choices. Data Description We focus on APH yield contracts in this study because yield contracts only contain yield risk and crop prices are constant. 3 If revenue contracts are included, unnecessary modeling complications will also be introduced. When price risks are in a limited range, the analysis in this study can be extended to revenue contracts, although all conclusions are based on yield contracts. Table 1.2 Definition of Variables. Variable Description Premium 𝐴𝑃𝑅 Premium rates (including subsidies) from RMA Estimated actuarially fair premium rates 𝑂𝑃𝑅 based on RMA actual yield data Wedge 𝑤𝑖 Farm-level wedge at each coverage level Farm-level wedge differential between higher and lower ∆𝑤𝑖 coverage levels 𝑤𝑐 County-level average wedge (acreage-weighted) County-level wedge differential between higher and lower ∆𝑤𝑐 coverage levels (acreage-weighted) Land Capability 𝐿𝐶𝐶 Percentage of Class I-II in Class I-VIII Weather Determinants 𝐺𝑐̅ Average GDD over 1989-2007 𝑆𝑐̅ Average SDD over 1989-2007 The targeted area is the twelve states in Midwest and Great Plain region (IL, IN, IA, KS, 3 Insurance plan code is 90; abbreviation is APH; the name is Actual Production History. 15 MI, MN, MO, NE, ND, OH, SD and WI). Corn and soybean are selected for crop-specific analysis. The Table 1.2 reports the definition of main variables, including premiums, wedge factors, land capability and weather determinants. The Table 1.3 reports the statistics summary of main variables. Table 1.3 Summary of Main Variables. Variable Obs. Mean Std.Dev. Min Max Corn 𝐴𝑃𝑅(φ = 0.65) 94,401 22.27 13.85 6.83 174.66 𝑂𝑃𝑅(φ = 0.65) 94,401 15.42 14.75 2.31 282.94 𝑤𝑖 94,401 1.73 0.65 0.17 20.76 ∆𝑤𝑖 94,401 0.06 0.17 -9.24 1.02 𝑤𝑐 818 1.72 0.51 0.43 4.35 ∆𝑤𝑐 818 0.04 0.15 -0.91 0.47 Soybean 𝐴𝑃𝑅(φ = 0.65) 108,792 6.23 3.27 2.22 66.9 𝑂𝑃𝑅(φ = 0.65) 108,791 3.84 3.63 0.85 196.1 𝑤𝑖 108,791 2.07 0.99 0.14 19.18 ∆𝑤𝑖 108,791 0.11 0.19 -7.12 1.95 𝑤𝑐 791 1.88 0.87 0.15 9.26 ∆𝑤𝑐 791 0.095 0.16 -0.43 0.67 Land Capability 𝐿𝐶𝐶 2,902 0.25 0.21 0.00006 0.93 Weather Determinants 𝐺𝑐̅ 1,006 1248.7 191.1 186.6 1692.2 𝑆𝑐̅ 1,006 20.48 22.9 0 126.02 Estimated Actuarially Fair Premium Rates by RMA Rules Denote 𝑂𝑃𝑅𝑖,𝑐 (𝜑) as the estimated premium rate for coverage level 𝜑 for farm 𝑖 in county 𝑐; 𝐴𝑃𝐻𝑖 as Actual Production History reflecting the institutional estimate of mean yield by USDA/RMA; ref⁡yieldc as county average yield; −𝐸𝑐 ⁡as the exponent reflecting the correlation between county and individual yields; 𝑈𝐿𝑅𝑐 (𝜑)⁡as county unloaded rate estimated from our proposed procedure, which is also the main independent variable in this study; 𝑓𝑖𝑥𝑒𝑑⁡𝑙𝑜𝑎𝑑c as the load reflecting potential disaster and production failure; 𝑐𝑜𝑣⁡𝑑𝑖𝑓𝑓(φ)⁡as 16 coverage level differential used to adjust rates from the base 65% coverage to other coverage levels. Then our estimated farm-level premium rate at each coverage level φ is as: 𝐴𝑃𝐻 −𝐸𝑐 𝑖 (1.6) 𝑂𝑃𝑅𝑖,𝑐 (𝜑) = (ref⁡yield ) × 𝑈𝐿𝑅𝑐 (𝜑) + 𝑓𝑖𝑥𝑒𝑑⁡𝑙𝑜𝑎𝑑c × 𝑐𝑜𝑣⁡𝑑𝑖𝑓𝑓(φ) c In Eqn.(1.6), our proposed procedure is used to estimate the county-level unloaded premium rate 𝑈𝐿𝑅𝑐 (𝜑) at the coverage level 𝜑 with 𝜑 ∈ {0.50, … ,0.85}. All other parameters are from USDA/RMA. It is noteworthy that in RMA’s methodology, county-level unloaded premium rate 𝑈𝐿𝑅𝑐 is only for a 65% coverage level (see Coble et al. 2010). Premium rates at other coverage levels are obtained by applying coverage level differentials 𝑐𝑜𝑣⁡𝑑𝑖𝑓𝑓(φ). For example, in RMA’s methodology, the premium at 𝜑 = 0.75 equals to 𝑈𝐿𝑅(𝜑 = 0.65) × 𝑐𝑜𝑣⁡𝑑𝑖𝑓𝑓(𝜑 = 0.75). Since our procedure can estimate the county-level premium rates at all coverage levels, coverage level differential will impact the farm-level premium rates only through the county-level fixed load 𝑓𝑖𝑥𝑒𝑑⁡𝑙𝑜𝑎𝑑c . Now we turn to introduce how 𝑈𝐿𝑅𝑐 (𝜑) is estimated from the 2008 USDA/RMA farm- level yield data. In this yield dataset, each farm has a specific APH and four-to-ten-year yield historical records. There are 18 yield types including actual yield, transitional yield, the exceptional transitional yield for a new producer, and simple average transitional yield for added land. In this study, we only employ the actual yield because other types are RMA’s imputation, accounting for a small proportion in the whole sample (less than 20% in western Great Plains and less than 15% in Corn Belt). Finally, there are about 9 million yield records in 882 counties for corn and about 8.6 million records in 828 counties for soybean. Our proposed procedure contains four parts: (i) yield detrend: all historical farm- and county-level actual yields are adjusted to the 2009 technological level; (ii) semiparametric 17 quantile regression imputation (SQRI): impute missing observations over 1970-1998 based on penalized B-splines and quantile regression approaches, because the most observations in the yield dataset lie between 1999 and 2008 and no extreme climate change happens in this 10-year interval; (iii) rejection sampling: a more representative sample with farm-level records can be obtained from long-term county-level yields; and (iv) actuarially fair premium estimation: impute 𝑈𝐿𝑅𝑐 (𝜑) at all coverage levels based on the univariate penalized B-spline method. After obtaining 𝑈𝐿𝑅𝑐 (𝜑), we employ the 2009 USDA/RMA farm-level contract choice data to obtain farm-level premiums as in Eqn.(1.6). The primary Information in this dataset contains the farm’s location information (state and county), coverage level chosen, the premium rate paid (including subsidies), insured acres, elected crop price, approved production history (APH), actual yield, and various parameters such as fixed load or coverage differential. Figure 1.3 reports the coefficient of variation (i.e., standard deviation/mean value) of actual yields for corn in each Crop Reporting Region (CRD). The upper panel of Figure 1.3 shows the coefficient of variation for the original data; the lower panel shows that for the resampling data. We find that the pattern of coefficient of variation does not change after our resampling procedure, implying no systematic bias. Figure 1.4 reports county-level yield distribution after the resampling procedure. We find that the yield distributions in the main production area (e.g., IA, IL, IN) have the similar patterns. However, the distribution becomes dispersed as one move towards the western Great Plains. 4 4 After resampling, we obtain 1,000 yields records in a county. Denote 𝑦 det as the detrended farm-level yields, then the nonparametric kernel estimates at a given point 𝑦 can be defined as det ĝ(𝑦) = (1⁄𝑁ℎ) ∑𝑁 𝑖=1 𝐾[(𝑦𝑖 − 𝑦)⁄ℎ], where 𝑁 = 1,000; 𝐾(∙) is the Gaussian kernel function which is nonnegative and satisfies ∫ 𝐾(𝑧)𝑑𝑧 = 1, ∫ 𝑧𝐾(𝑧)𝑑𝑧 = 0, ∫ 𝑧 2 𝐾(𝑧)𝑑𝑧 < ∞. Parameter ℎ is the smoothing parameter which can be defined as ℎ̂ = 0.9 × min[𝜎𝑦 , 𝐼𝑄𝑅𝑦 ⁄1.34] × 𝑁 −0.2, where 𝜎𝑦 is the standard deviation; 𝐼𝑄𝑅𝑦 is the interquartile range (i.e., the difference between 25th and 75th quantiles) of 𝑦 det . 18 Figure 1.3 Coefficient of Variation from Original and Resampling Data for Corn by Crop Reporting District (CRD). Wedge Factor RMA’s premium rates at all coverage levels can be imputed from the 2009 USDA/RMA farm-level contract choice data (see, for example, Du et al.2017). After obtaining the estimated actuarially fair premium rates, we can measure farm- and county-level actuarial fairness. Denote 𝐴𝑃𝑅𝑖 (𝜑) as RMA’s actual premium rates (including subsidies) at the coverage level 𝜑 for farm 𝑖,; 𝑂𝑃𝑅𝑖 (𝜑) as the actuarially fair premium rates; 𝑎𝑖,𝑐 represents insured acreages of farm 𝑖; 𝑠𝑖,𝑐 = 𝑎𝑖,𝑐 ⁄∑𝑖 𝑎𝑖,𝑐 represents a weight. 𝑚⁡corresponds to the coverage level with 𝜑1 = 0.5 until 𝜑8 = 0.85. Then four wedge factors can constructed as: ∑8 𝐴𝑃𝑅𝑖 (𝜑𝑚 ) (1.7a) farm average: 𝑤𝑖 = ∑8𝑚=1 ; 𝑚=1 𝑂𝑃𝑅𝑖 (𝜑𝑚 ) ∑8 𝐴𝑃𝑅𝑖 (𝜑𝑚 ) ∑4 𝐴𝑃𝑅𝑖 (𝜑𝑚 ) (1.7b) farm differential: ∆𝑤𝑖 = ∑8𝑚=5 − ∑4𝑚=1 ; 𝑚=5 𝑂𝑃𝑅𝑖 (𝜑𝑚 ) 𝑚=1 𝑂𝑃𝑅𝑖 (𝜑𝑚 ) ∑ ∑8 𝐴𝑃𝑅𝑖 (𝜑𝑚 )×𝑠𝑖,𝑐 (1.7c) county average: 𝑤𝑐 = ∑𝑖 ∑8𝑚=1 ; 𝑖 𝑚=1 𝑂𝑃𝑅𝑖 (𝜑𝑚 )×𝑠𝑖,𝑐 ∑ ∑8 𝐴𝑃𝑅𝑖 (𝜑𝑚 )×𝑠𝑖,𝑐 ∑ ∑4 𝐴𝑃𝑅𝑖 (𝜑𝑚 )×𝑠𝑖,𝑐 (1.7d) county differential: ∆𝑤𝑐 = ∑𝑖 ∑8𝑚=5 − ∑𝑖 ∑4𝑚=1 . 𝑖 𝑚=5 𝑂𝑃𝑅𝑖 (𝜑𝑚 )×𝑠𝑖,𝑐 𝑖 𝑚=1 𝑂𝑃𝑅𝑖 (𝜑𝑚 )×𝑠𝑖,𝑐 19 Figure 1.4 Detrended Unit-Level Imputed Yield Densities for Corn by County and States. Note: “dashed curve” shows density for each county; “yellow curve” shows density for the state (randomly draw from all observations in this state). 20 Figure 1.5 Distributions of Farm-level Wedges (before and after subsidies) for Corn and Soybean at the 65% Coverage Level. Note: The green curve shows the normal distribution. The red line shows the actuarial fairness condition (loss ratio=1). 21 The average wedges (Eqn.(1.7a) and (1.7c)) capture the farmer's response to the systematic pricing bias; the differential wedges (Eqn.(1.7b) and (1.7d)) can capture farmers' response to benefits from higher coverage levels. The motivation for constructing both average and differential wedges is that USDA RMA encourages farmers to choose higher coverage levels. Therefore, wedges at higher coverage levels may differ from those at lower coverage levels. Figure 1.5 shows the distributions of the farm-level wedges (before and after subsidies) for corn and soybean at the 65% coverage level. By comparing Panel A with B, we know the effect of subsidies on the farm-level wedges for corn. Before subsidy, most farm-level wedges are larger than the actuarial target (i.e., 1), which indicates that RMA’s premiums in most farms are higher than the actuarial fair premiums. After subsidy, most farm-level wedges are less than 1, which indicates that subsidy policy may cover actuarial unfairness. 5 A similar pattern is also applied to soybeans (Panels C and D). An inquiry may be generated because it seems that RMA’s premiums (including subsidy) are systematically higher than our estimated premium. To investigate whether and why the systematic bias happens, we compare the inverse of wedge and loss ratio as in Figure 1.6. In our construction, the inverse of the wedge is the ratio of expected indemnity (the estimated actuarially fair premium) to RMA’s premium, which is the loss ratio used by RMA. Therefore, the inverse of the wedge and loss ratio should be consistent. Panel A and B of Figure 1.6 report the inverse of wedge and loss ratio for corn, respectively. We find that: first, the actuarially fair premiums (i.e., expected payout) are higher than actual indemnities in the Corn Belt region, 5 For example, the mean wedge before subsidy for corn is 1.732, which is changed to 0.71 after subsidy, consistent with the subsidy rate at a 65% coverage level (0.71=1.73*(1-0.59)). 22 implying RMA’s premiums might be underestimated in the main production area; second, the inverses of the wedge are lower than the loss ratios when moving outside the main production area, implying that RMA’s premiums might be overestimated. Patterns for soybean are similar (Panel C and D of Figure 1.6). The findings above indicate that actuarial unfairness might be correlated with land quality. To examine whether wedge is related to land quality, we set six quantiles of land quality (10th,25th,50th,75th, 90th, and 95th) based on the farm’s APH in each Cropping Record District (CRD). It is noteworthy that we are interested in whether the farmer with higher quality cropland receives a worse-than-actuarially-fair rating. Therefore, the 5th quantile of land quality is not included. Finally, we may ask why the quantiles are divided in each CRD. The reasons are that: first, yields corresponding quantiles are not too concentrated so that we can avoid many same observations being divided into one pool; second, yield risk in one CRD is similar, so yields corresponding quantiles are not too dispersed. Figure 1.7 reports the wedge means for each quantile land quality. We find: (i) wedge (before subsidy) decreases with coverage level in lower land quality, but a contrary case exists in higher land quality; (ii) wedge (after subsidy) increases with coverage level in all land qualities; (iii) wedges (after subsidy) in lower land quality (10th and 25th quantiles) are less than 1 (actuarial fair target), but some wedges in higher land quality are higher than 1; and (iv) for higher land quality (50th -95th quantiles), wedges (after subsidy) increase with coverage level and intersect with 1 at 80% coverage level. 23 Figure 1.6 Comparison between Expected Loss Ratio (1/Wedge) and Average Loss Ratio. Note: As the construction in this study, 1/Wedge represents the ratio of expected indemnity to actual premium, which means that 1/Wedge is expected loss ratio. Land Capability Soil quality data are from National Resource Inventory (NRI). As in USDA (2015), there are eight Land Capability Classes (LCC). To be specific, Class I soils have limitations that restrict their use. However, Class VIII soils and miscellaneous areas have limitation that precludes their use for commercial plant production and limits their use for recreation, wildlife, water supply, or aesthetic purposes. As in the literature, we measure the land capability by calculating the percentage of good land in total land. We employ Class I-II as in Goodwin et 24 al. (2004), although Du et al. (2017) choose Class I-IV. We now turn to show whether actuarial unfairness exists in a macro-level region since land capability has inter-county heterogeneities. As in Figure 1.8, the x-axis shows the ranking of land capability (01-lowest; 12-highest) and state name. The distribution in each state shows county-level wedge heterogeneities at the 65% coverage level. For corn, we find that wedges (before subsidies) are lower in states with lower land capability (e.g., WI, SD, MN); however, wedges (before subsidies) are higher in the main production area (e.g., IA, IL, IN). This pattern implies that the wedge (before subsidies) might be inversely correlated with land quality. It is noteworthy that: (i) for corn, some states with lower land capability have high wedges, e.g., MO, MI, and NE; (ii) for soybean, the relationship between wedges and land capabilities is not entirely clear. Therefore, we will apply empirical methods for further analysis. Weather Determinants As in the literature (e.g., Schlenker and Roberts 2009; Xu et al. 2013), we construct two variables-Growing Degree Days (GDD) and Stress Degree Days (SDD)-to represent beneficial heat and heat stress during the growing season (April to September) as below: 𝑚𝑎𝑥 𝑚𝑖𝑛 (1.8a) 𝐺𝐷𝐷𝑐,𝑡 = ∑𝑑∈Ω𝑡 [0.5 (𝑚𝑖𝑛(𝑚𝑎𝑥(𝑇𝑐,𝑑,𝑡 , 𝑇 𝑙 ), 𝑇 ℎ ) + 𝑚𝑖𝑛(𝑚𝑎𝑥(𝑇𝑐,𝑑,𝑡 , 𝑇 𝑙 ), 𝑇 ℎ )) − 𝑇 𝑙 ] 𝑚𝑎𝑥 𝑚𝑖𝑛 (1.8b) 𝑆𝐷𝐷𝑐,𝑡 = ∑𝑑∈Ω𝑡 [0.5 (𝑚𝑎𝑥(𝑇𝑐,𝑑,𝑡 , 𝑇 𝑘 ) + 𝑚𝑎𝑥(𝑇𝑐,𝑑,𝑡 , 𝑇 𝑘 )) − 𝑇 𝑘 ] where 𝑐 is county; 𝑑 is day; Ω𝑡 is the set of growing season days in year 𝑡. The thresholds are 𝑇 𝑙 = 10° 𝐶, 𝑇 ℎ = 30° 𝐶 and 𝑇 𝑘 = 32.2° 𝐶. We use climatological normal around 20 years {1989, … ,2007} to control for weather effect on coverage level choice as: (1.9a) 𝐺𝑐̅ = (1/19) ∑2007𝑗=1989 𝐺𝐷𝐷𝑐,𝑗 ; (1.9b) 𝑆𝑐̅ = (1/19) ∑2007𝑗=1989 𝑆𝐷𝐷𝑐,𝑗 . 25 Figure 1.7 Comparisons of Wedges (before and after subsidies) among Different Land Qualities for Corn. Note: Land quality is measured by farm-level APH within a Cropping Record District. The tolerance is 20 bushels for each quantile. For instance, suppose the 25th quantile of APH is 112, then observations who own APH within the interval (102, 122) will be used for the 25th quantile part. 26 Figure 1.8 Wedge Comparison at 65% Coverage Level. Note: The distribution on each box is based on farm-level observations in each state. Central mark is median; edges are 25th (Q1) and 75th (Q3) quantiles respectively. Upper and lower whiskers include data points within (+)- 1.5×IQR where IQR=Q3-Q1. The distribution is based on farms in each state. 27 Empirical Results Evidence on Hypotheses We now turn to examine the hypotheses H1-H4. Recall that H1 states that farmers’ coverage choices are clustered at one single coverage level or two adjacent levels. From Table 1.4, we find that farmers choosing 65% and 70% coverage levels account for over 55% of the whole sample, which supports the H1. This conclusion holds for different crops and unit types. Furthermore, the inner solution indicates that the gradient of the self-paid premium function (𝜒𝜑 ) is between the two end coverage levels (0.5 and 0.85). Table 1.4 Evidence for H1. Plan % obs. % obs. % obs. % obs. % obs. % obs. % obs. % obs. Crop 90 at 50% at 55% at 60% at 65% at 70% at 75% at 80% at 85% Corn All 0.12 0.02 0.05 0.30 0.25 0.18 0.06 0.03 OpU 0.10 0.01 0.05 0.29 0.27 0.18 0.06 0.03 BaU 0.14 0.02 0.05 0.32 0.21 0.18 0.06 0.03 Soybean All 0.12 0.01 0.04 0.30 0.25 0.18 0.06 0.03 OpU 0.10 0.01 0.05 0.28 0.28 0.19 0.06 0.03 BaU 0.14 0.02 0.05 0.33 0.21 0.17 0.06 0.02 Note: the sample only includes OpU and BaU. Recall that H2 states farmers choose the highest coverage level, at which sum of elasticities of subsidy rate and wedge factor is larger than −1. To test this hypothesis, we first define two common elasticities as: 𝑠(𝜑+ )−𝑠(𝜑− ) 𝜑 (1.10a) Conservative: ∆1𝑠 (𝜑) = × 𝑠(𝜑+ ), 𝜑+ −𝜑− + 𝑤(𝜑+ )−𝑤(𝜑− ) 𝜑 Λ𝑤1 (𝜑) = × 𝑤(𝜑+ ); 𝜑+ −𝜑− + 𝑠(𝜑+ )−𝑠(𝜑− ) 𝜑 +𝜑 (1.10b) Arc Elasticity: ∆2𝑠 (𝜑) = × 𝑠(𝜑 +)+𝑠(𝜑 − ) ; 𝜑+ −𝜑− + − 28 𝑤(𝜑+ )−𝑤(𝜑− ) 𝜑+ +𝜑− Λ𝑤2 (𝜑) = × 𝑤(𝜑 , 𝜑+ −𝜑− + )+𝑤(𝜑− ) where 𝜑+ and 𝜑− are adjoining coverage levels with 𝜑+ > 𝜑− and 𝜑+ , 𝜑− ∈ {0.5, … ,0.85}. Both elasticities are employed to eliminate the potential mismeasurement due to the different start and end points. From Table 1.5, 75% should be farmers’ favorite coverage level because the 75% coverage level is the highest level satisfying the sum of elasticities larger than −1. Since the percentage of observations at the 65% coverage level is the highest in Table 1.4, H2 is not supported, which implies that we cannot conclude that farmers are concerned about the subsidy transfer. Table 1.5 reports evidence for H3, which states that farmers choose the coverage level at which the wedge (including subsidies) arrives at the minimum value. Wedges for corn and soybeans arrive at the minimum levels at the 65% coverage choices, implying that farmers are concerned about wedges (including subsidies). Therefore, policymakers might consider adjusting actuarial unfairness rather than maintaining large subsidies. For example, if the federal government's target encourages farmers to choose higher coverage levels, wedges at higher levels can be adjusted downward. It is noteworthy that the adjustment of wedges cannot be uniform nationwide. If the actuarially fair premiums are constant, the downward adjustment of wedges indicates RMA's premiums should be reduced, which might increase the loss ratio. Recall that H4 states farmers with higher (lower) land quality cropland pay more (fewer) premiums than they should. However, as in Figure 1.9, the mean values of RMA’s premiums (before subsidy) are higher than that of the actuarially fair premiums for all land qualities. Moreover, the difference between RMA’s and actuarially fair premiums becomes more significant as the coverage level increases. Therefore, H4 is not robustly supported if we compare the mean values of RMA’s premiums and the actuarially fair premiums. 29 Table 1.5 Evidence for H2 and H3. Coverage Level  CAT 50% 55% 60% 65% 70% 75% 80% 85% Subsidy Rate s( ) 1 0.67 0.64 0.64 0.59 0.59 0.55 0.48 0.38 1 ( ) s -0.52 0 -1.32 0 -1.09 -2.33 -4.47  2s ( ) -0.48 0 -1.02 0 -1.02 -2.11 -3.84 Corn Wedge w( ) NA 1.73 1.72 1.69 1.66 1.71 1.75 1.78 1.80 1w ( ) -0.12 -0.18 -0.24 0.43 0.37 0.26 0.18  2w ( ) -0.12 -0.18 -0.23 0.42 0.36 0.25 0.17 Soybeans Wedge w( ) NA 1.93 1.89 1.85 1.82 1.87 1.93 1.98 2.02 1w ( ) -0.27 -0.24 -0.26 0.43 0.44 0.40 0.35  2w ( ) -0.25 -0.23 -0.25 0.42 0.43 0.39 0.34 Corn 1s ( ) + 1w ( ) -0.64 -0.18 -1.56 0.43 -0.72 -2.07 -4.29  ( ) +  ( ) 2 s w 2 -0.60 -0.18 -1.25 0.42 -0.66 -1.86 -3.67 Soybeans 1s ( ) + 1w ( ) -0.79 -0.24 -1.58 0.43 -0.65 -1.93 -4.12  2s ( ) +  2w ( ) -0.73 -0.23 -1.27 0.42 -0.59 -1.72 -3.50 Table 1.6 Coverage Choices by Land Qualities. Coverage Level Land Quality % obs. % obs. % obs. % obs. % obs. % obs. % obs. % obs. (quantile) at 50% at 55% at 60% at 65% at 70% at 75% at 80% at 85% Corn 10th 12.8 1.9 6.5 36.9 28.5 11.3 1.8 0.2 25th 11.7 1.8 6.2 32.7 28.0 16.3 2.8 0.6 50th 11.7 1.6 4.8 27.9 25.6 20.5 5.5 2.5 75th 10.4 1.4 3.6 26.0 21.1 22.8 9.4 5.2 90th 10.6 1.2 4.0 26.2 20.9 21.6 9.5 6.1 95th 11.4 1.4 3.7 25.0 20.8 20.9 10.5 6.3 Soybeans 10th 7.8 1.0 3.9 46.8 31.9 7.8 0.6 0.1 25th 11.3 1.3 6.1 39.7 31.3 8.3 1.8 0.2 50th 13.1 1.6 5.4 33.5 28.5 14.8 2.5 0.5 75th 14.7 1.5 4.4 28.2 24.7 19.6 5.2 1.8 90th 13.3 1.5 4.3 25.9 20.6 22.3 8.3 3.8 95th 10.2 1.6 3.6 24.3 19.9 23.6 10.6 6.2 30 Figure 1.9 RMA Premiums (including subsidies), Farm’s Self-Paid Premium and Our Estimated Actuarially Fair Premiums for Corn. Note: Only mean values are reported here. 31 Table 1.6 reports the farmer’s choices in terms of land qualities. We find that: (i) the 65% coverage level is the most popular level for all land qualities, and (ii) the percentage for higher coverage level choices increases when land quality increases. For example, the percentage of 75% coverage level for the 10th land quality is 11.3, but that for the 95th land quality increases to 20.9. Results in Table 1.6 imply that farmers with better land quality are inclined to choose higher coverage levels. Regression Results We first employ the ordered logit model (OLM) to analyse the farm-level coverage choice. The dependent variable is each farm’s actual coverage choice; the independent variables of most interest are average wedge across all coverage levels (i.e., 𝑤𝑖 ) and wedge differential (i.e., ∆𝑤𝑖 ). The model specification we employed is as follows: 𝑒𝑥𝑝(𝐗 𝐛−𝑐 ) (1.10a) 𝑃𝑟𝑜𝑏(𝜑𝑖 = 𝜑1 |𝐗 𝒊 , 𝐛, 𝑐) = 1 − 1+𝑒𝑥𝑝(𝐗𝐢 𝐛−𝑐1 ); 𝐢 1 𝑒𝑥𝑝(𝐗 𝐢 𝐛−𝑐 ) 𝑒𝑥𝑝(𝐗 𝐢 𝐛−𝑐 ) (1.10b) 𝑃𝑟𝑜𝑏(𝜑𝑖 = 𝜑𝑗 |𝐗 𝒊 , 𝐛, 𝑐) = 1+𝑒𝑥𝑝(𝐗 𝐛−𝑐𝑗−1 ) − 1+𝑒𝑥𝑝(𝐗 𝐛−𝑐𝑗 ), 𝑗 = 2,3, … ,7; 𝐢 𝑗−1 𝐢 𝑗 𝑒𝑥𝑝(𝐗 𝐛−𝑐 ) (1.10c) 𝑃𝑟𝑜𝑏(𝜑𝑖 = 𝜑8 |𝐗 𝒊 , 𝐛, 𝑐) = 1+𝑒𝑥𝑝(𝐗𝐢 𝐛−𝑐7 ), 𝐢 7 where 𝑖 ∈ {1,2, … , 𝑁} is the index of each farm in the sample; 𝜑 ∈ {0.5, … ,0.85} with 𝜑1 = 0.5, 𝜑2 = 0.55 until 𝜑8 = 0.85; 𝐗 𝐦 is the vector of the exogenous variables {𝑤𝑖 , ∆𝑤𝑖 , 𝐿𝐶𝐶, 𝐺𝑐̅ , 𝑆𝑐̅ }; 𝐛 is the coefficient vector; c j , j 1,...,7 are cut points of the distribution. To investigate the effect of land quality on farmer’s coverage level choice, we add an indicator variable 𝑖𝑛𝑑𝑒𝑥_𝐶𝑅𝐷 showing quantiles of land quality in a Cropping Report District. Specifically, 𝑖𝑛𝑑𝑒𝑥_𝐶𝑅𝐷 = 1 represents 10th quantile and 𝑖𝑛𝑑𝑒𝑥_𝐶𝑅𝐷 = 6 represents 95th quantile. For the robustness checks, we also employ a county’s longitude and lattitude in order to control for some non-climate features such as technological adoption and infrastructure availability. 32 Next, by calculating weights based on farm-level insured acres, we investigate that how wedges (i.e., 𝑤𝑐 and ∆𝑤𝑐 ) affect coverage choices from a county-level perspective. We employ Ordinary Least Squares (OLS) for the primary estimation, and use Weighted Least Squares (WLS) for robustness checks. In WLS estimation, suppose the model specification be 𝑦 = 𝐗 𝐓 𝛃 + 𝜀 and 𝜀 2 = 𝐗 𝐓 𝛄 + 𝑢, we employ a procedure as: i) regress 𝑦 on 𝐗 by OLS and have residuals 𝜀̂ ; ii) run 𝜀̂ 2 on 𝐗 and have 𝛄̂; and iii) use 1⁄√𝐗 𝐓 𝛄̂ as the weight to correct heterogeneity in 𝜀. Table 1.7 reports the regression results for the farm-level coverage choices. The farmers’ coverage choice responses to wedge average and wedge differential are significantly negative for corn, although they are unclear to soybean. However, land quality variables – land quantile and land capability- have significantly positive effects on farmers’ choices, implying that farmers with better land quality are inclined to choose higher coverage levels. Table 1.8 reports OLS estimation results, and Table 1.9 reports the WLS results. Again, like the farm-level analysis, the wedge effect (average) on county-level coverage choice is significantly negative for corn but unclear for soybean. As in Table 1.8, for corn, the coefficient of wedge average is around -1.3, indicating the marginal effect of wedge average is also around - 1.3 since we use OLS estimation. Then the semi-elasticity of coverage choice is -2.2, implying that the mean coverage level will increase to 0.69 from 0.67 if the wedge average decreases by 1%. Furthermore, the wedge effect (differential) is unclear for corn and soybean. Conclusion Over the past twenty years, participation rates in the U.S. federal crop insurance program have improved nationwide; however, significant regional disparities still exist. This study contributes to the literature by proposing a novel resampling procedure, estimating the 33 actuarially fair premiums at all coverage levels, and then examining the effect of actuarial unfairness on farmers’ insurance demand. We find that: first, farmers are more concerned about the actuarial unfairness than the subsidy transfer, implying that mitigation of mispricing supplements the subsidy; second, actuarial unfairness is positively correlated with land quality, indicating that farmers with higher land quality cropland pay more premiums than they should. Finally, regression results show that such mispricing deters farmers’ insurance uptake, which is robust from the farm- or county-level perspectives. Our analysis might help the sustainability of the crop insurance program because an adverse selection problem might exist if federal subsidies were to be reduced in the future. When farm-level mispricing exists and subsidy reduction starts, farmers with higher land quality cropland will opt out of the program. If this is the case, the entire insurance pool becomes risky, and indemnities might exceed premiums, which will be detrimental to the development of the program. In addition, due to the increased indemnities, RMA might adjust the premium rating upward and crowd more farmers with good land quality out of the insurance pool, further exacerbating the program. The effect of mispricing on coverage level choice is significant for corn but not for soybean. According to our estimation, if the wedge (=RMA’s premium/the actuarially fair premium) decreases by 10%, county-level coverage choices of corn will increase by 0.2 percentage points. Therefore, the mitigation of mispricing provides an opportunity to adjust the current subsidy structure. 34 Table 1.7 Estimation Results for Ordered Logit Model. Dependent Variable Farm-Level Coverage Choice Estimation Method Ordered Logit Model Crop Corn Soybeans Wedge Average -0.082*** -0.090*** -0.074*** -0.081*** 0.020*** 0.020*** 0.008 0.009 (-7.42) (-8.01) (-6.66) (-7.15) (3.36) (3.38) (1.26) (1.49) Wedge Differential -0.158*** -0.133*** -0.007 -0.125*** (-3.49) (-2.93) (-0.22) (-3.76) Land Quantile 0.007 0.010** 0.009** 0.012*** 0.039*** 0.040*** 0.036*** 0.039*** (1.63) (2.29) (2.17) (2.69) (9.22) (9.13) (8.47) (9.07) Land Capability 1.349*** 1.367*** 1.353*** 1.368*** 1.887*** 1.888*** 1.879*** 1.883*** (36.93) (36.98) (36.95) (36.94) (54.50) (54.42) (53.75) (53.81) Control Variable Growing Degree Days -0.000*** -0.000*** 0.000 0.000 -0.001*** -0.001*** -0.000*** -0.000** (-3.02) (-2.76) (0.30) (0.37) (-16.74) (-16.68) (-2.62) (-2.52) Stress Degree Days -0.014*** -0.014*** -0.011*** -0.011*** -0.008*** -0.008*** -0.005*** -0.005*** (-24.52) (-24.75) (-15.47) (-15.69) (-14.46) (-14.13) (-8.38) (-7.85) Longitude 0.025*** 0.025*** 0.014** 0.011** (4.71) (4.72) (2.48) (1.97) Latitude 0.081*** 0.079*** 0.176*** 0.178*** (8.61) (8.38) (23.81) (23.97) State FE Yes Yes Yes Yes Yes Yes Yes Yes Obs. 81,088 81,088 81,088 81,088 104,209 104,209 104,209 104,209 Pseudo R2 0.042 0.042 0.042 0.042 0.046 0.046 0.048 0.048 Note: The t-statistics are in the parentheses. *** p<0.01, ** p<0.05, * p<0.1. 35 Table 1.8 The OLS Estimation Results for County-Level Choice. Dependent Variable Acreage-Weighted Coverage Choice Estimation Method OLS Crop Corn Soybeans Wedge Average -1.288*** -1.314*** -1.222*** -1.244*** -0.406* -0.362 -0.409* -0.357 (-3.79) (-3.83) (-3.66) (-3.69) (-1.81) (-1.58) (-1.84) (-1.58) Wedge Differential -0.740 -0.614 -0.851 -1.027 (-0.63) (-0.52) (-0.72) (-0.86) Land Capability 5.053*** 5.111*** 4.918*** 4.967*** 4.022*** 4.030*** 3.949*** 3.955*** (4.77) (4.80) (4.68) (4.69) (4.58) (4.58) (4.51) (4.51) Growing Degree Days -0.001 -0.001 -0.001 -0.001 -0.001 -0.001 -0.000 -0.000 (-0.72) (-0.67) (-0.46) (-0.43) (-0.58) (-0.56) (-0.09) (-0.05) Stress Degree Days -0.031* -0.032* -0.041** -0.041** 0.003 0.002 -0.001 -0.002 (-1.71) (-1.74) (-2.09) (-2.10) (0.16) (0.14) (-0.08) (-0.11) Longitude -0.503* -0.493* -0.377 -0.392 (-1.94) (-1.92) (-1.52) (-1.57) Latitude 0.040 0.035 0.456 0.474 (0.09) (0.08) (1.04) (1.09) State FE Yes Yes Yes Yes Yes Yes Yes Yes CRD FE Yes Yes Yes Yes Yes Yes Yes Yes Obs. 776 776 776 776 751 751 751 751 2 Pseudo R 0.58 0.58 0.58 0.58 0.58 0.58 0.58 0.58 Note: The t-statistics are in the parentheses. *** p<0.01, ** p<0.05, * p<0.1. 36 Table 1.9 The WLS Estimation Results for County-Level Choice. Dependent Variable Acreage-Weighted Coverage Choice Estimation Method WLS Crop Corn Soybeans Wedge Average -1.305*** -1.305*** -1.253*** -1.25*** -0.257 -0.218 -0.241 -0.193 (-3.95) (-3.94) (-3.76) (-3.69) (-1.09) (-0.92) (-1.02) (-0.81) Wedge Differential 0.001 0.119 -1.167 -1.349 (-0.00) (0.11) (-1.06) (-1.22) Land Capability 4.453*** 4.452*** 4.375*** 4.367*** 3.636*** 3.614*** 3.554*** 3.526*** (5.14) (5.12) (4.99) (4.96) (4.13) (4.11) (3.96) (3.93) Growing Degree Days -0.001 -0.001 -0.001 -0.001 0.00001 0.0001 0.003 0.001 (-0.61) (-0.60) (-0.37) (-0.39) (0.01) (-0.08) (0.23) (0.35) Stress Degree Days -0.025 -0.025 -0.031 -0.03** 0.011 0.01 0.005 0.004 (-1.28) (-1.28) (-1.49) (-1.48) (0.51) (0.48) (0.23) (0.17) Longitude -0.317 -0.32 -0.31 -0.34 (-1.29) (-1.3) (-1.2) (-1.30) Latitude 0.011 0.01 0.12 0.15 (0.03) (0.03) (0.32) (0.39) State FE Yes Yes Yes Yes Yes Yes Yes Yes CRD FE Yes Yes Yes Yes Yes Yes Yes Yes Obs. 770 770 770 770 733 733 733 733 2 Pseudo R 0.52 0.52 0.52 0.52 0.50 0.50 0.50 0.50 Note: The t-statistics are in the parentheses. *** p<0.01, ** p<0.05, * p<0.1. 37 REFERENCES Babcock, B.A. and A.M., Blackmer. 1992. The value of reducing temporal input nonuniformities. Journal of Agricultural and Resource Economics: 335-347. Babcock, B.A., E.K. Choi and E. Feinerman. 1993. Risk and probability premiums for CARA utility functions. Journal of Agricultural and Resource Economics: 17-24. Babcock, B.A. and D.A., Hennessy.1996. Input demand under yield and revenue insurance. Amer. J. of Agr. Econ, 78(2): 416-427. Babcock, B. A., Hart, C. E., & Hayes, D. J. 2004. Actuarial fairness of crop insurance rates with constant rate relativities. Amer. J. of Agr. Econ. 86(3): 563-575. Borges, R.B. and Thurman, W.N., 1994. Marketing quotas and random yields: marginal effects of inframarginal subsidies on peanut supply. Amer. J. of Agr. Econ. (4): 809-817. Calvin, L. 1992. Participation in the US federal crop insurance program (No. 1800). US Department of Agriculture, Economic Research Service. Chen, S. and L.Y. Cindy. 2016. Parameter estimation through semiparametric quantile regression imputation. Electronic Journal of Statistics, 10(2): 3621-3647. Chen, Z., Dall'Erba, S. and Sherrick, B.J., 2020. Premium misrating in federal crop insurance programs: scale, geography, and fiscal impacts. Agricultural Finance Review, 80(5): 693- 713. Chite, R. 1988. Federal crop insurance: background and current issues. In CRS report for Congress (USA). Congressional Research Service. Claassen, R. and R.E Just, 2011. Heterogeneity and distributional form of farm‐level yields. Amer. J. of Agr. Econ. 93(1): 144-160. Coble, K.H., T.O. Knight, R.D. Pope and J.R. Williams, 1996. Modeling farm‐level crop insurance demand with panel data. Amer. J. of Agr. Econ. 78(2): 439-447. Coble, K.H., T.O. Knight, R.D. Pope and J.R. Williams. 1997. An expected‐indemnity approach to the measurement of moral hazard in crop insurance. Amer. J. of Agr. Econ. 79(1): 216- 226. Coble, K.H., T.O. Knight, B.K. Goodwin, M.F. Miller, R.M. Rejesus and G. Duffield. 2010. A comprehensive review of the rma aph and combo rating methodology: Final report. prepared by sumaria systems for the risk Management agency. Deng, X., B.J. Barnett and D.V. Vedenov. 2007. Is there a viable market for area-based crop insurance?. Amer. J. of Agr. Econ. 89(2): 508-519. Du, X., D.A. Hennessy and H. Feng. 2014. A natural resource theory of US crop insurance 38 contract choice. Amer. J. of Agr. Econ. 96(1): 232-252. Du, X., H. Feng and D.A. Hennessy. 2017. Rationality of choices in subsidized crop insurance markets. Amer. J. Agr. Econ. 99(3): 732-756. Feng, H., X. Du and D.A. Hennessy. 2020. Depressed demand for crop insurance contracts, and a rationale based on third generation Prospect Theory. Agricultural Economics, 51(1): 59-73. Firpo, S., A.F. Galvao, M. Kobus, T. Parker and P. Rosa-Dias. 2020. Loss aversion and the welfare ranking of policy interventions. arXiv preprint arXiv:2004.08468. Gardner, B.L. and R.A. Kramer. 1986. Experience with crop insurance programs in the United States. Glickman, D., 2000. Testimony of Dan Glickman before the House Committee on Agriculture. Gollier, C. and J.W. Pratt. 1996. Risk vulnerability and the tempering effect of background risk. Econometrica: Journal of the Econometric Society: 1109-1123. Goodwin, B.K., 1993. An empirical analysis of the demand for multiple peril crop insurance. Amer. J. Agr. Econ. 75(2): 425-434. Goodwin, B.K. 1994. Premium rate determination in the federal crop insurance program: what do averages have to say about risk?. Journal of Agricultural and Resource Economics: 382-395. Goodwin, B.K. and A.P Ker, 1998. Nonparametric estimation of crop yield distributions: implications for rating group‐risk crop insurance contracts. Amer. J. Agr. Econ. 80(1): 139-153. Goodwin, B.K. 2001. Problems with market insurance in agriculture. Amer. J. of Agr. Econ. (3): 643-649. Hojjati, B. and N.E. Bockstaal. 1988. Modeling the demand for crop insurance. International Food Policy Research Institute. Hirshleifer, J., H. Jack and J.G. Riley, 1992. The analytics of uncertainty and information. Cambridge University Press. Jensen, N.D., A.G. Mude and C.B. Barrett, 2018. How basis risk and spatiotemporal adverse selection influence demand for index insurance: Evidence from northern Kenya. Food Policy, 74: 172-198. Joe, H. and J.J. Xu. 1996. The estimation method of inference functions for margins for multivariate models. Just, R. E., and Q. Weninger. 1999. "Are Crop Yields Normally Distributed?" Amer. J. Agr. 39 Econ. 81: 287-304. Ker, A.P. and B.K., Goodwin. 2000. Nonparametric estimation of crop insurance rates revisited. Amer. J. Agr. Econ. 82(2): 463-478. Ker, A.P. and Coble, K., 2003. Modeling conditional yield densities. Amer. J. Agr. Econ. 85(2): 291-304. Knight, T.O. and K.H., Coble. 1999. Actuarial effects of unit structure in the US actual production history crop insurance program. Journal of Agricultural and Applied Economics, 31(3): 519-535. Koenker, R. and G. Bassett Jr. 1978. Regression quantiles. Econometrica: journal of the Econometric Society: 33-50. Kousky, C. 2017. Disasters as learning experiences or disasters as policy opportunities? Examining flood insurance purchases after hurricanes. Risk analysis, 37(3): 517-530. LaFrance, J. T., Shimshack, J. P., and Wu, S. 2002. The Environmental Impacts of Subsidized Crop Insurance: Crop Insurance & the Extensive Margin. Miranda, M.J.1991. Area‐yield crop insurance reconsidered. Amer. J. Agr. Econ.73(2): 233- 242. Miranda, M. J., and J. W. Glauber. 1997. "Systemic Risk, Reinsurance, and the Failure of Crop Insurance Markets." Amer. J. Agr. Econ. 79:206-215. Nelson, C. H., & Loehman, E. T. (1987). Further toward a theory of agricultural insurance. Amer. J. Agr. Econ. 69(3): 523-531. Nelson, C.1990. "The Influence of Distribution Assumptions of the Calculation of Crop Insurance Premia." N. Cent. J. Agr. Econ. 12:71-78. Nelson, C., and P. Preckel.1989. "The Conditional Beta Distribution as a Stochastic Production Function." Amer. J. Agr. Econ. 71(2):370-378. Norwood, B., M.C. Roberts and J.L. Lusk. 2004. Ranking crop yield models using out‐of‐sample likelihood functions. Amer. J. Agr. Econ. 86(4): 1032-1043. O'Donoghue, E. 2014. The effects of premium subsidies on demand for crop insurance. USDA- ERS economic research report, (169). Price, M.J., C.L. Yu, D.A. Hennessy and X. Du. 2019. Are actuarial crop insurance rates fair?: an analysis using a penalized bivariate B‐spline method. Journal of the Royal Statistical Society: Series C (Applied Statistics), 68(5): 1207-1232. Quiggin, J. C., G. Karagiannis and J. Stanton. 1993. Crop insurance and crop production: an empirical study of moral hazard and adverse selection. Australian Journal of Agricultural 40 Economics, 37(429-2016-29192), 95-113. Ramirez, O. 1997. "Estimation and Use of a Multivariate Parametric Model for Simulating Heteroskedastic, Correlated, Nonnormal Random Variables: The Case of Corn Belt Corn, Soybean, and Wheat Yields." Amer. J. Agr. Econ. 79:191-205. Ramirez, O.A. and J.S. Shonkwiler. 2017. A probabilistic model of the crop insurance purchase decision. Journal of Agricultural and Resource Economics: 10-26. Ramirez, O., S. Misra and J. Field.2003. "Crop-Yield Distributions Revisited." Amer. J. Agr. Econ. 85(1): 108-120. Robinson, P.M., 1987. Asymptotically efficient estimation in the presence of heteroskedasticity of unknown form. Econometrica: Journal of the Econometric Society: 875-891. Rosch, Stephanie. 2021. “Federal Crop Insurance: A Primer”, Report No. R46686. Rosa, I. 2018a. Federal crop insurance: Program overview for the 115th congress. Report R45193. Rosa, I. 2018b. Farm bill primer: Federal crop insurance. Congressional Research Service. Schmidt, U., C. Starmer and R. Sugden. 2008. Third-generation prospect theory. Journal of Risk and Uncertainty, 36(3): 203-223. Schwarz, G.1978. "Estimating the Dimension of a Model." Annals of Statis. 6:461-464. Sheather, S.J. and M.C. Jones. 1991. A reliable data‐based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society: Series B (Methodological), 53(3): 683-690. Sherrick, B.J., Zanini, F.C., Schnitkey, G.D. and Irwin, S.H., 2004. Crop insurance valuation under alternative yield distributions. Amer. J. Agr. Econ. 86(2): 406-419. Skees, J.R., 1987. Future research needs on federal multiple peril crop insurance (No. 2096- 2018-3251). Skees, J. R., and Reed. M.R. 1986. Rate making for farm‐level crop insurance: Implications for adverse selection. Amer. J. Agr. Econ. 68(3): 653-659. Smith, V.H. and B.K. Goodwin. 1996. Crop insurance, moral hazard, and agricultural chemical use. Amer. J. Agr. Econ. 78(2): 428-438. Swinton, S., and R. King.1991. "Evaluating Robust Regression Techniques for Detrending Crop Yield Data with Non-Normal Errors." Amer. J. Agr. Econ. 73:446-461. Taylor, C. 1990. "Two Practical Procedures for Estimating Multivariate Nonnormal Probability Density Functions." Amer. J. Agr. Econ. 72: 210-217. 41 U.S. General Accounting Office. 1989. “A Disaster Assistance: Crop Insurance Can Provide Assistance More Effectively than Other Programs: GAO/RCED-89-211, September. U.S. General Accounting Office. 1993. “A Crop Insurance: Federal Program Has Been Unable to Meet Objectives of 1980 Act.” GAO/T-RCED-93-12, March 3. US GAO.2014 Considerations in Reducing Federal Premium Subsidies. Report GAO-14-700. US Gov. Account. Off., Washington DC. US GAO. 2015. In areas with higher crop production risks, costs are greater, and premiums may not cover expected losses. Report GAO-15-215. US Gov. Account. Off., Washington DC. Vuong, Q. H. 1989. "Likelihood Ratio Tests for Model Selection and Non-nested Hypotheses." Econometrica 57: 307-333. Wood, S.N., N. Pya and B. Säfken. 2016. Smoothing parameter and model selection for general smooth models. Journal of the American Statistical Association, 111(516): 1548-1563. Woodard, J.D., Sherrick, B.J. and Schnitkey, G.D., 2011. Actuarial impacts of loss cost ratio ratemaking in US crop insurance programs. Journal of Agricultural and Resource Economics: 211-228. Woodard, J.D., Schnitkey, G.D., Sherrick, B.J., Lozano‐Gracia, N. and Anselin, L., 2012. A spatial econometric analysis of loss experience in the US crop insurance program. Journal of Risk and Insurance, 79(1):261-286. Woodard, J.D. and Verteramo‐Chiu, L.J., 2017. Efficiency impacts of utilizing soil data in the pricing of the federal crop insurance program. Amer. J. Agr. Econ. 99(3): 757-772. Xu, Z., D.A. Hennessy, K. Sardana and G. Moschini. 2013. The realized yield effect of genetically engineered crops: US maize and soybean. Crop Science, 53(3): 735-745. Yoshida, T., 2013. Asymptotics for penalized spline estimators in quantile regression. Communications in Statistics-Theory and Methods, (just-accepted). Yu, J. and D.A. Sumner. 2018. Effects of subsidized crop insurance on crop choices. Agricultural Economics, 49(4): 533-545. Zulauf, C., J. Coppess, G. Schnitkey and N. Paulson. 2018. Premium Subsidy and Insured US Acres: Differential Impact by Crop. farmdoc daily, 8. Zhu, Y., B. K Goodwin., S. K. Ghosh. 2011. Modeling yield risk under technological change: Dynamic yield distributions and the US crop insurance program. Journal of Agricultural and Resource Economics: 192-210. 42 APPENDIX A: Resampling and Premium Estimation Step 1: Detrended Yields Let 𝑦𝑖,𝑐,𝑡 be the unit-level yield for farm 𝑖 in county 𝑐 and year 𝑡, 𝑦𝑐,𝑡 be the county-level yield. Two log-linear trend equations will be estimated as (Miranda and Glauber 1997; Deng et al. 2007): (A1.1) log(𝑦𝑖,𝑐,𝑡 ) = 𝛽0 + 𝛽1 (2009 − 𝑡) + 𝜀; (A1.2) log(𝑦𝑐,𝑡 ) = 𝛾0 + 𝛾1 (2009 − 𝑡) + 𝑢, where 𝑡 ∈ {1970, … ,2008} for unit-level yields and 𝑡 ∈ {1951, … ,2018} for county-level yields. The reason is that long-term county-level historical yields may represent the long-term variation and be more convincible. Then the detrend unit- and county-level yields will be calculated as: 𝑑𝑒𝑡 𝑦 (A1.3) 𝑦𝑖,𝑐,𝑡 = 𝑦̂𝑖,𝑐,𝑡 × 𝑦̂𝑖,𝑐,2009 𝑖,𝑐,𝑡 𝑑𝑒𝑡 𝑦 (A1.4) 𝑦𝑐,𝑡 = 𝑦̂𝑐,𝑡 × 𝑦̂𝑐,2009 𝑐,𝑡 where 𝑦̂𝑖,𝑐,𝑡 is the predicted unit-level yield, and 𝑦̂𝑐,𝑡 is the predicted county-level yield. Both yields are adjusted to 2009 technological level. Step 2: Semiparametric Quantile Regression Imputation (SQRI) Observations in our yield sample are from 1970 to 2008 but mostly lie between 1999 and 2008. This may induce temporal bias since no extreme climate change happens in the 10-year time interval. Aiming to making the sample more representative for a long-term period, we adopted semi-parametric quantile regression imputation using penalized B-splines as in Chen and Yu (2016). Consider a model for (𝑥𝑖 , 𝑦𝑖 )𝑇 , 𝑖 ∈ {1, … , 𝑛} that is a set of i.i.d observations of random variable (𝑋, 𝑌), where 𝑌 is the response variable that may be subject to missing, and 𝑋 is an 43 univariate variable that can be observed. Let 𝑞𝜏 (𝑥) be the unknown conditional 𝜏 − 𝑡ℎ quantile of response 𝑌 given 𝑋 = 𝑥. For a given 𝜏 ∈ (0,1), the conditional quantile function 𝑞𝜏 (𝑥) is defined as: 𝑃(𝑌 < 𝑞𝜏 (𝑥)|𝑋 = 𝑥) = 𝜏. Then (A1.5) 𝑞𝜏 (𝑥) = arg min 𝐸{𝜌𝜏 (𝑌 − ℎ(𝑥))|𝑋 = 𝑥}, ℎ(𝑥) 𝜏|𝑢|⁡⁡⁡⁡⁡⁡⁡⁡𝑢 ≥ 0 where 𝜌𝜏 (𝑢) = 𝑢 × (𝜏 − 𝐼(𝑢 < 0)) = { is the check function (see more in (1 − 𝜏)⁡⁡𝑢 ≤ 0 Koenker and Bassett 1978). Here 𝑞𝜏 (𝑥) will be estimated with penalized B-spline and ℎ(𝑥) will be represented vias basis expansion. Let 𝐾𝑛 − 1 be the number of knots within the support of 𝑋, and 𝑝 be the degree of B-splines. Define equidistantly located knots as  k = K n−1k , k = − p + 1,..., K n + p . As such, the p-th B-spline basis is 𝑇 [𝑝] [𝑝] [𝑝] (A1.6) 𝐵(𝑥) = (𝐵−𝑝+1 (𝑥), 𝐵−𝑝 (𝑥), … , 𝐵−𝑝 (𝑥)) [𝑝] where 𝐵𝑘 (𝑥), 𝑘 = −𝑝 + 1, … , 𝐾𝑛 are defined recursively as: o For 𝑠 = 0: [0] 1,⁡⁡⁡𝜅𝑘−1 < 𝑥 < 𝜅𝑘 (A1.7) 𝐵𝑘 (𝑥) = { where 𝑘 ∈ {−𝑝 + 1, … , 𝐾𝑛 + 𝑝}; 0,⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡otherwise o For 𝑠 ∈ {1,2, … , 𝑝}: [𝑠] 𝑥−𝜅𝑘−1 [𝑠−1] 𝜅𝑘+𝑠 −𝑥 [𝑠−1] (A1.8) 𝐵𝑘 (𝑥) = 𝜅 𝐵𝑘 (𝑥) + 𝐵𝑘+1 (𝑥), 𝑘+𝑠−1 −𝜅𝑘−1 𝜅𝑘+𝑠 −𝜅𝑘 where 𝑘 ∈ {−𝑝 + 1, … , 𝐾𝑛 + 𝑝 − 𝑠}. The estimated conditional quantile regression function is: (A1.9) 𝑞̂𝜏 (𝑥) = 𝐵 𝑇 (𝑥)𝑏̂(𝜏) Where 𝑏̂ (𝜏) is a (𝐾𝑛 + 𝑝) × 1 vector obtained by: 44 𝜆 (A1.10) 𝑏̂(𝜏) = arg min ∑𝑛𝑖=1 𝛿𝜏 𝜌𝜏 [𝑦𝑖 − 𝐵 𝑇 (𝑥)𝑏̂(𝜏)] + 2𝑛 𝑏 𝑇 (𝜏)𝐷𝑚𝑇 𝐷𝑚 𝑏(𝜏), 𝑏(𝜏) Where 𝜆𝑛 (> 0) is the smoothing parameter, 𝛿𝜏 is the “learning rate”, 𝐷𝑚 is the 𝑚 − 𝑡ℎ difference matrix and is (𝐾𝑛 + 𝑝 − 𝑚) × (𝐾𝑛 + 𝑝) dimensional with its element defined as 𝑚 (−1)|𝑖−𝑗| (|𝑖 − 𝑗|) , 0 ≤ 𝑗 − 𝑖 ≤ 𝑚 (A1.11) 𝑑𝑖𝑗 = { , 0,⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡otherwise 𝑚 where ( ) is the choose function given by (𝑘! (𝑚 − 𝑘)!)−1 𝑚!; 𝑚 is the order of penalty. The 𝑘 difference penalty 𝑏 𝑇 (𝜏)𝐷𝑚 𝑇 𝐷𝑚 𝑏(𝜏) is to solve computational difficulty as in Yoshida (2013). As shown above, smoothing penalties will vary as 𝜆𝑛 changes. From a Bayesian viewpoint, 𝑏̂(𝜏) is a posterior mode for true parameter 𝑏(𝜏). As in Wood et al. (2016), the −1 conditional distribution of 𝑏(𝜏) based on 𝑦 is 𝑏(𝜏)|𝑦~𝑁 (𝑏̂(𝜏), (𝛯 + (𝐷𝑚 𝑇 𝐷𝑚 )𝜆 ) ) where Ξ is the expected negative Hessian of the log-likelihood at 𝑏̂(𝜏) and log-likelihood is 𝑇 𝑙(𝑏(𝜏), 𝑞𝜏 (𝑥)) = log𝑓(𝑦|𝑏(𝜏), 𝑞𝜏 (𝑥)) where 𝑓𝜆 is given by 𝑁(0, (𝐷𝑚 𝐷𝑚 )𝜆− ). Then smoothing parameters 𝜆𝑛 can be estimated by maximizing the log marginal likelihood (A1.12) 𝑉(𝜆) = log ∫ 𝑓(𝑦|𝑏(𝜏)) 𝑓𝜆 (𝑏(𝜏))𝑑𝑏(𝜏) Ideally, 𝜏 should be randomly drawn from [0,1]. To save computational time, the alternative procedure for semi-parametric quantile regression imputation is: ▪ Divided [0,1] into 𝐽 = 100 equally distance sub-intervals; ▪ Obtain 𝑏̂(𝜏𝑗 ) 𝑗 ∈ {1, … , 𝐽 − 1} by using best “learning rate”  i and the smoothing parameter 𝜆𝑛 as in Wood (2016); ▪ For a given detrended county-level yield 𝑦𝑐,𝑡 where 𝑐 represents county and 𝑡 represent det∗ det ̂ year, the imputed detrended unit-level yield at 𝜏 − 𝑡ℎ quantile is 𝑦𝑗,𝑐,𝑡 = 𝐵 𝑇 (𝑦𝑐,𝑡 )𝑏(𝜏𝑗 ) . Randomly draw 1,000 observations from all years for one county. For instance, if 𝑡 ∈ 45 {1970, … ,2008}, there are total 39 years and 3,861 (=39*99) imputed unit-level yield for this county, then 1,000 observations will be chosen. Step 3: Rejection Sampling Based on SQRI using penalized B-splines, we have 1,000 unit-level yields for each county in 12 states (IA, IL, IN, KS, MI, MN, MO, NE, ND, OH, SD and WI). The imputed yields in one county consist of the target density for resampling. Rejection sampling method will be adopted in this paper to get a more representative sample. The steps are as follows: ▪ All yields are adjusted to 2009 technological level. Let 𝑦𝑐∗ be imputed unit-level yield for county 𝑐 in the SQRI part and set the distribution 𝑓(𝑦𝑐∗ ) as the target distribution ▪ Let 𝑦𝑐 be the detrended unit-level yield for county 𝑐 with an auxiliary distribution 𝑓(𝑦𝑐 )1 ▪ Set the “envelope constant” 𝐴, such that 0 < 𝐴 < +∞ and multiply by the auxiliary distribution to create a “blanket function”, 𝐴𝑓(𝑦𝑐 ). The selection of 𝑓(𝑦𝑐 ) and 𝐴 must satisfy the condition: 𝐴𝑓(𝑦𝑐 ) ≥ 𝑓(𝑦𝑐∗ ) ▪ Choose 𝑦𝑐 by the criteria: when 𝐴𝑓(𝑦𝑐 ) ≥ 𝑓(𝑦𝑐∗ ) then 𝑦𝑐 will be accepted, when 𝐴𝑓(𝑦𝑐 ) ≤ 𝑓(𝑦𝑐∗ ) then 𝑦𝑐 will be rejected. For the sake of desired sample size, we will still randomly choose 1,000 observations for each county Step 4: Actuarially Fair Premium After the Steps 1-3 are finished, there is a more representative sample for each county and then each sample has 1,000 observations. Here we will calculate the actuarially fair premium based on the new sample and the procedure in Price et al. (2019). The main idea is to use a modified stochastic yield function: det (A1.13) 𝑦𝑖,𝑐,𝑡 = 𝛼0 + 𝑚(𝑧𝑖 ) + 𝜎(𝑧𝑖 )𝜀𝑖,𝑡 1 Both distributions are obtained by “density” and “approxfun” in R with the default setting. 46 det where 𝑦𝑖,𝑐,𝑡 is the detrended farm-level yield for farm 𝑖 in county 𝑐 in year 𝑡; 𝑚(𝑧𝑖 ) and 𝜎(𝑧𝑖 ) are respectively yield mean and yield standard deviation functions which depend on the unit’s four- to-ten-year average actual yield (APH) labelled as 𝑧𝑖 ; residual 𝜀 has zero mean and CDF 𝐺(𝜀): [𝜀, 𝜀] → [0,1]. Here, the distribution of residuals is not restricted. Let 𝐸[∙] be the expectation operator, 𝜓 = [𝜑𝑧 − 𝑚(𝑧)]/𝜎(𝑧), the actuarially fair premium rate from buyer’s perspective (i.e., expected indemnity) is: 𝐸[max⁡(𝜑𝑧−𝑦,0)] 𝜎(𝑧)𝐸[max⁡(𝜓−𝜀,0)] 𝜎(𝑧) 𝜓 (A1.14) 𝑈𝐿𝑅(𝜑, 𝑧) = = = ∫𝜀 𝐺(𝜀)𝑑𝜀, 𝜑𝑧 𝜑𝑧 𝜑𝑧 where 𝑈𝐿𝑅 is County Unloaded Rate; 𝑦 is actual yield for each unit; 𝑧 is county yield in 2009; 𝜑 is coverage level. ̂ (𝑧), 𝜎̂(𝑧) and 𝐺̂ (𝜀) need to be estimated from the regression equation. It is clear that 𝑚 Insured price 𝑝 is calculated by averaging out February futures prices for upcoming year’s harvest contracts (December contracts for corn and November contracts for soybean). The procedure to obtain those estimators is: det ▪ Regress 𝑦𝑖,𝑐,𝑡 on 𝑧𝑖 using the univariate penalized B-spline method to estimate coefficients for mean function 𝑚 ̂ (𝑧); 2 ▪ Extract residuals 𝑟̂𝑖,𝑡 from step 1 and regress 𝑟̂𝑖,𝑡 on 𝑧𝑖 with the penalized B-spline method to estimate 𝜎̂(𝑧); ▪ Standardize the residuals as 𝜀𝑖,𝑡 ̂ ⁄𝜎(𝑧) and obtain the empirical cumulative density ̂ = 𝑟𝑖,𝑡 function estimate 𝐺̂ (𝜀𝑖,𝑡 )1; ▪ Integrate the cumulative density function estimate and obtain the actuarially fair premium for each (coverage level, APH) pair. 1 See “ecdf” function in R. 47 Parameters are specified as: (a) degree of spline, 𝑝 , is 3, i.e., cubic B-spline; (b) degree of penalty, 𝑚, is 2 as in Yoshida (2013); (c) number of knots is 5; (d) smoothing parameter λ is determined by GCV (generalized cross-validation). Let 𝑧 be median APH in resampling data, then we have reference rate 𝑈𝐿𝑅 for all coverage levels. By a slight modification of Base Producer Rate (BPR) as in Coble and Goodwin (2010), the estimated actuarially fair BPR will be: −𝐸𝑐 𝑧 𝑖 (A1.15) 𝛤𝑖 (𝜑, 𝑧𝑖 ) = (ref⁡yield ) × 𝑈𝐿𝑅𝑐 (𝜑) + fixed⁡load × cov⁡diff(φ) 𝑖,𝑐 where 𝑧𝑖 is Actual Production History of farm 𝑖; ref⁡yield𝑖,𝑐 is the reference yield in county 𝑐 where farm 𝑖 locates in; −𝐸𝑐 is the negative exponent provided by RMA by which a farm with better land quality will have a low premium rate; 𝑈𝐿𝑅𝑐 (𝜑) is the county unloaded rate we have estimated; fixed⁡load includes prevented planting rate load, replant rate load, quality adjustment rate load and state catastrophic rate load; cov diff is equal to 1 when 𝜑 = 0.65. 48 APPENDIX B: Supplemental Figures and Tables Figure 1B.1 Imputed County-Level Densities for Soybeans. 49 Figure 1B.2 Comparisons of Wedges (before and after subsidies) among Different Land Qualities for Soybean. 50 Figure 1B.3 RMA Premiums (including subsidies), Farm’s Self-Paid Premium and Estimated Actuarially Fair Premiums for Soybean. Note: Only mean values are reported here. 51 CHAPTER 2 Basis Risk and Farmers’ Participation in the U.S. Federal Crop Insurance Program: A Conceptual Framework and its Application Abstract Since 2000, both insured acres and coverage level choices in U.S. Federal Crop Insurance Program have been boosted due to generous subsidy policies. However, spatially heterogeneous participation patterns exist across the Great Plains and Corn Belt regions. Recently, the basis risk has been increasingly recognized as an essential driver for deterring the crop insurance uptake of weather index contracts. This study investigates whether the basis risk inversely affects the insurance demand for revenue and yield contracts. We build a conceptual model to explore farmers’ acreage response to basis risk within an Expected Utility framework. With elevator- level basis risk, we apply Fractional Probit with Control Function and find that the effects of basis risk on participation rates are significantly negative for nearly all insurance contracts. Our analysis implies that: (i) to remove basis risk, revision for revenue contract may be considered; and (ii) subsidy structure may also be adjusted to be consistent with the underlying basis risk. The policy adjustments for basis risk might help save large subsidies while keeping high participation rates. 52 Introduction Famers face multiple risks that can generate large fluctuations in their income. The U.S. government provides large subsidies to boost participation rates in the Federal Crop Insurance Program (FCIP) to shield farmers from risks. Crops and contracts have also been expanded to satisfy farmers’ demand for diversification. The 2000 Agricultural Risk Protection Act (ARPA) provides $8.2 billion for expanding participation through increased premium subsidies (Goodwin 2001; Coble and Goodwin 2010). Unsurprisingly then, both extensive (share of insured acres in total insurable acres) and intensive (coverage level choice) margin participation rates have increased over the past twenty years (Che et al. 2020). However, one concern increases because significant regional disparities in insurance uptake exist, especially across the Great Plains and Corn Belt (Innes and Ardila 1994; LaFrance et al. 2002; Goodwin et al. 2004; Clark 2016; Jensen et al. 2018). In this study, we focus on the effect of basis risk on the participation rate. As in the literature, the basis is defined as the difference between spot and futures prices, and then basis risk is the variation of the basis which can be defined as, for example, the standard deviation of the basis. Our motivation mainly comes from (i) basis risk is inherent in the revenue contracts because indemnity measurement is based on futures price, but farmer's sales revenue is based on cash price; (ii) crop revenue insurance, first introduced in the 1990s, has been the most popular product offered by Federal Crop Insurance Corporation (Du et al. 2014; Goodwin and Hungerford 2014); and (iii) many studies explore the effect of basis risk on insurance uptake focuses on weather index insurance products (Doherty and Richter 2002; Deng et al. 2007; Cole et al. 2014; Jensen et al. 2018; Cai et al. 2020; Gaurav and Chaudhary 2020; Lichtenberg and Martinez 2022; Ohashi 2022), but few studies investigate this topic using information from 53 the cash and futures markets. There is considerable interest in whether basis risk impedes crop insurance uptake in the United States. Clarke (2016) provides a formal analysis of how and why downside basis risk can discourage farmers’ purchase of index insurance and then argues that basis risk may pose a considerable obstacle to designing index insurance programs. The reason is that index insurance decreases farmers’ income in the wrong states of nature (suffer losses but are not indemnified) and increases it in the good states of nature (do not suffer losses but are indemnified). Then risk- averse farmers have little incentive to purchase index insurance (Miranda and Farrin 2012; Jensen and Barrett 2017; Lichtenberg and Martinez 2022). In this study, the best state of nature (no basis risk) is a perfect match between cash and futures prices, and the worst state of nature (considerable basis risk) is that a fundamental mismatch happens. If basis risk exists, weakening (strengthening) basis indicates farmer’s income will decrease (increase) compared to the best state of nature. Spatial disparities of participation rates show that: first, extensive margin (share of insured acres in total insurable acres) is in general highest in North Dakota and decreases when moving from west-southwest toward the eastern Corn Belt; second, intensive margin (coverage level choice) is highest in the central and eastern Corn Belt, especially in central Illinois and northern Indiana, but decreases when moving outside this region. The linkage between participation rate and basis risk may exist because basis risk is low in the production center of gravity (e.g., Corn Belt region) and increases when moving outside the main production area (Figlewski 1984; Haushalter 2000; Doherty and Richter 2002). Our analysis builds on and contributes to three main pieces of literature. First, our study provides insights on observed spatial disparities in the participation rate in U.S. Federal Crop 54 Insurance Program. Existing research has analyzed factors impacting insurance up-take, such as expected return to insurance (Gardner and Kramer 1986; Goodwin 1993), premium expenditures (Binswanger-Mkhize’s 2012; Cole et al. 2013; Du et al. 2017), recency effects (Cole et al. 2014; Gallagher 2014; Kousky 2017; Stein 2018; Che et al. 2020), premium payment at harvest rather than planting time (Casaburi and Willis 2018; Liu et al. 2020) and farmer’s Willingness-to-Pay (Feng et al. 2020). However, to our knowledge, little research has investigated the effect of basis risk (mismatch between cash and futures markets) on participation rates. Second, our empirical results contribute to the literature on the effect of basis risk on participation rates. Regression results show that basis risk inversely affects the participation rates, regardless of the extensive or intensive margin of insurance demands. This conclusion is robust even though changing model specifications or controlling for more variables. Results imply that revising insurance contract design to eliminate the basis risk can improve participation rates. Furthermore, if farmers’ indemnity is calculated based on cash rather than futures price, basis risk will vanish, and participation rates will increase. Therefore, eliminating basis risk supplements the subsidy structure, which might help save large subsidies while maintaining high participation rates. Third, this study sheds light on the impact of basis risk on farmers’ willingness-to-pay with the Monte Carlo simulation associated with the Multivariate Gaussian Copula (MGC) method. This method allows a dependence structure for multiple risk sources such as futures price, cash price, and actual yield. We employ unpublished farm-level insurance purchasing data maintained by USDA Risk Management Agency (RMA). Then we estimate the farmer’s WTP ($/acre) at all individual coverage levels when assuming a constant absolute risk aversion (CARA) utility function (see, for example, Lapan and Moschini 1994; Babcock and Hennessy 55 1996; Du and Hennessy 2012; Feng et al. 2020). As in our conceptual framework, a farmer’s WTP should decrease when basis risk increases. The implication is that a farmer will be inclined to insure fewer acreages under a high basis risk than a low risk. We assume that a state should be ranked highly (towards 12) in participation rate when ranked lowly (towards 1) in basis risk. Simulation results partly support our argument, but further analysis is needed because the risk- averse coefficient is set as a constant across all states, which may not be the case in the real world. This paper is organized as follows. First, a conceptual model will be derived to explain the effect of basis risk on the participation rate within a standard Expected Utility framework. Then, some testable hypotheses will be provided for further investigations. A subsequent section describes the data and approaches we employed to construct variables of interest, such as participation rates and basis risks. In the empirical section, we report the empirical results. The final section contains concluding comments. A Conceptual Model This section aims to better understand the effects of basis risk on both extensive (share of insured acres in total insurable acres) and intensive (coverage level) margin participation rates. As in Feng et al. (2020), within the expected utility setting, a farmer’s WTP for insured acres (or coverage level) can be implicitly estimated by equating the expected value of the objective function value with and without insurance products. Such WTP is equivalent to a farmer’s desirability to enroll in the crop insurance program. Given a coverage level and a pre-determined USDA/RMA premium, a high (low) WTP can raise (deter) crop insurance demand, which also indicates that extensive margin participation rate will go up (down). Revenue Protection with the Harvest Price Exclusion (RPHPE) is a simple revenue 56 contract since the revenue guarantee is based on the projected price only, and so we commence with analyzing this contract form. At harvesting time, two-income sources should be realized: (a) income from crop sales (cash price  actual yield); and (b) indemnity payment after any crop loss. If actual income is lower than the revenue guarantee, the difference will be paid from the insurer to the insured, or 0 otherwise. Denote 𝑝(𝜑) as insurance premium price and 𝑐 as production cost, the farmers’ net revenue after purchasing a corn crop revenue insurance product is: Revenue⁡in⁡ Indemnity⁡based⁡on⁡futures local⁡market prices⁡and⁡actual⁡yields (2.1) 𝑅̃ In = −𝑐 − 𝑝(𝜑) + 𝑃⏞ ̃Oc𝑐 𝑦̃ +⏞ 𝑚𝑎𝑥[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 − 𝐹̃De,Oc 𝑦̃, 0], where tilde indicates variables random at planting time, 𝑃̃Oc 𝑐 is the local cash price when the crop is sold, which is presumed to be October, 𝐹̃De,Oc is the harvest contract futures price,8 𝐹De,Fe is the Springtime price (i.e., the projected price) for corn December futures contract,9 𝑦̃ is actual yield in the harvesting time, 𝜙 is coverage level with 𝜙 ∈ {0.5, … ,0.85}, and 𝑦 𝐴𝑃𝐻 is the approved production history (APH) for each farm. With 𝐵̃Oc 𝑐 = 𝑃̃Oc 𝑐 − 𝐹̃De,Oc as the basis in October in county 𝑐, then (2.1) may be rewritten as (2.2) 𝑅̃ In = −𝑐 − 𝑝(𝜑) + 𝐵̃Oc 𝑐 𝑦̃ + 𝐹̃De,Oc 𝑐 𝑦̃ + max[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 − 𝐹̃De,Oc 𝑦̃, 0] = −𝑐 − 𝑝(𝜑) + 𝐵̃Oc 𝑐 𝑦̃ + max[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 , 𝐹̃De,Oc 𝑦̃] 𝐵̃Oc 𝑐 𝑦̃ + 𝐹̃De,Oc 𝑦̃,⁡⁡⁡⁡⁡⁡whenever⁡no⁡loss; = −𝑐 − 𝑝(𝜑) + { 𝑐 𝐵̃Oc 𝑦̃ + 𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 ,⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡otherwise; To account for basis risk, we assume that basis follows a location-scale family with 𝐵̃Oc 𝑐 (𝜎) ≡ 𝐵̅ + 𝜀̃𝜎 where 𝐸[𝜀̃|𝑦̃] = 0 and 𝜎⁡scales basis risk such that risk vanishes whenever 8 The Harvest price for a revenue contract is usually defined as the average of futures contract price during October. For more information, please refer to: https://www.extension.iastate.edu/agdm/crops/html/a1-54.html. 9 For soybean, the futures contract for new crop is November. 57 𝜎 = 0 (Sandmo 1971; Meyer 1987; Meyer and Rasche 1992). Requirement 𝐸[𝜀̃|𝑦̃] = 0 is the independent background risk assumption. Re-writing Eqn. (2.2) then we have (2.3) 𝑅̃ In = −𝑐 − 𝑝(𝜑) + 𝐵̃Oc 𝑐 (𝜎)𝑦̃ + max[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 , 𝐹̃De,Oc 𝑦̃] = −𝑐 − 𝑝(𝜑) + (𝐵̅ + 𝜀̃𝜎)𝑦̃ + max[𝐾, 𝐹̃𝐷𝑒,𝑂𝑐 𝑦̃]; 𝐾 ≡ 𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 We may ask what is WTP to remove basis risk? Then WTP as a function of basis risk and coverage level, 𝑊(𝜎; 𝜙), may be defined implicitly as (2.4) 𝐸 [𝑈 ([𝐵̅ + 𝜀̃𝜎]𝑦̃ + max[𝐾, 𝐹̃De,Oc 𝑦̃] − 𝑊(𝜎; 𝜙) − c − p(𝜙))] = 𝐸 [𝑈 (𝐵̅ 𝑦̃ + max[𝐾, 𝐹̃De,Oc 𝑦̃] − 𝑐 − 𝑝(𝜙))], where independent background risk (Gollier and Pratt, 1996) affects both sides through random yield. A crucial point to note is that if the distributions of 𝐹̃De,Oc and 𝑦̃ do not change then the right-hand term is constant at, say, value 𝜅. Eqn. (2.3) provides a key observation for understanding the role of basis risk. Observation 1: For RPHPE, the expected value of the outside option of no insurance is independent of basis risk while the expected value of RPHPE depends on basis risk only through its effect on basis times yield. Thus, we may write (2.5) 𝐸 [𝑈 ([𝐵̅ + 𝜀̃𝜎]𝑦̃ + max[𝐾, 𝐹̃De,Oc 𝑦̃] − 𝑊(𝜎; 𝜙) − c − p(𝜙))] = 𝜅 Applying the implicit function theorem, we obtain: 𝑑𝑊(𝜎;𝜙) (2.6) 𝐸 [𝑈 ′ (𝑅̃ In − 𝑊(𝜎; 𝜙)) 𝜀̃𝑦̃] − 𝐸 [𝑈 ′ (𝑅̃ In − 𝑊(𝜎; 𝜙))] 𝑑𝜎 = 0, and so, using the covariance relation 𝐸[𝑥1 𝑥2 ] = 𝐸[𝑥1 ]𝐸[𝑥2 ] + Cov(𝑥1 , 𝑥2 ), it follows that =0 𝐸[𝑈 ′ (𝑅̃In −𝑊(𝜎;𝜙))𝜀̃ 𝑦̃] ⏞ ̃ |𝑦̃]} 𝐸{𝐸[𝑈 ′ (𝑅̃In −𝑊(𝜎;𝜙))𝑦̃|𝑦̃]}⁡𝐸{𝐸[𝜀 𝑑𝑊(𝜎;𝜙) (2.7) = = 𝑑𝜎 𝐸[𝑈 ′ (𝑅̃ In −𝑊(𝜎;𝜙))] 𝐸[𝑈 ′ (𝑅̃ In −𝑊(𝜎;𝜙))] 58 𝐸{𝐶𝑜𝑣[𝑈 ′ (𝑅̃In −𝑊(𝜎;𝜙))𝑦, ̃ 𝜀̃ |𝑦̃]}⁡ 𝐸{𝐶𝑜𝑣[𝑈 ′ (𝑅̃ In −𝑊(𝜎;𝜙))𝑦, ̃ 𝜀̃ |𝑦̃]}⁡ + = , 𝐸[𝑈 ′ (𝑅̃In −𝑊(𝜎;𝜙))] 𝐸[𝑈 ′ (𝑅̃ In −𝑊(𝜎;𝜙))] where the |𝑧̃ notation indicates conditional on 𝑧̃ . Observe that 𝑑𝜀̃⁄𝑑𝜀̃ > 0 while 𝑑 [𝑈 ′ (𝑅̃ In − 𝑊(𝜎; 𝜙)) 𝑦̃]⁄𝑑𝜀̃ = 𝑈 ′′ (𝑅̃ In − 𝑊(𝜎; 𝜙)) 𝜎𝑦̃ ≤ 0 whenever 𝑈 ′′ (∙) ≤ 0. Therefore, by the covariance rule (Gollier 2001, p. 32), 𝐶𝑜𝑣 [𝑈 ′ (𝑅̃ In − 𝑊(𝜎; 𝜙)) 𝑦̃, 𝜀̃|𝑦̃] ≤ 0⁡⁡∀𝑦̃ and so 𝐸 {Cov [𝑈 ′ (𝑅̃ In − 𝑊(𝜎; 𝜙)) 𝑦̃, 𝜀̃|𝑦̃]⁡} ≤ 0. Therefore, 𝑑𝑊(𝜎;𝜙) (2.8) ≤ 0, 𝑑𝜎 a strong inference given the presence of independent background risk. Eqn.(2.8) indicates that farmers’ WTP will decrease as basis risk increase if the coverage level is constant. Hypothesis 1: Given a coverage level, a farmer’s WTP for crop insurance will decrease whenever basis risk increases. Standard revenue insurance (without harvest price exclusion, and labelled as RP) may be written as: (2.9) 𝑅̃ In = −𝑐 − 𝑝(𝜑) + 𝐵̃Oc 𝑐 (𝜎)𝑦̃ + max[𝜙max[𝐹De,Fe , 𝐹̃De,Oc ]𝑦 𝐴𝑃𝐻 , 𝐹̃De,Oc 𝑦̃] = −𝑐 − 𝑝(𝜑) + (𝐵̅ + 𝜀̃𝜎)𝑦̃ + max[𝜙max[𝐹De,Fe , 𝐹̃De,Oc ]𝑦 𝐴𝑃𝐻 , 𝐹̃De,Oc 𝑦̃]. Specification Eqn.(2.9) shows that Observation 1 applies also for RP in that basis risk matters only through its effect on basis times yield. In summary then Proposition 1: All else equal, under expected utility preferences whenever a location has larger futures market basis risk as measured by 𝜎 then willingness to pay for revenue insurance will be smaller. This is true for both standard revenue insurance and revenue insurance with harvest price exclusion. 59 A natural question to ask is whether basis risk has the same role for yield insurance. Yield insurance indemnifies yield shortfalls at a fixed price rather than at some measure of market price. Letting the fixed price be the February price for the December maturity contract (for corn) then total revenues become Revenue⁡in⁡ Indemnity⁡based⁡on⁡futures local⁡market prices⁡and⁡actual⁡yields (2.10) 𝑅̃ In = −𝑐 − 𝑝(𝜑) + 𝑃⏞ ̃Oc 𝑐 𝑦̃ +⏞ max[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 − 𝐹̃De,Oc 𝑦̃, 0] = −𝑐 − 𝑝(𝜑) + 𝐵̃Oc 𝑐 𝑦̃ + 𝐹̃De,Oc 𝑦̃ + 𝐹De,Fe max[𝜙𝑦 𝐴𝑃𝐻 − 𝑦̃, 0] = −𝑐 − 𝑝(𝜑) + 𝐵̃Oc 𝑐 𝑦̃ + (𝐹̃De,Oc − 𝐹De,Fe )𝑦̃ + 𝐹De,Fe max[𝜙𝑦 𝐴𝑃𝐻 , 𝑦̃] 𝐵̃Oc𝑐 𝑦̃ + 𝐹̃De,Oc 𝑦̃,⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡whenever⁡no⁡loss; = −𝑐 − 𝑝(𝜑) + { 𝑐 𝐵̃Oc 𝑦̃ + (𝐹̃De,Oc − 𝐹De,Fe )𝑦̃ + 𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 ,⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡otherwise; Observation 1 applies, but with the addendum that a random number of futures contract payoffs (𝐹̃De,Oc − 𝐹De,Fe )𝑦̃⁡also enters grower returns in addition to a revenue guarantee. Does the distaste for basis risk apply when preferences are other than those under the expected utility paradigm? As we shall show below, there is reason to believe so. Consider the version of prospect theory most relevant to insurance choices, third generation prospect theory (Schmidt, Starmer and Sugden 2008) where the point of reference is a contract absent basis risk. Loss relative to the reference point will then be basis risk times yield, 𝐵̃Oc 𝑐 (𝜎)𝑦̃ = (𝐵̅ + 𝜀̃𝜎)𝑦̃, as given above when 𝜎 = 1. Following a modification of Firpo et al. (2020), we suppose that the value function, 𝑣(𝑥), for relative gains/losses is differentiable on its domain of reals, 𝑥 ∈ ℜ, and satisfies: (2.11) 1.⁡⁡𝑣(𝑥)𝑥 ≥ 0⁡⁡⁡∀𝑥 ∈ ℜ⁡⁡with⁡⁡𝑣(0) = 0 (preferences for gains over losses); 2. ⁡𝑣 ′ (𝑥) ≥ 0⁡⁡⁡∀𝑥 ∈ ℜ (monotonicy); 3.⁡⁡ − 𝑣(−𝑥) ≥ 𝑣(𝑥)⁡⁡⁡∀𝑥 > 0 (loss aversion). 60 The first condition separates the loss and gain domains while the second states that more profit is preferred to less. The third condition, which replaces ⁡𝑣 ′ (−𝑥) ≥ ⁡𝑣 ′ (𝑥) in Firpo et al. (2020), codifies loss aversion because, loss of a given magnitude is more detrimental than a gain of the same magnitude is beneficial. With joint distribution function 𝐺(𝜀̃, 𝑦̃), then expected value is (2.12) 𝑊[𝑣(∙); 𝐺(𝜀̃, 𝑦̃)]|𝜎=1 = ∫ 𝑣(𝜀̃𝑦̃)𝑑 (𝜀̃, 𝑦̃). Suppose then that basis is independent of yield and the distribution of basis is symmetric around zero. In light of point 3. in Eqn. (2.11) above it can be readily shown that, for each yield outcome and each basis outcome, loss in the loss domain dominates the equally probable gain in the gain domain so that basis risk depresses expected value. Returning to Eqn. (2.7) above, and so to the RPHPE contract, we now ask how the WTP response to basis risk is affected by coverage level. In order to do so we must ask how premium is affected by coverage level. Supposing that the premium is actuarially fair, with payout max[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 − 𝐹̃De,Oc 𝑦̃, 0] and with joint distribution function 𝐺(𝐹̃De,Oc , 𝑦̃), the premium function is (2.13) 𝑝(𝜙) = 𝐸 [max[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 − 𝐹̃De,Oc 𝑦̃, 0]] 𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 = ∫0 (𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 − 𝐹̃De,Oc 𝑦̃) 𝑑𝐺(𝐹̃De,Oc , 𝑦̃), where the lower and upper limits of the integral are 0 and 𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 respectively. Eqn.(2.13) indicates that the indemnity 𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 − 𝐹̃De,Oc 𝑦̃ will be paid by insurers to farmers when farmer’s actual revenue in the market 𝐹̃De,Oc 𝑦̃ is less than the guaranteed income 𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 . In this sense, actuarially fair premium 𝑝(𝜙) should be equal to the expected payout. So that the derivative with respect to 𝜙 is: 61 (2.14) 𝑝′ (𝜙) = 𝐹De,Fe 𝑦 𝐴𝑃𝐻 prob[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 ≥ 𝐹̃De,Oc 𝑦̃]. Eqn.(2.14) indicates that premium rate monotonically increases as coverage level increases because the probability of revenue loss is always non-negative. An extreme case is that farmer’s revenue is always more than guaranteed income of the revenue contract, then there will be no insurance demand. A differentiation of Eqn. (2.6), and using identity (2.2) with 𝑅̃ net = In ̃ − 𝑊(𝜎; 𝜙), generates: 𝑅 net net net 𝑑2 𝑊(𝜎;𝜙) 1 𝑑𝐸[𝑈 ′ (𝑅 ̃ )𝜀̃ 𝑦̃] 𝐸[𝑈 ′ (𝑅 ̃ )𝜀̃ 𝑦̃] 𝑑𝐸[𝑈 ′ (𝑅̃ )] (2.15) = ̃ net − 2 𝑑𝜎𝑑𝜙 𝐸[𝑈 ′ (𝑅 )] 𝑑𝜙 ̃ {𝐸[𝑈 ′ (𝑅 net )]} 𝑑𝜙 1 𝑑𝐸[𝑈 ′ (𝑅̃ net )𝜀̃𝑦̃] 1 𝑑𝑊(𝜎; 𝜙) 𝑑𝐸[𝑈 ′ (𝑅̃ net )] = − 𝐸[𝑈 ′ (𝑅̃ net )] 𝑑𝜙 𝐸[𝑈 ′ (𝑅̃ net )] 𝑑𝜎 𝑑𝜙 sign 𝑑𝐸[𝑈 ′ (𝑅̃ net )𝜀̃𝑦̃] 𝑑𝑊(𝜎; 𝜙) 𝑑𝐸[𝑈 ′ (𝑅̃ net )] = ⏞ − 𝑑𝜙 𝑑𝜎 𝑑𝜙 Now 𝑑𝑊(𝜎;𝜙) 𝐹De,Fe 𝑦 𝐴𝑃𝐻 − − 𝑝′ (𝜙),⁡⁡⁡whenever⁡𝐹De,Fe 𝑦 𝐴𝑃𝐻 > 𝐹̃De,Oc 𝑦̃; ̃ net 𝑑𝜎 𝑑[𝑅 ] (2.16) = undefined,⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡whenever⁡𝐹De,Fe 𝑦 𝐴𝑃𝐻 = 𝐹̃De,Oc 𝑦̃; 𝑑𝜙 𝑑𝑊(𝜎;𝜙) − − 𝑝′ (𝜙),⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡⁡otherwise. { 𝑑𝜎 With 𝐼[𝐴] as an indicator function conditional on 𝐴 and with 𝑅̃ nor ≡ 𝐹̃De,Oc 𝑦̃⁄𝐹De,Fe 𝑦 𝐴𝑃𝐻 , Eqn.(2.15) may be written as sign 𝑑 2 𝑊(𝜎;𝜙) 𝑑𝑊(𝜎;𝜙) (2.17) 𝑑𝜎𝑑𝜙 ⏞ 𝐸 [𝑈 ′′ (𝑅̃net)𝜀̃𝑦̃ (𝐹De,Fe 𝑦 𝐴𝑃𝐻 𝐼[𝜙 ≥ 𝑅̃ nor ]— 𝑑𝜎 − 𝑝′ (𝜙)⁡)] = 𝑑𝑊(𝜎;𝜙) 𝑑𝑊(𝜎;𝜙) − 𝑑𝜎 𝐸 [𝑈 ′′ (𝑅̃net) (𝐹De,Fe 𝑦 𝐴𝑃𝐻 𝐼[𝜙 ≥ 𝑅̃ nor ] − 𝑑𝜎 − 𝑝′ (𝜙))] 𝑑𝑊(𝜎; 𝜙) 1 = 𝐹De,Fe 𝑦 𝐴𝑃𝐻 𝐸 [𝑈 ′′ (𝑅̃net )𝜀̃𝑦̃ (𝐼[𝜙 ≥ 𝑅̃ nor ] − — prob[𝜙 ≥ 𝑅̃ nor ])] 𝑑𝜎 𝐹De,Fe 𝑦 𝐴𝑃𝐻 𝑑𝑊(𝜎; 𝜙) 𝑑𝑊(𝜎; 𝜙) 1 −𝐹De,Fe 𝑦 𝐴𝑃𝐻 𝐸 [𝑈 ′′ (𝑅̃net ) (𝐼[𝜙 ≥ 𝑅̃ nor ] − − prob[𝜙 ≥ 𝑅̃ nor ]⁡)] 𝑑𝜎 𝑑𝜎 𝐹De,Fe 𝑦 𝐴𝑃𝐻 62 sign 𝑑𝑊(𝜎; 𝜙) 1 ⏞ 𝐸 [𝑈 ′′ (𝑅̃net )𝜀̃𝑦̃ (𝐼[𝜙 ≥ 𝑅̃ nor ] − = — prob[𝜙 ≥ 𝑅̃ nor ])] 𝑑𝜎 𝐹De,Fe 𝑦 𝐴𝑃𝐻 𝑑𝑊(𝜎; 𝜙) 𝑑𝑊(𝜎; 𝜙) 1 − 𝐸 [𝑈 ′′ (𝑅̃net ) (𝐼[𝜙 ≥ 𝑅̃ nor ] − − prob[𝜙 ≥ 𝑅̃ nor ]⁡)] 𝑑𝜎 𝑑𝜎 𝐹De,Fe 𝑦 𝐴𝑃𝐻 If farmers choose coverage level to maximize expected utility, where any coverage level were possible, and if Constant Absolute Risk Aversion applied, −𝑈 ′′ (𝑅̃ In − 𝑊(𝜎; 𝜙))⁄𝑈 ′ (𝑅̃ In − 𝑊(𝜎; 𝜙)) =⁡positive constant, then the last expectation would equal zero in light of the first-order condition.10 To see this, return to Eqn.(2.4) and solve (2.18) max 𝐸 [𝑈 (𝐵̅ 𝑦̃ + max[𝜙𝐹De,Fe 𝑦 𝐴𝑃𝐻 , 𝐹̃De,Oc 𝑦̃] − 𝑐 − 𝑝(𝜙))] 𝜙 with first-order condition that (2.19) 𝐸[𝑈 ′ (𝑅̃ In ) × (𝐼[𝜙 ≥ 𝑅 ̃nor ] − prob[𝜙 ≥ 𝑅 ̃nor ])] = 0 So under optimality for coverage level choice and constant absolute risk aversion then Eqn. (2.14) resolves to sign 𝑑 2 𝑊(𝜎;𝜙) In (2.20) 𝑑𝜎𝑑𝜙 ⏞ 𝐸 [𝑈′′ (𝑅 = ̃ ) 𝜀̃𝑦̃(𝐼[𝜙 ≥ 𝑅̃ nor ] − prob[𝜙 ≥ 𝑅̃ nor ])] As this expression involves multiple random variables and as two terms, 𝜀̃ and 𝐼[𝜙 ≥ 𝑅̃ nor ] − prob[𝜙 ≥ 𝑅̃ nor ], are not uniform in sign, establishing a sign on the expectation is challenging. Remark 1: Farmer’s coverage level choice is undetermined when basis risk increases. The relationship between basis risk and coverage level choice may not be robust. 10 There is strong evidence that they do not. See Du et al. (2017), Rationality of choices in subsidized crop insurance markets, X Du et al., American Journal of Agricultural Economics 99 (3), 732-756. 63 Figure 2.1 Assumptions for the Effect of Basis Risk on Participation Rates. Note: the y-axis represents the farmer’s willingness to pay and RMA’s premiums, both of which are measured as $/acre. The x-axis represents the farmer’s coverage level choice. 𝑝∗ represents a predetermined premium. The upper (lower) curve represents farmers’ willingness to pay under a low (high) basis risk condition. “In the market” and “Out of the market” indicate whether farmers will participate in the federal crop insurance program. “A” and “B” show the optimized coverage choices and corresponding WTPs under different conditions, but the two coverage choices are not necessarily the same. Figure 2.1 explains farmers' acreage response when they face the basis risk. First, we set the farmer's WTP as a quadratic coverage level function, implying that farmers prefer medium coverage levels over end levels. The set-up is consistent with participation rates in the real world. Second, farmers' WTP decreases when basis risk increases (low 𝜎 → high 𝜎), given a coverage level. It is noteworthy that the optimal coverage choices under different basis risks might not be identical. For example, the optimal choice 𝐴 may be on the left or right side of 𝐵. Third, if the premium rate is set as 𝑝∗ , the blue area's ratio to total area (sum of blue and red) decreases when basis risk increases (see bars under the x-axis). The downside ratio of the blue area indicates that 64 a farmer will choose to insure fewer acres in the market. An extreme case is that if a farmer's WTP at the optimal coverage level is lower than the premium rate 𝑝∗ , this farmer will eventually move out of the market and insure zero acres. Data Description This study focuses on participation rates for Buy-Up contracts in U.S Federal Crop Insurance Program. Unlike the CAT contract, the Buy-Up contracts allow farmers to choose one among multiple coverage levels. The targeted area contains 12 Midwest and Great Plain states (IL, IN, IA, KS, MI, MN, MO, NE, ND, OH, SD, and WI). Two dependent variables of most interests in the empirical analysis are: (i) the share of insured acres to total insurable acres (e.g., planted acres), which represents the extensive margin participation rate; and (ii) acreage- weighted coverage level choices, which represents the intensive margin participation rate. An extensive margin participation rate for yield and revenue contracts cannot be obtained directly because total insurable acres were not reported. Our primary research question is this: does basis risk affect participation rates? One of our main works is constructing county-level basis risk based on elevator-level daily basis data. The reason why elevator-level data are employed is that various county-level basis risks can be constructed and compared. We follow the literature and include several variables intended to control land capability (the share of good land in total land), Growing Degree Days (GDD), Stress Degree days (SDD), yield risk, and geographic location. A general form of the model specification for both participation rates (extensive and intensive margins) is: Basis⁡Risk 𝑙𝑐 , Land⁡Capability𝑙𝑐 , Growing⁡Degree⁡Days𝑙𝑐 (2.21) Participation⁡Rate𝑙𝑐 = 𝑓 ( ), Stress⁡Degree⁡Days𝑙𝑐 , Yield⁡CV𝑙𝑐 , Geographic⁡Location𝑐 where 𝑓(∙) is a function mapping the effect of drivers on participation rates. 65 Table 2.1 Definition and Summary for Main Variables. Description Data Source Crop Variable Obs. Mean Std.Dev Participation Rate Share of insured acres in NASS, FSA, SOB Corn 𝑒1 775 0.85 0.12 total insurable acres 𝑒2 775 0.82 0.13 𝑒3 775 0.83 0.13 Soybeans 𝑒1 735 0.82 0.13 𝑒2 735 0.81 0.12 Acreage-weighted SOB Corn 𝜙yi 775 0.70 0.06 average coverage level 𝜙re 775 0.76 0.04 Soybeans 𝜙yi 735 0.70 0.07 𝜙re 735 0.76 0.04 Basis Normalized Basis risk Bids Data Corn 𝜎 747 0.006 0.005 Soybeans 𝜎 728 0.004 0.004 Elevator amount Bids Data Corn 𝐸𝑙𝑒𝐴𝑚𝑡 775 5.17 4.27 per county Soybeans 𝐸𝑙𝑒𝐴𝑚𝑡 735 4.87 4.00 Years of elevator Bids Data Corn 𝐴𝑣𝑒𝑌𝑒𝑎𝑟 775 3.90 1.50 records per county Soybeans 𝐴𝑣𝑒𝑌𝑒𝑎𝑟 735 3.85 1.50 Yield Risk and Stock Coefficient of Variation NASS Corn 𝑌𝑅 775 0.18 0.07 Soybeans 𝑌𝑅 735 0.18 0.06 Ratio of Ending Stock NASS Corn 𝑅𝑎𝐸𝑆 775 0.56 0.06 to Production Soybeans 𝑅𝑎𝐸𝑆 735 0.46 0.01 Land Capability Percentage of acres for Class I-II in total acres for NRI 𝐿𝐶𝐶 2,902 0.25 0.21 Class I-VIII Weather Determinant Growing Degree Days NOAA 𝐺̅ 1,016 1248.63 194.56 Stress Degree Days NOAA 𝑆 ̅ 1,016 20.06 22.95 Note: (1) NASS: National Agricultural Statistics Service; (2) FSA: Farm Service Agency; (3) SOB: USDA Summary of Business; (4) NRI: National Resource Inventory; (5) NOAA: National Oceanic and Atmospheric Administration. Bids Data represent elevator-level daily basis data which are purchased from a company. As discussed above, the effect of basis risk on extensive margin participation rate should be negative, but that on intensive margin is unclear. The effects of production risks (i.e., weather 66 determinants and yield risk) are expected to be contrary to extensive and intensive margins. The contrary cases happen because exacerbation of production conditions impacts farmers’ trade-off between extensive and intensive margins. For example, bad weather conditions encourage farmers to enroll in the crop insurance program, but the coverage level might be low because farmers’ subjective probability of catastrophe may be high. Therefore, for an extensive (intensive) margin participation rate, the effect of GDD is expected to be negative (positive); however, those of SDD and yield risk are expected to be positive (negative). Furthermore, we will not consider the recency effect of loss events because all variables in this study are aggregate measures from 2009 to 2020. The main reasons for cross-sectional rather than panel analysis are: first, basis variation in a year might not be appropriate to measure basis risk; second, participation rates in a region (e.g., county) are maintained at a relatively constant level, so yearly variation maybe not enough. A summary of the main variables is provided in Table 2.1. Measuring Participation Rates Denote 𝑛𝑟𝑙𝑡𝑐 as the total insured acres for contract 𝑟 ∈ 𝑅 = Yi ∪ Re where Yi and Re represent sets including yield and revenue contracts respectively, and crop 𝑙 ∈ 𝐿 = {Corn, Soy} in year 𝑡 in county 𝑐. Table 2.2 reports the product categories for yield and revenue contracts. For each insurance contract, e.g., revenue contract, we can construct the yearly coverage level as 𝑙𝑡𝑐 𝜙Re = ∑𝑅𝑒 𝜙𝑟𝑙𝑡𝑐 × 𝑤r𝑙𝑡𝑐 where 𝜙𝑟𝑙𝑡𝑐 is coverage level with 𝑟 ∈ Re and 𝑤r𝑙𝑡𝑐 = 𝑛𝑟𝑙𝑡𝑐 ⁄∑Re 𝑛𝑟𝑙𝑡𝑐 representing weights. In this study, all intensive margin participation rates are acreage-weighted coverage levels. Let 𝑇 = {2009, … ,2020}, the intensive margin participation rate for revenue 𝑙𝑐 contracts will be 𝜙Re = ∑𝑇 ∑Re 𝜙𝑟𝑙𝑡𝑐 × (𝑛𝑟𝑙𝑡𝑐 ⁄𝑛𝑙𝑐 ) where 𝑛𝑙𝑐 = ∑𝑇 𝑛𝑙𝑡𝑐 . It is noteworthy that 𝑙𝑐 𝑙𝑐 both 𝜙Re and 𝜙Yi represent the spatial-only variations. All insurance data are from USDA Risk 67 Management Agency (RMA) Summary of Business (SOB). 11 Table 2.2 Information for Crop Insurance Plan. Plan Code Abbreviation Name Contract Period: 2008-2010 12 GRP Group Risk Plan Yield 25 RA Revenue Assurance Revenue 42 IP Income Protection Revenue 44 CRC Crop Revenue Coverage Revenue 45 IIP Indexed Income Protection Revenue 73 GRIP Group Risk Income Protection Revenue 90 APH Actual Production History Yield Period: 2011-2020 01 YP Yield Protection Yield 02 RP Revenue Protection Revenue Revenue Protection with Harvest Price 03 RPHPE Revenue Exclusion GRP Group Risk Plan Yield 04 AYP Area Yield Protection Yield 05 ARP Area Revenue Protection Revenue Area Revenue Protection with Harvest 06 ARPHP Revenue Price Exclusion Note: see Du et al. (2017) for period 2008-2010; see USDA RMA (2022) for period 2022-2020. Measuring acreage response is challenging because there is no direct measure of a county's eligible acres (Goodwin et al. 2004). In our case, SOB cannot be merged with any other data directly to match between insured acres and total eligible (insurable) acres. Therefore, we employ data from USDA National Agricultural Statistics Service (NASS) 12 and Farm Service Agency (FSA) 13 for more information, such as planted, prevented, or failed acres. We may ask why both NASS and FSA are needed in this study. First, the FSA dataset only includes farm operators who participate in the government programs, but the NASS survey tries to cover each 11 Data can be downloaded from https://www.rma.usda.gov/SummaryOfBusiness. 12 Data can be downloaded from https://quickstats.nass.usda.gov. 13 Data can be downloaded from https://www.fsa.usda.gov/news-room/efoia/electronic-reading- room/frequently-requested-information/crop-acreage-data/index. 68 farm; as such, NASS is more representative than FSA. Second, the FSA dataset has prevented and failed acres; however, NASS does not provide such information. Denote 𝑎𝑁1 , 𝑎𝑁2 , 𝑎𝑁3 as planted, harvested, and silage acres from USDA/NASS; 𝑎𝐹1 , 𝑎𝐹2 , 𝑎𝐹3 as FSA planted, prevented, and failed acres. Extensive margin participation rates can be constructed as: 𝑛 (2.22a) 𝑒 1 = max(𝑎𝑁1,𝑎𝐹1)+𝑎𝐹2+𝑎𝐹3 ; 𝑛 (2.22b) 𝑒 2 = max(𝑎𝑁1,𝑎𝐹1+𝑎𝐹3)+𝑎𝐹2 ; 𝑛 (2.22c) 𝑒3 = . max( ⏟𝑁1 −𝑎𝑁3 𝑎 ,𝑎𝐹1 +𝑎𝐹3 )+𝑎𝐹2 remove⁡corn⁡silage All elements in (2.22a)-(2.22c) have subscript 𝑟𝑙𝑡𝑐 to signify contract, crop, year and county information. It is noteworthy that NASS planted acres are not necessarily close to FSA planted acres although both of them are labeled as “planted”. In some counties, NASS harvested acres are larger than FSA planted acres. Therefore, we construct 𝑒 1 and 𝑒 2 for both corn and soybeans. Another concern is that NASS records contain acres for corn silage.14 We construct 𝑒 3 for corn grain by removing corn silage from NASS planted acres. Since acres for corn silage are non-negative, i.e., 𝑎𝑁3 ≥ 0, then 𝑒 3 will exceed 𝑒 2 . As in Figure 2.2, many measures are more significant than the theoretical maximum value (i.e., 1.0). It is noteworthy that states outside the main production area have more outliers (extreme values), e.g., North Dakota and Nebraska. When moving from west to east, extensive margin has a decreasing trend, but on the contrary, intensive margin seems to have an increasing trend (see Figures 2.3 and 2.4). 14 In general, silage is usually made from maize, sorghum or other cereals. As one of them, corn silage is a high-quality, popular forage crop for dairy farms and beef cattle farms. 69 Figure 2.2 Extensive Margin Participation Rates over 2009-2020. Notes: On each box, the central mark is median. Denote 𝑥[25] and 𝑥[75] as the 25th and 75th percentiles, then upper and lower adjacent lines represent 𝑥[75] + 1.5 × 𝐼𝑄𝑅 and 𝑥[25] + 1.5 × 𝐼𝑄𝑅, respectively, where 𝐼𝑄𝑅 = 𝑥[75] − 𝑥[25] . Dots represent other observations outside the range above. 70 Figure 2.3 Extensive Margin Participation Rates from West to East. Note: Each point represents one county. For convenience, we just choose 𝑒 1 . 71 Figure 2.4 Intensive Margin Participation Rates from West to East. Note: Each point represents one county. Only revenue contracts are included here. Yield contracts will be adopted in the empirical analysis. 72 Measuring Basis Risk Recall that in the conceptual model, we define basis as 𝐵Oc 𝑐 (𝜎) ≡ 𝐵̅ + 𝜀𝜎 where Oc indicates the harvesting time (October); 𝐵̅ indicates the geographical heterogeneity which are county-level constants due to locations; 𝜀 is a random variable with 𝐸[𝜀] = 0; 𝜎 measures basis 𝑐 (𝜎)), risk. Therefore, a simple transformation shows that 𝜎𝐵 ≡ 𝜎 × 𝜎𝜀 here 𝜎𝐵 = √var (𝐵Oc and 𝜎𝜀 = √var(𝜀). Without loss of generality, we normalize 𝜎𝜀 = 1 so that 𝜎𝐵 = 𝜎. Therefore, basis risk can be measured based on the observed cash and futures prices. Corn December and Soybeans November futures contracts are employed for our analysis. 𝑙𝑐,𝑛𝑡𝑑 Denote 𝑃Oc as cash price at the harvesting time (October) for elevator n on trading day 𝑑 for crop 𝑙 in county 𝑐 and year 𝑡; 𝐹𝑙,𝑡𝑑 as futures price. Then the expression of normalized basis risk 𝑙𝑐,𝑛𝑡𝑑 𝑙𝑐,𝑛𝑡𝑑 is 𝐵Oc = (𝑃Oc − 𝐹𝑙,𝑡𝑑 )⁄𝐹𝑙,𝑡𝑑 . The motivation for normalization is that we are inclined to have scale-free numbers as best possible because participation rates are measured on the interval [0,1]. Other normalization techniques can be found in the literature (see more in Nayak et al. 2014); however, these techniques may not be appropriate because they only focus on one input variable. 𝑙𝑐,𝑛𝑡𝑑 With normalized basis 𝐵Oc , we turn to constructing the county-level basis risk. Denote 𝐷𝑡 as a set including total trading days in year 𝑡; 𝑁𝑐 as a set including total elevators and 𝑇𝑐 as a set including total years in county 𝑐. Then the basis risk can be constructed as: 𝐷𝑡 𝑙𝑐,𝑛𝑡𝑑 (2.23a) elevator-level yearly basis 𝐵̅𝑙𝑐,𝑛𝑡 = (1⁄𝐷𝑡 ) ∑𝑑=1 𝐵Oc ; 2 1/2 𝐷𝑡 (2.23b) elevator-level yearly variation 𝜎𝑙𝑐,𝑛𝑡 = [(𝐷𝑡 − 1)−1 ∑𝑑=1 (𝐵𝑙𝑐,𝑛𝑡𝑑 − 𝐵̅𝑙𝑐,𝑛𝑡 ) ] ; 1 (2.23c) county-level basis risk 𝜎𝑙𝑐 = 𝑁 ∑𝑇𝑡=1 𝑐 ∑𝑁𝑐 𝑛=1 𝜎𝑙𝑐,𝑛𝑡 . 𝑐 𝑇𝑐 73 We may ask whether there are other alternatives to measure basis risk because elevator- level basis can be readily aggregated to county-level basis. we first define three bases as: 𝑁 (2.24a) county-level daily basis 𝐵𝑙𝑐,𝑡𝑑 = (𝑁𝑐 )−1 ∑𝑛=1 𝑐 𝐵𝑙𝑐,𝑛𝑡𝑑 ; 𝐷 (2.24b) county-level yearly basis 𝐵𝑙𝑐,𝑡 = (𝐷𝑡 )−1 ∑𝑑=1 𝑡 𝐵𝑙𝑐,𝑡𝑑 ; 𝑇 (2.24c) county-level long-term average 𝐵𝑙𝑐 = (𝑇𝑐 )−1 ∑𝑛=1 𝑐 𝐵𝑙𝑐,𝑡 . Then three alternatives for basis risk can be constructed as: 𝑇 ⊕ ⊕ 1 𝐷 2 1/2 (2.25a) 𝜎𝑙𝑐𝐴1 = (1⁄𝑇𝑐 ) ∑𝑡=1𝑐 𝜎𝑙𝑐,𝑡 with 𝜎𝑙𝑐,𝑡 = [𝐷 −1 ∑𝑑=1𝑡 (𝐵𝑙𝑐,𝑡𝑑 − 𝐵𝑙𝑐,𝑡 ) ] ; 𝑡 1 𝑇 𝐷 2 1/2 (2.25b) 𝜎𝑙𝑐𝐴2 = [𝑇 𝐷 −1 ∑𝑡=1𝑐 ∑𝑑=1𝑡 (𝐵𝑙𝑐,𝑡𝑑 − 𝐵𝑙𝑐 ) ] ; 𝑐 𝑡 1 𝑇 2 1/2 (2.25c) 𝜎𝑙𝑐𝐴3 = [𝑇 −1 ∑𝑡=1 𝑐 (𝐵𝑙𝑐,𝑡 − 𝐵𝑙𝑐 ) ] . 𝑐 Figure 2.5 shows the kernel density estimations (KDE) for all four basis risks. We find that the first basis risk  has the minimum means and standard deviations among all measurements. In light of large heterogeneities of scale, it is natural to ask which basis risk is the most appropriate one for our analysis. We assume that elevators in a county are homogenous, and farmers are sensitive to the cash prices of elevators nearby. If this is the case, the variable averaged out from elevator-level yearly variation might capture the true basis risk. As such, county-level basis risk 𝜎 in Eqn.(2.23c) will be used (red curve in Figure 2.5). The first three basis risks are highly correlated because corr(𝜎, 𝜎 𝐴1 ) = 0.8, corr(𝜎, 𝜎 𝐴2 ) = 0.7 and corr(𝜎 𝐴1 , 𝜎 𝐴2 ) = 0.7. However, the last one 𝜎 𝐴3 is nearly uncorrelated with the others. Figure 2.6 shows spatial disparities of basis risk. We find that the basis risk for corn is more dispersed than soybeans, which is reasonable because the interstate basis for corn has a more extensive range than soybeans (see Figure 2.7). The scale of basis (i.e., the absolute value of basis) increases in general as one moves away from the main production areas (e.g., IA, IL, 74 IN), which can be explained due to various costs (e.g., storage or transportation). However, we cannot conclude the basis risk pattern because interstate and intrastate basis risks are dispersed. As in Figure 2.8, a general pattern is that participation rates seem high in areas where basis risk is low, although some states in the western Great Plains (e.g., North Dakota and South Dakota) need to be discussed further. Table 2.3 provides more information on the elevator-level basis data, including number of counties, operating years and elevator count. We find that: (i) the percentage of counties with at least one elevator to entire counties in a state is highly correlated with elevator county. The correlation coefficient between “Percent” column of Panel A and “Mean” column of Panel C is 0.72; and (ii) the percentage (“Percent” column of Panel A) seems not correlated with the operating years (“Mean” column of Panel B) because their correlation coefficient is just 0.11. Spatial disparities in elevator count and operating years are shown in Figure 2A.1. Land Capability Land capability data are from National Resource Inventory (NRI). As in USDA (2015), there are eight Land Capability Classes (LCC). Specifically, Class I soils have no limitations restricting their use. However, Class VIII soils and miscellaneous areas have the limitation that precludes their use for commercial plant production and limits their use for recreation, wildlife, water supply, or esthetic purposes. As in the literature, we measure the soil quality by calculating the percentage of good land in total land. We employ Class I-II as the good land as in Goodwin et al. (2004), although Du et al. (2017) choose Class I-IV. 75 Figure 2.5 Kernel Density Estimation (KDE) for Four Basis Risks. Notes: The Gaussian kernel is employed. Bandwidth is the optimal one that is selected to minimize the mean integrated square error. Distribution stands for spatial heterogeneity across 12 states. 76 Figure 2.6 County-Level Basis Risks from West to East. Note: Only points between 5th and 95th quantiles are included in this figure. 77 Figure 2.7 Patterns of Mean Normalized Basis for 12 States. Note: Mean normalized basis is the mean value of all counties within a state per year. The goal here is to exhibit the heterogeneity in different states. 78 Figure 2.8 Basis Risk and Participation Rates. 79 Table 2.3 Elevator Information for 12 States. Panel A: Number of Counties Panel B: Operating Years Per County Panel C: Elevator Count Per County State Total Elevator  1 Percent Mean Std. Dev Min Max Mean Std. Dev Min Max Illinois 102 90 88.2% 3.71 1.07 1.5 6 6.22 5.22 1 24 Indiana 92 70 76.1% 4.84 1.64 1 8.5 2.7 1.74 1 8 Iowa 99 96 97.0% 4.43 1.43 1.5 8.64 9.52 5.24 1 23 Kansas 105 100 95.2% 3.46 1.63 1 8.33 5.09 4.09 1 23 Michigan 83 38 45.8% 3.98 1.43 1.5 8 2.45 1.27 1 6 Minnesota 87 64 73.6% 3.73 1.1 1.4 6.29 6.16 3.45 1 15 Missouri 115 58 50.4% 3.58 1.54 1 10 2.45 1.58 1 8 North Dakota 53 40 75.5% 3.3 1.18 1 5 3.53 2.63 1 13 Nebraska 93 81 87.1% 4.05 1.36 1 8.2 6.2 4.2 1 20 Ohio 88 64 72.7% 4 1.23 2 8.5 4.01 3.38 1 13 South Dakota 66 55 83.3% 4.06 2.1 1 10 3.75 2.16 1 8 Wisconsin 72 56 77.8% 3.34 1.24 1 6 3.34 2.27 1 10 Note: County-level mean operating years and total elevators are summarized at first. The intra-state distribution shows heterogeneities for those counties that have at least one elevator. 80 Weather Determinants As in the literature (e.g., Schlenker and Roberts 2009; Xu et al. 2013), we construct two variables-Growing Degree Days (GDD) and Stress Degree Days (SDD)-to represent beneficial heat and heat stress during the growing season (April to September). The formulas for variable GDD and SDD in the year 𝑡 are: max 𝑙 min 𝑙 (2.26a) 𝐺𝐷𝐷𝑐,𝑡 = ∑𝑑∈Ω𝑡 [0.5 (min(max(𝑇𝑐,𝑑,𝑡 , 𝑇 ), 𝑇 ℎ ) + min(max(𝑇𝑐,𝑑,𝑡 , 𝑇 ), 𝑇 ℎ )) − 𝑇 𝑙 ] max 𝑘 min 𝑘 (2.26b) 𝑆𝐷𝐷𝑐,𝑡 = ∑𝑑∈Ω𝑡 [0.5 (max(𝑇𝑐,𝑑,𝑡 , 𝑇 ) + max(𝑇𝑐,𝑑,𝑡 , 𝑇 )) − 𝑇 𝑘 ] where 𝑐 is county; 𝑑 is day; Ω𝑡 is the set of growing season days in year 𝑡. The thresholds are 𝑇 𝑙 = 10° 𝐶, 𝑇 ℎ = 30° 𝐶 and 𝑇 𝑘 = 32.2° 𝐶. In this study, we use climatological normal around 30 years to control for weather effect on coverage level choice. The time length we selected is {1990, … ,2019} and then 𝐺𝑐̅ = (1/30) ∑2019 ̅ 2019 𝑗=1990 𝐺𝐷𝐷𝑐,𝑗 , 𝑆𝑐 = (1/19) ∑𝑗=1990 𝑆𝐷𝐷𝑐,𝑗 . Yield Risk We use coefficient of variance (CoV) to control for yield risk as in Goodwin et al. (2004). Denote 𝑦𝑐𝑡 be the actual county-level yield recorded by USDA NASS for county 𝑐 and year 𝑡. As in Deng et al. (2007), we employ the trend equation log(𝑦𝑐𝑡 ) = 𝜆0 + 𝜆1 (2020 − 𝑡) + 𝑢, where 𝑡 ∈ T det = {1971, … ,2020} and 𝑢 is the error term. We predict 𝑦̂𝑐𝑡 and 𝑦̂𝑐,2020 , and det then construct detrended county-level yield as 𝑦𝑐𝑡 = 𝑦𝑐𝑡 × 𝑦̂𝑐,2020⁄𝑦̂𝑐𝑡 . In final, yield risk can 1 2 be measured as 𝑌𝑅𝑐 = 𝜎𝑦𝑐det ⁄𝜇𝑦𝑐 det , where 𝜎𝑦𝑐det = √Tdet −1 ∑𝑡∈𝑇 det (𝑦𝑐𝑡 det − 𝑦̅𝑐det ) and 𝜇y𝑐det = det (1⁄T det ) ∑𝑡∈Tdet 𝑦𝑐𝑡 . We may ask why the time length for detrending is 50-year (T det ∈ {1971,2020}), but that for the empirical analysis is only 12 years (T ∈ {2009, … ,2020}). First, different time length does not affect our cross-sectional analysis. Recall that all variables in this study only show 81 spatial variations. Second, longer yield records can better capture the underlying yield risk because a more extended time length contains more information such as technological development, disaster, and policy adjustments. Ratio of Ending Stock to Production We construct the ratio of ending stock to production as an instrumental variable. The literature concludes that: (i) the relationship between supply and demand affects the basis level in each market (e.g., Jiang 1997); and (ii) there is an inverse relationship between basis and the stocks to storage capacity ratio (Karlson et al. 1993). For instance, when a massive volume of production puts pressure on storage in a market, the basis will weaken, and the farmer’s revenue expectation will decrease. To our best knowledge, county-level supply and demand data are hard to obtain. However, we are motivated to employ state-level data because USDA NASS provides state-level 𝑙𝑠𝑡 quarterly stock data (i.e., Mar, Jun, Sep, and Dec). Denote 𝐸𝑆Ma as the ending stock of crop 𝑙 in state 𝑠 in March of year 𝑡 and 𝑃𝐷𝑙𝑠𝑡 as actual production. we construct 𝑅𝑎𝐸𝑆𝑙𝑠 = 𝑙𝑠𝑡 ∑𝑡∈𝑇 𝐸𝑆Ma ⁄∑𝑡∈𝑇 𝑃𝐷𝑙𝑠𝑡 representing a ratio of ending stock to production. We assume that the effect of this ratio on the basis is negative, which indicates that the more the storage, the weak the basis (see Figure A2 for state-by-year variations over 2009-2020). We find: (i) the ratios for corn are higher than those for soybeans in general; and (ii) ratios in the main production area seem higher than in other regions. Unit Choice Data for Simulation We employ USDA RMA's 2009 farm-level contract choice data for our simulation. This dataset contains farm-level actual yield, approved historical yield, coverage level choice, premium payment, and location information (state and county). We may ask why such dated data 82 are still adopted in this study. The reasons are: (i) there are not many changes in participation rates after 2009, so we assume farmers' behavior does not have distinct changes as well; and (ii) observations are enough for simulation (corn: 687,274; soybeans 567,711). Therefore, the farmer's WTP will be implicitly estimated as in Eqn. (2.5). Empirical Results Estimation Method We will employ Fractional Probit with Control Function (FPCF) for the empirical analysis. First of all, Fractional Probit is more appropriate than linear model because the participation rate is measured as a fraction. Second, potential endogeneity issues induced by omitted variables may exist. Some unobserved variables, e.g., farmer’s subjective probability for future’s loss, affect supply and demand in the local market, which then impact participation rate and basis risk. Denote 𝑒𝑙𝑐 and 𝜙𝑙𝑐 as extensive (the share of insured acres in total insurable acres) and intensive margin (acreage-weighted coverage choices) participation rates for crop 𝑙 in county 𝑐; 𝜎𝑙𝑐 as basis risk; 𝐗 𝐥𝐜 as a vector of explanatory variables with 𝐗 𝐥𝐜 = {𝑆𝑄, 𝐺̅ , 𝑆̅, 𝑌𝑅}; 𝐙𝐥𝐜 as a vector of instrumental variables with 𝐙𝐥𝐜 = {EleAmt, AveYear, RaES}. Then the model specification is: (2.27a) Control Function 𝜎𝑙𝑐 = 𝐙𝐥𝐜 𝚷 + 𝐗 𝐥𝐜 𝛃(𝟏) + 𝑣𝑙𝑐 ; (2.27b) Fractional Probit 𝐸[ratelc |𝜎𝑙𝑐 , 𝑋𝑙𝑐 , 𝜂𝑙𝑐 ] = Φ(𝛾𝜎𝑙𝑐 + 𝐗 𝐥𝐜 𝛃(𝟐) + 𝜂𝑙𝑐 ), Where rate𝑙𝑐 ∈ Pa = {𝑒𝑙𝑐 , 𝜙𝑙𝑐 }, 𝐸[∙] is the expectation operator, 𝜂𝑙𝑐 is an omitted variable with cov(𝜎𝑙𝑐 , 𝜂𝑙𝑐 ) ≠ 0. Explanatory variables are chosen as in the literature (e.g., LaFrance et al. 2002; Goodwin et al. 2004; Du et al. 2014; Yu and Sumner 2018; Che et al. 2020). As to the instrumental variables, we assume elevator operation and aggregate ending 83 stock are not correlated with the omitted variables such as farmer’s subjective belief. The estimation procedure for Fractional Probit with an Endogenous Explanatory Variable (EEV) has been discussed extensively in the literature (see Papke and Wooldridge 1996, 2008; Wooldridge 2015). If (𝑣𝑙𝑐 , 𝜂𝑙𝑐 ) is jointly normal then a two-step control function method can be processed as: (i) regress 𝜎𝑙𝑐 on 𝐙𝐥𝐜 , 𝐗 𝐥𝐜 and obtain the residuals 𝑣̂𝑙𝑐 ; (ii) use “Probit” of rate𝑙𝑐 on ̂ (𝟐) , 𝛿̂𝑠𝑐𝑎𝑙𝑒 . Denote 𝜎̂𝜂2 as the variance of 𝜎𝑙𝑐 , 𝐗 𝐥𝐜 , 𝑣̂𝑙𝑐 to estimate scaled coefficients, say 𝛾̂𝑠𝑐𝑎𝑙𝑒 , 𝛃 𝐬𝐜𝐚𝐥𝐞 𝑙𝑐 1/2 (𝟐) the omitted variable, then 𝛾̂𝑠𝑐𝑎𝑙𝑒 = 𝛾̂⁄(1 + 𝜎̂𝜂2𝑙𝑐 ) , which is also applied to 𝛃 ̂ ̂ 𝐬𝐜𝐚𝐥𝐞 and 𝛿𝑠𝑐𝑎𝑙𝑒 . The coefficient on 𝑣̂𝑙𝑐 , labeled as 𝛾̂𝑠𝑐𝑎𝑙𝑒 , can provide a simple t test for the fixed effect with null hypothesis H0 : 𝛾 = 0 (exogeneous test). Empirical Results Table 2.4 reports semi-elasticities of insurance demand, which explains the change in the participation rate for a proportional change in basis risk. For extensive margin (𝑒 1 , 𝑒 2 , 𝑒 3 ) participation rate, Fractional Probit with Control Function is an appropriate approach because all residuals are significant at a 1% significance level (p-value of the residual is less than 0.01). The semi-elasticities of insurance demand for corn are lower than -0.300, and for soybean are lower than -0.769, which implies that the share of insured acres in all insurable acres will increase by at least 3 and 7.7 percentage points for corn and soybean, respectively, when the basis risk decreases by 10%. However, for the intensive margin participation rate, Fractional Probit with Control Function is just appropriate for the corn revenue contract, but Fractional Probit is appropriate for corn yield contract, soybean revenue, and yield contracts. The semi-elasticity of insurance demand for the corn revenue contract is -0.043, implying that the coverage level will increase by 0.4 percentage points if the basis risk decreases by 10%. The semi-elasticities for the soybean 84 Table 2.4 Semi-Elasticities for Corn and Soybeans. Estimation Method Fractional Probit Fractional Probit with CF 1 2 Dependent Variable 𝑒 𝑒 𝑒3 𝜙yi 𝜙re 𝑒 1 𝑒2 𝑒3 𝜙yi 𝜙re Corn Normalized Basis risk 0.003 0.003 0.003 -0.001 -0.001 -0.300+ -0.330+ -0.359+ -0.032 -0.043+ (0.004) (0.004) (0.004) (0.002) (0.001) (0.076) (0.076) (0.081) (0.025) (0.013) Land Capability 0.061+ 0.078+ 0.067+ 0.035+ 0.023+ 0.083+ 0.102+ 0.094+ 0.037+ 0.026+ (0.009) (0.009) (0.009) (0.005) (0.002) (0.011) (0.011) (0.012) (0.005) (0.002) Growing Degree Days -0.261+ -0.094* -0.205+ 0.083+ 0.126+ -0.317+ -0.160+ -0.270+ 0.078+ 0.119+ (0.050) (0.050) (0.049) (0.023) (0.009) (0.050) (0.052) (0.049) (0.022) (0.009) Stress Degree Days 0.032+ 0.029+ 0.026+ -0.016+ -0.020+ 0.026+ 0.021+ 0.017+ -0.018+ -0.022+ (0.005) (0.006) (0.006) (0.002) (0.001) (0.006) (0.006) (0.006) (0.003) (0.001) Yield Risk 0.132+ 0.132+ 0.129+ -0.040+ -0.027+ 0.183+ 0.189+ 0.192+ -0.036+ -0.020+ (0.014) (0.017) (0.016) (0.005) (0.003) (0.021) (0.023) (0.023) (0.007) (0.004) residual (p-value) 0.000 0.000 0.000 0.222 0.001 Obs. 747 747 747 747 747 747 747 747 747 747 Soybeans Normalized Basis risk 0.002 0.002 -0.004* -0.002+ -0.769+ -0.807+ -0.042 -0.044 (0.004) (0.004) (0.002) (0.001) (0.184) (0.183) (0.058) (0.029) Land Capability 0.067+ 0.066+ 0.050+ 0.024+ -0.016 -0.020 0.046+ 0.019+ (0.010) (0.010) (0.006) (0.002) (0.022) (0.022) (0.009) (0.004) Growing Degree Days -0.247+ -0.216+ 0.032 0.058+ 0.772+ 0.854+ 0.083 0.113+ (0.050) (0.050) (0.030) (0.011) (0.251) (0.250) (0.081) (0.040) Stress Degree Days 0.019+ 0.013** -0.010+ -0.017+ -0.114+ -0.131+ -0.017 -0.024+ (0.006) (0.006) (0.003) (0.001) (0.031) (0.032) (0.011) (0.005) Yield Risk 0.090+ 0.087+ -0.056+ -0.021+ 0.186+ 0.188+ -0.051+ -0.016+ (0.017) (0.017) (0.011) (0.003) (0.030) (0.030) (0.013) (0.004) residual (p-value) 0.000 0.000 0.501 0.143 Obs. 728 728 728 728 728 728 728 728 Note: (1) standard errors in parentheses; (2) * p<0.10, ** p<0.05, + p<0.01. 85 yield and revenue contracts are -0.004 and -0.002, respectively, implying that the coverage levels will increase by 0.04 and 0.02 percentage points if the basis risk decreases by 10%. Conclusion The basis is defined as the difference between spot and futures prices. Therefore, the basis risk measures the mismatch between cash and futures markets. Due to the crop insurance contract design, the basis and the basis risk exist, which might deter farmers’ insurance uptake for both extensive and intensive margins. Understanding the effect of basis risk is crucial because crop revenue insurance, first introduced in the 1990s, has been the most popular product offered by Federal Crop Insurance Corporation. In this study, we first develop a conceptual model to analyze farmers' acreage response to basis risk. Given a coverage level, a farmer's willingness to pay for insurance products increases when basis risk decreases, which implies that a farmer is inclined to pay a higher premium per acre or insure more acres when the basis risk is lower. Next, we employed Fractional Probit with Control Function for the empirical analysis. Regression results show that when the basis risk decreases by 10%: (i) the share of insured acres in all insurable acres will increase by at least 3 and 7.7 percentage points for corn and soybean respectively; (ii) coverage level choice will increase by 0.4 and 0.02 percentage points for corn revenue and soybean revenue contracts, respectively. The effect of basis risk on participation rate might shed light on adjusting federal policies. If the revenue contract is revised to remove the basis, basis risk will decrease by 100%. Given the current subsidy structure, the shares of insured acres in total insurable acres for corn and soybean will be more significant than 1 (theoretically, the maximum is 1); coverage level choice for the corn revenue contract increases from 0.76 to 0.803; coverage level choice for the 86 soybean revenue contract rises from 0.76 to 0.762. Besides removing the basis, policymakers might adjust the subsidy structure to be consistent with the underlying basis risk. The regression results show that the lower the basis risk, the higher the intensive margin participation rate. Therefore, the subsidy rate might be reduced in the areas with low basis risks but increased in the areas with high basis risks. 87 REFERENCES Babcock, B.A., E.K. Choi and E. Feinerman. 1993. Risk and probability premiums for CARA utility functions. Journal of Agricultural and Resource Economics: 17-24. Babcock, B.A. and D.A. Hennessy. 1996. Input demand under yield and revenue insurance. Amer. J. of Agr. Econ. 78(2): 416-427. Calvin, L. 1992. Participation in the US federal crop insurance program (No. 1800). US Department of Agriculture, Economic Research Service. Carter, M., G. Elabed, and E. Serfilippi. 2015. Behavioral economic insights on index insurance design. Agricultural Finance Review. Che, Y., H. Feng and D.A. Hennessy. 2020. Recency effects and participation at the extensive and intensive margins in the US Federal Crop Insurance Program. The Geneva Papers on Risk and Insurance-Issues and Practice, 45(1): 52-85. Chite, R. 1988. Federal crop insurance: background and current issues. In CRS report for Congress (USA). Congressional Research Service. Clarke, D.J. 2016. A theory of rational demand for index insurance. American Economic Journal: Microeconomics, 8(1): 283-306. Coble, K.H., T.O. Knight, R.D. Pope and J.R. Williams. 1997. An expected‐indemnity approach to the measurement of moral hazard in crop insurance. Amer. J. of Agr. Econ. 79(1): 216- 226. Coble, K.H., T.O. Knight, B.K. Goodwin, M.F. Miller, R.M. Rejesus and G. Duffield. 2010. A comprehensive review of the rma aph and combo rating methodology: Final report. prepared by Sumaria systems for the Risk Management agency. Deng, X., B.J. Barnett and D.V. Vedenov. 2007. Is there a viable market for area-based crop insurance?. Amer. J. of Agr. Econ. 89(2): 508-519. Deng, X., B.J. Barnett, D.V. Vedenov, and J.W. West. 2007. Hedging dairy production losses using weather‐based index insurance. Agricultural Economics, 36(2): 271-280. Doherty, N. A., and A. Richter. 2002. Moral hazard, basis risk, and gap insurance. Journal of Risk and Insurance, 69(1): 9-24. Du, X. and D.A. Hennessy. 2012. The planting real option in cash rent valuation. Applied Economics, 44(6): 765-776. Du, X., D.A. Hennessy and H. Feng. 2014. A natural resource theory of US crop insurance contract choice. Amer. J. of Agr. Econ. 96(1): 232-252. Du, X., H. Feng and D.A. Hennessy. 2017. Rationality of choices in subsidized crop insurance 88 markets. Amer. J. of Agr. Econ. 99(3): 732-756. Feng, H., X. Du and D.A. Hennessy. 2020. Depressed demand for crop insurance contracts, and a rationale based on third generation Prospect Theory. Agricultural Economics, 51(1): 59-73. Figlewski, S. (1984). Hedging performance and basis risk in stock index futures. The Journal of Finance, 39(3): 657-669. Firpo, S., A.F. M. Kobus Galvao, T. Parker and P. Rosa-Dias. 2020. Loss aversion and the welfare ranking of policy interventions. arXiv preprint arXiv:2004.08468. Gallagher, J., 2014. Learning about an infrequent event: evidence from flood insurance take-up in the United States. American Economic Journal: Applied Economics: 206-233. Gardner, B.L. and R.A. Kramer. 1986. Experience with crop insurance programs in the United States. Pub. for Internatl. Food Policy Research Inst. by Johns Hopkins Univ. Press. Glauber, J.W., 2004. Crop insurance reconsidered. Amer. J. of Agr. Econ. 86(5): 1179-1195. Gollier, C. and J.W. Pratt. 1996. Risk vulnerability and the tempering effect of background risk. Econometrica: Journal of the Econometric Society: 1109-1123. Gollier, C. 2001. The economics of risk and time. MIT press. Goodwin, B.K. 1994. Premium rate determination in the federal crop insurance program: what do averages have to say about risk?. Journal of Agricultural and Resource Economics: 382-395. Goodwin, B.K. 2001. Problems with market insurance in agriculture. Amer. J. of Agr. Econ. 83(3): 643-649. Goodwin, B.K. and A. Hungerford. 2015. Copula-Based Models of Systemic Risk in U.S. Agriculture: Implications for Crop Insurance and Reinsurance Contracts. Amer. J. of Agr. Econ. 97: 879-96. Haushalter, G. D. 2000. Financing policy, basis risk, and corporate hedging: Evidence from oil and gas producers. The Journal of Finance, 55(1): 107-152. Hojjati, B. and N.E. Bockstael. 1988. Modeling the demand for crop insurance. International Food Policy Research Institute. Jiang, B. 1997. Corn and soybean basis behavior and forecasting: fundamental and alternative approaches. Iowa State University. Joe, H. and J.J. Xu. 1996. The estimation method of inference functions for margins for multivariate models. Karlson, N., B. Anderson, and R.P. Dahl. 1993. Cash-Futures Price Relationships: Guides to 89 Corn Marketing (No. 1701-2016-139145). Kousky, C. 2017. Disasters as learning experiences or disasters as policy opportunities? Examining flood insurance purchases after hurricanes. Risk analysis, 37(3): 517-530. Meyer, J. 1987. Two-moment decision models and expected utility maximization. The American economic review: 421-430. Meyer, J., and R.H. Rasche. 1992. Sufficient conditions for expected utility to imply mean- standard deviation rankings: empirical evidence concerning the location and scale condition. The Economic Journal, 102(410) 91-106. Naik, G. and R.M. Leuthold. 1991. A Note on the Factors Affecting Corn Basis Relationships. Journal of Agricultural and Applied Economics, 23(1): 147-153. Nayak, S. C., B.B. Misra, and H.S. Behera. 2014. Impact of data normalization on stock index forecasting. International Journal of Computer Information Systems and Industrial Management Applications, 6(2014): 257-269. O'Donoghue, E. 2014. The effects of premium subsidies on demand for crop insurance. USDA- ERS economic research report, (169). Papke, L.E. and J.M. Wooldridge. 1996. Econometric methods for fractional response variables with an application to 401 (k) plan participation rates. Journal of applied econometrics, 11(6): 619-632. Papke, L.E. and J.M. Wooldridge. 2008. Panel data methods for fractional response variables with an application to test pass rates. Journal of econometrics, 145(1-2): 121-133. Ramirez, O.A. and J.S. Shonkwiler. 2017. A probabilistic model of the crop insurance purchase decision. Journal of Agricultural and Resource Economics: 10-26. Rosch, Stephanie. 2021. “Federal Crop Insurance: A Primer”, Report No. R46686. Rosa, I. 2018a. Federal crop insurance: Program overview for the 115th congress. Report R45193. Rosa, I. 2018b. Farm bill primer: Federal crop insurance. Congressional Research Service. Sandmo, A. 1971. On the theory of the competitive firm under price uncertainty. The American Economic Review, 61(1): 65-73. Schmidt, U., C. Starmer and R. Sugden. 2008. Third-generation prospect theory. Journal of Risk and Uncertainty, 36(3): 203-223. Schlenker, W., and M.J. Roberts. 2009. Nonlinear temperature effects indicate severe damages to US crop yields under climate change. Proceedings of the National Academy of sciences, 106(37): 15594-15598. 90 Sheather, S.J. and M.C. Jones. 1991. A reliable data‐based bandwidth selection method for kernel density estimation. Journal of the Royal Statistical Society: Series B (Methodological), 53(3): 683-690. Skees, J.R., 1987. Future research needs on federal multiple peril crop insurance (No. 2096- 2018-3251). Smith, V.H. and B.K. Goodwin. 1996. Crop insurance, moral hazard, and agricultural chemical use. Amer. J. of Agr. Econ. 78(2): 428-438. Stein, D. 2018. Dynamics of demand for rainfall index insurance: evidence from a commercial product in India. The World Bank Economic Review, 32(3): 692-708. Turvey, C.G., A. Weersink, and S.H.C. Chiang .2006: Pricing Weather Insurance with a Random Strike Price: The Ontario Ice-Wine Harvest Amer. J. of Agr. Econ. 88(August): 696-709 U.S. Department of Agriculture. 2015. Summary Report: 2012 National Resources Inventory, Natural Resources Conservation Service, Washington, DC, and Center for Survey Statistics and Methodology, Iowa State University, Ames, Iowa. http://www.nrcs.usda.gov/technical/nri/12summary US Risk Management Agency. 2022. Crop Insurance Handbook. FCIC 18010-1. US GAO.2014. Considerations in Reducing Federal Premium Subsidies. Report GAO-14-700. US Gov. Account. Off., Washington DC. US GAO. 2015. In areas with higher crop production risks, costs are greater, and premiums may not cover expected losses. Report GAO-15-215. US Gov. Account. Off., Washington DC. Wooldridge, J. M. 2015. Control function methods in applied econometrics. Journal of Human Resources, 50(2): 420-445. Xu, Z., D.A. Hennessy, K. Sardana and G. Moschini. 2013. The realized yield effect of genetically engineered crops: US maize and soybean. Crop Science, 53(3): 735-745. Yan, J., 2007. Enjoy the joy of copulas: with a package copula. Journal of Statistical Software, 21(4): 1-21. Yu, J. and D.A. Sumner. 2018. Effects of subsidized crop insurance on crop choices. Agricultural Economics, 49(4): 533-545. Zulauf, C., J. Coppess, G. Schnitkey and N. Paulson. 2018. Premium Subsidy and Insured US Acres: Differential Impact by Crop. farmdoc daily, 8. 91 APPENDIX A: Monte Carlo Simulation 1. Estimation for Farmer’s Willingness to Pay We employ Monte Carlo simulation associated with Gaussian Copula Method to measure farmers’ WTP ($/acre) at all coverage levels. Given a coverage level, a high WTP indicates that a farmer would like to pay a high premium per acre or insure more acres. The main hypothesis we will test is that farmers’ WTP decreases when basis risk increases (H1). First of all, we use the Copula method to draw a unit-level subsample. As in the literature (Yan 2007; Du and Hennessy 2012; Goodwin and Hungerford 2015), we assume that a dependence structure exists among yield, spot, and futures prices. Our motivation is to estimate crop-specific farmers’ WTP without considering other effects such as crop rotation. Therefore, the three-dimensional Multivariate Gaussian Copula (MGC) will be used, which indicates that the dimension of variance-covariance matrix is 3  3 (see more details in Appendix B). Denote 𝑦𝑠𝑡 , 𝑃𝑠𝑡 and 𝐹𝑠𝑡 as yield, cash and futures prices with 𝑠 ∈ {IL, IN, … , WI} and 𝑡 ∈ 𝑇 det = {1971, … ,2020}. Yield and cash price data are from USDA NASS; futures price data are from Chicago Board of Trade (CBOT). Three main steps are used in the simulation: (i) assume yield model specification 𝑦𝑠𝑡 = 𝛽0 + 𝛽1 (2020 − 𝑡) + 𝜀𝑠𝑡 where 𝜀𝑠𝑡 ⁡is the random variation. After estimating 𝛽̂0, 𝛽̂1, and 𝜀̂𝑠𝑡 via OLS, we can normalize yield variation by 𝜀̂𝑠𝑡 = 𝑦𝑠𝑡 ⁄𝑦̂𝑠𝑡 where 𝑦̂𝑠𝑡 = 𝑦𝑠𝑡 − 𝜀̂𝑠𝑡 . Denote by 𝜀̅𝑠 and 𝜎𝜀 , respectively, the mean and standard deviation of {𝜀𝑠𝑡 }𝑡∈𝑇 det . Then the upper support of 𝜀̂𝑠𝑡 can be constructed as 𝜀̂𝑠,max = 𝜀̅𝑠 + 3𝜎𝜀 and the lower support 𝜀̃𝑠,min can be imposed as 0. As such, a standard beta random variable 𝜉̃ can be constructed as 𝜉̃𝑠𝑡 = (𝜀𝑠𝑡 − 𝜀̃𝑠,min )⁄(𝜀̃𝑠,max − 𝜀̃𝑠,min ) with 𝜉̃𝑠 = 1 whenever 𝜀𝑠𝑡 > 𝜀̃𝑠,max ; (ii) assume difference of logarithm of prices in 2 consecutive years, 𝜁̃𝑠𝑡 = 𝑙𝑜𝑔(𝑃𝑠,𝑡 ) − 𝑙𝑜𝑔(𝑃𝑠,𝑡−1 ) and 𝜂̃𝑠𝑡 = log(𝐹𝑠𝑡 ) − log(𝐹𝑠,𝑡−1 ) to be normally distributed; and (iii) with 𝜉̃𝑠𝑡 , 𝜁̃𝑠𝑡 and 𝜂̃𝑠𝑡 , we 92 draw 5,000 records from the 2008 unit-level NASS/RMA yield dataset. Five input variables contain state-level spot price, futures price in October, futures price in February, actual yield and approved historical yield (APH). Second, we employ Constant Absolute Risk Aversion (CARA) utility function 𝑢(𝑤) = 1 − 𝑒 −𝐴𝑤 where 𝑤 is farmer’s income and 𝐴 is a risk-averse coefficient. Farmer’s income can be set as $400/acre, which requires that price and yield for corn (soybeans) are $4/bushel ($10/bushel) and 100 bushel/acre (40 bushel/acre). Then we assume that a farmer encounters a normal year in which the farmer’s gamble size ℎ is $200. As the Eqn.(4) in Babcock et al. (1993), we obtain the risk premium 𝜃 which reflects how much money a farmer can give up for the potential risk. Babcock and Hennessy (1996) use risk premiums of 20% and 40%. In our case, we set the risk premium as 0.20 and then the farmer’s WTP as $40/acre, which is consistent with the mean value of premiums farmers paid for revenue contracts over 2009-2020. For robustness checks, we also set other three risk premiums 0.10, 0.06 and 0.02, which correspond to farmer’s WTP $20, $12 and $4. As such, we will employ four risk-averse coefficients – 0.008, 0.004, 0.002 and 0.0008- to correspond the four risk premiums 0.20, 0.10, 0.06 and 0.02. Figure 2A.3 shows farmer’s WTP when the risk-averse coefficient 𝐴 is set as 0.004. We show the results with a format “rank of basis risk - state name – rank of extensive margin participation rate” on the y-axis. The range of ranking is from “01” to “12” where “01” and “12” represent the lowest and highest orders respectively. In our case, the ideal result is “01-state name-12” (“12-state name-01”) which means that a state with the lowest (highest) basis risk has the highest (lowest) extensive margin participation rate. Furthermore, the x-axis reports farmers’ WTP measured as $ per acre. The ideal result is that given a coverage level, a farmer’s WTP decreases when the ranking of basis risk increases. 93 Results in Figure 2A.3 seem not to support our argument robustly. Figures 2A.4 and 2A.5 show farmers’ WTP for all four risk-averse coefficients for all states. We can then find: (i) farmer’s WTP increases as coverage level increases; (ii) given a coverage level, farmer’s WTP at four risk coefficients may intersect; and (iii) given a coverage level and a risk-averse coefficient, differential of farmer’s WTP in different states seem not distinct. Reasons for failure in the simulation may contain: (i) risk-averse coefficients are heterogenous in 12 states; (ii) county-level simulation should be employed rather than state-level simulation; (iii) basis risk cannot be effectively controlled because there are other confounding factors such as yield risk or subsidy effect. Hence, a regression analysis is needed to further identify the effect of basis risk on participation rates. 2. Copula Method Yield, cash price and future price are three continuous random variables in our simulation, which can be denoted as 𝑋𝑌 , 𝑋𝐶 , 𝑋𝐹 . Suppose the joint distribution function is 𝐹(𝑥𝑌 , 𝑥𝐶 , 𝑥𝐹 ) and marginal distribution functions are 𝐹𝑌 , 𝐹𝐶 , 𝐹𝐹 , then a 3-copula C () satisfies: (A2.1) 𝐹(𝑥𝑌 , 𝑥𝐶 , 𝑥𝐹 ) = 𝐶(𝐹𝑌 (𝑥𝑌 ), 𝐹𝐶 (𝑥𝐶 ), 𝐹𝐹 (𝑥𝐹 )) for (𝑥𝑌 , 𝑥𝐶 , 𝑥𝐹 ) ∈ (−∞, +∞)3 Where 𝐹𝑖 = 𝑢𝑖 ∈ [0,1] with 𝑖 ∈ {𝑌, 𝐶, 𝐹}. Let 𝐹𝑖−1 be the inverse distribution function with 𝐹𝑖−1 = 𝑠𝑢𝑝{𝑥𝑖 |𝐹𝑖 (𝑋𝑖 ) ≤ 𝑢𝑖 }, the Multivariate Gaussian Copula (MGC) can be constructed as (A2.2) 𝐶 𝑀𝐺𝐶 (𝑢𝑌 , 𝑢𝑐 , 𝑢𝐹 ; 𝑅) = Φ(𝐹𝑌−1 (𝑢𝑌 ), 𝐹𝐶−1 (𝑢𝐶 ), 𝐹𝐹−1 (𝑢𝐹 ); 𝑅), Where 𝑅 is a symmetric, positive definite matrix with 1 𝜌𝐶𝑌 𝜌𝐹𝑌 (A2.3) 𝑅 = [𝜌𝐶𝑌 1 𝜌𝐹𝐶 ] 𝜌𝐹𝑌 𝜌𝐹𝐶 1 Where 𝜌𝑚 with 𝑚 ∈ {𝐶𝑌, 𝐹𝑌, 𝐹𝐶} is the dispersion parameter. Let 𝛽 and 𝜌 be the vectors 94 of marginal distribution and the copula’s dispersion parameters, then estimated parameter vector is 𝜃 = (𝛽 𝑇 , 𝜌𝑇 )𝑇 . Given 𝑁 observations, the corresponding log-likelihood function can be specified as (Yan, 2007): (A2.4) 𝑙(𝜃) = ∑𝑁 𝑖=1 log𝑐{𝐹𝑌 (𝑋𝑌,𝑖 ; 𝛽), 𝐹𝐶 (𝑋𝐶,𝑖 ; 𝛽), 𝐹𝐹 (𝑋𝐹,𝑖 ; 𝛽); 𝜌} +∑𝑁 𝑖=1 ∑𝑗∈{𝑌,𝐶,𝐹} log𝑓𝑖 (𝑋𝑗,𝑖 ; 𝛽) where 𝑐(𝑢𝑌 , 𝑢𝐶 , 𝑢𝐹 ) = 𝑓[𝐹𝑌−1 (𝑢𝑌 ), 𝐹𝐶−1 (𝑢𝐶 ), 𝐹𝐹−1 (𝑢𝐹 )]⁄∏𝑗∈{𝑌,𝐶,𝐹} 𝑓𝑗 [𝐹𝑗−1 (𝑢𝑗 )] is the copula density function; while 𝑓(∙) and 𝑓𝑖 (∙) represent the multivariate and the univariate marginal density functions, respectively. The ML estimator of 𝜃 is 𝜃̂𝑀𝐿 = arg max 𝑙(𝜃). We 𝜃∈Θ apply the two-step estimation method called Inference Functions for Margins (IFM) as in Joe and Xu (1996): (A2.5a) 𝛽̂𝐼𝐹𝑀 = arg max ∑𝑁 𝑖=1 ∑𝑗∈{𝑌,𝐶,𝐹} log𝑓𝑖 (𝑋𝑗,𝑖 ; 𝛽) 𝛽 (A2.5b) 𝜌̂𝐼𝐹𝑀 = arg max ∑𝑁 ̂ ̂ ̂ 𝑖=1 log𝑐{𝐹𝑌 (𝑋𝑌,𝑖 ; 𝛽𝐼𝐹𝑀 ), 𝐹𝐶 (𝑋𝐶,𝑖 ; 𝛽𝐼𝐹𝑀 ), 𝐹𝐹 (𝑋𝐹,𝑖 ; 𝛽𝐼𝐹𝑀 ); 𝜌} 𝜌 The first step (A.5.a) consists of an ML estimation for each marginal distribution 𝑗 ∈ {𝑌, 𝐶, 𝐹}: (A2.5c) 𝛽̂𝑗,𝐼𝐹𝑀 = arg max ∑𝑁 𝑖=1 log⁡𝑓𝑖 (𝑋𝑗,𝑖 ; 𝛽𝑗 ) 𝛽𝑗 3. Monte Carlo Method The procedure of simulation is employed as in Du and Hennessy (2012): (a) we first simulate 3 independent random variates 𝑥 = (𝑥𝑌 , 𝑥𝐶 , 𝑥𝐹 )𝑇 from the standard distribution 𝑁(0,1). (b) Generate 𝑣 = 𝐴𝑥 where 𝐴 is the Cholesky decomposition of the estimated MGC dispersion matrix 𝑅̂ , then 𝑣 = (𝑣𝑌 , 𝑣𝐶 , 𝑣𝐹 )𝑇 . (c) Set 𝑢𝑗 = Φ(𝑣𝑗 ), 𝑗 ∈ {𝑌, 𝐶, 𝐹} where Φ denotes the univariate 𝑁(0,1) distribution function. 95 (d) Set 𝑥̃𝑗 = 𝐹𝑗−1 (𝑢𝑗 ), 𝑗 ∈ {𝑌, 𝐶, 𝐹} where 𝐹𝑗−1 denotes the inverse of the marginal cumulative density function, then a realization is 𝑥̃ = (𝑥̃𝑌 , 𝑥̃𝐶 , 𝑥̃𝐹 )𝑇 . We repeat this procedure to obtain 5,000 realizations for each state 𝑆, 𝑆 ∈ {IL, IN, … , WI}. 96 APPENDIX B: Supplemental Figures and Tables Figure 2A.1 Elevator Count and Operating Years. 𝑁𝑐 Note: Elevator Year = (1⁄𝑁𝑐 ) ∑𝑖=1 𝑦𝑒𝑎𝑟𝑖,𝑐 where 𝑦𝑒𝑎𝑟𝑖,𝑐 is the number of years for elevator 𝑖 in county 𝑐; 𝑁𝑐 is the elevator count. Both variables represent the elevator history in each county. 97 Figure 2A.2 Ratio of Ending Stocks to Production. Note: Ratio of ending stock to production here is a state-by-year variable. The distribution in a state shows state-level temporal variation over 2009-2020. 98 Figure 2A.3 Willingness-to-Pay for A = 0.004 (i.e., risk premium = 0.1). Note: Number before and after state’s name represent ranks of basis risk and extensive margin participation rates. The “01” and “12” indicate the lowest and highest values respectively. 99 Figure 2A.4 Farmers’ Willingness-to-Pay for Corn. 100 Figure 2A.5 Farmers’ Willingness-to-Pay for Soybeans. 101 CHAPTER 3 Extensive and Intensive Margins of Irrigation Water Demand In the Great Lakes Region Abstract The literature has explored water irrigation issues in arid regions; however, few studies investigate irrigation water usage in relatively water-rich areas such as the Great Lakes. Although the water conservation policy was implemented in recent years, there has been an upward trend in irrigation water demand from 2003 to 2018, including irrigated acres, number of pumped wells, and average depth to water. We employ firm-level data from USDA Farm and Ranch Irrigation Survey (FRIS) and Irrigation and Water Management Survey (IWMS) to examine what factors impact farmers' response to irrigation water usage. The findings are: (i) price elasticities of irrigation water usage vary significantly according to model specifications and water costs; (ii) the water usage at both extensive (irrigated acres) and intensive (water application per acre) margins are input price inelastic; and (iii) price elasticities of water usage are homogeneous across crops but heterogeneous across states. Our estimates of price elasticities of water usage can be used to calculate the cost of irrigation water reduction through water pricing, which might shed light on long-term water sustainability. 102 Introduction The literature has explored irrigation water issues in the arid regions (U.S. western states), including water conservation (Neigri and Hanchar 1989; Huffaker et al. 1998), the economic impacts of climate change (Deschênes and Greenstone 2007), the effect of electricity subsidies on groundwater extraction (Badiani and Jessoe 2013), the impact of irrigation technology on reduction of groundwater extraction (Pfeiffer and Lin 2014), precision irrigation (Adeyemi et al. 2017), and price elasticities for irrigation water (Howitt et al. 1980; Moore et al. 1994; Hooker and Alexander 1998; Hendricks and Peterson 2012; Mieno and Brozović, N. 2017). However, few studies investigated the issues regarding irrigation water demand in a relatively water-rich area, e.g., the Great Lakes region. In 2008, the Great Lakes-St. Lawrence River Basin Water Resources Compact became law, requiring the eight U.S. states (IL, IN, MI, MN, NY, OH, PA and WI) to work together to protect water resources in this region. Per this inter-state Compact, all eight states have implemented many regulations for future water management (Lautenberger and Norris 2016). Although the goal of the rules is water conservation, there has been an upward trend in irrigation water demand from 2003 to 2018, including irrigated acres, the number of pumped wells, and the average water depth (see Table 3.1). Investigating why farmers increase water demand is crucial because the depressed water levels might induce costly treatment of a bottomless aquifer groundwater supply. In this study, we investigate irrigation water usage using farm-level survey data from USDA National Agricultural Statistics Service (NASS) overlying the Great Lakes region. Our motivations mainly come from that: (i) the increasing water demand may induce a concern regarding water scarcity; (ii) current policies to restrict water usage do not seem to be efficient, 103 so it is necessary to examine the own-price elasticity of irrigation water demand since it is the critical parameter in determining the impacts of change in water-related policies (Hendricks and Peterson 2012); (iii) the literature commonly adopts marginal irrigation cost (MIC) (Casewell and Zilberman 1986; Gonzalez-Alvarez et al. 2006; Mieno and Brozović 2017), and many recent studies use average energy cost (AEC) (Ito 2014; Kornelis and Norris 2020); however, no research investigates which water cost is appropriate in the Great Lakes region; (iv) what underlying factors affect farmer’s behavior on groundwater extraction. Table 3.1 Summary of Water Usage in the Great Lakes Region. Acres Irrigated Acre-feet Water Applied State (thousands) (thousands) 2003 2008 2013 2018 2003 2008 2013 2018 Illinois 375 457 542 566 237 231 368 333 Indiana 276 404 455 583 132 199 235 248 Michigan 433 532 588 827 218 298 313 458 Minnesota 435 504 517 555 294 334 324 249 Ohio 14 19 42 39 16 9 26 17 Wisconsin 392 396 473 518 357 322 388 294 Average Depth to Water Number of Pumped Wells State (feet) 2003 2008 2013 2018 2003 2008 2013 2018 Illinois 3,204 3,857 5,252 5,095 111 117 117 129 Indiana 2,318 2,746 4,133 5,056 83 85 95 96 Michigan 4,031 4,097 7,554 9,691 121 117 105 121 Minnesota 3,797 4,312 5,385 5,905 130 122 131 136 Ohio 975 215 1,064 1,203 123 135 110 103 Wisconsin 3,002 3,192 6,169 5,166 155 147 149 166 Data Source: USDA Farm and Ranch Irrigation Survey (2003,2008,2013); Irrigation and Water Management Survey (2018). There is an increasing research interest in the issues regarding water withdrawal and induced water scarcity in the water-abundant region. Grannemann et al. (2000) note that the available freshwater supply is more than adequate to meet the Great Lakes region's agricultural, 104 industrial, residential, and ecological needs. However, some findings and anecdotes exist to presage a potential water scarcity in this region. De Loe and Kreutzwiser (2000) note that climate changes are likely to increase the frequency of shallow levels in the Great Lakes region. Gronewold and Stow (2014) remind that the federal agencies from the United States and Canada documented the lowest water levels recorded on Lakes Michigan and Huron in January 2013. For the Kalamazoo River watershed in Southwest Michigan, Mubako et al. (2013) noted that irrigation water withdrawals in water-abundant regions can lead to surface water scarcity due to seasonal and spatial concentration during low-flow summer months. Kornelis and Norris (2020) investigate price elasticities and climatic determinants of irrigation water demand in the Great Lakes region. We build on Kornelis and Norris (2020) but further contribute to three primary pieces of literature. First, our analysis provides evidence to understand which kind of water cost is appropriate in the Great Lakes region. Marginal irrigation cost is commonly adopted in studies that analyze water demand in the U.S western states. However, many studies argue that consumers respond to average rather than marginal costs due to the complexity of electricity pricing schedules (Foster and Beattie 1979; Foster and Beattie 1981; Borenstein 2009; Ito 2014). Kornelis and Norris (2020) adopt average energy cost directly but do not provide how marginal irrigation cost performs in the empirical analysis. Since energy employed in western states and the Great Lakes region are the same (e.g., electricity, diesel, and natural gas), we may ask whether marginal irrigation cost is a good predictor for irrigation water usage. Second, we contribute to estimating price elasticity in the Great Lakes region with multiple model specifications. Price elasticity depends on the water cost construction and model specifications adoption. Our analysis provides evidence that policymakers may overestimate the 105 price elasticities if they only employ the Tobit Model, which may induce ineffective policies. We compare price elasticities from Heckman Selection Model, Tobit Model and of extensive (irrigated acres/irrigated share) and intensive (water application per acre) margins of irrigation water usage (see, for example, Moore et al. 1994; Mullen et al. 2009). Per estimated price elasticities, we further calculate the cost of reducing irrigation water usage through water pricing. Third, our analysis provides the estimations of price elasticity with different tolerances of measurement error. The first measurement error is the misreporting of energy expenditure and water application, which generates extremely high or low water costs (outliers). Kornelis and Norris (2020) address the first measurement error by dropping firms with average energy costs above the 95th percentile and below the 5th percentile. The second measurement error exists in extensive and intensive margins of water usage. Farmers report total and crop-specific values for irrigated acres and water application in the questionnaire, but two values may be inconsistent. Kornelis and Norris (2020) address the second measurement error by dropping observations with a difference between two values less than 5 percent. We find that extensive margin is more precise than intensive margin. The reason is that extensive margin can be more readily measured than intensive margin. To obtain robust estimations, we adopt many tolerances of measurement error for the empirical analysis. The third measurement error exists when we impute marginal irrigation cost- as in Mieno and Brozović (2017), omitting the pressure head in the engineering equation will underestimate marginal irrigation cost. Therefore, we first impute Total Dynamic Head (TDH) and then drop outliers. The rest of this paper is organized as follows. Section 2 reports all data and variables which are employed in this study. Empirical results are in section 3, and then section 4 concludes. 106 Data and Main Variables Six states (IL, IN, MI, MN, OH, WI) were selected as our targeted area. The most important data used in this study are from USDA Farm and Ranch Irrigation Survey (FRIS: 2003/2008/2013) and Irrigation and Water Management Survey (IWMS:2018), which follow the latest Census of Agriculture (COA: 2002/2007/2012/2017). Specifically, all farms with irrigation behavior can be identified in COA, and a sample was drawn with a targeted size. Farms are different in each survey, so our sample is not a panel dataset. Table 3.2 Summary of Sample Size. 2018 2013 2008§ 2003 Panel A: National Level Targeted sample size 35,000 35,000 35,000 NA Final sample size 34,783 34,966 33,085 25,014 certainty 1,340 2,095 2,738 1,823 noncertainty 33,443 32,871 30,347 23,191 Panel B: Great Lakes Region Illinois 329 913 599 569 Indiana 359 886 585 580 Michigan 541 1,103 613 584 Minnesota 412 873 685 695 Ohio 382 697 241 314 Wisconsin 403 905 513 513 Note: “NA” indicates the missing value. “§” means that the sample design in 2008 contains two parts: (1) for the general sample, the targeted sample size is 25,000 and the final sample size is 23,089; (2) for horticulture sample, the targeted sample size is 10,000 and the final sample size is 9,996. Hence, in Panel B of 2008, state-level values represent farms from the final sample with 23,089 farms. FRIS/IWMS issued a targeted sample size every survey year, e.g., 35,000 in 2018 (first column of Table 3.2). All targeted farms were selected in terms of a stratification strategy: (i) a certainty stratum (with probability 1); and (ii) the remaining strata (with probability less than 1). Table 3.2 reports the information of the final sample and the farms in the Great Lakes region. One noteworthy thing is that farms in 2018 in each state of the Great Lakes region declined to 107 around 50% in 2013 (see Panel B of Table 3.2). However, farms in some western states, e.g., CA, CO, NE, and TX, increased significantly (see Figure 3A.2). FRIS/IWMS did not clearly explain why sample sizes in the Great Lakes region have been changed. In the empirical part, we employ weights provided by USDA FRIS/IWMS to estimate price elasticities. Sample Selection There were 13,465 farms in the 4 survey years in total. We dropped farms: (i) without planting acres because there is no irrigation behavior on these farms; and (ii) without crop- specific records because these records can be used to address measurement errors. In final, there are 7,936 farms left for our analysis. Kornelis and Norris (2020) compare total water usage and the sum of crop-specific water usage, then select farms with a difference of less than 5 percent. Per a similar method, we compare the total and the sum of crop-specific water usage for extensive (irrigated acres) and intensive (water application per acre) margins, respectively. Let 𝑇𝐸𝑖 denote the total irrigated acres for the farm 𝑖; 𝑇𝐼𝑖 denote the total water use (acre-inch). Let 𝐶𝐸𝑖 and 𝐶𝐼𝑖 denote the sum of crop-specific irrigated acres and water usage, respectively. We then construct two indexes - 𝐷𝐸𝑖 = |𝐶𝐸𝑖 ⁄𝑇𝐸𝑖 − 1| and 𝐷𝐼𝑖 = |𝐶𝐼𝑖 ⁄𝑇𝐼𝑖 − 1| - for our subsample selection. These two indexes measure the difference between the total value and the sum of crop-specific values. For example, when 𝐷𝐸𝑖 = 0, that means farm 𝑖 precisely reported irrigated acres; when 𝐷𝐸𝑖 = 0.1, the sum of crop-specific irrigated acres may be higher or lower than the total irrigated acres by 10 percent. We can call 𝐷𝐸 and 𝐷𝐼 as tolerances for extensive and intensive margin water usages, respectively. It is noteworthy that total water usage record for each farm contains groundwater, on-farm surface water and off-farm water. Table 3.3 reports the observations for 4 error tolerance criteria - 0, 0.01, 0.05 and 0.1. We 108 find that measurement for extensive margin (irrigated acres) is more accurate than intensive margin (water application per acre). For example, farms satisfying the condition “𝐷𝐸 = 0” are 6,837, accounting for 86.2% of the sample; however, those satisfying the condition “𝐷𝐼 = 0” only account for 29.1%. For each error tolerance, the farms for extensive margin are more than those for intensive margin. One possible explanation is that measurement of water application is not precise due to a lack of tools such as flow meters. We will do empirical analysis for subsamples from all four tolerances to avoid biased estimations. Table 3.3 Summary for Different Precision Conditions. Irrigated Acres Water Application Obs. of Intersection Condition Obs. Condition Obs. 𝐷𝐸 = 0 6,837 𝐷𝐼 = 0 2,309 2,281 𝐷𝐸 ≤ 0.01 6,916 𝐷𝐼 ≤ 0.01 2,795 2,753 𝐷𝐸 ≤ 0.05 7,063 𝐷𝐼 ≤ 0.05 3,903 3,827 𝐷𝐸 ≤ 0.1 7,177 𝐷𝐼 ≤ 0.1 4,695 4,605 Total 7,936 Note: 0, 0.01, 0.05 and 0.10 represent error tolerances. For example, “0” indicates that each farm in this sub-sample has a perfect match between the total value and sum of crop-specific values. Water application is measured as acre-inch/acre. “Obs. of Intersection” shows both conditions are satisfied at the same time. For example, the 2,281 (the first row and last column) indicates how many farms report the exact records for both extensive and intensive margins. Water Use - Extensive and Intensive Margins This study analyses two types of water use: extensive margin (irrigated acres) and the intensive margin (water application per acre). Extensive margin reflects the cropland allocation decisions, and intensive margin reflects short-run water application decisions (Moore et al. 1994). Figure 3.1 shows state-by-year irrigated acres for five selected states. Ohio is omitted because irrigated acres are much smaller than in other states. Panel A of Figure 3.1 said that 109 Figure 3.1 Irrigated Acres by Crop, State and Year. Data Source: FRIS (2003, 2008, 2013) and IWMS (2018) summary report. Note: Ohio is not included due to the small value of irrigated acres, but it will be in the empirical analysis. 110 Figure 3.2 Water Application by Crop, State, and Year. Data Source: Farm-level records in FRIS (2003, 2008, 2013) and IWMS (2018). Note: The middle point indicates the mean value of firm-level water applications in each (state, year) pair. The range indicates 1 standard deviation. This sample includes all observations with an error tolerance 0.1. 111 irrigated and non-irrigated acres have an increasing trend in each state. The crop-specific irrigated acres have a similar pattern (see Panel B of Figure 3.1). In this study, we select corn and soybeans as the targeted crops for two reasons: (i) both crops account for a large percentage of the total irrigated acres; and (ii) amounts of farms planting corn and soybeans are far beyond that of other crops (see Table 3A.1). Figure 3.2 shows state-by-year water applications (acre inch per acre) with a tolerance of 0.1. More error tolerances will be employed in the empirical analysis. Different from irrigated acres, water applications are variable across years. There are two interesting points: (1) from 2003 to 2013, water usage has an increasing trend in nearly all states except Illinois and Ohio; (2) between 2013 and 2018, a decreasing trend exists except for Michigan and Ohio. We may argue that: (i) Illinois improved irrigation efficiency or had high energy cost in 2008; (ii) after 2013, irrigation efficiency has improved quite a lot. Irrigation Water Cost In the literature, there are two irrigation water costs for estimating price elasticities. The first water cost is the marginal irrigation cost calculated from an engineering equation; the second water cost is the average energy cost calculated based on energy expenditure and water usage. Kornelis and Norris (2020) adopt the average cost approach by assuming that the average water cost may approximate the marginal cost when there are no significant changes in energy prices during the irrigation season. Farmers report energy expenses and parameters for irrigation in the FRIS/IWMS, such as well depth and pumping capacity. Therefore, we are motivated to compare average energy cost and the imputed marginal irrigation cost because: (i) marginal irrigation cost is commonly used in the studies for U.S. western states (Moore et al. 1994; Gonzalez-Alvarez et al., 2006; Mullen et al., 2009; Hendricks and Peterson 2012; Pfeiffer and 112 Lin 2014); and (ii) price elasticity based on different irrigation water cost may help policymakers better understand the potential consequences of tax structure. Let 𝐴𝐸𝐶𝑖 denote average energy cost for farm 𝑖; 𝑇𝑜𝑡𝐸𝑥𝑝𝑒𝑛𝑖 denote total energy expenditure; 𝑇𝐼𝑖 denote the total water usage defined as above. The formula for average energy cost is 𝐴𝐸𝐶𝑖 = 𝑇𝑜𝑡𝐸𝑥𝑝𝑒𝑛𝑖 ⁄𝑇𝐼𝑖 , which is measured as dollars per acre inch. Now we turn to construct marginal irrigation costs based on Static Water Level and Total Dynamic Head, respectively. Why are two irrigation costs imputed? Mieno and Brozović (2017) note that the total dynamic head contains static water level, draw-down, and pressure head, so omitting components in the Total Dynamic Head will induce an underestimation of marginal cost. Let 𝑀𝐼𝐶 𝑆 denote marginal irrigation cost based on static water level; 𝑀𝐼𝐶 𝑇 denote the marginal irrigation cost based on Total Dynamic Head; 𝑆𝑊 denote the static water level; 𝜂 denote the pumping efficiency, 𝐸𝑃 denote the energy price. 𝑊𝐻𝑃 denote water horsepower (i.e., size of engine Horse-Power); 𝐺𝑃𝑀 is pumping capacity measured as gallon/min. As in Rogers and Alam (2006), we employ 𝑇𝐷𝐻 = 𝑊𝐻𝑃 × 3,960/𝐺𝑃𝑀. Then the two imputed marginal irrigation costs are (3.1a) 𝑀𝐼𝐶 𝑠 = 𝑆𝑊 × 𝜂 × 𝐸𝑃, (3.1b) 𝑀𝐼𝐶 𝑇 = (𝑊𝐻𝑃 × 3,960⁄𝐺𝑃𝑀 ) × 𝜂 × 𝐸𝑃 Table 3.4 The 100% Nebraska Performance Criteria (NPC) for Pumping Plant. NPC for Fuel Units for Lifting Fuel Type Fuel Unit Pumping Plants 1 acre-foot of water 1 foot Electricity kWh 0.885 whp-hr/kWh 1.551 Natural Gas mcf 61.7 whp-hr/MCF 0.0223 (925 BTU/cf) Diesel gal 12.50 whp-hr/gal 0.1098 Propane gal 6.89 whp-hr/gal 0.1993 Note: kWh represents the kilowatt hour. whp-hr represents water horsepower hours. 1 mcf = 1,032 cubic feet. gal represents gallon. 113 Table 3.5 Average Electricity and Diesel Prices (May-September). Electricity Diesel GGE Diesel IL IN MI MN OH WI (cents/kWh) ($/gal) (kWh/gal) (cents/kWh) 2003 8.08 6.44 8.11 7.49 7.94 7.93 2.36 37.95 6.22 2008 6.49 4.77 5.96 5.2 5.39 5.77 3.72 37.95 9.8 2013 5.53 6.14 7.23 6.51 5.74 6.9 3.54 37.95 9.34 2018 6.66 7.3 7.09 7.67 6.99 7.43 3.14 37.95 8.27 Note: All price values are adjusted to USD 2018 with the irrigation-seasonally (May to September) average Consumer Price Index (CPI) for all urban consumers. The CPI data are from U.S. Bureau of Labor Statistics. GGE represents Gasoline Gallon Equivalent. Table 3.6 Farms by Energy Sources and Error Tolerance. Energy Source Obs. With Only One All Obs. Percentage (All Obs./Total) No Condition (No constraint; Total=7,936) Electricity 3,720 5,971 75.2 Natural Gas 16 64 0.8 LP Gas, Propane 39 220 2.8 Diesel Fuel 1,448 3,667 46.2 Gasoline 170 264 3.3 Condition 1 (𝐷𝐸 = 0; 𝐷𝐼 = 0; Total=2,281) Electricity 1,043 1,604 70.3 Natural Gas 7 13 0.6 LP Gas, Propane 14 71 3.1 Diesel Fuel 533 1,099 48.2 Gasoline 34 50 2.2 Condition 2 (𝐷𝐸 ≤ 0.01; 𝐷𝐼 ≤ 0.01; Total=2,753) Electricity 1,243 1,951 70.9 Natural Gas 10 23 0.8 LP Gas, Propane 15 88 3.2 Diesel Fuel 642 1,350 49.0 Gasoline 39 59 2.1 Condition 3 (𝐷𝐸 ≤ 0.05; 𝐷𝐼 ≤ 0.05; Total=3,827) Electricity 1,720 2,830 73.9 Natural Gas 14 41 1.1 LP Gas, Propane 20 119 3.1 Diesel Fuel 805 1,914 50.0 Gasoline 46 76 2.0 Condition 4 (𝐷𝐸 ≤ 0.1; 𝐷𝐼 ≤ 0.1; Total=4,605) Electricity 2,070 3,476 75.5 Natural Gas 15 45 1.0 LP Gas, Propane 24 140 3.0 Diesel Fuel 916 2,321 50.4 Gasoline 47 82 1.8 114 The parameters for irrigation, i.e., 𝑆𝑊, 𝑊𝐻𝑃, 𝐺𝑃𝑀, can be obtained from farm-level FRIS/IWMS datasets. Pumping efficiency is Nebraska Performance Criteria (NPC) for Pumping Plants with 100% efficiency (see Table 3.4), commonly used in the literature. State-by-year energy price data are from the U.S. Energy Information Administration (EIA) (see Table 3.5), all of which are adjusted to the 2018 dollars. The electricity price measured as cents/kWh is monthly average retail price in the industrial sector. The diesel price is weekly Midwest No.2 diesel retail price measured as $/gal. We impute the marginal irrigation cost for farms which: (i) have at least one well; and (ii) use electricity, diesel, or both. The reasons are that farms using other energy sources account for a low percentage in the sample (see Table 3.6). It is noteworthy that both imputed irrigation water costs are the costs for groundwater extraction and that surface water is not considered. 115 Figure 3.3 Kernel Density Estimation (KDE) for Three Water Costs. Note: The kernel function is Gaussian function. Bandwidth is calculated by minimizing the mean integrated squared error. Observations with values larger than 20 are excluded. Figure 3.3 shows the kernel density estimations (KDE) for three irrigation water costs - 𝐴𝐸𝐶, ⁡𝑀𝐼𝐶 𝑇 , 𝑀𝐼𝐶 𝑠 . We find that: (i) 𝐴𝐸𝐶 and 𝑀𝐼𝐶 𝑇 are flatter than 𝑀𝐼𝐶 𝑠 , indicating irrigation water cost based on static water level is underestimated; (ii) the distribution difference between 𝐴𝐸𝐶 and 𝑀𝐼𝐶 𝑇 is not distinct, although 𝑀𝐼𝐶 𝑇 seems more concentrated around its mode. Figure 3.4 provides the comparison between 𝐴𝐸𝐶 and 𝑀𝐼𝐶 𝑇 by state and year. The difference between both water costs is not distinct as well. Since there are farm-level observations for both water costs, we are motivated to have a paired t-test to examine whether the mean difference between 𝐴𝐸𝐶 and 𝑀𝐼𝐶 𝑇 is different from 0. The null hypothesis is that 𝐸[𝐴𝐸𝐶𝑖 − 𝑀𝐼𝐶𝑖𝑇 ] = 0 where 𝑖 represents farms and 𝐸[∙] is expectation operation. The t-statistic is 12.15 with 5,972 degrees of freedom. The corresponding two-tailed p-value is 0.00 that is less than 0.01. Therefore, 𝐴𝐸𝐶 is statistically different from 116 𝑀𝐼𝐶 𝑇 , although their distributions are much close. Figure 3.4 Irrigation Water Costs by State and Year. Note: The middle point indicates the mean value of firm-level water costs in each (state, year) pair. The range indicates 1 standard deviation. 117 Crop Price High crop prices are expected to incentivize farmers' irrigation behavior. As a result, farmers might irrigate more acres or increase water application (ac-in/ac) as a profit maximizer. Since the farmer's decision depends on the expected crop prices before the planting time, we employ state-level February price received data published by USDA National Agricultural Statistics Service (NASS). All data are adjusted to 2018 dollars with urban consumers' annual average Consumer Price Index (CPI) from the U.S. Bureau of Labor Statistics. Weather Determinants In the literature, many studies have investigated the effects of climate determinants on cropland allocation (Moore et al. 1994; Schlenker and Roberts 2006; Schlenker and Roberts 2009), water quality (Fausey et al. 1995), water management (De Loe and Kreutzwiser 2000), agricultural production (Deschênes and Greenstone 2007), irrigation technology adoption (Loe and Kreutzwiser 2000; Pfeiffer and Lin 2014), and irrigation water demand (Hendricks and Peterson 2012; Kornelis and Norris 2020). Temperature data employed in this study are from National Oceanic and Atmospheric Administration (NOAA). As in the literature (e.g., Xu et al. 2013; Che et al. 2020), we construct two variables-Growing Degree Days (GDD) and Stress Degree Days (SDD)-to represent beneficial heat and heat stress during the growing season (April to September) respectively. The formulas for variable GDD and SDD in year t are: max l min l (3.2a) 𝐺𝐷𝐷𝑐,𝑡 = ∑𝑑∈Ω𝑡 [0.5 (min(max(Tc,d,t , T ), T h ) + min(max(Tc,d,t , T ), T h )) − 𝑇 𝑙 ] 𝑚𝑎𝑥 𝑚𝑖𝑛 (3.2b) 𝑆𝐷𝐷𝑐,𝑡 = ∑𝑑∈Ω𝑡 [0.5 (max(𝑇𝑐,𝑑,𝑡 , 𝑇 𝑘 ) + max(𝑇𝑐,𝑑,𝑡 , 𝑇 𝑘 )) − 𝑇 𝑘 ] Where 𝑐 is county; 𝑑 is day; Ω𝑡 is the set of growing season days in year 𝑡. The 118 thresholds are 𝑇 𝑙 = 10° 𝐶 (lower bound), 𝑇 ℎ = 30° 𝐶 (upper bound) and 𝑇 𝑘 = 32.2° 𝐶. Let 𝐺𝑐̅ = (1/14) ∑2002 ̅ 2002 𝑗=1989 𝐺𝐷𝐷𝑐,𝑗 and 𝑆𝑐 = (1/14) ∑𝑗=1990 𝑆𝐷𝐷𝑐,𝑗 denote the climatological normals before 2003. Two variables can be constructed for the climate variablity: (3.3a) 𝐺𝐷𝑐,𝑡 = 𝐺𝐷𝐷𝑐,𝑡 − 𝐺𝑐̅ (3.3b) 𝑆𝐷𝑐,𝑡 = 𝑆𝐷𝐷𝑐,𝑡 − 𝑆𝑐̅ Land Capability Land capability data are from National Resource Inventory (NRI). There are eight Land Capability Classes (LCC), where Class I represents the best land and Class VIII represents the worst land. Therefore, land capability can be measured as the percentage of the first two Classes in all Classes (acres for Class I-II/total acres for Class I-VIII). Farm Size Hendricks and Peterson (2012) note that the short-run response to water use is smaller than the long-run response because irrigators can respond to price changes by adjusting irrigation capital. We select two indexes - the value of sales or total acres - to control the farm size. The effect of the scale of farm size on the extensive margin (irrigated acres) is expected to be positive because irrigation practice helps increase yields. On the other hand, the effect of the scale of farm size on the intensive margin (water application per acre) is expected to be negative because larger farms are inclined to improve irrigation efficiency by adopting better equipment. Considering the potential multicollinearity of both variables, we will employ total acres as the primary variable controlling for the farm size and use the sales value for the robustness check. Empirical Analysis In this empirical part, the dependent variables of most interest are extensive margin (irrigated acres) and the intensive margin (water application per acre). The primary independent 119 variables are average energy cost and imputed marginal irrigation cost. We will estimate state- level price elasticities of irrigation water usage with different model specifications. Extensive Margins Table 3.7 shows that 2,676 farms plant corn without irrigation and 4,423 farms plant soybean without irrigation, accounting for 33.7% and 55.7%, respectively. Because of the censored data, we will start from the Tobit model and then employ a generalized Tobit model-the Cragg hurdle regression. Table 3.7 Summary of Main Variables. Variable Obs. Mean Std.Dev I. Dependent Variable Total Irrigated Acres 7,936 606 1,142 Total Water Application (ac-in/ac) 7,818 6.90 4.56 Crop-Specific Irrigation Acres Corn (  0 ) 7,936 290.37 574.56 Corn (  0 ) 5,260 438.10 658.30 Soybean (  0 ) 7,936 106.36 252.94 Soybean (  0 ) 3,513 240.27 335.22 Crop-Specific Water Application (ac-in/ac) Corn (  0 ) 7,936 4.55 4.58 Corn (  0 ) 5,260 6.87 3.98 Soybean (  0 ) 7,936 2.77 4.22 Soybean (  0 ) 3,513 6.26 4.29 II. Water Irrigation Cost ($/ac-in) Average cost 7,434 4.44 3.70 Marginal Cost 6,282 3.77 2.70 III. Control Variable Crop Price (Feb, $/bu) Corn 7,936 4.74 1.30 Soybean 7,936 10.89 1.77 Weather Determinant GD 7,063 -12.59 116.99 SD 7,063 -2.27 4.78 Total Planted Acres 7,936 1466.92 2154.68 Land Capability 7,932 0.40 0.21 120 ∗ Let 𝑛𝑖𝑡𝑙 denote the latent irrigated acres for farm 𝑖 for crop 𝑙 in year 𝑡; Cost 𝑖𝑡 denote irrigation water costs, including average energy cost and imputed marginal irrigation cost; 𝑃𝑠𝑡𝑙,Fe denote the crop price in state 𝑠 in February (projected price before the planting time); 𝐙𝐢𝐭 denote a set of explanation variables including county-level weather determinants (GD and SD), county- level land capability and farm-level total acres; 𝜀𝑖𝑡 denote the error term with 𝜀𝑖𝑡 ~𝑁[0, 𝜎𝜀2 ]. Then the Tobit model specification is: ∗ ∗ ∗ 𝑛𝑖𝑡𝑙 , if⁡⁡𝑛𝑖𝑡𝑙 >0 (3.4) 𝑛𝑖𝑡𝑙 = 𝛼0 + 𝛼1 𝑃𝑠𝑡𝑙,Fe + 𝛼2 Cost 𝑖𝑡 + 𝐙𝐢𝐭′ 𝛂 ̃ + 𝜀𝑖𝑡 with 𝑛𝑖𝑡𝑙 = { ∗ 0,⁡⁡⁡⁡⁡⁡if⁡⁡𝑛𝑖𝑡𝑙 > 0 ∗ For the Cragg Hurdle (Double-Hurdle) model, we let 𝑠𝑖𝑡𝑙 denote a selection variable for 𝑢1,𝑖𝑡𝑙 0 1 0 farm 𝑖 for crop 𝑙 in year 𝑡; 𝑢1,𝑖𝑡𝑙 and 𝑢2,𝑖𝑡𝑙 are error terms with (𝑢 ) ~𝑁 [( ) , ( )], then 2,𝑖𝑡𝑙 0 0 𝜎2 the hurdle model specification is as: ∗ ∗ 1, if⁡𝑠𝑖𝑡𝑙 >0 (3.5a) first hurdle: 𝑠𝑖𝑡𝑙 = 𝛾0 + 𝛾1 𝑃𝑠𝑡𝑙,Fe + 𝐙𝐢𝐭′ 𝛄 + 𝑢1,𝑖𝑡 with 𝑠𝑖𝑡𝑙 ={ ∗ , 0, if⁡𝑠𝑖𝑡𝑙 < 0 (3.5b) second hurdle: ∗ 𝑛𝑖𝑡𝑙 = 𝛽0 + 𝛽1 Cost 𝑖𝑡 + 𝐙′𝐢𝐭 𝛃 + 𝑢2,𝑖𝑡 , Where 𝐙𝐢𝐭 is the same as in the Tobit model. The Eqn.(3.5a) and (3.b) generate probit estimator (first hurdle) and Tobit estimator (second hurdle), respectivley. In the Eqn. (3.5a), we assume that farmers’ irrigation decisions (whether to irrigate) depend on the projected crop prices. Notably, the correlation of error terms in two hurdle equations is assumed as 0. Intensive Margin As discussed, misreporting for the intensive margin (water application) is not uncommon (see Table 3.4). As in the literature, the OLS estimator would be unbiased when the dependent variable is mismeasured (Bound and Krueger 1991; Bound et al. 1994; Hausman 2001). Therefore, we employ the OLS estimation for water application per acre 𝑤𝑖𝑡𝑙 for farm 𝑖 for crop 121 𝑙 in year 𝑡. The reduced form equation is as (3.6) 𝑤𝑖𝑡𝑙 = 𝜅0 + 𝜅1 Cost 𝑖𝑡 + 𝜅2 𝑃𝑆𝑡𝑙,Fe + 𝐙̃𝐢𝐭′ 𝛋 + 𝑐𝑖 + 𝛾𝑠 + 𝜅𝑡 + 𝑢𝑖𝑡𝑙 , 𝑤𝑖𝑡𝑙 > 0 The OLS estimation is unbiased if cov(Cost 𝑖𝑡 , 𝑐𝑖 ) = 0 which satisfies the exogenous condition. However, the condition that the pumping cost is exogenous to the farmer is too restrictive because some firm-level characteristics such as well depth indeed affect the pumping costs (see, for example, Hendricks and Peterson 2012). For robustness checks, we also provide regression results from Tobit model. Empirical Results In the sample selection, we consider farmers’ misreporting for irrigated acres and water application. Another measurement error is from farmers’ misreporting for energy expenditure. Kornelis and Norris (2020) dropped the firms above the 95th percentile and below the 5th percentile for the average energy cost of water. In our data, the 5th and 95th percentiles for average energy cost are $0.53 and $25, respectively; those for imputed marginal irrigation cost based on Total Dynamic Head are $1.11 and $19.1. Therefore, we employ average energy cost and marginal irrigation cost between the 5th and 95th percentiles for our analysis, and then use other percentiles for the robustness checks. Based on FRIS/IWMS survey weights, we first use the Tobit model for the sum of crop- specific irrigated acres, and the Pooled OLS for total water application. Table 3.8 reports regression results for all four error tolerances (0, 0.01, 0.05, 0.1) for two water costs while controlling for other determinants. We find that: (i) elasticities from AEC are a little larger than those from MIC in general, no matter for extensive or intensive margins; (ii) AEC has a better performance than MIC for irrigated acres (see F-value of regression), but there is a contrary case for water application (see R2 of regression); and (iii) estimation results are consistent when the 122 error tolerances change. Now we turn to examine crop-specific price elasticities. For the extensive margin of irrigation water usage for corn, as in Table 3.9, the price elasticities of the MIC are lower than those of the AEC. The underestimation of price elasticities of the MIC might induce a high tax structure when the policymakers' target for the reduction of irrigated acres is pre-determined. The price elasticities of the Cregg Hurdle model (around 0.163) are half that of the Tobit model (about 0.29), which implies that policies based on estimation results in the Tobit model might overestimate the outcome of water reduction. The upper panel of Figure 3.5 shows the price elasticities for the extensive margin by state, crop, and model specification based on average energy cost. The findings are: (i) price elasticities of OH are higher than that of other states; (ii) for corn, price elasticities of the Tobit model are higher than that of the Cregg Hurdle model in IN, MI, MN, and WI, but the contrary case happens in IL, and OH; and (iii) for soybean, price elasticities of the Tobit model are higher than that of the Cregg Hurdle model in all states except OH. Kornelis and Norris (2020) report that the price elasticities of water application (ac-in/ac) for corn and soybeans are -0.3 and -0.3, respectively, consistent with our results (see Table 3.10). However, we find that interstate heterogeneity exists. As in the lower panel of Figure 3.5, OH has the maximum value of price elasticity (-0.08), and MI has the minimum value (-0.28), which implies that water reduction will differ among states if a uniform tax structure on average energy cost is provided. The Effect of Potential Tax Structure on Water Reduction Estimated price elasticities can be used for predicting water reduction under different tax structures. Price elasticities for extensive margin (irrigated acres) are from 123 the Cregg Hurdle model, and those for intensive margin (water application) are from Pooled OLS estimation. We set a hypothetical scenario that there is a 10% tax for average energy cost. All else being equal (e.g., equipment count and irrigation efficiency), the tax on average energy cost is equivalent to that on energy price. The irrigated acres, water application, and total water usage in 2018 are employed as the benchmark. Total water usage is the multiplication of irrigated acres and water application. Table 3.11 reports that: (i) average water reductions for corn and soybean are 4.2% and 4.1%, respectively; (ii) The corn change is stabler than soybean because changes in total water usage for corn are around 4% in all states except OH, but those for soybean vary over 2.6%-8.3%. The 10% tax on average energy cost is arbitrarily selected. When a higher tax is chosen by policymakers, e.g., 20%, average water reductions for corn and soybean will be 7.2% and 8.1%, respectively. 124 Table 3.8 Price Elasticities for Extensive and Intensive Margins. Dependent Variable Total Irrigated Acres Estimation Method Tobit (lower bound=1) Tolerance 0 0.01 0.05 0.1 0 0.01 0.05 0.1 Average Energy Cost -0.053*** -0.053*** -0.051*** -0.049*** Marginal Irrigation Cost -0.039*** -0.039*** -0.038*** -0.037*** Land Capability -0.06*** -0.06*** -0.06*** -0.06*** -0.07*** -0.08*** -0.08*** -0.08*** Total Acres (,000) 0.24*** 0.24*** 0.25*** 0.25*** 0.26*** 0.27*** 0.27*** 0.27*** GD -0.0005 -0.0005 -0.0006 -0.0006 -0.002 -0.0002 -0.0003 -0.0003 SD 0.008*** 0.007*** 0.007** 0.007** 0.008*** 0.007** 0.007** 0.007** Obs. 5,410 5,476 5,578 5,666 4,588 4,648 4,745 4820 Population Size 23,306 23,537 24,093 24,653 17,870 18,110 18,596 19,073 F-value of Regression 40.72 40.85 42.05 43.88 17.95 18.07 19.29 20.16 Dependent Variable Water Application (ac-in/ac) Estimation Method Pooled OLS Tolerance 0 0.01 0.05 0.1 0 0.01 0.05 0.1 Average Energy Cost -0.26*** -0.26*** -0.23*** -0.23*** Marginal Irrigation Cost -0.21*** -0.19*** -0.18*** -0.17*** Land Capability 0.03 0.005 -0.06 -0.07* -0.33*** -0.38*** -0.51*** -0.49*** Total Acres (,000) -0.03*** -0.03*** -0.03*** -0.02*** 0.65*** 0.65*** 0.81*** 0.78*** GD -0.003 -0.001 -0.0001 -0.0001 0.05** 0.04** 0.01* 0.004** SD -0.03** -0.02** -0.014** -0.014** -0.0002 -0.01 -0.001 0.006 Obs. 1,803 2,186 3,097 3,748 1,515 1,855 2,661 3,252 Population Size 7,308 8,802 11,970 14,488 5,419 6,629 9,285 11,434 R2 of Regression 0.17 0.17 0.15 0.15 0.23 0.28 0.35 0.33 Note:*,** and *** represent 10%, 5% and 1% significance levels, respectively. All regressions include state fixed effect controlling for macro-level shock and year fixed effect controlling for the yearly differential. 125 Table 3.9 Price Elasticities of Extensive Margins for Corn. Dependent Variable Irrigated Acres Crop Corn Estimation Method Tobit (lower bound=0) Tolerance 0 0.01 0.05 0.1 0 0.01 0.05 0.1 Average Energy Cost -0.293*** -0.297*** -0.29*** -0.29*** Marginal Irrigation Cost -0.178*** -0.179*** -0.179*** -0.171*** Land Capability 0.17*** 0.17*** 0.17*** 0.18*** 0.17*** 0.17*** 0.17*** 0.17*** Total Acres (,000) 0.49*** 0.49*** 0.51*** 0.51*** 0.50*** 0.50*** 0.53*** 0.53*** GD -0.002 -0.002 -0.002 -0.002 0.0002 0.0003 -0.0002 -0.0002 SD 0.02** 0.02** 0.02** 0.02** 0.02** 0.02** 0.02** 0.02** Obs. 5,410 5,476 5,578 5,666 4,588 4,648 4,745 4,820 F-value 44.4 44.97 44.7 43.82 32.36 32.8 33.98 34.52 Estimation Method Cregg Hurdle (lower bound=0) Tolerance 0 0.01 0.05 0.1 0 0.01 0.05 0.1 Average Energy Cost -0.164*** -0.164*** -0.163*** -0.162*** Marginal Irrigation Cost -0.117*** -0.125*** -0.132*** -0.132*** Land Capability 0.19*** 0.19*** 0.19*** 0.19*** 0.20*** 0.20*** 0.20*** 0.20*** Total Acres (,000) 0.51*** 0.51*** 0.51*** 0.51*** 0.5*** 0.5*** 0.5*** 0.5*** GD 0.009*** 0.009** 0.009*** 0.009*** 0.01** 0.01** 0.01** 0.01** SD 0.03** 0.03*** 0.03*** 0.03*** 0.02** 0.02** 0.02** 0.02** Obs. 5,410 5,476 5,578 5,666 4,588 4,648 4,746 4,820 2 Pseudo R 0.034 0.034 0.034 0.034 0.032 0.032 0.032 0.032 p-value of lnsigma 0.000 0.000 0.000 0.000 0.000 0.000 0.000 0.000 Note: *, ** and *** represent 10%, 5% and 1% significance levels respectively. Linearized standard errors are in the parenthesis. 126 Table 3.10 Price Elasticities of Intensive Margins for Corn. Dependent Variable Water Application (ac-in/ac) Crop Corn Estimation Method Tobit (lower bound=0) Tolerance 0 0.01 0.05 0.1 0 0.01 0.05 0.1 Average Energy Cost -0.27*** -0.28*** -0.27*** -0.27*** Marginal Irrigation Cost -0.10*** -0.11*** -0.056 -0.051 Land Capability -0.11** -0.13** -0.16** -0.16** -0.1* -0.17*** -0.21*** -0.19*** Total Acres (,000) -0.003 -0.009 -0.001 -0.004 -0.016 -0.024 -0.011 -0.01 GD 0.011* 0.005* 0.002* -0.001* 0.0103 0.002 0.0049 0.00007 SD -0.03*** -0.03*** -0.02*** -0.02*** -0.05*** -0.014 -0.19* -0.02* Obs. 1,328 1,621 2,345 2,888 1,196 1,467 2,143 2,647 F-value 19.46 17.38 24.89 30.16 3.03 3.34 5.89 4.73 Estimation Method Pooled OLS Tolerance 0 0.01 0.05 0.1 0 0.01 0.05 0.1 Average Energy Cost -0.27*** -0.29*** -0.27*** -0.27*** Marginal Irrigation Cost -0.11*** -0.12*** -0.06 -0.06 Land Capability -0.11** -0.13*** -0.16*** -0.16*** -0.1* -0.18*** -0.23*** -0.21*** Total Acres (,000) -0.003 -0.01 -0.002 -0.004 -0.017 -0.025 -0.012 -0.01 GD 0.01* 0.01 0.002 -0.001 0.01 0.003 0.005 0.0001 SD -0.04*** -0.03*** -0.02*** -0.02*** -0.05*** -0.02 -0.02* -0.02* Obs. 1,328 1,621 2,345 2,888 1,196 1,467 2,143 2,647 2 R 0.26 0.24 0.22 0.21 0.08 0.07 0.05 0.04 Note: *, ** and *** represent 10%, 5% and 1% significance levels respectively. Linearized standard errors are in the parenthesis. 127 Figure 3.5 Price Elasticities for Extensive and Intensive Margins. Note: Water cost used is average energy cost. Standard errors are estimated by Taylor linearization with weights from USDA NASS. The range of each estimation represents 95% confidence interval. Average Energy Cost are employed for the estimation. Price elasticities are estimated when other variables are at their mean values. 128 Table 3.11 Predicted Water Reduction with Estimated Price Elasticities based on Average Energy Cost. IL IN MI MN OH WI Corn Year 2018 Irrigated Acres (,000) 313.4 317.2 378.7 280.4 8.0 142.9 Water Application (ac-in/ac) 7.2 4.8 6.0 6.0 8.4 6.0 Total Water Usage (ac-in) 2256.5 1522.6 2272.2 1682.4 67.2 857.4 10% Tax for Irrigated Acres (,000) 308.5 312.3 372.3 275.8 7.4 140.8 Average Energy Cost Water Application (ac-in/ac) 7.0 4.7 5.8 5.8 8.3 5.9 Total Water Usage (ac-in) 2160.4 1461.8 2170.6 1613.4 61.5 823.7 Change of Total Water Usage (%) -4.3 -4.0 -4.5 -4.1 -8.5 -3.9 Soybean Year 2018 Irrigated Acres (,000) 184.4 175.3 192.5 104.2 4.7 73.8 Water Application (ac-in/ac) 6.0 6.0 6.0 4.8 2.4 4.8 Total Water Usage (ac-in) 1106.4 1051.8 1155.0 500.2 11.3 354.2 10% Tax for Irrigated Acres (,000) 179.4 174.5 188.8 103.2 4.4 72.5 Average Energy Cost Water Application (ac-in/ac) 5.8 5.9 5.9 4.7 2.4 4.7 Total Water Usage (ac-in) 1045.5 1024.9 1105.4 482.7 10.3 338.2 Change of Total Water Usage (%) -5.5 -2.6 -4.3 -3.5 -8.3 -4.5 Data Source: USDA Farm and Ranch Irrigation Survey (2003,2008,2013); Irrigation and Water Management Survey (2018). Estimated price elasticities are from empirical analysis. 129 Conclusion Few studies focus on the potential water scarcity issue in a water-rich region. As such, more research is needed better to understand irrigation water issues in the Great Lakes region. By employing USDA FRIS/IWMS data from 2003 to 2018, we investigate what kind of water cost farmers respond to, estimate multiple price elasticities of irrigation water usage, and predict the effect of a potential tax on irrigation water reduction. Our findings contain that: (i) AEC and MIC have similar performance for values (extensive and intensive margins) which include all crops, although price elasticities based on MIC are slightly smaller; (ii) for crop-specific values, price elasticities based on MIC might be underestimated and be sensitive to error tolerance; (iii) if a 10% tax structure on average energy cost were employed by policymakers, the total water usage will decrease by about 4% for both corn and soybean; and (iv) when average energy cost increases by 20% due to a tax policy, the total water usage will decrease by 7.2% and 8.1% for corn and soybean, respectively. The external validities of our findings are that: (i) price elasticities in previous studies investigating U.S. western states might be underestimated because of the adoption of a uniform irrigation pumping efficiency; (ii) model specification impacts the estimations of price elasticity of extensive margin water usage, and (iii) interstate price elasticities are heterogeneous, so a uniform tax structure will induce a differential of water reduction when all else is equal. 130 REFERENCES Baum, C.F. 2008. Modeling proportions. Stata Journal 8: 299–303. Borenstein, S. 2009. To what electricity price do consumers respond? Residential demand elasticity under increasing-block pricing. Preliminary Draft April, 30, 95. Bound, J., and A.B. Krueger. 1991. The extent of measurement error in longitudinal earnings data: Do two wrongs make a right?. Journal of labor economics, 9(1), 1-24. Bound, J., C. Brown, G.J. Duncan, and W.L. Rodgers. 1994. Evidence on the validity of cross- sectional and longitudinal labor market data. Journal of Labor Economics, 12(3), 345- 368. Burton, I.: 1996, The growth of adaptation capacity: practice and policy’. In Adapting to Climate Change: An International Perspective, eds. J.B. Smith, N. Bhatti, G.V. Menzhulin, R. Benioff, M. Campos, B. Jallow, F. Rijsberman, M.1. Budyko, and RK Dixon, 55-67, Springer, New York. Caswell, M. F., and D. Zilberman. 1986. The effects of well depth and land quality on the choice of irrigation technology. Amer. J. of Agr. Econ. 68(4): 798-811. Che, Y., H. Feng, and D.A. Hennessy. 2020. Recency effects and participation at the extensive and intensive margins in the US Federal Crop Insurance Program. The Geneva Papers on Risk and Insurance-Issues and Practice, 45(1), 52-85. Chiu, Y. W., B. Walseth, and S. Suh. Water embodied in bioethanol in the United States. Environmental Science and Technology Vol. 43, No.8: 2688-92, 2009. Condon, N., H. Klemick, and A. Wolverton. 2015. Impacts of ethanol policy on corn prices: A review and meta-analysis of recent evidence. Food Policy, 51: 63-73. Croley, T.I. II.: 1991. ‘CCC GCM 2xC02 hydrological impacts on the Great Lakes’. Working Committee 3, International Joint Commission Levels Reference Study. De Loe, R. C., and R.D. Kreutzwiser. 2000. Climate variability, climate change and water resource management in the Great Lakes. Climatic Change, 45(1): 163-179. Deschênes, O., and M. Greenstone. 2007. The economic impacts of climate change: evidence from agricultural output and random fluctuations in weather. American Economic Review, 97(1): 354-385. Döll, P. 2002. Impact of climate change and variability on irrigation requirements: a global perspective. Climatic change, 54(3): 269-293. Foster, H. S., and B.R. Beattie. 1979. Urban residential demand for water in the United States. Land Economics, 55(1): 43-58. 131 Foster, H. S., and B.R. Beattie. 1981. Urban residential demand for water in the United States: reply. Land Economics, 57(2): 257-265. Gonzalez-Alvarez, Y., A.G. Keeler, and J.D. Mullen. 2006. Farm-level irrigation and the marginal cost of water use: Evidence from Georgia. Journal of Environmental Management, 80(4): 311-317. Grannemann, N. G., et al. 2000. The importance of ground water in the Great Lakes Region (No. 2000-4008). US Geological Survey. Hausman, J. 2001. Mismeasured variables in econometric analysis: problems from the right and problems from the left. Journal of Economic perspectives, 15(4): 57-67. Hendricks, N.P., and J.M. Peterson. 2012. “Fixed Effects Estimation of the Intensive and Extensive Margins of Irrigation Water Demand.” Journal of Agricultural and Resource Economics, 37(1): 1–19. Hendricks, N. P. 2018. Potential benefits from innovations to reduce heat and water stress in agriculture. Journal of the Association of Environmental and Resource Economists, 5(3): 545-576. Hrozencik, R. A., D.T. Manning, J.F. Suter, and C. Goemans. (2022). Impacts of Block‐Rate Energy Pricing on Groundwater Demand in Irrigated Agriculture. Amer J. of Agri Econ, 104(1): 404-427. Ito, K. 2014. Do consumers respond to marginal or average price? Evidence from nonlinear electricity pricing. American Economic Review, 104(2): 537-63. Kornelis, A., and P. Norris. 2020. Irrigation Water Demand: Price Elasticities and Climatic Determinants in the Great Lakes Region. Agric. & Resour. Econ. Rev. 49(3): 437-464. Lautenberger, M. C., & Norris, P. E. 2016. Private rights, public interests and water use conflicts: evolving water law and policy in Michigan. Water Policy, 18(4), 903-917. Mieno, T., and N. Brozović. 2016. Price elasticity of groundwater demand: attenuation and amplification bias due to incomplete Information. Amer J. of Agri Econ, https://doi.org/10.1093/ajae/aaw089. Moore, M. R., N.R. Gollehon, and M.B. Carey. 1994. Multicrop production decisions in western irrigated agriculture: the role of water price. Amer J. of Agri Econ. 76(4): 859-874. Mubako, S. T., B.L. Ruddell, and A.S. Mayer (2013). Relationship between water withdrawals and freshwater ecosystem water scarcity quantified at multiple scales for a Great Lakes watershed. J. Water Resour. Plan. & Manage, 139(6): 671-681. Nataraj, S., and W.M. Hanemann. 2011. Does marginal price matter? A regression discontinuity approach to estimating water demand. Journal of Environmental Economics and Management, 61(2): 198-212. 132 Papke, L. E., and J.M. Wooldridge. 1996. Econometric methods for fractional response variables with an application to 401 (k) plan participation rates. Journal of applied econometrics, 11(6): 619-632. Papke, L. E., and J.M. Wooldridge. 2008. Panel data methods for fractional response variables with an application to test pass rates. Journal of econometrics, 145(1-2): 121-133. Pfeiffer, L., and C.Y.C. Lin. 2014. Does efficient irrigation technology lead to reduced groundwater extraction? Empirical evidence. Journal of Environmental Economics and Management, 67(2): 189-208. Rogers, D. H. 2006. Pumping plant efficiency, fuel options and costs. In Proceedings for 2006 Central Plains irrigation conference, Colby, Kansas, February 21-22. Colorado State University. Libraries. Rogers, D. and M. Alam. 2006. “Comparing Irrigation Energy Costs.” Irrigation Management Series MF-2360, Kansas State University, 2006. Available online at http://www.ksre.ksu.edu/library/ ageng2/mf2360.pdf. Schlenker, W., and M.J. Roberts. 2009. Nonlinear temperature effects indicate severe damages to US crop yields under climate change. Proceedings of the National Academy of sciences, 106(37), 15594-15598. USDA, N. 2018. Irrigation and Water Management Survey. Summary available at: https://www.nass.usda. gov/Publications/Highlights/2019/2017Census_Irrigation_and_ WaterManagement. pdf. Wooldridge, J. M. 2005. Simple solutions to the initial conditions problem in dynamic, nonlinear panel data models with unobserved heterogeneity. Journal of applied econometrics, 20(1): 39-54. Wooldridge, J.M., 2010. Econometric Analysis of Cross Section and Panel Data, second ed. MIT Press, Cambridge, MA. Wooldridge, J. M. 2019. Correlated random effects models with unbalanced panels. Journal of Econometrics, 211(1): 137-150. Xu, Z., D.A. Hennessy, K. Sardana, and G. Moschini. 2013. The realized yield effect of genetically engineered crops: US maize and soybean. Crop Science, 53(3): 735-745. 133 APPENDIX Table 3A.1 Crop-Specific Farm Amounts (N=7,936). Crop Obs. Crop Obs. 01 Corn for grain or seed 5,260 07 Wheat for grain or seed 347 02 Soybeans for beans 3,513 08 Beans, dry edible 267 03 Alfalfa and alfalfa mixtures 750 09 Sorghum for grain or seed 21 04 Potatoes 679 10 Rice 6 05 All berries 661 11 Cotton 3 06 Corn for silage or greenchop 512 12 Peanut 0 Note: Observations in our final sample are 7,936. 12 crop categories are selected in this table with the rank of farm amounts. Figure 3A.1 Water Usage by Source in 2013 and 2018. Note: Data are from 2013 FRIS and 2018 IWMS summary report. 134 Figure 3A.2 Percentage of State-Level Sample Size over 2003-2018. Note: The yellow curve indicates the targeted area in this study. For example, “MI 1.6” means farm size of Michigan accounts for around 1.6% in the national farm size (=541/34,783). Source: USDA FRIS/IMWS. 135 Figure 3A.2 (cont’d). 136