UNDERSTANDING TRANSIENT TECHNOLOGY USE AMONG SMALLHOLDER FARMERS IN AFRICA By Maolong Chen A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of Agricultural, Food, and Resource Economics - Doctor of Philosophy 2018 ABSTRACT UNDERSTANDING TRANSIENT TECHNOLOGY USE AMONG SMALLHOLDER FARMERS IN AFRICA By Maolong Chen The objective of this dissertation is to study African smallholder farmers’ transient technology use. Transient use refers to the situation where farmers switch back and forth between two or more technologies. To better understand transient technology use, I first specify a dynamic theoretical model to investigate farmers’ optimal decision rules given the availability of modern and traditional technologies under a range of productivity and market scenarios. The model is then calibrated and solved using a dynamic programming algorithm. Numerical results show that expected profitability and costs of switching between technologies are the two main driving forces influencing the patterns of transient technology use. Next I turn to econometric insights about transient technology use in Africa. The sample data utilized in this dissertation is an irregularly spaced four-wave panel data set, making all existing traditional discrete choice estimators inconsistent for dynamic panel estimation. Therefore, before conducting the empirical analysis, I develop and evaluate the performance of three possible estimators (gap-dummy approach, linear probability model, and indirect inference method) for discrete choice dynamic panel data models with irregular spacing. Monte Carlo simulations reveal that traditional estimators generate downward bias in estimates for the state dependence parameter. Adding gap dummies indicating if the panel period is irregularly spaced could potentially reduce the bias. The other two estimators, linear probability model and indirect inference, fail to reduce the bias of irregular spacing effectively in our simulations. The final task is to undertake an empirical analysis of Kenyan smallholder farmers’ decision to use hybrid maize seed. The gap dummy approach is applied to reduce the bias from the irregular spacing problem. Our findings provide empirical evidence that hybrid maize seed use is a dynamic process with a high degree of state dependence. However transient use does occur regularly and switching is influenced by the expected relative profitability between hybrid and traditional varieties, but that the choice is also highly state dependent consistent with the existence of switching costs and/or learning by doing effects. Copyright by MAOLONG CHEN 2018 I dedicate this dissertation to my dear daughter, Chloe N. Chen. v ACKNOWLEDGEMENTS I would like to express my special thanks of gratitude to my major professor, Robert J. Myers, for every invaluable lesson I have learnt from him. He guided me through the whole PhD program, encouraged me to be an independent researcher, and gave me the freedom to do whatever I wanted and to try every possibility I imagined. I also gratefully acknowledge the other members of my research committee for their valuable feedback and inspiration. My thanks go to Joseph Herriges, who provided me with programming suggestions, to Jeffery Wooldridge, who helped me with econometric problems, and to Thomas Jayne, who offered me empirical advice. Also, I would like to thank a group of people at MSU for offering me tons of help in studying, research and life. Thanks to David L. Ortega for leading me to the academic world and sharing your precious experience and insights. Thanks to Songqing Jin and Hongya Chen for their support and help in my graduate study life. My sincere thanks also go to the graduate students, faculties, and staff in AFRE and all my friends at MSU for all the great times that we have shared. I am deeply thankful to my family for their love, understanding, and sacrifices. Without them, I would never make it happen. Thank you to my father, mother, father-in-law and mother- in-law for their constant encouragement and trust. Last, I want to give my special thanks to my wife, Chaoran Hu, for everything she has done for me. vi TABLE OF CONTENTS LIST OF TABLES ......................................................................................................................... ix LIST OF FIGURES ...................................................................................................................... xii INTRODUCTION .......................................................................................................................... 1 CHAPTER 1. UNDERSTANDING TRANSIENT TECHNOLOGY USE AMONG SMALLHOLDER FARMERS IN AFRICA: A DYNAMIC PROGRAMMING APPROACH ... 3 1.1. Introduction ..................................................................................................................... 3 1.2. Background ..................................................................................................................... 6 1.3. Conceptual Model ........................................................................................................... 7 1.4. Numerical Model and Parameterization ....................................................................... 11 1.4.1. Modeling Maize Price Expectations ......................................................................... 11 1.4.2. Modeling Maize Yield Expectations ........................................................................ 13 1.4.3. Modeling Production and Switching Costs............................................................... 16 1.5. Numerical Results ......................................................................................................... 17 1.5.1. Value Function and Decision Rules .......................................................................... 17 1.5.2. Fluctuations in Relative Profitability of Hybrid vs. Traditional Seeds ..................... 18 1.5.3. Scenario Analysis ...................................................................................................... 19 1.5.3.1. Alternative Hybrid Yield Scenarios ...................................................................... 20 1.5.3.2. Alternative Hybrid Yield Variance Scenarios ...................................................... 20 1.5.3.3. Alternative Switching Cost Scenarios .................................................................. 21 1.5.3.4. Alternative Price Yield Correlations and Price Variance ..................................... 21 1.5.3.5. Learning and Reductions in Switching Costs ....................................................... 22 1.6. Conclusions ................................................................................................................... 23 REFERENCES ......................................................................................................................... 38 CHAPTER 2. ESTIMATING DYNAMIC DISCRETE CHOICE PANEL MODELS USING IRREGULARLY SPACED DATA .............................................................................................. 42 2.1. Introduction ................................................................................................................... 42 2.2. Dynamic Panel Data Models with Continuous Dependent Variables .......................... 44 2.2.1. Regularly Spaced Data .............................................................................................. 44 2.2.2. Irregularly Spaced Data ............................................................................................ 45 2.3. Discrete Choice Dynamic Panel Data Models .............................................................. 49 2.3.1. Regularly Spaced Data .............................................................................................. 49 2.3.2. Irregularly Spaced Data ............................................................................................ 54 2.4. Alternative Estimation Approaches for Discrete Choice DPD models under Irregular Spacing ...................................................................................................................................... 57 2.4.1. Correlated-Random-Effects Probit (CRE-P) ............................................................ 58 2.4.2. Correlated-Random-Effects Probit with Gap Dummies (CRE-PGD) ...................... 58 2.4.3. Linear Probability Model Estimator Using a US Spacing Structure (LPM-US) ...... 59 2.4.4. Indirect Inference Approach (IIA) ............................................................................ 59 Evaluating Estimator Performance ............................................................................... 61 2.5. vii 2.5.1. Monte Carlo Experiments ......................................................................................... 61 2.5.2. Monte Carlo Results ................................................................................................. 63 2.6. Conclusions ................................................................................................................... 66 REFERENCES ......................................................................................................................... 82 CHAPTER 3. TRANSIENT USE OF HYBRID MAIZE: IRREGULARLY SPACED DYNAMIC PANEL EVIDENCE FROM KENYA ..................................................................... 85 Introduction ................................................................................................................... 85 3.1. 3.2. Data and Descriptive Statistics ..................................................................................... 87 3.3. Conceptual Model ......................................................................................................... 88 3.4. Empirical Implementation ............................................................................................ 92 3.4.1. Modeling Maize Price Expectations ......................................................................... 94 3.4.2. Modeling Fertilizer Use and Fertilizer Cost Differential Expectations .................... 96 3.4.3. Modeling Yield Differential Expectations ................................................................ 98 3.5. Results ......................................................................................................................... 100 3.5.1. Determinants of Hybrid Adoption .......................................................................... 101 3.5.2. Static vs. Dynamic Estimation ................................................................................ 102 3.5.3. Irregular Spacing Effects ........................................................................................ 104 3.6. Conclusion .................................................................................................................. 105 APPENDIX ............................................................................................................................. 118 REFERENCES ....................................................................................................................... 132 viii LIST OF TABLES Table 1-1 Possible transitions across hybrid/non-hybrid use ....................................................... 25 Table 1-2 Proportion of households by adoption history category ............................................... 25 Table 1-3 Unit root and stationarity tests for monthly maize price .............................................. 26 Table 1-4 VAR lag order selection criteria for first differenced maize price ............................... 26 Table 1-5 Regression results for monthly maize prices ................................................................ 27 Table 1-6 LM test for autoregressive conditional heteroscedasticity in monthly maize prices .... 28 Table 1-7 Unit root and stationarity tests for aggregate maize yield ............................................ 28 Table 1-8 Regression results for aggregate maize yields .............................................................. 28 Table 1-9 LM test for autoregressive conditional heteroscedasticity in aggregate maize yields . 29 Table 1-10 Household-Level Yield Parameterization Results ...................................................... 29 Table 1-11 Baseline dynamic programming parameterization ..................................................... 29 Table 1-12 Number of switching, and adoption/disadoption duration over 1000 periods ........... 30 Table 1- 13 Parameterization for the learning model ................................................................... 30 Table 2-1 Monte Carlo results for regularly spaced CRE-P (Estimates of Coefficients) ............. 68 Table 2-2 Monte Carlo results for regularly spaced CRE-P (Estimates of APEs) ....................... 69 Table 2-3 Results of experiment 2 for irregular spacing evaluation with exogenous xit (Estimates of Coefficients) ............................................................................................................................. 70 Table 2-4 Results of experiment 2 for irregular spacing evaluation with exogenous xit (Estimates of APEs) ........................................................................................................................................ 71 Table 2-5 Results of experiment 2 for irregular spacing evaluation with endogenous xit (Estimates of Coefficients) ........................................................................................................... 72 Table 2-6 Results of experiment 2 for irregular spacing evaluation with endogenous xit (Estimates of APEs) ...................................................................................................................... 73 ix Table 2-7 Results of experiment 3 for different state dependence of yit with exogenous xit Table 2-8 Results of experiment 3 for different state dependence of yit with exogenous xit Table 2-9 Results of experiment 3 for different state dependence of yit with endogenous xit Table 2-10 Results of experiment 3 for different state dependence of yit with endogenous xit Table 2-11 Results of experiment 4 for different persistence of xit and with exogenous xit Table 2-12 Results of experiment 4 for different persistence of xit and with exogenous xit Table 2-13 Results of experiment 4 for different persistence of xit and with endogenous xit Table 2-14 Results of experiment 4 for different persistence of xit and with endogenous xit (Estimates of APEs) ...................................................................................................................... 77 (Estimates of Coefficients) ........................................................................................................... 78 (Estimates of Coefficients) ........................................................................................................... 80 (Estimates of APEs) ...................................................................................................................... 79 (Estimates of Coefficients) ........................................................................................................... 74 (Estimates of APEs) ...................................................................................................................... 75 (Estimates of Coefficients) ........................................................................................................... 76 (Estimates of APEs) ...................................................................................................................... 81 Table 3-1 Possible transitions across hybrid/non-hybrid use ..................................................... 108 Table 3-2 Proportion of households by adoption history category ............................................. 108 Table 3-3 Maize production summary statistics by adoption pattern ......................................... 109 Table 3-4 Income information by adoption pattern .................................................................... 110 Table 3-5 Market infrastructure statistics by adoption pattern ................................................... 111 Table 3-6 Household’s expectations of maize selling and buying prices ................................... 111 Table 3-7 Household’s expectations of fertilizer use for hybrid and traditional seeds .............. 112 Table 3-8 Household’s yield differential expectations between hybrids and traditional seeds .. 112 Table 3-9 Main estimates of hybrid adoption models (coefficients) .......................................... 113 Table 3-10 Main estimates of hybrid adoption models (average partial effects) ....................... 115 Table 3-11 Predicted price elasticities of hybrid adoption ......................................................... 117 x Table 3A-1 Robustness check for modeling price expectation with different lag length ........... 119 Table 3A-2 Estimates of Maize Price Expectations ................................................................... 122 Table 3A-3 Estimates of fertilizer use expectations ................................................................... 124 Table 3A-4 Estimates of yield response model .......................................................................... 126 Table 3A-5 Full estimates of hybrid adoption model ................................................................. 128 xi LIST OF FIGURES Figure 1-1 Conditional value functions for adopters and non-adopters ....................................... 31 Figure 1-2 Optimal adoption rules under alternative switching costs .......................................... 31 Figure 1-3 Expected paths of yield and profit differentials between hybrid and traditional seeds ....................................................................................................................................................... 32 Figure 1-4 Expected path of profit differential under different production cost differentials ...... 32 Figure 1-5 Expected path of profit differential under different maize price ................................. 33 Figure 1-6 Adoption rate under alternative hybrid yield scenario ................................................ 33 Figure 1-7 Different hybrid yield variance scenarios ................................................................... 34 Figure 1-8 Adoption rate under alternative switching cost scenarios given high profitability of hybrids........................................................................................................................................... 35 Figure 1-9 Adoption rate under alternative switching cost scenarios given low profitability of hybrids........................................................................................................................................... 35 Figure 1-10 Adoption rate under different price-yield differential correlation ............................ 36 Figure 1-11 Adoption rate under different price variance ............................................................ 36 Figure 1-12 Influences on the adoption process when switching costs are decreasing in hybrid use ................................................................................................................................................. 37 xii INTRODUCTION Development and adoption of new crop varieties and other technological improvements have led to massive increases in agricultural productivity in many parts of the world. However, the progress of technology development and adoption in Africa remains slow, due partly to the high rate of disadoption and switching back and forth between modern and traditional technologies. This dissertation aims at studying this transient technology use in a dynamic context. The first chapter investigates the patterns of transient technology use through a dynamic programming approach. A dynamic conceptual model is developed to explain transient use, and the model is then calibrated and solved using a dynamic programming algorithm. Numerical results show that relative profitability, yield uncertainty, and switching costs are important influences on the pattern of adoption and disadoption. Switching costs play a role in preventing households from both entering and exiting modern technology use, and the profitability of modern technologies determines if the switching cost will encourage or discourage long-run adoption. The second chapter attempts to improve estimation of dynamic panel data discrete choice models with irregular spacing. The panel data used in this dissertation is irregularly spaced in four periods, making all commonly used dynamic discrete choice panel data estimators inconsistent. Thus, before conducting the empirical analysis of Kenyan hybrid maize adoption, I first develop three estimators for discrete choice dynamic panel data models with irregular spacing, and evaluate the performance of these estimators using Monte Carlo methods. Monte Carlo simulations reveal that traditional estimators (Correlated- Random-Effect Probit) generally produce downward bias in estimates for the state dependence parameter in the dynamic model because of missing lagged dependent variables. 1 Adding dummies to indicate if the period is irregularly spaced could potentially reduce the bias, but the effectiveness relies on the panel structure: having at least two consecutively observed periods enables the dummy variable approach to account for irregular spacing effects more effectively. Also, simulation results show that, in most scenarios, the estimates of the contemporaneous effects of covariates in dynamic panels are unbiased or have only small bias. The third chapter explores the determinants of transient technology use in Africa. A four-wave panel data, from the Tegemeo Agricultural Monitoring and Policy Analysis (TAMPA) Project between Tegemeo Institute at Egerton University, Kenya and Michigan State University, is used to investigate Kenyan smallholder farmers’ decision on hybrid maize seed. The panel is irregularly spaced, thus the approach developed in Chapter 2 is used to reduce the bias from irregular spacing. Our findings provide empirical evidence to support the findings from Chapter 1, that transient seed technology use in Africa is determined by both profitability and adoption persistence (either switching costs or learning effects). On the one hand, fluctuations in maize and fertilizer prices can reverse the relative profitability of hybrids traditional seeds and lead households to switch back and forth between hybrid and traditional varieties. On the other hand, adoption persistence pushes households to persist with their recent seed use choice, despite the apparent profitability of changing. These two effects jointly determine the patterns and rate of adoption of hybrid maize seed in Africa. Taken together, these chapters shed light on transient technology use in Africa. The first chapter provides a deeper understanding of transient technology use. The second chapter improves the econometric approaches to estimating irregularly spaced dynamic discrete choice panel models. The third chapter provides empirical evidence to support the importance of the key factors influencing transient technology use that were identified in the dynamic programming model of Chapter 1. 2 CHAPTER 1. UNDERSTANDING TRANSIENT TECHNOLOGY USE AMONG SMALLHOLDER FARMERS IN AFRICA: A DYNAMIC PROGRAMMING APPROACH 1.1. Introduction Development and adoption of new crop varieties and other technological improvements have led to massive increases in agricultural productivity in many parts of the world. However, productivity gains in Africa have been disappointing (Mwangi 1996; Duflo, Kremer, and Robinson 2008). Given apparent land scarcity and low land fertility in Africa, many view intensive agriculture based on modern technologies as crucial for Africa to reach its development potential (De Groote et al. 2002; Lee 2005; Pannell and Vanclay 2011). A number of policies have been implemented to encourage the adoption of new technologies and modern inputs throughout Africa, including direct input subsidies (primarily fertilizer), government-facilitated provision of input credit, and centralized control of input procurement and distribution (Ouma et al. 2002). Even with these initiatives, however, the progress of technology development and adoption in Africa remains slow (Spencer 1996; Moser and Barrett 2003; Dercon and Christiaensen 2011). There is considerable existing research on technology adoption (Byerlee 1994; Mwangi 1996; Zeller, Diagne, and Mataya 1998; Sunding and Zilberman 2001; Doss 2006; Suri 2011). This research has focused on explaining technology adoption based on farmer characteristics, farmer information, expected profitability, risk, the existence of marketing and transportation infrastructure, and the availability of credit and liquidity for seed and fertilizer purchases. For example, Mwangi (1996) identified liquidity constraints as one of the key factors affecting the adoption decision, especially when farmers with little cash are planting under high risk. Byerlee et al. (1994) comment “the profitability of using 3 technologies is highly site-specific, depending on land pressure, agro-climatic variables, fertilizer costs, and farm-gate crop prices”. Suri (2011) provides empirical evidence of heterogeneous profitability of technologies to explain low adoption rates in some areas of Africa. Most of the existing research on technology adoption assumes that adoption is a one- time decision so that, once adopted, a new technology will continue to be used until a better one becomes available. There has been some work on technology adoption in a dynamic context which makes allowance for learning effects and the option to delay adoption (e.g. Foster and Rosenzweig 1995; Conley and Udry 2010). Even in this framework, however, the decision to adopt is still typically viewed as a one-time decision.1 This is at odds with what we observe in some technology adoption environments where farmers switch back and forth between two or more technologies. This is particularly true for hybrid seed use in Africa where panel data sets reveal individual farmers commonly switching back and forth between modern varieties and traditional local varieties (Ouma et al. 2002; Tura et al. 2010). We provide descriptive data below that support these observations for maize production in Kenya and Zambia. We term this technology switching behavior “transient technology use” and it has been little studied to date. Because technology switching behavior is clearly a dynamic process, this paper develops a dynamic switching model to explain and study transient technology use. Dynamic switching models study optimal sequential choice patterns among a potential set of activities. They have been studied in various areas in economics, such as labor participation, brand choice, asset replacement, and industrial organization (Wolpin 1984; Eckstein and Wolpin 1 There is also a small literature on technology disadoption as well but again disadoption is viewed as a one- time decision. 4 1989a; Hyslop 1999; Rust 1989; Dixit 1989; Das and Das 1997; Ackerberg 2003; Kim 2006; Das, Roberts, and Tybout 2007). One important feature of this type of model is that switching from one choice to another is costly, and the existence of switching creates an incentive to wait which significantly influences the optimal decision path. For example, Eckstein and Wolpin (1989) developed a dynamic labor force participation model to study married women’s decisions on whether or not to work in each period over a finite horizon. The dynamics arise from the effect of work experience on wages, and thus on future work decisions. Dixit (1989) investigated a firm’s decision to enter and exit production when facing a random-walk output price and sunk costs. The resulting optimal decision rules depend on switching costs and the sunk cost of investment, explaining the phenomenon of hysteresis. In the model developed in this chapter, transient technology use is driven by the relative profitability of different technologies, the costs of switching between them, and a learning process that reduces switching costs as experience with new technologies grows. The switching costs and learning effect introduce a degree of irreversibility into technology adoption choices, but do not restrict adoption to be fully irreversible, as is implicitly assumed in much of the existing technology adoption literature. The conceptual model is then calibrated and solved numerically using a dynamic programming algorithm. Simulations of the model illustrate how changes in switching costs, relative profitability, and the learning process can lead to different patterns and duration of transient technology use. The contribution of the paper is that it leads to several new insights into the process of transient technology adoption and the factors that cause it, as well as suggesting potential policy levers to discourage persistent disadoption cycles. 5 1.2. Background The model in this paper is motivated by, and calibrated to, farmer data from the TAMPA Project between Tegemeo Institute at Egerton University, Kenya and Michigan State University. It is a four-wave household level panel survey (2000, 2004, 2007, 2010), representative of rural maize-growing areas in Kenya. The sample has 1207 observations tracked in all four waves. Table 1-1 lists all the possible four period transitions of hybrid seed use, and the corresponding number of the 1207 households that fall into each transition category. Table 1-2 then classifies the households according to their adoption history (never adopted, always adopted, adopted and continued, adopted and disadopted, and transient use). Two observations are worth noting. First, while over 90% of households used hybrids in at least one sample year, almost 23% of the sample subsequently disadopted them. Second, almost 15% of the sample displayed transient use (switching back and forth between hybrids and traditional varieties). These data show that transient use of hybrid seeds is an important phenomenon in Kenya and suggests that transient technology use may be important in other technology adoption contexts as well. Of course, transient technology use may occur simply because the relative returns from using the alternative technologies fluctuate over time, and the costs of switching between them are minimal. In most technology environments, however, it is not costless to switch technologies. As well as the financial investment required, production processes and practices may have to be adjusted and an investment has to be made in learning how to use the new technology, at least until some experience has been gained. This suggests that transient technology use is a dynamic process and we need a dynamic model to characterize it. 6 1.3. Conceptual Model Consider a farmer with two maize seed technologies available—hybrid and traditional seeds. If the hybrid variety is used, realized profits per acre are given by !"#=%"&"#−("# where p hybrid. Similarly, if the traditional variety is used, realized profits per acre are given by !")= %"&")−(") where the superscript T indicates traditional variety seed. We assume maize output is maize price, y is maize yield, c is cost of production per acre, and superscript H indicates price is the same irrespective of whether maize is produced from hybrid or traditional seed, and that price and yield are uncertain at planting time when the seed technology choice has to be made. We keep other resource allocation decisions in the background by assuming that, once a seed choice has been made, production practices and other input use are set to recommended levels for that seed technology (i.e., seed is the only explicit choice variable). In addition to production costs, there are costs from switching from one seed type to the other. The cost of switching from traditional to hybrid seeds includes costs of searching for and establishing a relationship with hybrid vendors, screening to ensure seed quality, and investing in learning about differences in recommended production practices. The cost of switching from hybrid to traditional seeds include the cost of adjusting back to traditional production practices, re-acquainting with traditional farming practices, learning about changes to soil quality brought on by hybrid production practices, etc. It is logical that the cost of switching from traditional to hybrid seeds is higher than the cost of switching from hybrid to traditional seeds, and for switching costs to be decreasing in the number of times hybrids have been used in the past (a learning effect). Per acre switching costs are therefore denoted by *")→#(-" ) for switching from traditional to hybrids and *"#→)(-") for switching from hybrids to traditional varieties, where -" is the number of times hybrids have been used in the past. 7 We assume the farmer is risk-neutral (or can insure risks) and chooses traditional or hybrid seed to maximize the discounted sum of expected lifetime profits over an infinite horizon:2 :" ;"<= { d"!"#−*")→#-" d"−d"89 + 1−d" !")− *"#→)-" d"89−d" max{34} 789 subject to -"A9=-"+B" where d" is a binary decision variable with d"=1 indicating hybrid seed is chosen and d"= 0 indicating traditional seed is chosen. Switching costs are incurred only when d"≠d"89 (i.e. (1) , the technology is switched). The model is solved using dynamic programming. The relevant , value function takes the form, E"B"89 =maxE"#B"89,E")B"89 where E"#B"89 and E")B"89 are the conditional value functions for hybrid and traditional E"#B"89 =7"89(!"#)−*")→#-" 1−d"89 +:7"89E"A9(1), E")B"89 =7"89(!"))−*"#→)-"d"89+:7"89E"A9(0), seed use given by, (2) (3a) (3b) 2 Most households in the Kenya data chose to plant only one type of seed in each season so profit is normalized to a per acre basis and the seed decision is assumed to be binary. 8 where E"A9B" is the discounted value of future profits from choosing hybrid (B"=1) or traditional (B"=0) seed today, assuming optimal seed choices are made in the future. planting period (dG89=0). Then the switch to hybrids will occur if, EG89πGJ −EG89πGK >sGK→JnG +β[EG89vGA90 −EG89vGA91], There are two cases to consider. First, suppose the traditional variety was used last (4) otherwise, traditional seeds will continue to be used. Without switching costs the right-hand side of (4) is zero and the decision rule reduces to the simple static condition that the switch to hybrids occurs if the expected current production profits under hybrids exceeds expected current production profits from using traditional seeds. With switching costs, however, the difference in expected current production profit must exceed a premium composed of two parts. The first part is the (always positive) switching costs. The second part is the discounted expected future profit premium from sticking with the traditional seeds today. The second part may be positive or negative, depending on the expected future profitability of hybrids compared to traditional varieties, and on the expected magnitude and frequency of future switching costs. If the premium is positive we may observe the farmer continuing to use traditional varieties, even when the current expected return from switching to hybrids is positive. The model is therefore capable of explaining the often-claimed-to-be-observed phenomenon of non-adoption even when adoption should increase current profits. The reason varieties at some point in the future. is essentially that non-adoption now eliminates the cost of switching back to traditional Second, suppose the hybrid variety was used last planting period (B"89=1). Then the switch to traditional varieties will occur if: 9 7"89!") −7"89!"# >*"#→)-" +:[7"89E"A91 −7"89E"A90], (5) otherwise, hybrids will continue to be used. With no switching costs the rule again collapses to the simple static result that whichever seed type is expected to provide the most current production profit is used. With switching costs, however, the difference in expected current production profit from switching to traditional varieties must exceed a premium composed of (positive) switching costs and a (positive or negative) discounted expected future profit premium from sticking with the hybrids today. If the premium is negative we may observe the farmer switching back to traditional varieties, even when the current expected profits from using traditional seeds is lower than the current expected profits from sticking with hybrids. The model is therefore capable of explaining disadoption, even when continuing to use hybrids would be expected to generate increased current profits. The reason is essentially that disadopting now reduces the costs of having to switch back to traditional varieties in the future. A number of results emerge from this conceptual model. First, because of switching costs the history of adoption decisions has an important influence on current adoption choice (current seed choice is conditioned on past practice). However, if switching costs decline as more experience is gained with hybrids (learning effect), then dependence on the history of past seed use will also decline. Second, the relative yields and costs from using hybrid versus traditional seeds will continue to play a major role in hybrid adoption and disadoption, because these will have a major impact on current and future profitability from adoption. Hence, the stochastic processes driving prices, yields, and costs, as well as the magnitude and dynamics of switching costs, will have a major impact on the prevalence of transient technology use. Third, there will be a band of inaction (waiting) in the optimal seed use rule. If traditional varieties (hybrids) are being used and the returns from adoption (disadoption) 10 get high enough adoption (disadoption) will occur. However, there will also be a band of inaction where returns that are not too far apart will lead to maintaining the status quo (continuing to use the current technology), despite the fact that switching may lead to higher expected current production profits. 1.4. Numerical Model and Parameterization The empirical analysis focuses on a representative farm household’s maize seed decision in Kenya. We parameterize and solve the conceptual model numerically to highlight a number of important implications of switching costs and learning for transient technology use. A numerical model requires information on: (1) maize price expectations; (2) different yield expectations for hybrid and traditional seeds; (3) production costs for each seed type; and (4) the costs of switching between seed types. Calibration of each of these model components are discussed in turn. We start with a simplified model where switching costs are assumed to be constant, and then extend the model to allow switching costs to decrease with experience to examine how learning might influence the transient hybrid adoption process. 1.4.1. Modeling Maize Price Expectations Maize price expectations are estimated from a univariate time series model of monthly Kenyan wholesale maize prices in Nairobi 3 from 2000 to 2010 in Kenyan Shillings per kilogram. 4 There were five missing prices (February-June 2005). These missing prices were interpolated using cubic spline interpolation and 2005 data from Mombasa (the second largest city of Kenya in population and for which complete data was available for 2005). 3 Nairobi is one of the major maize markets in Kenya and therefore has the longest and most reliable monthly data. 4 Source: Republic of Kenya, Ministry of Agriculture, Market Research and Information. 11 Unit root and stationarity tests on the Nairobi monthly maize price data are reported in Table 1-3. The null hypothesis of a unit root fails to be rejected at the 10% significance level based on both the Dickey-Fuller (DF) and Phillips-Perron (PP) tests. The null hypothesis of stationarity is rejected at the 1% level based on the Kwiatkowski-Phillips- Schmidt-Shin (KPSS) test. Therefore, the price data are differenced and results of optimal lag order selection tests for first differenced maize price are shown in Table 1-4. The evidence suggests a simple random walk process for the monthly maize price: %"=%"89+ST", where εVG~N(0,σVZ) is a random monthly price shock. We also tested for a time trend and (6) seasonality by adding monthly dummies. The estimation results show no evidence of a time trend and little seasonality in price movements (Table 1-5). There is some evidence supporting a seasonal price jump in May but the effect is small and no seasonality is assumed as a simplification to keep the dynamic programming model as tractable as possible. The null hypothesis of no autoregressive conditional heteroskedasticity in the errors fails to be rejected at the 10% significance level based on Lagrange multiplier (LM) tests (Breusch and Pagan 1980) provided in Table 1-6. Therefore, the results suggest no significant heteroscedasticity in the price distribution. We use equation (6) to represent monthly maize price movements but harvest price expectations are formed at planting which may be several months prior to the harvest. The main harvest season for Kenyan maize production is from January to March while planting occurs in October. Thus, the time gap between planning and harvesting is four to six months. Denoting the planting month (October) as %9" then %[" would be the last harvest month when 12 the majority of sales are occurring. Consistent price expectations would then be formed using: 7%["%9" =%9", (7) so the October (planting) price is the conditional expectation. 1.4.2. Modeling Maize Yield Expectations Detailed time series data on individual farm maize yields for Kenya is not available. However, annual aggregate maize yield data for all Kenya is available from 1961 to 2014 from FAOSTAT in kilograms per acre. Our procedure is to estimate a model for the aggregate maize yield data and then make appropriate adjustments to the model to estimate farm-level maize yield distributions for both traditional and hybrid maize seeds. The first step is to estimate a time series model for the aggregate maize yield data. Stationarity tests for aggregate maize yields provided mixed results (see Table 1-7). The null hypothesis of a unit root is rejected at the 5% level based on the Dickey-Fuller and Phillip- Perron tests. However, the null hypothesis of stationarity is also rejected at the 1% level based on the Kwiatkowski-Phillips-Schmidt-Shin test. Nevertheless, since most yield data have been found to be (trend) stationary, and a unit root in yields seems unlikely a priori, stationarity is assumed. The aggregate maize yield model is then specified as: &"\=&\+:9(&"89\ −&\)+S]"\ , (8a) 13 where the superscript A denotes the aggregate yield; :9 characterizes the speed of mean reversion; &\ is the long-run mean; and S]"\~^(0,_`]Z) is a random shock. Because this model appears to have residual autocorrelation when fitted to the data, we also investigate the possibility of allowing an MA(1) error process. However, the simple AR(1) model in (8a) fits the data well, is parsimonious, and tractable for inclusion in the numerical dynamic programming model, so this is the stochastic aggregate yield process assumed for the dynamic programming model . No strong evidence of a significant trend was found in the aggregate maize yield results. This may appear somewhat surprising but is consistent with observations that there has been very little maize yield growth in Kenya for many decades (Nyoro, Ayieko, and Muyanga 2007). Estimation results for the aggregate Kenyan maize yield data are provided in Table 1-8. The hypothesis of no autoregressive conditional heteroskedasticity fails to be rejected at the 10% significance level based on LM test (see Table 1-9). Therefore, heteroscedasticity does not need to be accounted for in the aggregate yield distribution. The aggregate maize yield model is then calibrated to farm level maize yield distributions for hybrid and traditional seeds using the Kenyan farm-level panel data. Farm- level yields are assumed to follow similar processes as aggregate yield but with different means and variances: &"#=&#+:9(&"89# −&#)+S]"# , &")=&)+:9(&"89) −&))+S]") , (8b) (8c) 14 where &# and &) are the long-run average yield of hybrid and traditional seeds, respectively; S]"#~^(0,_#]Z) and S]")~^(0,_)]Z) are random shocks to hybrid and traditional yields, and yield shocks are allowed to be correlated with price shocks with correlation coefficient a.5 To estimate the variances for household-level hybrid and traditional maize shocks, we first use equation (8a) to estimate the variance of the aggregate yield shock. This estimate is then scaled to approximate the conditional variance of the hybrid and traditional yields at the individual household level by multiplying by five and two, respectively (Just and Weninger 1999).6 To estimate the long-run means for household-level hybrid and traditional maize seeds, we first employ a propensity score matching (PSM) method to predict the counterfactual yield for every household assuming they used the other seed type.7 The PSM method is implemented as follows. First, a logit model is used to predict each household’s adoption decision given the household’s demographic and farm characteristics. The probability of adopting hybrids is used as the propensity score. The validity of PSM relies on the conditional independence assumption (CIA) and overlap assumption. Only covariates that are either fixed or measured before participating are selected to ensure the CIA, and the overlap assumption test is passed (Caliendo and Kopeinig 2008). Second, we then use nearest-neighbor matching (NNM) to generate weights and predict counterfactual yields. We select five nearest neighbors (closest in terms of propensity score) that used the other seed type and generate weights based on the distance of propensity scores between the treated farm and neighbors. Counterfactual yields are generated by the weighted sum of neighbors’ yields that used the other seed type. We then calculate the average yields for hybrid and 5 We assume both yield shocks have the same correlation with price in the base model but this is relaxed in sensitivity analysis. 6 The root-mean-square error of the aggregate yield estimation is 0.79, thus the estimated aggregate yield shock variance is 0.62. The estimated hybrid yield shock is 0.62*5=3.10, the estimated traditional yield shock is 0.62*2=1.24. 7 We only observe one type of yield at a time for most households in the sample. 15 traditional seeds, respectively, among all households using both actual and counterfactual yields and the results are given in Table 1-10. 1.4.3. Modeling Production and Switching Costs Production costs consist of seed, fertilizer, labor, and land preparation costs and differ by seed type. In the dynamic programming model production costs are assumed to be constants, (# and (), for production with hybrid and traditional seeds, respectively. We use panel data to estimate average costs for Kenyan maize production in Kenyan Shillings per acre. Using a similar procedure applied to estimate average yields at the household level, we predict counterfactual production costs for every household assuming they used the other seed type. Counterfactual production costs are estimated as the weighted sum of nearest neighbors’ production costs that used the other seed type.8 We then calculate the average production costs for hybrid and traditional seeds, respectively, among all households using both actual and estimated counterfactual production costs and the results are provided in Table 1-11. assumed to be constants, *")→# and *"#→), in the base model but allowed to be different Per acre switching costs are defined as a sum of transaction costs and learning costs, depending on the direction of the switch. Renkow, Hallstrom, and Karanja (2004) defined the sum of searching costs, bargaining costs, and screening and monitoring costs as fixed transaction costs and found that the magnitude of fixed transaction costs for Kenyan semi- subsistence households is equivalent to approximately 15% of the market price. Using this 8 The same weights described above for predicting counterfactual yields are used. 16 estimate, we set the constant switching cost9 from traditional to hybrid seeds at 1.4,10 and the constant switching cost from hybrid to traditional at 1.2. From these results, the base numerical dynamic programming model was parameterized as shown in Table 1-11. The base parameterizations were used to simulate the expected paths of yield and profit differentials between hybrid and traditional seeds, and then changed in various ways (as discussed below) and the model re-solved to illustrate various effects. The numerical model was solved using DPSOLVE in the Compecon Toolbox programmed in Matlab (Miranda and Fackler 2002). The family basis function we use is a Chebychev polynomial basis. 1.5. Numerical Results 1.5.1. Value Function and Decision Rules Figure 1-1 graphs the conditional value functions for current adopters and non-adopters as a function of current expected prices and yield differentials between hybrid and traditional seeds under the base parametrization. Two observations stand out. First, both conditional value functions are increasing in current expected price and the yield differential, holding other state variables constant. This indicates that higher prices and yield differentials increase the discounted profit stream for both current adopters and non-adopters alike (since current non-adopters still benefit from the option to adopt in the future). Second, the value function differential between adopters and non-adopters is increasing in the expected current yield differential, showing the higher the current expected yield differential the more likely current adoption is the dominant strategy. 9 Given the sample mean of maize price is approximately 1.50 ksh/kg, and average hybrid yield is 5.53 kg/acre, the average transaction cost would be 1.50*5.53*0.15=1.24. Adding in an approximation for learning costs we set switching cost to hybrid as 1.4, given. Switching cost to traditional seed is calibrated in the same way. 10 Note that the magnitudes of price and costs are both divided by 10 in dynamic programming (the average of maize price is approximately 15 ksh/kg in the sample). 17 Figure 1-2 shows optimal seed use rules under higher and lower switching costs as a function of current expected price and yield differentials. The optimal adoption rule takes the form of pair of threshold lines indicating the boundary between using hybrids and using traditional varieties. If the current expected price and yield differential are high enough, hybrids will always be used. Similarly, if the current expected price and yield differential are low enough traditional seeds will always be used. At intermediate levels of current price and yield differentials the decision is to wait and continue using the existing technology (whatever it is). The waiting area is due to switching costs that slow down adjustment to changing relative profitability of hybrids versus traditional seeds. As the switching cost is lowered, the waiting area shrinks and it is optimal for households to switch more often in response to changing relative profitability. 1.5.2. Fluctuations in Relative Profitability of Hybrid vs. Traditional Seeds The optimal seed use rules imply that transient technology use is encouraged by fluctuations in relative profitability of hybrid versus traditional seeds, and discouraged by switching costs. Given that the relative profitability is determined by the hybrid-traditional expected yield and cost differentials, along with the maize selling price, changes in these factors will determine the pattern of seed use switching. Maize hybrids were introduced into Africa to improve agricultural productivity, but it has long been recognized that the potential of hybrids is only realized under intensive input management (Ojiem, Ransom, and Wakhonya 1996; Coulter et al. 2010; Macharia et al. 2010; Omondi, Norton, and Ashilenje 2014). Therefore, hybrids may not always lead to higher yields. Furthermore, even if hybrid maize is always more productive, higher productivity does not always translate into higher profitability. Hybrid maize may require higher production costs that traditional varieties, which can be particularly 18 problematic if farm households are credit constrained. Fluctuating maize selling price may also impact relative profitability. To illustrate the potential influence of these factors we conducted Monte Carlo seed choice simulations over 1000 production periods with 200 replications. Figure 1-3 shows the expected paths of yield and profit differentials between hybrid and traditional seeds, starting from a negative value because hybrid is a newly introduced variety and optimal productivity will not be achieved until several trials. In the long run, the expected yield differential between hybrid and traditional seeds is always positive, but the expected profit differential flips several times indicating that hybrids are consistently more productive but could also be less profitable for farmers in some periods. To further show that hybrids could be both more productive and less profitable, we vary the levels of production cost differentials and maize prices and hold other variables constant. The corresponding expected paths of profit differential are presented in Figure 1-4 and 1-5. Although hybrids are more productive, a higher production cost differential or a lower maize selling price could significantly lower the hybrid’s profitability and lead to more frequent flips of the relative profit between the two technologies. This happens when the premium generated from the surplus yield of hybrids declines due to a lower maize price, and then fails to offset the additional production costs of hybrids. 1.5.3. Scenario Analysis In the scenario analysis, we run Monte Carlo simulations under different parameterizations to evaluate how different factors influence the transitory nature of the adoption process. For each scenario, the simulation is run for 1000 periods with 200 replications. We assume households have never adopted hybrids before at period zero, so the path of the adoption starts from traditional seeds at the beginning of the simulation. 19 1.5.3.1. Alternative Hybrid Yield Scenarios Figure 1-6 illustrates the expected path of adoption under different expected yield differential scenarios. Holding other variables constant, a higher level of expected yield differential between hybrid and traditional seeds encourages households to adopt hybrid seeds over time and achieve a higher expected adoption rate in a long run. However, notice that there are still episodes of disadoption, even in the high yield differential scenario. Table 1-12 shows that a higher current yield differential also enlarges the duration of adoption, and shortens the duration of disadoption, but has little effect on the number of switches. This occurs because higher yield differentials decrease the role of switching costs. Hence, with a higher yield differential the household optimally extends the adoption duration and shortens the disadoption duration. In terms of the number of switches these two effects then approximately cancel to leave little change in the total number of switches. Hence, unless the expected yield differential becomes very large, we still get transient technology use. It is just that the duration of adoptions is higher and the duration of disadoptions is lower. 1.5.3.2. Alternative Hybrid Yield Variance Scenarios Figure 1-7 illustrates how the transient technology use process evolves under different yield variance scenarios. A higher hybrid yield variance implies higher probability of extreme hybrid yield realizations, causing more frequent breakthroughs of the threshold boundaries of the optimal decision rule, leading to more switches between hybrids and traditional seeds, and eventually incurring higher total costs of switching back and forth over the period. Thus, even in a risk neutral scenario, a higher hybrid yield variance gives rise to a slightly lower adoption rate in the long run, shorter durations of both adoption and disadoption, and more switches, all of which are shown in Table 1-12. 20 1.5.3.3. Alternative Switching Cost Scenarios In this scenario, we adjust the size of the switching costs. Figures 1-8 and 1-9 illustrate how different levels of switching costs influence the adoption process, given high and low expected yield differentials, respectively. In general, higher switching costs discourage earlier use of hybrids and switching is less frequent. This can be seen from second panel of Table 1- 12 where higher switching costs give rise to fewer switches between hybrids and traditional seeds, and longer durations of both adoption and disadoption. This is because switching costs play a key role in preventing entry into the hybrid seed market. While the switching cost influences the speed of adoption, it does not solely determine the level of adoption. Figure 1- 8 and 1-9 show that the switching cost effect on adoption rate varies for different levels of profitability. When the profitability of hybrids is relatively high, higher switching costs will maintain farmers in hybrids, and the adoption path converges more slowly but to a higher expected level. When the profitability of hybrids is relatively low, higher switching costs will cause farmers to delay adoption, and the adoption path converges more slowly and to a lower level. This implies a complex relationship between switching costs, profitability, and the adoption process. 1.5.3.4. Alternative Price Yield Correlations and Price Variance Figure 1-10 shows how the adoption rate varies when the correlation between price and yield changes. Figure 1-11 shows how the adoption rate changes under different price variances. Under positive price/yield correlation and higher price volatility there is little effect on the adoption path. This can be seen from third panel of Table 1-12 which shows that changing price variability or the price-yield correlation has little impact on adoption duration or the frequency of switching. This can be attributed to the fact that when price is the same for 21 maize produced from either hybrid or traditional seeds, uncertainties solely from price have little effect on farmers’ optimal seed decision in a risk neutral scenario. 1.5.3.5. Learning and Reductions in Switching Costs In this section the model is extended to allow switching costs to be decreasing in experience with hybrid use. The objective is to examine how learning effects can potentially influence allowed to change over time according to, the adoption process. Instead of assuming constant switching costs these costs are now *")→#-" =b)→#+ c= de -"=0 c9 de -"=1 ⋮ 0 de -"≥6 , *"#→)-" =b#→)+ c= de -"=0 c9 de -"=1 ⋮ 0 de -"≥6 , where c=>c9>⋯>0. This allows for switching costs to start high and decline with remain fixed at their lowest level. The fixed lower levels after learning (b)→j and bj→)) are experience using hybrids, up to 5 instances of hybrid use. After five instances switching costs (9a) (9b) allowed to be different depending on the direction of the switch. The parameterization for the learning model is given in Table 1-13. Similar to previous scenarios, Monte Carlo simulations were used to evaluate how different factors influence the adoption process. As switching costs will be constant after 5 instances of hybrid use, steady states in this scenario are equivalent to those under constant switching costs. Hence, we only run each simulation for 50 periods with 200 replications, and concentrate on evaluating how learning that reduces switching costs affects adoption and 22 transient hybrid use in the early stages of the process. We vary the levels of yield differential, hybrid yield variance, correlation between price and yield, as well as price uncertainty, based on the same parameterizations used when assuming constant switching costs. Figure 1-12 illustrates how learning which leads to reductions in switching costs affect early adoption. The path of switching costs is presented on the left, while the adoption rate path is presented on the right. Holding other variables constant, a higher level of expected yield differential and hybrid yield uncertainty, and positive correlation between price and yields are found to encourage households to adopt hybrids at an earlier stage. This is because learning and reductions in switching costs strengthen the role of relative profitability of hybrids in influencing the adoption process, triggering more adoption at earlier periods, pushing the adoption process to converge faster, and leading to more switches over time in response to the profitability fluctuations. These findings provide another perspective to explain the phenomenon of transient hybrid use: occasionally high expected hybrid yield (high expected revenue of producing with hybrids) encourages adoption at an earlier stage. Then, the accumulation of experience with hybrids reduces the cost of switching between both seeds and gives rise to more switching or transient use in the future as their relative profitability fluctuates. 1.6. Conclusions This paper investigates complex dynamic patterns of adoption and disadoption of new technologies that are sometimes observed in practice. In contrast to much of the literature on technology adoption, this research abandons the assumption on the irreversibility of adopting modern varieties, and instead develops a dynamic switching model to study farmers’ behavior of non-adopt, disadopt, and switch back and forth among two seed technologies. The model is solved using a numerical dynamic programming algorithm, and optimal 23 decision rules imply that the transient technology use is encouraged by switches in relative profitability between alternative seed technologies and discouraged by switching costs. Evidence that hybrids could be both more productive and less profitable was provided using Monte Carlo simulations, which leads to transient technology due to the fluctuations in relative profitability. Focusing on the role of relative profitability and switching costs, the effects of various factors influencing adoption, disadoption, and transient technology use are modeled and explained. In particular, the profitability of adoption, the variance of hybrid yield shocks, and the size of switching costs are all found to be significant factors influencing the pattern of transient technology use. In long-run equilibrium, high hybrid yield variance encourages more switching and therefore causes higher switching costs, lowering the long- run adoption rate. Profitability and switching costs jointly determine the level of adoption in the long run. Switching costs play a role in preventing households from both entering and exiting the hybrid seed market, and the profitability of hybrids determines if the switching cost will maintain or exclude households from using the new technology. Learning effects that reduce switching costs when experience is gained using hybrids provides an additional perspective: the accumulation of hybrid use experience reduces switching costs and encourages households to manage their seed choice based more on relative profitability among technologies, rather than switching costs. Therefore, from the perspective of maximizing total expected profits, policy could pay more attention to reducing and overcoming these switching costs as they play a role in preventing farmers from choosing higher yielding hybrids, especially at an early stage of experience when productivity increases of hybrids have not been fully demonstrated. Once the profitability advantage of hybrids has been recognized, higher switching costs help maintain farmers in the hybrids. 24 Table 1-1 Possible transitions across hybrid/non-hybrid use Hybrid Use Transitions (2000 2004 2007 2010) No. Fraction of Sample (%) (N=1207 Households) N N N N N N N H N N H H N N H N N H H H N H H N N H N H N H N N H H H H H H H N H H N N H N N N H H N H H N H H H N H N H N N H 99 70 67 21 53 9 14 10 643 13 9 34 27 79 18 41 8.20 5.80 5.55 1.74 4.39 0.75 1.16 0.83 53.27 1.08 0.75 2.82 2.24 6.55 1.49 3.40 Note: “H” denotes the use of hybrid seed and “N” denotes the use of non- hybrid seed. Table 1-2 Proportion of households by adoption history category Proportion of the No. of Households Sample (%) 100 8.20 91.80 53.27 15.74 7.95 14.83 Total 1. Never Adopted 2. Adopted at least once 2.1 Always Adopted 2.2 Adopted and continued 2.3 Adopted and then Disadopted 2.4. Transient use (back and forth) 1207 99 1108 643 190 96 179 25 Table 1-3 Unit root and stationarity tests for monthly maize price Variable Monthly maize price DF statistic -1.542 PP statistic -1.554 KPSS statistic 0.868*** Note: ***, **, and * denote rejection at the 1%, 5%, and 10% significance levels, respectively. Table 1-4 VAR lag order selection criteria for first differenced maize price LL p-value AIC SBIC Lag 0 1 2 3 4 5 6 LR NA 0.270 1.832 0.000 0.186 1.029 3.077 NA 0.603 0.176 0.987 0.667 0.310 0.079 57.154 57.289 58.205 58.205 58.298 58.813 60.351 -0.876# -0.839 -0.815 -0.777 -0.740 -0.709 -0.695 Note: Criteria are likelihood ratio (LR), final prediction error (FPE), Akaike information criterion (AIC), Hannan and Quinn information criterion (HQIC), and Schwarz’s Bayesian information criterion (SBIC). # denotes the optimal lag selection. -0.898# -0.885 -0.883 -0.867 -0.853 -0.845 -0.854 HQIC -0.889# -0.866 -0.856 -0.831 -0.807 -0.790 -0.789 FPE 0.024# 0.024 0.024 0.025 0.025 0.025 0.025 26 Variables Trend Lagged first difference of maize price Dummy for January Dummy for February Dummy for March Dummy for April Dummy for May Dummy for June Dummy for July Dummy for August Dummy for September Dummy for October Dummy for November Constant 0.001 (0.001) 0.027 (0.093) 0.040 (0.066) -0.058 (0.066) -0.003 (0.064) 0.013 (0.013) 0.136** (0.064) 0.020 (0.065) -0.022 (0.064) -0.050 (0.064) -0.079 (0.064) -0.004 (0.065) 0.001 (0.064) -0.003 (0.052) 0.003 (0.013) 130 0.000 0.000 0.000 Table 1-5 Regression results for monthly maize prices Regression with time trend and seasonality Regression for random walk process Observations R-squared Adj R-squared F statistic Note: Robust standard errors in parentheses. *** p<0.01, **p<0.05, *p<0.1 129 0.123 0.025 1.250 27 Table 1-6 LM test for autoregressive conditional heteroscedasticity in monthly maize prices lag 1 2 3 4 5 chi-square 1.552 1.655 1.650 1.687 2.135 p-value 0.213 0.198 0.199 0.194 0.144 Squared errors year Table 1-7 Unit root and stationarity tests for aggregate maize yield PP statistic -3.100** DF statistic -3.305** Aggregate maize yield Variable KPSS statistic 0.536*** Note: ***, **, and * denote rejection at the 1%, 5%, and 10% significance levels, respectively. Table 1-8 Regression results for aggregate maize yields Variables lagged maize yield Regression with time trend Regression without time trend 0.500*** (0.123) 0.018** (0.008) -33.421** (16.262) 0.657*** (0.104) constant Observations R-squared Adj R-squared F statistic Note: Robust standard errors in parentheses. *** p<0.01, **p<0.05, *p<0.1 53 0.489 0.468 23.88 2.139*** (0.647) 53 0.440 0.429 40.01 28 Table 1-9 LM test for autoregressive conditional heteroscedasticity in aggregate maize yields Squared errors lag 1 2 3 4 5 chi-square 0.247 0.330 0.349 0.239 0.334 p-value 0.619 0.566 0.555 0.625 0.564 Table 1-10 Household-Level Yield Parameterization Results Parameter Description Long-run mean of hybrid yield Long-run mean of traditional yield Mean reversion parameter Hybrid yield shock variance Traditional yield shock variance Value 5.53 4.28 0.66 3.10 1.24 Table 1-11 Baseline dynamic programming parameterization Description Parameter kl km no pqkr pskr ptr kl km no pqkr pskr u vl vm wsm→l wsl→m Base Value 0.02 5.53 4.28 0.66 3.10 1.24 0 2.92 1.19 1.4 1.2 Monthly price shock variance Long-run mean of hybrid yield Long-run mean of traditional yield Yield mean reversion parameter Hybrid yield shock variance Traditional yield shock variance Price-Yield correlation Constant hybrid production cost Constant traditional production cost Constant switching cost from traditional to hybrid Constant switching cost from hybrid to traditional 29 Table 1-12 Number of switching, and adoption/disadoption duration over 1000 periods yield differential low 185 18.80 37.83 mid 196 26.76 25.55 high 176 38.06 18.35 switching cost (high profitability) low 196 26.76 25.55 mid 6.76 447 112 high 1.36 667 298 price/yield correlation positive 165 29.44 33.60 zero 157 29.85 34.04 negative 164 28.37 31.95 low 174 19.33 42.04 low 194 27.17 24.64 Scenario: hybrid yield Number of switches Adoption duration Disadoption duration Scenario: switching cost Number of switches Adoption duration Disadoption duration Scenario: revenue variability Number of switches Adoption duration Disadoption duration Table 1- 13 Parameterization for the learning model Parameter Description low 166 32.03 27.70 hybrid yield uncertainty high 213 26.45 21.44 mid 196 26.76 25.55 switching cost (low profitability) mid 7.08 134 412 high 0.71 110 831 price uncertainty mid 196 26.76 25.55 high 198 26.58 24.49 xm→l xl→m yz yo yr y{ y| y} Base Value 0.4 0.2 1 0.8 0.6 0.4 0.2 0 Long-run cost of switching to hybrid seed Long-run cost of switching to traditional seed Additional cost for first switch Additional cost for second switch Additional cost for third switch Additional cost for fourth switch Additional cost for fifth switch Additional cost for sixth and more switches 30 Figure 1-1 Conditional value functions for adopters and non-adopters Figure 1-2 Optimal adoption rules under alternative switching costs 31 Figure 1-3 Expected paths of yield and profit differentials between hybrid and traditional seeds Figure 1-4 Expected path of profit differential under different production cost differentials 32 Figure 1-5 Expected path of profit differential under different maize price Figure 1-6 Adoption rate under alternative hybrid yield scenario 33 Figure 1-7 Different hybrid yield variance scenarios 34 Figure 1-8 Adoption rate under alternative switching cost scenarios given high profitability of hybrids Figure 1-9 Adoption rate under alternative switching cost scenarios given low profitability of hybrids 35 Figure 1-10 Adoption rate under different price-yield differential correlation Figure 1-11 Adoption rate under different price variance 36 Figure 1-12 Influences on the adoption process when switching costs are decreasing in hybrid use 37 REFERENCES 38 REFERENCES Ackerberg, Daniel A. 2003. “Advertising, Learning, and Consumer Choice in Experience Good Markets: An Empirical Examination*.” International Economic Review 44 (3):1007–40. Breusch, Trevor Stanley, and Adrian Rodney Pagan. 1980. “The Lagrange Multiplier Test and Its Applications to Model Specification in Econometrics.” The Review of Economic Studies 47 (1): 239–253. Byerlee, Derek. 1994. “Maize Research in Sub-Saharan Africa: An Overview of Past Impacts and Future Prospects.” CIMMYT Economics Working Paper (CIMMYT). Caliendo, Marco, and Sabine Kopeinig. 2008. “Some Practical Guidance for the Implementation of Propensity Score Matching.” Journal of Economic Surveys 22 (1):31–72. Conley, Timothy G., and Christopher R. Udry. 2010. “Learning about a New Technology: Pineapple in Ghana.” The American Economic Review, 35–69. Coulter, JeFF, Craig SheaFFer, KriStiNe MoNCada, and S. C. Huerd. 2010. “Corn Production.” Risk Management Guide for Organic Producers, 23. Das, Sanghamitra, and Satya P. Das. 1997. “Dynamics of Entry and Exit of Firms in the Presence of Entry Adjustment Costs.” International Journal of Industrial Organization 15 (2):217–41. Das, Sanghamitra, Mark J. Roberts, and James R. Tybout. 2007. “Market Entry Costs, Producer Heterogeneity, and Export Dynamics.” Econometrica 75 (3):837–873. De Groote, Hugo, Cheryl Doss, Stephen D. Lyimo, Wilfred Mwangi, and Dawit Alemu. 2002. “Adoption of Maize Technologies in East Africa–what Happened to Africa’s Emerging Maize Revolution.” In FASID Forum V,“Green Revolution in Asia and Its Transferability to Africa”, Tokyo. Dercon, Stefan, and Luc Christiaensen. 2011. “Consumption Risk, Technology Adoption and Poverty Traps: Evidence from Ethiopia.” Journal of Development Economics 96 (2):159–173. Dixit, Avinash. 1989. “Entry and Exit Decisions under Uncertainty.” Journal of Political Economy 97 (3):620–638. Doss, Cheryl R. 2006. “Analyzing Technology Adoption Using Microstudies: Limitations, Challenges, and Opportunities for Improvement.” Agricultural Economics 34 (3):207–219. 39 Duflo, Esther, Michael Kremer, and Jonathan Robinson. 2008. “How High Are Rates of Return to Fertilizer? Evidence from Field Experiments in Kenya.” The American Economic Review 98 (2):482–88. Eckstein, Zvi, and Kenneth I. Wolpin. 1989a. “Dynamic Labour Force Participation of Married Women and Endogenous Work Experience.” The Review of Economic Studies 56 (3):375–390. Eckstein, Zvi, and Kenneth I. Wolpin. 1989b. “Dynamic Labour Force Participation of Married Women and Endogenous Work Experience.” The Review of Economic Studies 56 (3):375–390. Foster, Andrew D., and Mark R. Rosenzweig. 1995. “Learning by Doing and Learning from Others: Human Capital and Technical Change in Agriculture.” Journal of Political Economy, 1176–1209. Hyslop, Dean R. 1999. “State Dependence, Serial Correlation and Heterogeneity in Intertemporal Labor Force Participation of Married Women.” Econometrica 67 (6):1255–1294. Just, Richard E., and Quinn Weninger. 1999. “Are Crop Yields Normally Distributed?” American Journal of Agricultural Economics 81 (2):287–304. Kim, Jiyoung. 2006. “Consumers’ Dynamic Switching Decisions in the Cellular Service Industry.” Lee, David R. 2005. “Agricultural Sustainability and Technology Adoption: Issues and Policies for Developing Countries.” American Journal of Agricultural Economics 87 (5):1325–34. Macharia, C. N., C. M. Njeru, G. A. Ombakho, and M. S. Shiluli. 2010. “Comparative Performance of Advanced Generations of Maize Hybrids with a Local Maize Variety: Agronomic and Financial Implications for Smallholder Farmers.” J. Anim. Plant Sci 7 (2):801–809. Moser, Christine M, and Christopher B Barrett. 2003. “The Disappointing Adoption Dynamics of a Yield-Increasing, Low External-Input Technology: The Case of SRI in Madagascar.” Agricultural Systems 76 (3):1085–1100. Mwangi, Wilfred M. 1996. “Low Use of Fertilizers and Low Productivity in Sub-Saharan Africa.” Nutrient Cycling in Agroecosystems 47 (2):135–47. Nyoro, James K., Milton Ayieko, and Milu Muyanga. 2007. “The Compatibility of Trade Policy with Domestic Policy Interventions Affecting the Grains Sector in Kenya.” Tegemeo Institute, Egerton University. Ojiem, J. O., J. K. Ransom, and H. W. Wakhonya. 1996. “Performance of Hybrid and Local Maize with and without Fertilizer in Western Kenya.” In Maize Productivity Gains Through Research and Technology Dissemination, Proceedings of the 5th Eastern 40 and Southern Africa Regional Maize Conference in Arusha, Tanzania, CIMMYT Maize Program, 149–152. Omondi, Emmanuel Chiwo, Jay B. Norton, and Dennis Shibonje Ashilenje. 2014. “Performance of a Local Open Pollinated Maize Variety and a Common Hybrid Variety under Intensive Small-Scale Farming Practices.” African Journal of Agricultural Research 9 (11):950–955. Ouma, James O., Festus M. Murithi, Wilfred Mwangi, Hugo Verkuijl, Macharia Gethi, and Hugo De Groote. 2002. Adoption of Maize Seed and Fertilizer Technologies in Embu District, Kenya. CIMMYT. Pannell, David J., and Frank Vanclay. 2011. Changing Land Management: Adoption of New Practices by Rural Landholders. Csiro Publishing. Renkow, Mitch, Daniel G. Hallstrom, and Daniel D. Karanja. 2004. “Rural Infrastructure, Transactions Costs and Market Participation in Kenya.” Journal of Development Economics 73 (1):349–367. Rust, John P. 1989. “A Dynamic Programming Model of Retirement Behavior.” In The Economics of Aging, 359–404. University of Chicago Press. Spencer, Dunstan S. 1996. “Infrastructure and Technology Constraints to Agricultural Development in the Humid and Subhumid Tropics of Africa.” African Development Review 8 (2):68–93. Sunding, David, and David Zilberman. 2001. “The Agricultural Innovation Process: Research and Technology Adoption in a Changing Agricultural Sector.” Handbook of Agricultural Economics 1:207–261. Suri, Tavneet. 2011. “Selection and Comparative Advantage in Technology Adoption.” Econometrica 79 (1):159–209. Tura, Motuma, Dejene Aredo, Wondwossen Tsegaye, Roberto La Rovere, Girma Tesfahun, Wilfred Mwangi, and Germano Mwabu. 2010. “Adoption and Continued Use of Improved Maize Seeds: Case Study of Central Ethiopia.” African Journal of Agricultural Research 5 (17):2350–2358. Wolpin, Kenneth I. 1984. “An Estimable Dynamic Stochastic Model of Fertility and Child Mortality.” Journal of Political Economy 92 (5):852–874. Zeller, Manfred, Aliou Diagne, and Charles Mataya. 1998. “Market Access by Smallholder Farmers in Malawi: Implications for Technology Adoption, Agricultural Productivity and Crop Income.” Agricultural Economics 19 (1):219–229. 41 CHAPTER 2. ESTIMATING DYNAMIC DISCRETE CHOICE PANEL MODELS USING IRREGULARLY SPACED DATA 2.1. Introduction Irregular spacing refers to the situation where the unit period of data is not equal to the observation interval (Millimet and McDonough 2013). This situation occurs frequently in panel data sets from developing countries where the time and expense required for data collection often preclude data collection in every observation period. For example, the third chapter in this dissertation uses a four-wave panel data set on Kenyan smallholder farmers with data collected in 2000, 2004, 2007, and 2010. The unit period for this data set is one (crop) year, but the observation intervals are either three or four years. Thus, panel model applications using this data set feature irregular spacing. Irregular spacing occurs in many other panel data sets as well, including some data from developed countries (Millimet and McDonough 2013). As Millimet and McDonough (2013) have shown, all commonly used dynamic panel data (DPD) estimators are inconsistent if the data is irregularly spaced. These authors have studied the irregular spacing problem in situations where the dependent variable is continuous and arrived at a number of results and conclusions to improve inference in this environment. However, many applications of DPD involve discrete choice models, particularly in the developing country context where discrete technology choices are an important focus of study. To our knowledge, estimation of dynamic discrete choice models under irregular spacing has yet to be addressed in the literature. Hence, the objective of this chapter is to investigate the inference problem in discrete choice DPD models with irregular spacing. We propose a number of alternative approaches to improving inference in discrete choice DPD models, and investigate their performance via Monte Carlo simulation. The 42 Monte Carlo study provides important results on how to best deal with the irregular spacing problem in different application environments. The remainder of this chapter is organized as follows. In section 2, we first review the literature on conventional methods for estimating continuous dependent variable DPD models with regularly spaced data. Then we discuss the irregular spacing problem, maintaining the focus on continuous dependent variable DPD models, and review the existing literature on estimation approaches that have been developed so far. In section 3, we turn to discrete choice DPD models and review the traditional approaches to identification and estimation with regularly spaced data. We then show specifically how traditional discrete choice DPD estimators become inconsistent or not feasible in the presence of irregular spacing. Although this result is not surprising, it has not appeared in the literature to date. In section 4, a number of alternative estimators for discrete choice DPD models for irregularly spaced data are proposed. Although our proposed estimators are somewhat related to the irregular spacing estimators for continuous dependent variable DPD models that have already appeared in the literature (Sasaki and Xin 2014), the discrete choice environment has some additional complications that need to be addressed. Section 5 outlines Monte Carlo experiments used to compare the finite sample performances of our proposed discrete choice DPD estimators, and discusses the results and implications from the experiments. Conclusions on the advantages and disadvantages of alternative approaches to estimation and inferences in dynamic discrete choice panel models using irregularly spaced data are provided in section 6. 43 2.2. Dynamic Panel Data Models with Continuous Dependent Variables 2.2.1. Regularly Spaced Data Regularly spaced continuous dependent variable DPD models were first studied by Balestra and Nerlove (1966) and have been applied in many contexts. Lagged values of the dependent variable are incorporated as a covariate to account for the feedback from the current state to future states. Heckman (1981a) termed such persistence as ‘true’ or ‘structural’ state dependence. However, the observed persistence could also result from permanent unobserved heterogeneity across individuals, which might be viewed as ‘spurious’ state dependence. Thus, DPD model users also want to control for unobserved heterogeneity to distinguish these two sources of persistence. The inclusion of both non-strictly exogenous covariates (the lagged dependent variable) and unobserved heterogeneity in DPD models invalidates many estimation methods (Wooldridge 2010). To see this, consider the following continuous dependent variable DPD model: &~"=&~"89+Ä~":+(~+S~", where &~" is the continuous dependent variable for individual d in period Å,  is the state dependence parameter on the lagged dependent variable, Ä~" is a vector of covariates with corresponding parameter vector :, (~ is the individual-specific unobserved effect, and S~" is Due to the correlation between &~"89 and (~, the least squares estimator is inconsistent, the idiosyncratic error term. (1) irrespective of whether the unobserved effects are treated as random or fixed effects. The “Within Groups” estimator eliminates this source of inconsistency by transforming the equation using time averages to eliminate the effects of the unobserved heterogeneity. 44 However, for panels where the number of time periods is small, this transformation induces another non-negligible correlation between the transformed lagged dependent variable, &~"89− 9)89(&~9+⋯+&~"+⋯+&~)89) and the transformed error term, S~"− 9)89(S~Z+ ⋯+S~"89+⋯+S~)). Thus, the “Within Groups” estimator is also inconsistent. differencing induces correlation between the differenced lagged dependent variable, ∆&~"89 and the differenced error, ∆S~", using the lagged level &~"8Z as an instrument for ∆&~"89 provides consistent estimation. The validity of this approach relies on assumptions that S~" is differencing along with IV estimation. While eliminating the unobserved effects by first Anderson and Hsiao (1982) generated a consistent estimator by using first- serially uncorrelated and the initial conditions are predetermined. Extending this approach, Arellano and Bond (1991) obtained asymptotically efficient estimators by using additional moment conditions in a Generalized Method of Moments approach. The Arellano and Bond approach is currently the standard method for estimating regularly spaced DPD models with continuous dependent variables. 2.2.2. Irregularly Spaced Data The presence of irregularly spaced data makes all commonly used DPD estimators for continuous dependent variable models inconsistent for the following three reasons (Millimet and McDonough 2013). First, typical transformations fail to eliminate the observation- specific unobserved heterogeneity due to its time-varying factor structure. Second, the coefficient on the lagged dependent variable depends on the ‘gap’ structure (the number of missing periods between the observed irregularly spaced data). Third, covariates and the idiosyncratic errors are contained in the error term, which causes endogeneity problems. 45 To understand how the irregular spacing affects finite sample performance and ÉÑ<= äâ89 Ñ<= (2) ÑS~"8Ñ , É89Ñ<= Ñ(~+ É89Ñ<= Ä~"8ÑÑ :+ consistency, consider the continuous dependent variable DPD in equation (1).11 Given irregular spacing, observed periods are not consecutive and there are missing periods between generate an autoregressive equation between two consecutively observed periods: at least some of the observed periods (&~"89 is unobserved for some &~"). To eliminate the unobserved &~"89, Millimet and McDonough (2013) repeatedly substitute equation (1) to &~"=É&~"8É+ where * is the number of missing periods between two consecutively observed periods. If we use Ö=0,1,2…,à to index the observed periods, we can re-write the above equation as: &~â=äâ&~â89+ where ãÖ is the ‘gap size’ or the number of missing periods between period Ö and period Ö+1, and t(Ö) is the actual time period Ö period stands for. Then we can transform ÑS~"(â)8Ñ &~â=äâ&~â89+Ä~â:+[ Ä~"(â)8ÑÑ :+ ÑS~"(â)8Ñ , Ä~"â8ÑÑ :+ 98çéè98ç (~+ Ñ(~+ äâ89 Ñ<= äâ89 Ñ<= äâ89 Ñ<= (3) equation (3) as: äâ89 Ñ<9 ], (4) where all the terms in the square brackets are unobserved in the missing periods. The correlations between observed covariates and unobserved ones in the square brackets will 11 We use the same notations from Millimet and McDonough (2013) for the following illustrations. 46 lead to biased and inconsistent estimators (omitted variable bias). To address the irregular spacing problem, correlations between covariates and unobserved terms in the square brackets in equation (4) need to be properly accounted for. Millimet and McDonough (2013) suggest two main approaches to handling the correlations between covariates and the unobserved terms. First, they suggest a Mundlak (1978) correlated-random-effects (CRE) type estimator which specifies (~ as a function of Ä~, and uses Ä~â89 as an IV for &~â89 to deal with the correlation between &~â89 and the random effects. Second, they suggest using a quasi-differencing approach to eliminating (~, and using Ä~â89 as an IV for &~â89 to deal with the correlations between &~â89 and Ä~â8Z.12 However, the validity of Ä~â89 as an IV for &~â89 relies on the assumptions of strict exogeneity (7[Ä~"S~É]=0 ∀*,Å) and no serial correlations of Ä~, which could be very restrictive. Sasaki and Xin (2014) suggest another way to deal with irregular spacing using a transformation approach to identify and estimate parameters of fixed-effect continuous dependent variable DPD models. The idea behind this approach is to use available information to predict the missing data. The approach assumes weak stationarity13 and predeterminedness14 of &~" and Ä~". Under these assumptions previously observed data can be used as multipliers to transform the regression equation. When stationarity holds, the covariance between unobserved variables can be predicted by the observed ones, which identifies the parameters in the regression model. To illustrate Sasaki and Xin’s approach, consider the continuous dependent variable DPD model (1). Taking the difference of the dynamic model between two observed periods (Ö9 and ÖZ) to difference out the unobserved heterogeneity: 12 Ä~â8Z and covariates prior to period m-1 are included in the regression because of the quasi-differencing. 13 7~&~"S~É =0 and 7~Ä~"S~É =0 whenever s>t. 14 Variances of &~", Ä~", and covariance between &~" and &~É, Ä~" and Ä~É, and &~" and Ä~É are all time invariant. 47 (5) individuals yields: &~âë−&~âí= (&~âë89−&~âí89)+:(Ä~âë−Ä~âí)+(S~âë−S~âí). Multiplying both sides by &~âí898ì and Ä~âí898ì from another observed period where î+1 is the gap periods between two observed periods in the panel, and taking expectations 7~ across 7~ &~âí898ì&~âë−&~âí =7~ &~âí898ì&~âë89−&~âí89 +:7~ &~âí898ìÄ~âë−Ä~âí +7~ &~âí898ìS~âë−S~âí and 7~ Ä~âí898ì&~âë−&~âí =7~ Ä~âí898ì&~âë89−&~âí89 +:7~ Ä~âí898ìÄ~âë−Ä~âí +7~ Ä~âí898ìS~âë−S~âí Expressions for (,:) are obtained as a function of variances and covariance among &~â, Ä~â, and S~â in equations (6a) and (6b). Then based on the assumptions of weak stationarity sample variance and covariance and the structural parameters (,:) are identified. and predeterminedness, all the cross-sectional moments in (6a) and (6b) are approximated by . (6b) (6a) It is important to note, however, that the Sasaki and Xin (2014) approach is only practical for certain panel structures. Because the approach uses previous information on covariates to predict unobserved covariates, it requires a strict structure on how the data is spaced (i.e., the spacing structure of the panel). Sasaki and Xin define two spacing structures, 48 UK and US spacing, that satisfy the identification requirements.15 Both spacing structures require at least three waves of data, and must have adequate variations in gaps between each wave of the panel. The spacing structure is the US spacing if the panel satisfies ï(1)≠∅, ï(∆Å)≠∅, and ï(∆Å+1)≠∅, for some gap ∆Å∈Ν, where ï(∆Å) indicates observed periods have ∆Å gaps between them.16 Given that the above approaches rely on restrictive assumptions on either serially uncorrelated covariates or the panel structure, and given that the focus thus far has been on continuous dependent variable models, there is a need for further investigation of discrete choice DPD models with unstructured irregular spacing. 2.3. Discrete Choice Dynamic Panel Data Models 2.3.1. Regularly Spaced Data While the econometric literature on continuous DPD models has been well established, the identification and estimation of discrete choice DPD models (even under regular spacing) remain tenuous. Addressing the correlation between lagged dependent variable and unobserved heterogeneity, as well as the initial condition problem, are more difficult in discrete choice models. To illustrate this, suppose we have the following dynamic binary response panel data model: &~"=1&~"∗>0 =1(&~"89+Ä~":+(~+S~">0), ö&~==1Ä~",(~ =%=(Ä~",(~), ö&~"=1Ä~",(~,&~=,…,&~"89 =F(&~"89+Ä~":+(~), (7a) (7b) (7c) 15 Because this approach has been shown to perform better, and be more robust for US spacing (see Sasaki and Xin, 2014), we only evaluate the performance of the US spacing structure in the Monte Carlo experiments below. 16 For example, ï= 1,2,4 is the US spacing, where ∆Å=2 and ï∆Å = {2,4}. 49 where 1(∙) is an indicator function which equals 1 if the enclosed statement is true and 0 otherwise; &~"∗ is the latent variable that guides the binary decision; %= is the initial probability of &~==1; and F(∙) is the distribution function of the idiosyncratic error term, S~". Identification and estimation of this model relies on several assumptions (Chay and Hyslop 1998). First, the lagged dependent variable and other observable covariates must be jointly exogenous conditional on the individual effects. Second, the form of the conditional mean of the latent variable, &~"∗, must be correctly specified. Third, the idiosyncratic error term, S~", must be serially uncorrelated over time. Finally, the functional form of the distribution of S~" has to be specified. Unlike in continuous dependent variable models, traditional transformations are not capable of eliminating the unobserved heterogeneity.17 Therefore, the conventional approach of transformation and then GMM will not work in discrete choice models such as (7). Nevertheless, it is still possible to identify discrete choice DPD models that allow for fixed unobserved heterogeneity, but only under some restrictive conditions. Chamberlain (1985) showed that if the idiosyncratic errors follow an i.i.d. logistic distribution, then a proper conditioning statement can ‘condition out’ the unobserved heterogeneity and the initial conditions. To see this, consider the following logit binary response model with lagged dependent variable and unobserved heterogeneity, but no other exogenous covariates: &~==1Ä~",(~ =%=(~ , ö&~"=1Ä~",(~,&~=,…,&~"89 = ûüV (ç]†4°ëA¢†) 9AûüV (ç]†4°ëA¢†), (8a) (8b) 17 Traditional transformations (such as first differencing) requires linearity, which can only be maintained is discrete choice models by implementing the linear probability assumption, which puts unpalatable restrictions on the heterogeneity distribution (Wooldridge 2010). 50 It follows that: ö&~==B~=,&~9=1,&~Z=0,&~£=B~£(~,&~9+&~Z=1 = ûüV [秆•8§†¶] 9AûüV [秆•8§†¶], where B~= and B~£ ∈{0,1}.18 The parameter  is identified as long as the joint probability in (9) is independent of (~. This approach does not require parametric assumptions about the Honore and Kyriazidou (2000) added one more restriction, Ä~Z=Ä~£, on the subsample conditional distribution of the unobserved effects and initial conditions. Using the same idea, (9) selection in addition to the above conditional statement19, and developed a conditional logit fixed effects estimator for the dynamic logit model in the presence of other strictly exogenous explanatory variables. While the two approaches can generate consistent estimators, they still have a drawback: the estimation only utilizes a subsample in which the decisions made by individuals are consistent with the conditioning statement. While the information from other observations is omitted, such approaches are likely to understate the amount of true state dependence (Chay and Hyslop 1998). Because failure to utilize the full sample makes the fixed effects approach less desirable, a consistent estimator which can also be applied to a full sample would be helpful. A correctly specified random effects approach satisfies this requirement. Furthermore, the random effects approach can be used under a wide variety of assumptions, while the fixed effects approach can only be used when the idiosyncratic errors are logistically distributed.20 18 The conditional probability in (9) is on the subsample in which individuals make different choices in period 1 and 2 (either ‘1’ or ‘0’ and &~9+&~Z=1), and it is the conditional probability of individuals choosing ‘1’ in 19 The conditional probability becomes ö&~==B~=,&~9=1,&~Z=0,&~£=B~£(~,&~9+&~Z=1,Ä~Z=Ä~£ in period 1 and ‘0’ in period 2, then making either choice in period 0 and 3. Honore and Kyriazidou (2000). 20 The closed forms of conditional probabilities in Chamberlain (1985) and Honore and Kyriazidou (2000) can only be characterized under the logit assumption. 51 The random effects approach requires specification of the initial conditions and the conditional distribution of the unobserved heterogeneity. Hsiao (1986) summarized three alternative assumptions about the initial conditions. The simplest but most naïve assumes that the initial conditions, &~= (or the pre-sample history of the process), are strictly exogenous and nonrandom. This implies that the initial state &~= is independent of the unobserved effects and can be ignored in the estimation. A more realistic assumption is to allow the initial conditions to be random. There are two main approaches to specifying the distribution of the initial condition. One assumes that the dynamic process is in equilibrium at the beginning of the sample period, and thus the distribution of the initial condition is a steady state distribution. However, this assumption is unlikely to hold if any determinants of the decision are time-varying. The other one proposed by Heckman (1981b) is to approximate the initial condition for the dynamic discrete model. The main step in this approach is specifying the initial state as a function of covariates in the dynamic model, and then approximating the probability of the initial state by a probit model. This approach overcomes the difficulty of finding the conditional distribution of the initial condition. Once the distribution of the idiosyncratic errors and the conditional distribution of the initial condition are correctly specified, the conditional density of all the observations on individual i is generated as21: e&~9,&~Z,…,&~)|&~==&=,Ä~=Ä,(~=( = e"(&"|&"89,Ä",() e&~=,&~9,…,&~)|Ä~=Ä,(~=( =e&~9,&~Z,…,&~)|&~==&=,Ä~=Ä,(~=( ∙e(&~=|Ä~=Ä,(~=(). )"<9 , (10) (11) 21 The two assumptions of the correct dynamic specification and strict exogeneity of covariates are made for this generalization. See Wooldridge (2005) for details. 52 e&~=,&~9,…,&~)|Ä~=Ä,(~=( To obtain the density e&~=,&~9,…,&~)|Ä~=Ä , one can specify the conditional distribution of the unobserved effects e((|Ä) and integrate them: &~=,&~9,…,&~)|Ä~=Ä = Maximizing the sum of loge&~=,&~9,…,&~)|Ä~=Ä across individuals generates consistent with covariates and the conditional distribution of the unobserved effects is e(Ä =e((), The random effects approach assumes that the unobserved effects are uncorrelated ∙e((|Ä)B(. Maximum Likelihood Estimators (MLE). ;8; (12) which could be restrictive. To relax this assumption, Chamberlain (1979) and Mundlak (1978) developed the correlated-random-effects approach which allows correlation between c and x, and specifies the conditional distribution of c as a function of x or the average of x.22 In the following discussions, we utilize this assumption and focus on the correlated-random- effects approach sometimes comparing it to fixed effects. The correlated-random-effects approach relies on correctly specifying the conditional distribution of the initial condition to find the density of (y©=,y©9,…,y©K|x©). Wooldridge e((|&~=,Ä). The resulting conditional MLE estimators are ^-consistent and asymptotically distribution of the unobserved effects conditional on the initial state and model covariates, (2005) proposed a simple approach to finding this density by specifying an auxiliary normal under regularity conditions. 22 Mundlak (1978) specified c as a function of the time average of x, while Chamberlain (1979) incorporated all x and in all periods to allow more generality. 53 2.3.2. Irregularly Spaced Data To our knowledge, there is no existing literature discussing the irregular spacing problem in discrete choice DPD models. In this section, we show that commonly used discrete choice DPD estimators are either infeasible or inconsistent when data is irregularly spaced. We have outlined two main approaches to estimating regularly spaced discrete choice DPD models: fixed effects and correlated-random-effects. The main idea behind the fixed effects approach is to ‘condition out’ the unobserved effects and the initial condition under a proper conditioning statement, which requires that the decision choices follow a specific order. When the panel data is not consecutively observed (irregularly spaced), we are unable to select the subsample satisfying the conditioning statement. Therefore, the fixed effects approach is not feasible in the presence of irregularly spaced panel data. The main idea behind the correlated-random-effects approach is to define the log- likelihood function by assuming initial conditions and the conditional distribution of the unobserved heterogeneity. Suppose we have a four-wave irregularly spaced panel data set indexed by (Ö=,Ö9,ÖZ,Ö£).23 The correlated-random-effects approach requires identification of f&â•,&âë,&âí,&â¶&89,( for the specification of the log-likelihood function24, where &89 is the choice state prior to the sample period, and c is the unobserved heterogeneity. Following Wooldridge (2005) we assume that the dynamics are correctly specified and transform the joint density function as: f&â•,&âë,&âí,&â¶&89,( =f&â¶&âí,( f&âí&âë,(f&âë&â•,(f&â•&89,( , (13) 23 Recalling that Ö=,Ö9,ÖZ,Ö£ are irregularly spaced with missing periods between each wave. 24 As the unobserved heterogeneity will be integrated out and covariates do not affect the following deduction, we omit both of them in this illustration and focus on the identification of the joint density. 54 Then consider the following transformation for the conditional density: 55 f"#$"#%,' =0,"#%,' =1,"#%,' +f"#$"#%)* =f"#$"#%)* ∗ ∗ =1"#%,' +f"#$"#%)* =1,'/"#%)* =0,'/"#%)* =0"#%,' =f"#$"#%)* ∗ ∗ ∗ ∗ =1,' +f"#$"#%)0 =0,"#%)* =1,"#%)* =1"#%,'1+ f"#$"#%)0 =1,' /"#%)* = f"#$"#%)0 ∗ ∗ ∗ ∗ ∗ ∗ =0,' /"#%)* =0,"#%)* =0"#%,' f"#$"#%)0 ∗ ∗ ∗ =1,'/"#%)0 =1"#%)* =1,' +f"#$"#%)0 =0"#%)* =0,'/"#%)0 =1,' /"#%)* = f"#$"#%)0 ∗ ∗ ∗ ∗ ∗ ∗ ∗ =0,' /"#%)* =0"#%)* =0,'/"#%)0 =0,' + f"#$"#%)0 =1"#%)* =1,'/"#%)0 f"#$"#%)0 ∗ ∗ ∗ ∗ ∗ ∗ ∗ =1"#%,' + =1,'/"#%)* =1"#%)* =1,'/"#%)0 =f"#$"#%)0 ∗ ∗ ∗ ∗ =0,'/"#%)0 =0"#%)* =1,'/"#%)* =1"#%,' + f"#$"#%)0 ∗ ∗ ∗ ∗ =0,'/"#%)* =1"#%)* =1,'/"#%)0 =0"#%,' + f"#$"#%)0 ∗ ∗ ∗ ∗ f"#$"#%)0 =0,'/"#%)0 =0"#%,' + =0,'/"#%)* =0"#%)* ∗ ∗ ∗ ∗ {f"#$"#$6* =7#$6*,'/"#$6* =7#$6*"#$60 ⋯ = * * * * ∗ ∗ ∗ =7#%)*,'/"#%)* =7#%)0"#%)* /"#%)0 ∗ ∗ ∗ 9:$?<=> 9:$?%=> 9:%;%=> 9:%;<=> ⋮ =7#$60,' ⋯ =7#%)*"#%,'} =1,"#%)* =0,' + ∗ =1"#%,' + =0"#%,' (14) 56 transformed in the same way. unobserved periods are explicitly accounted for. Other conditional density functions can be where !"#$% denotes the decision in period '(−1 is not observed, and so forth. The ∗ conditional density f!"#!",,. is specified as a sum of conditional probabilities in all possible outcome combinations from period '/ to '(, and all possible outcomes of decisions in After specifying the index model as P!1=1!1$%,. =F(5!1$%+718+.), the specified and the parameters (5,8) are identified as long as 71 is observed in the missing above is different from the one constructed by assuming f!":,!";,!",,!"#!$%,. = f(!":,!":<%,!":=/,!":=(|!$%,.). In other words, if we ignore irregular spacing and apply the transformed conditional densities in equation (14) and the log-likelihood function are explicitly periods. However, if covariates are not observed in the missing periods (which will usually be the case), the above approach is not feasible. Obviously, the likelihood function constructed random effects approach by assuming the periods are consecutive, the estimates are inconsistent. 2.4. Alternative Estimation Approaches for Discrete Choice DPD models under Irregular To derive alternative estimation approaches to discrete choice DPD models under irregular Spacing spacing, consider a dynamic probit model with unobserved heterogeneity: !@1=1!@1∗>0 =1(5!@1$%+7@18+.@+C@1>0), C@1~E(0,1) F!@1=17@1,.@,!@G,…,!@1$% =Φ(5!@1$%+7@18+.@). (15a) (15b) A number of estimators could be used 57 the estimating equation: 2.4.1. Correlated-Random-Effects Probit (CRE-P) The CRE-P estimator, ignoring irregular spacing issues, assumes unobserved heterogeneity is a function of initial !@G, initial 7@G, and the average of covariates 7@. Dynamic probit is applied to F!@"=17@1,J@,!@G,…,!@"$% =Φ(5G!@"$%+7@"8G+.@), J@~E(0,1). .@=JG+J%!@G+J/7@+J(7@G+J@, (16a) (16b) Recalling that m indexes observed periods in the panel and that the CRE-P estimator is consistent in estimating regular spacing panels, using CRE-P and simply ignoring the irregular spacing is one way to proceed. While this estimator would be inconsistent with irregularly spaced data, it is straightforward to apply with existing econometric software and would certainly be the simplest way to proceed. Hence, we evaluate its performance under irregular spacing using Monte Carlo methods. 2.4.2. Correlated-Random-Effects Probit with Gap Dummies (CRE-PGD) The CRE-PGD estimator applies dynamic probit along with dummy variables indicating if there is a gap between the two observed waves of the panel: F!@"=17@1,J@,!@G,…,!@"$% =Φ(5G!@"$%+5%K"$%+5/!@"$%K"$%+7@"8G+8%7@"K"$%+.@), =Φ[5G!@"$%+(5%+5/K"$%)!@"$%+(8G+8%K"$%)7@"+.@], .@=JG+J%!@G+J/7@+J(7@G+J@, J@~E(0,1) (17a) (17b) 58 where K"$% equals 1 if there is a more-than-one-year gap between wave m and m-1, and equals 0 otherwise. Incorporating gap dummies to indicate if the observations belong to regularly spaced or irregularly spaced waves25 would allow us to compare the difference between these two groups of data, which could potentially account for the effects of irregular spacing on estimation. To see this, incorporating gap dummy K"$% in equation (17a) allows us to separate the state dependence into two parts: 5%+5/ is the state dependence of irregularly spaced data; and 5% is the state dependence of regularly spaced data. 2.4.3. Linear Probability Model Estimator Using a US Spacing Structure (LPM-US) Another alternative is the linear approach based on gap structure developed by Sasaki and Xin (2014). It has been shown that this approach can identify parameters in fixed-effect continuous DPD models under certain conditions. Because this approach requires linearity, however, it can only be applied to the linear probability model approximation of the dynamic probit model. Hence, there may be a trade-off between the linear probability approximation and the ability of Sasaki and Xin (2014) approach to accommodating the irregular spacing. We will use Monte Carlo simulation to evaluate this trade off and the performance of this estimator. 2.4.4. Indirect Inference Approach (IIA) Given the fact that all existing discrete choice DPD estimators are inconsistent under irregular spacing, it is worthwhile considering whether we could correct the bias resulting from irregular spacing using indirect inference. This is the fourth estimation approach we evaluate. Everaert and Pozzi (2007) have developed an iterative bootstrap procedure to correct the bias of the least squares dummy variable estimator in dynamic panel models with moderate T. between !@" and !@"$%, and vice versa. 25 In terms of dynamic models, observations belonging to regularly spaced wave means there is only a one-year gap 59 The idea is to use the biased estimates of the true population parameters as the baseline and search over the parameter space to reduce the bias using a simulation process. Unbiased estimates are obtained when simulated data sets generated from searching over the parameter space recover the original baseline estimates. This approach is very similar to the indirect inference method, which was first introduced by Smith (1993) and Gourieroux, Monfort and Renault (1993), and has been found to be useful when the moments and the likelihood function of the true model are hard to define. Gouriéroux, Phillips and Yu (2010) have shown that the indirect inference approach can correct the bias of fixed effects estimation with dynamic panel data sets that occurs due to the incidental parameter problem. While the approach proposed by Gouriéroux, Phillips and Yu (2010) is only applicable in a linear model, Bruins et al. (2015) developed a generalized indirect inference procedure for discrete choice models, which is applicable in our case. The simulation-based method they developed is fast, robust, and nearly as efficient as maximum likelihood when maximum likelihood is consistent. The next question is whether such a method can generate unbiased and consistent estimators when maximum likelihood is inconsistent (our case). We illustrate the application of the IIA below. The basic idea of indirect inference is to compare the observed data and the simulated data from alternative parameters in the parameter space through a descriptive statistical (auxiliary) model. The simulated data is generated by the structural model (the original estimation model) but using alternative parameter values. Given a binary dependent variable, we need a continuous one to make the simulation and iteration easier to compute. Thus, we employ a latent utility model as follows. Suppose the index model and the latent utility model, respectively, are: (19a) P!@1=1!@1$% =Φ(5!@1$%+7@18+.@), 60 (19b) N@1=5!@1$%+7@18+.@+C@1. Without loss of generality, we set N@G=0 if !@G=0.26 Then the IIA is as follows: 1. Estimate model (19a) via CRE-P to obtain the inconsistent estimator P=(5,8). 2. Set N@GR and randomly draw C@1 from its distribution and generate J sets of simulated N@1R through N@1R=5N@1$%R +7@18+J@+C@1. 3. Estimate the actual data set (!@1,7@1) and J sets of simulated data sets (N@1R , 7@1) via a linear probability model27 and obtain auxiliary parameters SP and J sets of SR(P). SR(P) 4. The indirect inference estimator is defined as: PTT=UVW'XY SP −%Z ZR[% 28 2.5. Evaluating Estimator Performance 2.5.1. Monte Carlo Experiments To investigate how irregular spacing will affect the performance of traditional estimators and compare the performance of the alternative estimators that have been discussed, we conduct Monte Carlo experiments. The general structure for the data generating process (DGP) used in the Monte Carlo simulations is: !@^~Bernouli (0.5), for _=−24, (20a) 26 One can set !@G=0 as the initial value for the dynamic model. Otherwise there is an “initial condition problem” at this step. 27 It is not necessary to use linear probability model as the auxiliary model and other models could potentially perform better. For the discrete choice DPD model with irregular spacing, however, there is not a good alternative to using the linear probability model. 28 The choice of the distance metric can be Wald, Likelihood Ratio, and Lagrange Multiplier. An example of the iteration procedure to optimize the criterion function is quasi-Newton routine. 61 (20c) !@1=1{5!@1$%+7@18+.@+C@1≥0}, C@1~E0,1,29 ; X=1,…,E; e=−23,…,g (20b) ci ~N(0,1) exogenous 7@1 ~N(0.5xs,1) endogenous 7@1 7@1 ~E0,1 independent normal =7@1$%+Ä@1 AR1process (20d) ~|/1 −1 independent skewed process starts at _=−24 where the initial !@^ follows a Bernouli distribution. The subsequent !@1 are generated using (20b) and is assumed to be potentially observable only after time t=0. The unobserved heterogeneity follows two normal distributions, determining if the covariate 7@1 is exogenous or endogenous. We specify three different DGPs for 7@1, including independent We utilize the basic Monte Carlo design in Rabe-Hesketh and Skrondal (2013). The normal, independent skewed, and autoregressive one (AR1) process. Four experiments are conducted to investigate estimator performance under irregular spacing. In the first experiment, the performance of CRE-P in estimating a regularly spaced panel is evaluated. This provides a baseline for the simulations because CRE-P is the standard estimation approach for discret choice DPD models with regularly spaced data. In the second experiment, some observations are assumed missing to impose irregular spacing and the performance of CRE-P, CRE-PGD, LPM-US, and IIA estimators are compared under a number of alternative assumptions about the nature of the covariate and its relationship with unobserved heterogeneity. In the thrid experiment, the covariate 7@1 is assumed to follow an AR(1) process and the state dependence parameter (5) is varied to evaluate the effects of different 5 on covariate 7@1but vary the persistence parameter () of the AR(1) process for 7@1 to evaluate how 29 We have conducted experiments in heteroskedastic case by allowing C@1 to change over time and the results do estimator performance. In the fourth experiment, we again assume serial correlation in the not change much. 62 different  affects the the performance of the estimators in an irregularly spaced panel data model. For experiment 1,2, and 4, the parameters are set at 5=0.5, and 8=0.5. For experiment 1,2, and 3, the persistence parameter of 7@1 is set at =0.5. In all cases, we set E= 1000 and perform 100 repetitions for each experiment using GAUSS and the Maxlik library. For each repetition, we run 300 iterations to integrate out random errors in unobserved heterogeneity in the probit model. The estimated average relative bias in percentage and root mean square errors (RMSE) are the two main indicators 30 to evaluate the performance of each estimator. 2.5.2. Monte Carlo Results Results from the first experiment provide a baseline by evaluating the performance of CRE-P in estimating a regularly spaced panel given under alternative assumptions on covariate and unobserved heterogeneity with three different T (3,4,and 10). Later these results will be compared to other results from alternative estimators when irregular spacing is incorporated. The results reported in Table 2-1 show CRE-P performs well in estimating coefficients in regularly spaced DPD models (as expected). In most scenarios, the bias of coefficient estimates is not significantly different from zero using one-sample t tests at the 5% level (only DGP2, which features a skewed covariate with T=3, shows evidence of bias). These results are consistent with the theoretical fact that CRE-P estimates are unbiased and consistent under regular spacing. However, in Table 2-2, the estimates of average partial effects (APE) of CRE-P in regularly spaced DPD models are mostly found to be biased, even though the bias is not extremely large. The bias of APE estimates could be partially resulted from the biased estimates of unobserved heterogeneity 30 Results of estimates of both coefficients and average partial effects are presented in the following result section. 63 In the second experiment, the performances of CRE-P, CRE-PGD, LPM-US, and IIA are evaluated under two different irregularly spaced panels (Pattern 1: e=1,5,8,11; Pattern 2: e=1,2,5,8). The patterns of irregular spacing are chosen based on the following reasoning: (1) the sample data utilized in Chapter 3 of this dissertation is collected in 2000, 2004, 2007, and 2010, which is consistent with panel structure Pattern 1; (2) panel structure pattern 2 is the US spacing panel to which the LPM-US estimator could be applied; and (3) pattern 2 has two consecutively observed waves, allowing us to use gap dummies to deal with the irregular spacing using the CRE-P estimator (CRE-PGD). Results of estimates of coeffients and APE are very similar, while the bias of APE estimates is slightly larger than that of coeffients estimates. In terms of estimating the state dependence Monte Caro results for the four estimators are presented in Table 2-3 to Table 2-6. Table 2-3 and 2-4 feature exogenous 7@1 while Table 2-5 and 2-6 show results for endogenous 7@1. parameter 5, CRE-P generally produces downward biased estimates under both irregular instead of the actual value lagged one period to predict the state dependence of !@1. The gap between two observed periods diminishes the estimated state dependence of !@1, leading to the spacing patterns 1 and 2. Due to the missing periods, CRE-P uses the lagged observed period downward bias of the estimates. Also, comparing pattern 1 and pattern 2, adding a consecutively observed period in the panel improves the performance of CRE-P considerably (average relative bias is decreased by about 50%). This indicates that having at least two consecutively observed periods will reduce the bias of CRE-P estimates under irregular spacing. As for CRE-PGD, the estimates in pattern 1 do not change much due to the fact that no two periods are consecutively observed and only dummies indicating if the gap is three or four years could be incorporated into the CRE-PGD estimator. Thus, the effect of irregular spacing is not fully taken into account by adding gap dummies in this case. However, in pattern2, where there are two consecutively observed periods, the CRE-PGD estimator performs much better 64 (average relative bias is decreased by about 85%). This confirms the importance of having at least two consecutively observed periods in irregularly spaced panels, which significanly improves the performance of both CRE-P and CRE-PGD.31 The estimates of 8 are not severely biased in most cases for either CRE-P or CRE-PGD, and in many cases the bias is not significantly different than zero. This could be due to the fact that the bias from irregular spacing manifests itself mainly in the dynamics of the panel model (the incorporation of unobserved lagged !@1 in some periods). Thus, the estimates of the contemporaneous effect of 7@1 is not affected much by the unobserved lagged !@1. The LPM-US and IIA estimators do not perform well in discrete choice DPD models (see results in Table 2-3 to Table 2-6). Even though the average relative bias in some cases are not significantly different from 0 using one-sample t tests, the high RMSE suggests that LPM- US is not very efficient in estimating discrete choice DPD models under irregular spacing. The IIA does not significantly reduce the bias of CRE-P and its performance in estimating 8 is even worse than CRE-P. One potential explanation for the poor performance of the IIA approach is that, the only option for an auxiliary model in step 3 of the IIA is LPM-US. However LPM-US itself performs poorly in estimating the irregularly spaced discrete choice DPD models. efficient auxiliary models in step 3. Therefore, further refinement of the IIA approach could be focused on developing more In the third experiment, 5 is varied from 0.1 to 0.9 to evaluate how the magnitude of the state dependence of !@1 will affect the performance of CRE-P and CRE-PGD.32 Accoridng to Table 2-7 to Table 2-10, the performances of all estimators are similar to what we observe in 31 Having two consecutively observed periods improves the performance of CRE-P, and adding gap dummies can additionally improve the performance. Thus, it is still worthwhile using CRE-PGD in panels with two consecutively observed periods. 32 As the performances of LPM-US and IIA are weak, we focus on investigating the patterns of the CRE-P and CRE-PGD in the following experiments. Also, note that in this experiment, the covariate 7@1 follows an AR(1) process in all cases. 65 experiment 2. One thing worth noting is that the magnitude of the bias of estimating 5 tends to be U-shaped:. the bias is smaller when 5 is low or high, and it is larger when 5 is intermediate. This may be because when 5 is low, the dynamics of the model are relatively minor and thus the effect of irregular spacing is small. When 5 is high, the higher state dependence of !@1 means smaller variation of the !@1 sequence (more similarity between !@"$%and !@1$%), and thus smaller bias of estimating 5 from using !@"$% as the proxy of !@1$%.33 In the fourth experiment, the covariate 7@1 follows an AR(1) process with the AR coefficient  varied from 0.1 to 0.9 to evaluate how the degree of covariate persistence will from experiment 2. One thing worth noting is that, higher  produces smaller bias in estimating 5. A potential explanation is that, according to (14), the log-likelihood function can be correctly specified if no other covariates are incorporated. Higher  reduces the variation of 7@1, which in turn decreases the role of 7@1 in the identification of the log-likelihood function (and thus the affect the performance of estimators when the panel is irregularly spaced. Results are presented in Table 2-11 to Table 2-14. Generally, the results of experiment 4 are consistent with results patameters). 2.6. Conclusions This study investigates the irregular spacing problem in estimation of dynamic panel discrete choice models. Comparing simulations of both regularly and irregularly spaced panels, we illustrate how irregular spacing affects the performance of existing discrete DPD estimators. Generally, CRE-P produces downward biased estimates for the state dependence of the dynamic model, due to the diminishing effect of using former observed periods as the proxy for the 33 In this illustration, given irregular spacing, !@" (also !@1) and !@"$% are observed, and !@1$% lies between !@1 and !@"$% and is unobserved. In the dynamic regression model, we use !@"$% as the proxy of !@1$% to predict the state dependence parameter 5. 66 unobserved one-lagged period. This finding suggests that any empirical work ignoring irregular spacing would underestimate the true state dependence in discrete choice DPD models. Adding gap dummies to the CRE-P approach could potentially reduce the bias. However, the effectiveness of CRE-PGD relies on the panel structure and simulation results suggest it is crucial to have at least two consecutively observed periods in the panel for CRE-PGD to account for irregular spacing effectively. This is an important finding because even if panel data cannot be collected every period due to budget constraints, making sure there are at least two consecutive periods of data collection will increase the effectiveness of the CRE-PGD as well as the traditional CRE-P estimator. The estimates of the parameters of the covariates are unbiased in most cases. Even though in some cases the estimates are still biased, the magnitude of the bias is not severe. This indicates that the bias from estimating discrete choice DPD models under irregular spacing is mainly from the dynamics. Thus, irregular spacing may not be a major problem in estimating contemporaneous effects of covariates in dynamic panels. Also, the persistence of both dependent variable and the covariate have been varied in experiments to characterize alternative patterns of how irregular spacing affects the estimations. We have found that higher persistence, corresponding to smaller variation of outcomes, tends to reduce the bias from irregular spacing. These patterns of irregular spacing effects would be useful when empirical researchers predict and interpret their estimation results. In addition to traditional estimators, we also propose two new estimators to address irregular spacing issues in discrete choice DPD models. LPM-US, which has been found to work effectively in continuous DPD models, performs poorly in discrete choice DPD models. Indirect inference also fails to reduce the bias of irregular spacing effectively in our simulations. Further refinements of the IIA approach could be focused on developing more efficient auxiliary models to compare the observed and simulated data sets. 67 Exogenous DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one Endogenous Time Periods 3 4 10 Time Periods ! ! Average Relative Bias (%) RMSE Average Relative Bias (%) RMSE 0.15 0.06 0.07 0.10 0.04 0.04 0.04 0.02 0.02 5.05 11.11 6.78 0.21 -0.74 2.32 0.89 0.24 1.50 0.58 0.30 0.03 -0.55 -1.52 0.56 0.45 0.47 -0.47 0.15 0.18 0.20 0.10 0.11 0.09 0.04 0.04 0.04 " " Table 2-1 Monte Carlo results for regularly spaced CRE-P (Estimates of Coefficients) DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one Note: underlined estimates are significantly different from 0 at the 5% significance level. Average Relative Bias (%) RMSE Average Relative Bias (%) RMSE 0.06 0.07 0.07 0.03 0.05 0.04 0.02 0.02 0.02 2.35 11.17 5.38 -0.91 0.47 2.73 0.61 0.61 1.03 0.70 0.57 1.04 -0.28 -1.15 0.83 0.47 0.47 -0.47 0.15 0.18 0.19 0.10 0.11 0.09 0.04 0.04 0.04 10 3 4 68 " Average Relative Bias (%) -17.71 -20.77 -17.96 -17.55 -20.36 -15.90 -15.42 -17.59 -15.24 " Average Relative Bias (%) -10.46 -16.74 -3.27 -12.45 -16.65 -2.90 -13.41 -16.48 -10.27 RMSE 0.03 0.04 0.03 0.03 0.04 0.03 0.03 0.03 0.03 RMSE 0.02 0.03 0.02 0.02 0.03 0.01 0.02 0.03 0.02 Table 2-2 Monte Carlo results for regularly spaced CRE-P (Estimates of APEs) Exogenous DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one Endogenous Time Periods 3 4 10 Time Periods ! Average Relative Bias (%) RMSE 0.05 0.06 0.06 0.04 0.05 0.03 0.03 0.03 0.02 -13.43 -11.41 -11.01 -16.75 -19.52 -14.20 -15.04 -17.80 -13.55 ! DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: Autoregeressive one Note: underlined estimates are significantly different from 0 at the 5% significance level. Average Relative Bias (%) RMSE 0.05 0.05 0.05 0.03 0.04 0.03 0.02 0.03 0.02 -8.17 -7.03 2.63 -12.85 -15.04 -0.84 -13.31 -16.44 -8.91 10 3 4 69 Table 2-3 Results of experiment 2 for irregular spacing evaluation with exogenous xit (Estimates of Coefficients) Average Relative Bias (%) RMSE Average Relative Bias (%) RMSE 0.49 0.49 0.48 0.27 0.25 0.26 CRE-P CRE- PGD 0.49 0.48 0.47 0.14 0.14 0.14 CRE-P CRE- PGD 0.05 0.05 0.05 0.07 0.08 0.06 0.04 0.05 0.04 0.05 0.05 0.04 LPM- US NA NA NA 5.42 0.53 0.08 LPM- US NA NA NA 23.84 1.99 0.06 II NA NA NA 0.27 0.26 0.25 II NA NA NA 0.22 0.24 0.21 ! " DGP1: independent normal DGP2: independent skewed DGP3: autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: autoregeressive one CRE-P CRE- PGD -93.13 -96.06 -95.30 -91.58 -90.46 -94.83 -15.93 -49.24 -45.98 -14.81 -13.96 -47.50 LPM- US NA NA NA 291.53 18.55 -8.62 II NA NA NA -50.85 -49.19 -44.81 Panel Structure 1,5,8,11 1,2,5,8 Panel Structure II NA DGP1: independent normal NA DGP2: independent skewed NA DGP3: autoregeressive one -43.58 DGP1: independent normal -46.72 DGP2: independent skewed -41.05 DGP3: autoregeressive one Note: underlined estimates are significantly different from 0 at the 5% significance level. CRE-P CRE- PGD -2.08 -1.52 -2.25 -1.38 4.07 3.63 5.42 -3.57 -5.02 2.28 1.11 -0.46 LPM- US NA NA NA -714.47 -116.51 -29.47 1,5,8,11 1,2,5,8 70 Table 2-4 Results of experiment 2 for irregular spacing evaluation with exogenous xit (Estimates of APEs) Average Relative Bias (%) RMSE Average Relative Bias (%) RMSE 0.16 0.17 0.16 0.10 0.10 0.09 CRE-P CRE- PGD 0.16 0.17 0.15 0.06 0.06 0.06 CRE-P CRE- PGD 0.03 0.03 0.02 0.03 0.04 0.03 0.03 0.03 0.02 0.03 0.04 0.02 LPM- US NA NA NA 5.42 0.53 0.08 LPM- US NA NA NA -714.47 -116.51 -29.47 II NA NA NA 0.27 0.26 0.25 II NA NA NA -43.58 -46.72 -41.05 ! " DGP1: independent normal DGP2: independent skewed DGP3: autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: autoregeressive one CRE-P CRE- PGD -93.63 -96.41 -95.82 -92.56 -91.31 -95.32 -29.00 -55.99 -54.29 -29.67 -26.55 -53.94 LPM- US NA NA NA 291.53 18.55 -8.62 II NA NA NA -50.85 -49.19 -44.81 Panel Structure 1,5,8,11 1,2,5,8 Panel Structure II NA DGP1: independent normal NA DGP2: independent skewed NA DGP3: autoregeressive one -43.58 DGP1: independent normal -46.72 DGP2: independent skewed -41.05 DGP3: autoregeressive one Note: underlined estimates are significantly different from 0 at the 5% significance level. CRE-P CRE- PGD -15.34 -14.81 -17.39 -16.59 -8.90 -9.32 -11.14 -16.84 -19.94 -15.70 -13.78 -13.06 LPM- US NA NA NA -714.47 -116.51 -29.47 1,5,8,11 1,2,5,8 71 Table 2-5 Results of experiment 2 for irregular spacing evaluation with endogenous xit (Estimates of Coefficients) Average Relative Bias (%) RMSE Average Relative Bias (%) RMSE 0.49 0.48 0.47 0.27 0.25 0.26 CRE-P CRE- PGD 0.50 0.49 0.47 0.16 0.15 0.16 CRE-P CRE- PGD 0.05 0.05 0.05 0.06 0.08 0.05 0.04 0.05 0.04 0.04 0.05 0.04 LPM- US NA NA NA 1.33 2.48 0.08 LPM- US NA NA NA 11.19 24.19 0.05 II NA NA NA 0.26 0.26 0.24 II NA NA NA 0.22 0.24 0.21 ! " DGP1: independent normal DGP2: independent skewed DGP3: autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: autoregeressive one Panel Structure 1,5,8,11 1,2,5,8 Panel CRE-P CRE- PGD -95.27 -95.04 -94.24 -93.89 -90.31 -92.66 -16.95 -50.12 -45.66 -11.45 -16.67 -48.96 LPM- US NA NA NA 62.09 126.49 4.34 II NA NA NA -49.06 -47.21 -44.56 Structure II NA DGP1: independent normal NA DGP2: independent skewed NA DGP3: autoregeressive one -43.73 DGP1: independent normal -46.78 DGP2: independent skewed DGP3: autoregeressive one -40.96 Note: underlined estimates are significantly different from 0 at the 5% significance level. CRE-P CRE- PGD -2.64 -3.10 -1.38 -1.18 3.87 3.17 1.68 -5.11 -3.93 5.30 0.23 -0.54 LPM- US NA NA NA 375.32 -1351.09 -23.94 1,5,8,11 1,2,5,8 72 Table 2-6 Results of experiment 2 for irregular spacing evaluation with endogenous xit (Estimates of APEs) Average Relative Bias (%) RMSE Average Relative Bias (%) RMSE 0.15 0.16 0.14 0.09 0.09 0.08 CRE-P CRE- PGD 0.16 0.16 0.14 0.06 0.06 0.05 CRE-P CRE- PGD 0.02 0.03 0.01 0.02 0.03 0.02 0.02 0.03 0.01 0.02 0.03 0.01 LPM- US NA NA NA 1.33 2.48 0.08 LPM- US NA NA NA 11.19 24.19 0.05 II NA NA NA 0.26 0.26 0.24 II NA NA NA 0.22 0.24 0.21 ! " DGP1: independent normal DGP2: independent skewed DGP3: autoregeressive one DGP1: independent normal DGP2: independent skewed DGP3: autoregeressive one Panel Structure 1,5,8,11 1,2,5,8 Panel CRE-P CRE- PGD -95.33 -95.27 -94.66 -94.13 -90.51 -92.81 -25.19 -53.88 -52.09 -24.11 -22.06 -50.89 LPM- US NA NA NA 62.09 126.49 4.34 II NA NA NA -49.06 -47.21 -44.56 Structure II NA DGP1: independent normal NA DGP2: independent skewed NA DGP3: autoregeressive one -43.73 DGP1: independent normal -46.78 DGP2: independent skewed -40.96 DGP3: autoregeressive one Note: underlined estimates are significantly different from 0 at the 5% significance level. CRE-P CRE- PGD -10.25 -10.51 -13.51 -13.48 -2.55 -3.16 -8.47 -12.65 -15.80 -9.91 -6.54 -4.92 LPM- US NA NA NA 375.32 -1351.09 -23.94 1,5,8,11 1,2,5,8 73 ! RMSE State dependence of Table 2-7 Results of experiment 3 for different state dependence of y$% with exogenous xit (Estimates of Coefficients) " &'( ! = 0.1 ! = 0.3 ! = 0.5 ! = 0.7 ! = 0.9 ! = 0.1 ! = 0.3 ! = 0.5 ! = 0.7 ! = 0.9 Average Relative Bias (%) CRE-P 0.45 3.83 4.45 4.49 2.31 -0.46 1.48 0.43 -1.36 -4.51 CRE-P CRE-PGD 0.13 0.30 0.48 0.65 0.77 0.11 0.18 0.26 0.32 0.38 Average Relative Bias (%) CRE-P -91.58 -94.33 -92.86 -91.20 -84.30 -39.89 -50.42 -46.69 -42.76 -40.09 -85.00 -92.37 -89.91 -87.81 -80.19 -8.10 -17.82 -14.30 -8.93 -8.70 0.93 4.04 4.28 4.52 2.70 -0.65 2.75 2.93 3.03 0.20 0.16 0.30 0.47 0.63 0.74 0.12 0.12 0.15 0.15 0.16 CRE-PGD CRE-PGD Panel Structure 1,5,8,11 1,2,5,8 RMSE CRE-P CRE-PGD 0.04 0.04 0.05 0.04 0.04 0.04 0.03 0.04 0.03 0.04 0.04 0.04 0.05 0.05 0.04 0.06 0.06 0.06 0.06 0.06 Note: underlined estimates are significantly different from 0 at the 5% significance level. 74 ! RMSE State dependence of Panel Structure Table 2-8 Results of experiment 3 for different state dependence of y$% with exogenous xit (Estimates of APEs) " &'( ! = 0.1 ! = 0.3 ! = 0.5 ! = 0.7 ! = 0.9 ! = 0.1 ! = 0.3 ! = 0.5 ! = 0.7 ! = 0.9 Average Relative Bias (%) CRE-P -11.04 -9.48 -9.17 -8.95 -9.16 -11.89 -11.73 -12.81 -13.67 -14.31 CRE-P CRE-PGD 0.04 0.10 0.16 0.20 0.22 0.04 0.06 0.09 0.11 0.12 Average Relative Bias (%) CRE-P -92.26 -94.83 -93.49 -92.15 -85.86 -46.28 -56.57 -53.39 -49.68 -46.08 CRE-PGD -85.81 -92.80 -90.71 -89.09 -81.99 -18.76 -29.23 -27.07 -23.27 -22.35 CRE-PGD -10.73 -9.39 -9.34 -8.79 -8.57 -12.50 -11.80 -12.70 -13.34 -14.82 0.05 0.10 0.15 0.20 0.22 0.04 0.04 0.06 0.06 0.07 1,5,8,11 1,2,5,8 RMSE CRE-P CRE-PGD 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.02 0.03 0.03 0.03 0.03 0.03 Note: underlined estimates are significantly different from 0 at the 5% significance level. 75 ! RMSE State dependence of Table 2-9 Results of experiment 3 for different state dependence of y$% with endogenous xit (Estimates of Coefficients) &'( ! = 0.1 ! = 0.3 ! = 0.5 ! = 0.7 ! = 0.9 ! = 0.1 ! = 0.3 ! = 0.5 ! = 0.7 ! = 0.9 Average Relative Bias (%) CRE-P -0.06 3.68 3.37 3.57 4.05 -0.89 1.30 -0.71 -1.97 -2.77 CRE-P CRE-PGD 0.12 0.29 0.47 0.63 0.79 0.11 0.17 0.24 0.32 0.39 Average Relative Bias (%) CRE-P -72.46 -89.07 -91.08 -88.94 -87.20 -41.51 -45.05 -44.09 -43.36 -41.64 -80.31 -83.37 -87.57 -84.53 -80.67 -6.83 -14.77 -9.53 -11.46 -11.14 0.08 3.37 3.36 2.95 3.45 -0.84 2.78 0.70 2.03 2.88 0.15 0.29 0.45 0.61 0.74 0.12 0.12 0.15 0.16 0.17 CRE-PGD CRE-PGD Panel Structure 1,5,8,11 1,2,5,8 " RMSE CRE-P CRE-PGD 0.04 0.04 0.04 0.04 0.05 0.04 0.03 0.04 0.04 0.04 0.05 0.05 0.04 0.05 0.05 0.05 0.05 0.06 0.06 0.06 Note: underlined estimates are significantly different from 0 at the 5% significance level. 76 ! State dependence of Table 2-10 Results of experiment 3 for different state dependence of y$% with endogenous xit (Estimates of APEs) " &'( ! = 0.1 ! = 0.3 ! = 0.5 ! = 0.7 ! = 0.9 ! = 0.1 ! = 0.3 ! = 0.5 ! = 0.7 ! = 0.9 Average Relative Bias (%) CRE-P -3.39 -1.62 -2.74 -2.43 -2.16 -3.05 -2.52 -5.22 -5.70 -5.61 CRE-P CRE-PGD 0.04 0.09 0.14 0.18 0.21 0.03 0.05 0.07 0.09 0.10 Average Relative Bias (%) CRE-P -72.68 -89.24 -91.32 -89.41 -87.83 -42.16 -46.84 -46.47 -45.29 -43.18 CRE-PGD -79.87 -83.63 -87.91 -85.06 -81.37 -9.09 -18.95 -15.94 -17.94 -18.17 -3.49 -1.91 -2.71 -2.78 -2.10 -3.54 -2.37 -6.39 -5.50 -5.27 0.04 0.09 0.13 0.17 0.20 0.04 0.04 0.04 0.05 0.05 CRE-PGD RMSE CRE-P CRE-PGD 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.01 0.02 0.02 0.01 RMSE Panel Structure 1,5,8,11 1,2,5,8 Note: underlined estimates are significantly different from 0 at the 5% significance level. 77 Persistence of Panel Structure Table 2-11 Results of experiment 4 for different persistence of x$% and with exogenous xit (Estimates of Coefficients) " *'( ρ = 0.1 ρ = 0.3 ρ = 0.5 ρ = 0.7 ρ = 0.9 ρ = 0.1 ρ = 0.3 ρ = 0.5 ρ = 0.7 ρ = 0.9 Average Relative Bias (%) CRE-P -1.69 2.47 4.45 6.74 5.28 -4.18 -0.51 0.43 1.32 -1.11 CRE-P CRE-PGD 0.50 0.49 0.48 0.46 0.41 0.25 0.26 0.26 0.25 0.23 Average Relative Bias (%) CRE-P -97.07 -96.01 -92.86 -89.74 -79.92 -45.19 -49.07 -46.69 -44.61 -39.71 CRE-PGD -93.00 -93.60 -89.50 -85.01 -73.03 -9.63 -14.70 -14.30 -12.50 -10.32 -0.85 2.99 4.18 6.69 5.06 1.60 3.96 2.93 1.65 -2.79 0.49 0.48 0.47 0.44 0.39 0.14 0.14 0.15 0.14 0.15 CRE-PGD 1,5,8,11 1,2,5,8 ! RMSE RMSE CRE-P CRE-PGD 0.04 0.04 0.05 0.05 0.04 0.04 0.04 0.04 0.03 0.03 0.05 0.04 0.05 0.05 0.04 0.07 0.06 0.06 0.05 0.04 Note: underlined estimates are significantly different from 0 at the 5% significance level. 78 Persistence of Panel Structure RMSE ! Table 2-12 Results of experiment 4 for different persistence of x$% and with exogenous xit (Estimates of APEs) " *'( ρ = 0.1 ρ = 0.3 ρ = 0.5 ρ = 0.7 ρ = 0.9 ρ = 0.1 ρ = 0.3 ρ = 0.5 ρ = 0.7 ρ = 0.9 Average Relative Bias (%) CRE-P -14.98 -11.45 -9.17 -6.01 -2.74 -17.25 -14.19 -12.81 -10.81 -7.72 CRE-P CRE-PGD 0.17 0.16 0.16 0.14 0.10 0.09 0.10 0.09 0.08 0.06 Average Relative Bias (%) CRE-P -97.33 -96.44 -93.49 -90.74 -81.17 -52.49 -55.88 -53.39 -50.93 -43.46 CRE-PGD -93.61 -94.20 -90.37 -86.39 -74.32 -23.72 -28.09 -27.07 -24.67 -18.93 CRE-PGD -14.27 -11.07 -9.38 -5.91 -2.62 -14.36 -12.51 -12.70 -12.76 -11.94 0.16 0.16 0.15 0.14 0.10 0.05 0.06 0.06 0.05 0.04 1,5,8,11 1,2,5,8 RMSE CRE-P CRE-PGD 0.03 0.02 0.02 0.01 0.01 0.03 0.03 0.02 0.02 0.01 0.03 0.02 0.02 0.01 0.01 0.03 0.03 0.03 0.02 0.02 Note: underlined estimates are significantly different from 0 at the 5% significance level. 79 ! RMSE Persistence of Panel Structure Table 2-13 Results of experiment 4 for different persistence of x$% and with endogenous xit (Estimates of Coefficients) *'( ρ = 0.1 ρ = 0.3 ρ = 0.5 ρ = 0.7 ρ = 0.9 ρ = 0.1 ρ = 0.3 ρ = 0.5 ρ = 0.7 ρ = 0.9 Average Relative Bias (%) CRE-P -2.05 1.64 3.37 4.95 5.64 -4.56 -1.39 -0.71 -0.03 -0.57 CRE-P CRE-PGD 0.49 0.48 0.47 0.44 0.45 0.26 0.26 0.24 0.25 0.27 Average Relative Bias (%) CRE-P -95.38 -94.43 -91.08 -86.21 -85.90 -47.61 -47.56 -44.09 -45.54 -47.00 CRE-PGD -96.82 -91.43 -86.77 -82.46 -76.73 -10.28 -14.27 -9.53 -14.68 -19.21 -1.60 1.40 3.26 4.94 5.10 2.18 4.02 0.70 -0.98 -2.82 0.50 0.48 0.45 0.43 0.42 0.15 0.14 0.15 0.16 0.20 CRE-PGD 1,5,8,11 1,2,5,8 " RMSE CRE-P CRE-PGD 0.04 0.04 0.04 0.04 0.05 0.05 0.04 0.04 0.04 0.03 0.05 0.04 0.04 0.05 0.05 0.06 0.06 0.06 0.05 0.05 Note: underlined estimates are significantly different from 0 at the 5% significance level. 80 Persistence of Panel Structure RMSE CRE-PGD ! Table 2-14 Results of experiment 4 for different persistence of x$% and with endogenous xit (Estimates of APEs) " *'( ρ = 0.1 ρ = 0.3 ρ = 0.5 ρ = 0.7 ρ = 0.9 ρ = 0.1 ρ = 0.3 ρ = 0.5 ρ = 0.7 ρ = 0.9 Average Relative Bias (%) CRE-P -9.64 -6.02 -2.74 4.63 16.46 -11.84 -8.03 -5.22 2.32 13.70 CRE-P CRE-PGD 0.15 0.15 0.14 0.11 0.07 0.09 0.08 0.07 0.06 0.04 Average Relative Bias (%) CRE-P -95.51 -94.59 -91.32 -86.01 -83.70 -51.39 -50.88 -46.47 -43.90 -38.44 CRE-PGD -96.70 -91.59 -87.13 -82.15 -72.80 -19.26 -21.95 -15.94 -14.71 -10.37 -9.54 -6.28 -2.71 4.71 16.55 -8.13 -5.37 -6.39 -1.22 7.34 0.16 0.15 0.13 0.11 0.07 0.05 0.05 0.04 0.04 0.03 1,5,8,11 1,2,5,8 RMSE CRE-P CRE-PGD 0.02 0.01 0.01 0.01 0.01 0.02 0.02 0.01 0.01 0.01 0.02 0.02 0.01 0.01 0.02 0.02 0.02 0.02 0.01 0.01 Note: underlined estimates are significantly different from 0 at the 5% significance level. 81 REFERENCES 82 REFERENCES Anderson, Theodore Wilbur, and Cheng Hsiao. 1982. “Formulation and Estimation of Dynamic Models Using Panel Data.” Journal of Econometrics 18 (1): 47–82. Arellano, Manuel, and Stephen Bond. 1991. “Some Tests of Specification for Panel Data: Monte Carlo Evidence and an Application to Employment Equations.” The Review of Economic Studies 58 (2): 277–297. Balestra, Pietro, and Marc Nerlove. 1966. “Pooling Cross Section and Time Series Data in the Estimation of a Dynamic Model: The Demand for Natural Gas.” Econometrica: Journal of the Econometric Society, 585–612. Bruins, Marianne, James A. Duffy, Michael P. Keane, and Anthony A. Smith Jr. 2015. “Generalized Indirect Inference for Discrete Choice Models.” arXiv Preprint arXiv:1507.06115. Chamberlain, Gary. 1979. Analysis of Covariance with Qualitative Data. National Bureau of Economic Research Cambridge, Mass., USA. Chamberlain, Gary. 1985. “Heterogeneity, Duration Dependence and Omitted Variable Bias.” In Longitudinal Analysis of Labor Market Data. Cambridge University Press New York. Chay, Kenneth Y., and Dean Hyslop. 1998. Identification and Estimation of Dynamic Binary Response Panel Data Models: Empirical Evidence Using Alternative Approaches. 5. Center for Labor Economics, University of California, Berkeley. Everaert, Gerdie, and Lorenzo Pozzi. 2007. “Bootstrap-Based Bias Correction for Dynamic Panels.” Journal of Economic Dynamics and Control 31 (4): 1160–1184. Gourieroux, Christian, Alain Monfort, and Eric Renault. 1993. “Indirect Inference.” Journal of Applied Econometrics 8 (S1): S85–S118. Gouriéroux, Christian, Peter CB Phillips, and Jun Yu. 2010. “Indirect Inference for Dynamic Panel Models.” Journal of Econometrics 157 (1): 68–77. Heckman, James J. 1981a. “The Incidental Parameters Problem and the Problem of Initial Conditions in Estimating a Discrete Time-Discrete Data Stochastic Process.” Heckman, James J. 1981b. “The Incidental Parameters Problem and the Problem of Initial Conditions in Estimating a Discrete Time-Discrete Data Stochastic Process.” Honoré, Bo E., and Ekaterini Kyriazidou. 2000. “Panel Data Discrete Choice Models with Lagged Dependent Variables.” Econometrica 68 (4): 839–74. 83 Hsiao, C. 1986. M., 1986, Analysis of Panel Data. Cambridge University Press. Millimet, Daniel L., and Ian K. McDonough. 2013. “Dynamic Panel Data Models with Irregular Spacing: With Applications to Early Childhood Development.” Mundlak, Yair. 1978. “On the Pooling of Time Series and Cross Section Data.” Econometrica: Journal of the Econometric Society, 69–85. Rabe-Hesketh, Sophia, and Anders Skrondal. 2013. “Avoiding Biased Versions of Wooldridge’s Simple Solution to the Initial Conditions Problem.” Economics Letters 120 (2): 346–49. Sasaki, Yuya, and Yi Xin. 2014. “Unequal Spacing in Dynamic Panel Data: Identification and Estimation.” Smith, Anthony A. 1993. “Estimating Nonlinear Time-Series Models Using Simulated Vector Autoregressions.” Journal of Applied Econometrics 8 (S1): S63–S84. Wooldridge, Jeffrey M. 2005. “Simple Solutions to the Initial Conditions Problem in Dynamic, Nonlinear Panel Data Models with Unobserved Heterogeneity.” Journal of Applied Econometrics 20 (1): 39–54. Wooldridge, Jeffrey M. 2010. Econometric Analysis of Cross Section and Panel Data. MIT press. 84 CHAPTER 3. TRANSIENT USE OF HYBRID MAIZE: IRREGULARLY SPACED DYNAMIC PANEL EVIDENCE FROM KENYA 3.1. Introduction Intensive agriculture based on adoption of modern technologies has been viewed as crucial for Africa to reach its development potential, given apparent land scarcity and low land fertility (De Groote et al. 2002; Lee 2005; Pannell and Vanclay 2011). However, the progress of technology development and adoption in Africa remains slow, even after experience with a number of incentive programs, such as fertilizer subsidies, government-facilitated provision of input credit, and centralized control of input procurement (Spencer 1996; Ouma et al. 2002; Moser and Barrett 2003; Dercon and Christiaensen 2011). Considerable research has been conducted on the causes and consequences of low adoption of technologies like hybrid maize and fertilizer in Africa. This literature has focused on explaining technology adoption based on farmer/farm characteristics and agricultural production information, and has identified a number of constraints to adoption such as low expected profitability, risk aversion, lack of marketing and transportation infrastructure, and low availability of credit and liquidity for seed and fertilizer purchases (Byerlee 1994; Mwangi 1996; Zeller, Diagne, and Mataya 1998; Sunding and Zilberman 2001; Doss 2006; Suri 2011). However, most of the existing research on technology adoption is conducted in a static framework, assuming that adoption is a one-time decision so that, once adopted, a new technology will continue to be used until a better one becomes available. This is at odds with what we observe in some technology adoption environments where farmers switch back and forth between two or more technologies. This is particularly true for hybrid seed use in Africa 85 where panel data sets reveal individual farmers commonly switch back and forth between modern varieties and traditional local varieties (Ouma et al. 2002; Tura et al. 2010). We provide some descriptive data below that support these observations for maize production in Kenya and Zambia. We term this technology switching behavior “transient technology use” and it has been little studied to date. Therefore, the objective of this chapter is to investigate the phenomenon of transient hybrid maize use in Africa. As one of the first attempts to study transient hybrid use, we first develop a dynamic theoretical model to characterize household switching behavior between hybrid maize and traditional maize varieties. Then we apply a dynamic binary response model to panel data from Kenya to investigate the determinants of transient hybrid maize use in this environment. Given that the panel data is irregularly spaced, the persistence of hybrid adoption would be underestimated by traditional dynamic panel estimators (see Chapter 2). Thus, we apply a gap dummy approach to deal with the irregular spacing. Simulation results in Chapter 2 suggest the gap dummy estimator should provide useful improvements in inference over standard correlated random effect probit estimation, given the irregularly spaced panel (downward bias reduction). The remainder of this Chapter is structured as follows. Section 2 describes the data and presents descriptive statistics. Section 3 outlines the conceptual model. Section 4 illustrates the empirical implementation of the correlated random effects gap dummy estimator. Section 5 presents and compares results from both traditional correlated random effects probit estimation and the gap-dummy estimator suggested here. Section 6 concludes. 86 3.2. Data and Descriptive Statistics The household panel data used in this chapter was collected as a joint project between the Tegemeo Institute at Egerton University, Kenya and Michigan State University. It is a four-wave household level panel survey (2000, 2004, 2007, 2010), representative of rural maize-growing areas in Kenya. The surveys collect demographic and socio-economic characteristics, input use, crop and livestock production data, and off-farm activities, assets and income. Maize is the main staple crop in Kenya and planted in main and short seasons. We restrict the sample to households planting maize in the main seasons of all four panel waves. The total number of households used is 1207. Table 3-1 lists all possible four period seed type transitions, and the corresponding number of households that fall into each transition category. Table 3-2 then classifies the households according to their adoption history (never adopted, always adopted, adopted and continued, adopted and disadopted, and transient use). While over 90% of households adopted hybrids at least once, almost 23% of the sample subsequently disadopted them. Transient use of hybrid seeds accounts for about 15% of the sample, indicating that transient use of hybrid seeds is an important phenomenon in Kenya and suggesting that transient technology use may be important in other technology adoption contexts as well. Household descriptive statistics are presented in Table 3-3, dividing households into groups by hybrid adoption patterns (full sample, never adopted, always adopted, and transient use). Three observations are worth noting. First, households who always use hybrids have the highest average maize yield, while households who never used hybrids have the lowest. Second, households who adopt and use hybrids are more likely to use fertilizer and higher amounts of fertilizer. Third, hybrid adopters are more likely to be net maize sellers, even though the average maize selling price does not vary much across seed use categories. 87 Household income information is provided by seed use pattern in Table 3-4. Along with the highest yields, households adopting hybrids also have the highest asset value and highest income, irrespective of whether it is from crop, livestock, or off-farm income. Transient technology user’s asset value was the lowest in the beginning of the sample period, but surpasses that of households who never adopt hybrids in later periods. Market infrastructure information, including household’s distance to hybrid and fertilizer sellers, as well as distance to motorable and tarmac road, is compared across hybrid use groups in Table 3-5. Households who always adopt hybrids have the shortest distance to hybrids and fertilizer, and to motorable and tarmac road, suggesting that their transportation costs are the lowest. 3.3. Conceptual Model The conceptual model for this chapter is similar to the model in Chapter 1, except that farmers’ decisions on fertilizer are included explicitly. Suppose a household maximizes expected discounted sum of expected lifetime net (1) () *)+, returns over an infinite horizon: -) max %&' where ( is the household’s rate of time preference; -) is net return per acre from maize production. Each period the household chooses whether to use hybrid seed, whether to use fertilizer, and the amount of fertilizer to use. The net return per acre from maize production depends on seed and fertilizer choices according to: 88 (2) 1−d) 6):(f),x);3))− 9)7→:d)&'−d) -)(d),f),x);3))=d)6)7(f),x);3))−9):→7d)−d)&' + where d) is a binary decision variable with d)=1 indicating hybrid seed is chosen and d)=0 indicating traditional seed is chosen; f) is a binary decision variable with f)=1 indicating fertilizer is used and f)=0 indicating fertilizer is not used; x) is the amount of fertilizer used; 3) production; superscript H(T) denotes hybrids (traditional seeds); 6)7(:) is per acre profits from decision and the amount of fertilizer used; and per acre switching costs are denoted by 9):→7 for switching from traditional to hybrids and 9)7→: for switching from hybrids to traditional varieties. Switching cost is incurred only when d)≠d)&' (i.e. the seed technology is switched). is a vector of household and market characteristics that can influence the net return from maize maize production with hybrids (traditional seeds), which depends on the fertilizer participation The dynamic programming solution to the household’s optimization problem is characterized by the value function: {BC,DC,EC}E){-)(d),f),x);3))+(@(d),3)H')} @(d)&',3))= max subject to transition equations for the state variables 3). To facilitate the empirical implementation, (3) we break the decision problem into three hurdles including whether to use hybrids, whether to use fertilizer, and how much fertilizer to use. In the third and final stage the household chooses the amount of fertilizer use conditional on seed and fertilizer participation decisions. This is a purely static problem defined as: 89 @I(d)&',dJ,f),3))=maxEC E)-)(d),f),x);3)) (4)34 The solution to this problem takes the form of a set of conditional (on technology and fertilizer participation decisions) fertilizer demand equations: lnx)= M7d)&',3) if d)=1 and f)=1 M:d)&',3) if d)=0 and f)=1 x)=0 OP f)=0 and (5a) (5b) Moving backwards, in the second stage the household chooses whether to participate in fertilizer use conditional on the seed decision. Conditional on hybrids being chosen, the second- stage decision of whether to use fertilizer is characterized by: @Qd)=1,d)&',Z) =max {@Qd)=1,f)=1,d)&',Z) +(E)@Qd)=1,f)=1,Z)H', @Qd)=1,f)=0,d)&',Z) +(E)@Qd)=1,f)=0,Z)H' } (6) The sure current payment that would have to be made to the household at time t when no fertilizer is used in order to make the household indifferent to using the optimal fertilizer allocation or not is given by an S)TU that satisfies: @Qd)=1,f)=1,d)&',Z) +(E)@Qd)=1,f)=1,Z)H' ≡ 34 Subscript of V denotes the stage of the production decision making. 90 otherwise. Similarly, conditional on traditional seeds being used the second-stage decision of whether to use fertilizer is characterized by: @Qd)=0,d)&',Z) =max {@Qd)=0,f)=1,d)&',Z) +(E)@Qd)=0,f)=1,Z)H', @Qd)=0,f)=0,d)&',Z) +(E)@Qd)=0,f)=0,Z)H' } (9) Conditional on hybrid use the optimal fertilizer participation decision is given by: @Qd)=1,f)=0,d)&',Z),S)TU +(E)@Qd)=1,f)=0,Z)H' f)d)=1 =1(S)TU>0) where 1(∙) is the indicator function that is equal to 1 if hybrids and fertilizer are both used and 0 (7)35 (8) The sure current payment that would have to be made to the household at time t when no fertilizer is used in order to make the household indifferent to using the optimal fertilizer allocation or not is given by an S))U that satisfies: @Qd)=0,f)=1,d)&',Z) +(E)@Qd)=0,f)=1,Z)H' ≡ @Qd)=0,f)=0,d)&',Z),S))U +(E)@Qd)=0,f)=0,Z)H' (10) Conditional on traditional seed use the optimal fertilizer participation decision is given by: 35 Having S)TU in the value function does not affect the optimal decisions, but increases the optimal value @Qd)&',Z),S)TU = max {BC,DC,EC}E){-)d),f),x);3) +S)TU+(@Q(d),3)H')}. 91 f)d)=0 =1(S))U>0) with f)d)=0 =1 indicating fertilizer use conditional on traditional seed use. (11) Continuing backwards in time the first stage decision of whether to use hybrids is characterized by: and the sure current payment that would have to be made at time t when hybrids are not used in @'d)&',Z) =max {@'d)=1,d)&',Z),@'d)=0,d)&',Z)} order to make the household indifferent to the seed choice is given by an S)T that satisfies: @'d)=1,d)&',Z) ≡@'d)=0,d)&',Z),S)T d)=1(S)T>0) with d)=1 indicating a hybrid is used. The optimal seed decision is therefore: (12) (13) (14) The complete production decisions consist of the seed adoption rule (14), the two conditional fertilizer participation rules (8) and (11), and the fertilizer intensity rules (5a) and (5b). 3.4. Empirical Implementation To investigate transient seed technology use, we focus on seed decisions in the first stage. Since the sample used in this Chapter was collected in 2000, 2004, 2007, and 2010, the data are 92 irregularly spaced and this needs to be accounted for during estimation. We will employ the single-equation (gap-dummy) approach proposed in Chapter 2 to reduce the downward bias from irregular spacing. Therefore, instead of using system equations to analyze both seed and fertilizer decisions simultaneously, we use a single adoption model for hybrid seed participation conditional on household’s expectations of fertilizer allocation for each type of seed. The empirical implementation is in the context of irregularly spaced panel data. We define the explanatory variable vector XZ)= dZ)&',fZ)∗,\Z)∗,ZZ) where O=1,2,…,_ indexes households and `=1,2,…,a indexes years. Taking linear stochastic approximations for SZ)T in the above dZ)|fZ)∗,\Z)∗=1(XZ)c+dT,Z)>0) Without loss of generality, the variances of the d errors can be normalized to 1 and the conceptual model gives rise to the specification of the empirical model: (15) errors are assumed to be normally and independently distributed. The log-likelihood function for BC+' seed adoption model is specified as: lnfgZ)c ee= where Φ(∙) is density function for the standard normal distribution. ln1−fgZ)c BC+, + (16) Four econometric issues complicate estimation of equation (15) with the panel data set used in this research. First, there may be omitted household characteristics (unobserved heterogeneity). Given the binary dependent variable, correlated-random-effects probit (CRE-P) would be the standard approach to addressing the unobserved heterogeneity. CRE-P restricts the 93 correlation between the unobserved heterogeneity and other covariates (Mundlak 1978; Chamberlain 1985). Second, the inclusion of the lagged dependent variable (seed adoption decision) leads to the initial conditions problem (Wooldridge 2005). Wooldridge (2005) suggested addressing this problem by using the value of the dependent variable in the first wave as the ‘initial condition’ correction factor. Following Wooldridge’s method and the CRE-P approach, we specify the auxiliary conditional distribution of the unobserved heterogeneity as a function of the first wave of dependent variable and the time averages for the time varying covariates, and apply correlated-random-effects probit model. Third, incorporating fertilizer decisions in the seed adoption model would cause an endogeneity problem. To avoid the potential endogeneity problem, household’s fertilizer use expectations will be modeled and incorporated in the model instead of the actual fertilizer use. Fourth, we have an irregular spacing problem. Therefore, the CRE-P estimators of the seed adoption model will be corrected through the gap-dummy approach developed in Chapter 2. 3.4.1. Modeling Maize Price Expectations The explanatory variable vector XZ) includes household expectations of maize selling price at the harvest time, conditional on information available at planting time. As discussed in Chapter 1, profitability and switching costs are the two main driving forces leading to household seed adoption decisions in each production period. Given that hybrid seed is generally more productive, maize selling price will influence the value of the premium attributable to the additional yield gain from hybrids over traditional seeds. Thus, one of the goals in this Chapter is to evaluate how household’s subjective maize selling price assessment will affect their seed 94 adoption decision. One thing worth noting from Table 3-3 is that, not every Kenyan maize farmer is a net maize seller: some of them consume more than they produce. And for net maize buyers, the relevant maize price is buying price rather than selling price. In this case, adopting hybrids can reduce maize net buyer’s cost of buying additional maize, which is influenced by maize buying price. Therefore, to model household’s maize price expectations, we distinguish prices and net buyers’ expectations of maize buying prices. households by net seller and net buyer and estimate net sellers’ expectations of maize selling maize selling prices37 observable at the planting time and useful for predicting harvest prices. To model household price expectations, let iZ) be the maize selling price (or buying price36) received by household i at the harvest time t and j) be a vector of relevant wholesale Also, let 3Z) be the vector of household characteristics observed at planting time that can influence price expectations. Then the reduced form model for maize price iZ) used to predict iZ)=kZ+j)c+3Z)l+mZ) where mZ)~_(0,oZ)). Estimating equation (17) via correlated-random-effects regression to account for unobserved heterogeneity allows estimating price expectations iZ)p conditional on price expectations for each household takes the panel form: (17) available information at the planting time, as: 36 Since household’s maize buying price is not collected in the survey, we use district market prices as proxies of maize buying price. 37 We have conducted robustness check to evaluate the optimal length for lagged price. The evaluated lags are ranging from 5 to 12 and all results are robust. Thus, we choose to use 12 lagged prices assuming farmers are building price expectations based on the previous prices since the last planting time. The result of robustness check is reported in Table 3A-1. 95 iZ)p=kZ+j)c+3Z)l (18) We separate the sample according to whether households are net seller or net buyer and run two separate regressions to predict household’s expectations of maize selling (buying) prices. The estimation results of the fixed-effects regression are presented in Table 3A-2. Average maize selling and buying prices and prices expectations are presented in Table 3-6. 3.4.2. Modeling Fertilizer Use and Fertilizer Cost Differential Expectations form: According to the conceptual model, household’s fertilizer decisions are made differently for time t. Recalling that superscript H(T) denotes hybrids (traditional seeds). The amount of different seed technologies. Thus, we model household’s expectations of fertilizer use for both hybrid and traditional seeds conditional on information available at the planting time. Let \Z)7 and \Z): be the amount of fertilizer household i uses for hybrids and traditional seeds, respectively, at fertilizer use \Z)7(:)≥0 has a corner at zero. The reduced form model for fertilizer use takes the \Z)7=max (\Z)&'(7+j)c7+3Z)l7+rZ+sZ)7,0) \Z):=max (\Z)&'(:+j)c:+3Z)l:+rZ+sZ):,0) where \Z)&' is the amount of fertilizer used in the last period; j) is a vector of relevant price information available at the planting time38; 3Z) is the vector of household characteristics 38 We define j) in a very general way in equation (19) and (20), but only relevant price variables will enter relevant (19b) (19a) equations. We could impose relevant exclusion restrictions in each equation. 96 observed at the planting time that can influence fertilizer use decisions;39 rZ is household’s unobserved heterogeneity correlated with fertilizer use; sZ)7~_(0,o7) and sZ):~_(0,o:). Notice that the parameters in equations (19a) and (19b) are different indicating that the fertilizer use decisions may differ depending on seed choices. Given the corner-solution nature of fertilizer use and the existence of unobserved heterogeneity and lagged dependent variable, correlated-random-effects tobit will be applied to estimate equation (19a) and (19b). We separate the sample according to whether hybrids or traditional seed is used and run two separate tobits on each subsample. Household’s fertilizer use expectation for hybrid or traditional seed is given below and regression results are presented in Table 3A-3: \)∗7=max (\Z)&'(7+j)c7+3Z)l7+rZ,0) \)∗:=max (\Z)&'(:+j)c:+3Z)l:+rZ,0) (20a) (20b) Because most households only plant one type of maize seed in each production period, we can only generate a household’s fertilizer use expectation for one type of seed. To predict the counterfactual fertilizer use expectation for the other type of seed, we employ propensity score matching (PSM) and nearest-neighborhood matching (NNM) methods. The matching methods are implemented by first using a logit model to predict each household’s seed adoption decision given household demographic variables and farm characteristics. The probability of adopting hybrids (participation treatment) is used as the propensity score. The validity of PSM relies on 39 Last period hybrid use decision is incorporated to indicate which type of seed fertilizer was allocated to. 97 the conditional independence assumption (CIA) and overlap assumption. Only covariates that are either fixed or measured before participating are selected to ensure the CIA, and the overlap assumption test is passed (Caliendo and Kopeinig 2008). Next we use NNM to generate weights and predict counterfactual fertilizer use expectations. We select five nearest neighbors (closest in terms of propensity score) and generate weights based on the distance of propensity scores between the treated farmer and neighbors. Counterfactual fertilizer use expectations are generated by the weighted sum of neighbors’ fertilizer use expectations. Average fertilizer use expectations for hybrids and traditional seeds are presented in Table 3-7. From Table 3-7, households expect to allocate more fertilizer to hybrids than traditional seeds, and the amount of expected fertilizer use is increasing over time. To evaluate the effects of fertilizer use expectations on seed adoption decisions, we specify two more profitability factors, fertilizer cost differentials and yield differentials, and substitute fertilizer use expectations to generate fertilizer cost differential expectations and yield differential expectations, respectively. Specifically, fertilizer cost differential expectation is defined as: the differential between expected fertilizer use for hybrid and expected fertilizer use for traditional seed multiplies the fertilizer price. With this definition, we only evaluate the effect of the cost differential from fertilizer on seed adoption. Yield differential expectations are modeled in the next section. 3.4.3. Modeling Yield Differential Expectations As shown in Chapter 1, profitability and switching costs are the two main drivers influencing the adoption of hybrids, and profitability is jointly determined by maize price and yield differential 98 seed adoption decisions is of great interest. between hybrids and traditional seeds. Therefore, the effect of yield differential expectations on To model yield differential expectations, let vZ)7 and vZ): be the yield from hybrids and traditional seeds received by household i at time t. The yield response models for hybrid and traditional seeds take the form: vZ)7=\Z)7w'7+\Z)7QwQ7+3Z)Θ7+yZ)7 vZ):=\Z):w':+\Z):QwQ:+3Z)Θ:+yZ): where \Z)7 and \Z): are the amount of fertilizer household i uses for hybrid and traditional seeds, respectively; 3Z) is a vector of relevant household characteristics that affect yield responses; and yZ)7~_(0,ozZ7) and yZ):~_(0,ozZ:). Notice that the parameters in equations (21a) and (21b) are (21b) (21a) different indicating that the yield responses may differ depending on seed choice. As most households only obtain yields for one type of seed, we separate the sample based on whether hybrids or traditional seed is used and run two separate OLS regressions on equation (21a) and (21b) on each subsample40. Substituting household fertilizer use expectations for endogenous fertilizer use and estimating enables predicting household’s expectations for hybrid maize yield and traditional maize yield, vZ)7p vZ)7p=\)∗7w'7+\)∗7QwQ7+3Z)Θ7 and vZ):p , respectively: (22a) 40 In this approach, we assume that the yield response is homogeneous among households. Thus, the yield expectation is mainly determined by household’s expectation of fertilizer use. 99 vZ):p=\)∗:w':+\)∗:QwQ:+3Z)Θ: and the yield differential expectation is given by vZ)7p−vZ):p (22b) . The results of the yield response regression are presented in Table 3A-4, and the predicted yield differential expectations of each sample year are given in Table 3-8. 3.5. Results To study transient seed adoption, we apply a binary response model to investigate factors affecting household’s hybrid seed adoption decisions. For comparative purposes, we estimate three versions of the model: (1) conventional static which does not allow for state dependence; (2) dynamic discrete choice ignoring irregular spacing (CRE-P), and (3) the CRE-P with gap dummies estimator to account for the irregular spacing. For each estimator we specify two specifications: (1) a basic specification which incorporates basic household characteristics and infrastructure variables; and (2) an “expectations” specification which also includes the expectation variables derived in Section 4. The initial condition problem for the CRE-P estimators is addressed by including the first period as the correction factor. We employ the bootstrapping method to model standard errors. The main results of estimates of coefficients and average partial effects are present in Table 3-9 and Table 3-10, respectively, and the full estimates are given in Table 3A-5. Because the profitability expectation variables are strongly statistically significant, irrespective of the estimator used, we focus attention on results from the models that include the profitability expectation variables. 100 3.5.1. Determinants of Hybrid Adoption The contemporaneous effect of an increase in maize price expectations is significantly positive on the probability of adopting hybrids at the 1% significance level under all estimators. This is consistent with the idea that increased maize price will enlarge the premium from additional gains of hybrids over traditional seeds, and in turn encourage the adoption of hybrids. The effects of yield differential expectations are also significantly positive at the 1% significance level, indicating that the productivity advantage of hybrids is another key driver triggering the adoption process. The effects of fertilizer cost differential are significantly negative at the 1% significance level, indicating that a higher input cost expectation from hybrids will discourage the adoption of hybrids. The effects of household head education, household size, and infrastructure variables are estimated and the results are robust in all models. Specifically, the education level of the household head increases the probability of adopting hybrids at the 1% significance level, implying that education would significantly improve the adoption of new technologies. The effect of the distance to the nearest motorable/tarmac road is significantly negative indicating that transportation costs will slow down the adoption of hybrids. The negative effect of household size indicates that larger households are less likely to adopt hybrids. This is contrary to our expectation as the increased availability of family labor should increase the probability of using the more labor-intensive production plan, hybrids.41 41 One potential explanation is that we did not account for the labor supply in this regression due to data constraints, therefore potential omitted variable bias contaminates the estimates of household size effects. 101 3.5.2. Static vs. Dynamic Estimation An estimate of the state dependence of hybrid adoption is obtained from all four dynamic models. In most specifications, the estimates of the state dependence are around 0.4 and significantly different from zero at the 1% significance level.42 This indicates that households adopting hybrids in the last period are more likely to keep using hybrids in the next period, and households using traditional seeds in the last period are more likely to continue using traditional seeds. The estimation of hybrid adoption persistence demonstrates the existence of either learning effect from using hybrids or switching costs between two seed technologies or both. With learning, households would adopt hybrids if the current loss is less than the future gain from the additional trial of hybrids (Foster and Rosenzweig 1995). With switching costs, households are motivated to continue using the existing varieties (whatever it is) due to the costly adjustment (Dixit 1989). Also, the estimates of profitability factors, as well as the other relevant covariates that have significant effects on hybrid adoption decisions, are larger (in absolute value) in static models than in dynamic models. This is due to the fact that static models assume that the hybrid adoption decision will respond immediately to changes in relevant exogenous shocks, while dynamic models allow for adjustment costs and current adoption decisions to be made based not only on current economic conditions but also expected future outcomes. The comparison between static and dynamic models illustrates the importance of using dynamic approaches to investigating the hybrid adoption process. First, the identification of state dependence in dynamic models implies the existence of learning effects and/or switching costs. 42 The state dependence is only about 0.2 in 10% significance level in the dynamic profitability model and this will be corrected later in irregular spacing model. 102 This suggests policies should be designed to reduce or overcome the costs of switching from traditional varieties to hybrids to improve hybrid adoption. For example, institutional and physical infrastructure (transportation, distribution, contract design, etc.) could be improved to ensure sufficient and low-cost access to hybrids and reduce new adopter’s cost of establishing relationships with hybrid vendors (Barrett 2008). Also, education and training services could be implemented to reduce new hybrid user’s cost of screening seed quality and learning to achieve optimal productivity. On the other hand, policies could be developed to increase the costs of switching back from hybrids to traditional varieties, to encourage new hybrid adopters to at least experiment with the new varieties for several periods.43 For example, encouragement and subsidies could be offered to farmers who keep using hybrids at least for some period. Second, while the observed transient hybrid use suggests that the hybrid adoption is clearly a dynamic process, research conducting in a static context tends to ignore the long-run effects and exaggerate the short-run effects of the determinants of hybrid adoption. For example, we generate short-run and long run price elasticities of adoption from all three models and present them in Table 3-11. The estimated short-run price elasticity is about 0.34 in the static model, but is reduced to 0.21 in the dynamic model (accounting for irregular spacing), where the estimated long-run price elasticity is 0.36. Similarly, in Table 3-10, the average partial effect of yield expectation on adoption in the short run is about 0.14, which in the long run is about 0.24. These results could explain why we do not appear to get the adoption response we expect from higher maize prices or incentive programs: the importance of higher maize prices or yield 43 Even though the costs of switching back may reduce the incentives of non-adopters’ switching into the hybrid market in the first place, increasing switching back costs could help maintain current adopters in the hybrids market and it will work more effectively with decreasing switching-in costs 103 advantages effects on hybrid adoption in the short run is overestimated because farmers need time to respond to the changes in price or yield expectations. 3.5.3. Irregular Spacing Effects In irregular spacing models, we incorporate the gap dummy as well as its interactions with other covariates to account for irregular spacing effects. The estimates of the state dependence increase from 0.19 to 0.43 in the preferred specification (with profitability variables) when irregular spacing is accounted for. This result is consistent with the findings in Chapter 2, that ignoring irregular spacing will underestimate the state dependence of the dynamic model due to the missing lagged periods. Hence, incorporating gap dummies appears to reduce the downward bias. Properly estimating the state dependence of hybrid adoption enables the correct prediction of (expected) price elasticities of adoption and the effects of other relevant adoption determinants in both the short run and long run. From Table 3-11, the estimated short-run price elasticity is about 0.35 if irregular spacing is ignored and reduces to 0.21 when the gap dummies are included. The over-estimated short-run price elasticity from ignoring irregular spacing implies that the importance of maize price on hybrid adoption in the short run could be overestimated when irregular spacing is not properly accounted for. Also, the estimated long-run price elasticity is 0.44 in the CRE-P model ignoring irregular spacing and 0.36 in the irregular spacing model, indicating that long-run effects could be overestimated also. The estimates of the effect of other covariates do not change much after the gap dummies are included, which is consistent with findings in Chapter 2 that irregular spacing does not affect the estimates of the contemporaneous effects much in terms of the significance and the 104 magnitude.44 One thing worth noting is that the significance and even the sign of the estimates of yield differential expectations changes much after incorporating gap dummies. This is because the dummy we incorporate equals 1 if the period gap is four years and 0 otherwise, distinguishing two groups of data (2004, and 2007/2010).45 The coefficient of the yield differential is therefore for waves 2007 and 2010, and the coefficients of both the yield differential and the interaction term with gap dummy are the estimates for the 2004 wave. The significant positive effect of the interaction term indicates that the yield differential effect is significantly positive on hybrid adoption in 2004, and not significant in 2007 and 2010. In static and dynamic models, the estimates show that the effects of the yield differential are positive in general. We would be able to fully account for the irregular spacing effects and predict the contemporaneous effects of the yield differential if we have at least two consecutively observed periods in the panel. However, given the sample data in this research, the best we could do is to distinguish the effects by different data spacing groups. 3.6. Conclusion This paper provides new information on transient seed technology use in Africa. We employ a dynamic conceptual model to explain transient use, and apply the model empirically to a four- wave panel data to obtain quantitative estimates of the determinants of transient hybrid maize use in Kenya. Profitability effects are identified as important factors in the adoption process. 44 We know the true parameterization of the model in Chapter 2, but we do not in Chapter 3. Therefore, in Chapter 3, we can only evaluate the effect of irregular spacing through comparing the differences of the estimates with/without accounting for irregular spacing. It is not sufficient to prove that irregular spacing does not affect the estimates of contemporaneous effects in Chapter. 45 Given our sample data is in 2000, 2004, 2007, and 2010, the period gap in 2004 is four years (2004-2000), and the period gap in 2007 and 2010 is three years (2007-2004, and 2010-2007). 105 Specifically, the effects of maize price expectations and yield differential expectations on hybrid adoption are significantly positive, and the effects of fertilizer cost differential expectations are significantly negative. The significantly positive effects of maize price expectations imply that Kenyan maize farmers have perceived that an increase (decrease) in profitability of adopting hybrids due to the increased (decreased) maize selling price will encourage (discourage) hybrid adoption. In addition to price effects, the other factors that also affect hybrid adoption include the perceived yield advantage of hybrids, the education of household heads, and distance to a motorable road. More importantly for the contribution of this paper, the persistence of hybrid adoption is estimated in the dynamic specifications, and irregular spacing is corrected for using the gap dummy approach. The finding of statistically significant adoption persistence implies the existence of either switching costs or learning effects in the process of hybrid adoption, demonstrating the importance of using dynamic approaches to studying transient technology use. These results are quite different to those obtained using static models or dynamic models that do not account for irregular spacing. Analysis conducted in dynamic contexts with properly estimated persistence facilitates correctly distinguishing short run and long run effects of the adoption determinants, improving the richness of empirical insights. Our findings provide empirical evidence that transient hybrid seed use in Kenya is determined by both profitability of hybrids and adoption persistence (either switching costs or learning effects). On the one hand, the fluctuations of maize selling prices and fertilizer prices reverse the profitability of using hybrids and lead to household’s switching back and forth between hybrid and traditional varieties. Adoption persistence pushes farmers to stay with current seed use (whatever it is). Therefore, to expand adoption of modern inputs in Africa, 106 policy could pay more attention to enhancing the expected profitability of adopting hybrids (such as stabilizing the maize selling price, encouraging hybrid productivity improvement research and training, and implementing fertilizer subsidies) and overcoming the costs of switching from traditional to new varieties (programs that reduce the costs of searching and establishing relationships with seed providers, screening seed quality, and learning about differences in recommended production practices). 107 Table 3-1 Possible transitions across hybrid/non-hybrid use Hybrid Use Transitions (2000 2004 2007 2010) Fraction of Sample (%) (N=1207 Households) N N N N N N N H N N H H N N H N N H H H N H H N N H N H N H N N H H H H H H H N H H N N H N N N H H N H H N H H H N H N H N N H No. 99 70 67 21 53 9 14 10 643 13 9 34 27 79 18 41 8.20 5.80 5.55 1.74 4.39 0.75 1.16 0.83 53.27 1.08 0.75 2.82 2.24 6.55 1.49 3.40 Sample (%) 100 8.20 91.80 53.27 15.74 7.95 14.83 Note: “H” denotes the use of hybrid seed and “N” denotes the use of non- hybrid seed. Table 3-2 Proportion of households by adoption history category Proportion of the No. of Households Total 1. Never Adopted 2. Adopted at least once 2.1 Always Adopted 2.2 Adopted and continued 2.3 Adopted and then Disadopted 2.4. Transient use (back and forth) 1207 99 1108 643 190 96 179 108 Table 3-3 Maize production summary statistics by adoption pattern Never adopted Always adopted Transient use 196 0.22 0.29 0.30 0.22 0.26 432 495 547 458 483 $933 $1,071 $1,096 $1,909 $1,232 0.48 0.47 0.49 0.53 0.49 41.14 45.94 40.09 55.96 45.99 617 0.50 0.58 0.56 0.38 0.51 960 963 1030 766 929 $1,107 $1,160 $1,065 $1,865 $1,254 0.82 0.82 0.85 0.84 0.83 72.41 70.86 71.71 75.29 72.58 120 0.24 0.19 0.20 0.24 0.22 316 293 376 349 333 $939 $886 $1,008 $1,747 $1,169 0.11 0.18 0.25 0.24 0.20 29.28 14.69 21.88 30.61 23.91 Year Full sample # of household net maize seller (share) yield (kg/acre) maize selling price (ksh/kg) fertilizer use (share) fertilizer use (kg/acre) 2000 2004 2007 2010 average 2000 2004 2007 2010 average 2000 2004 2007 2010 average 2000 2004 2007 2010 average 2000 2004 2007 2010 average 1207 0.37 0.43 0.44 0.31 0.39 678 698 784 622 695 $1,064 $1,132 $1,054 $1,824 $1,234 0.58 0.61 0.65 0.66 0.63 63.39 60.78 59.84 65.74 62.44 109 Never adopted 120 $85,828 $101,137 $120,747 $143,123 $112,709 $48,345 $30,582 $43,723 $45,390 $42,010 $4,796 $12,838 $7,132 $7,577 $8,086 $32,687 $57,717 $58,146 $86,119 $58,667 $110,900 $122,252 $160,808 $147,039 $135,250 Always adopted 617 $191,610 $208,385 $248,467 $363,756 $253,055 $101,455 $90,810 $90,909 $145,617 $107,198 $27,935 $39,665 $34,534 $46,567 $37,175 $62,221 $77,910 $84,532 $124,219 $87,220 $177,308 $224,053 $265,821 $347,158 $253,585 Transient use 196 $118,798 $142,385 $162,427 $223,128 $161,685 $53,840 $46,953 $52,194 $70,571 $55,890 $11,971 $18,733 $11,631 $19,976 $15,578 $52,988 $76,698 $85,582 $113,246 $82,129 $97,060 $103,551 $177,714 $208,388 $146,678 Table 3-4 Income information by adoption pattern Year Full sample # of household total income (ksh) crop income (ksh) livestock income (ksh) off farm income (ksh) asset (ksh) 1207 $151,344 $167,586 $198,740 $281,897 $199,892 $77,488 $66,470 $72,341 $106,644 $80,735 $19,212 $27,582 $22,922 $31,216 $25,233 $54,645 $73,534 $78,308 $113,010 $79,874 $139,327 $170,416 $220,716 $273,673 $201,033 2000 2004 2007 2010 average 2000 2004 2007 2010 average 2000 2004 2007 2010 average 2000 2004 2007 2010 average 2000 2004 2007 2010 average 110 Table 3-5 Market infrastructure statistics by adoption pattern Year Full sample Never adopted # of household distance to fertilizer seller (km) distance to hybrid seller (km) distance to a motorable road (km) distance to a tarmac road (km) 2000 2004 2007 2010 average 2000 2004 2007 2010 average 2000 2004 2007 2010 average 2000 2004 2007 2010 average 1207 5.46 3.60 3.09 3.67 3.96 \ 3.33 3.14 4.25 3.57 1.28 1.05 0.52 0.43 0.82 7.58 7.52 7.56 7.04 7.43 120 12.23 6.59 4.43 3.93 6.81 \ 6.02 3.80 4.80 4.87 1.29 1.47 0.53 0.60 0.97 11.58 11.38 11.39 9.51 10.97 Always adopted 617 3.38 2.61 2.91 3.77 3.17 \ 2.57 3.06 4.16 3.26 1.38 0.80 0.51 0.33 0.75 6.79 6.65 6.64 6.21 6.57 Transient use 196 6.02 4.11 2.65 3.38 4.04 \ 3.38 2.87 4.02 3.42 1.15 1.24 0.56 0.57 0.88 6.53 6.71 6.79 6.60 6.66 Table 3-6 Household’s expectations of maize selling and buying prices Obs Mean 1295 1,871 1,871 1291 Std. Dev. Min Max 4000 2198 412 306 500 870 2,957 2,957 1573 1573 357 337 1000 1001 2222 2313 111 Selling price Selling price expectation Buying price Buying price expectation Table 3-7 Household’s expectations of fertilizer use for hybrid and traditional seeds Traditional 2004 Hybrid 2004 Traditional 2007 Hybrid 2007 Traditional 2010 Hybrid 2010 Mean 15.46 38.59 17.01 39.39 20.32 42.21 Obs 1,207 1,207 1,207 1,207 1,207 1,207 11.24 22.84 11.70 24.12 12.79 24.66 Min 0.00 0.14 0.00 0.11 0.00 0.14 Std. Dev. Max 108.17 150.70 95.13 114.82 117.03 118.94 Table 3-8 Household’s yield differential expectations between hybrids and traditional seeds 2004 2007 2010 Mean 245 354 260 Obs 1,207 1,207 1,207 Max 2377 1292 1544 Min -706 -337 -555 283 259 205 Std. Dev. 112 Table 3-9 Main estimates of hybrid adoption models (coefficients) Static basic 0.021*** (0.008) 0.066*** (0.018) -0.002 (0.018) 0.037* (0.021) -0.067*** (0.025) -0.016 (0.010) 0.006 (0.010) -0.145*** (0.034) Used hybrid previous period Profitability factors Price expectation Fertilizer cost differential Yield differential Yield differential*G-Dummy Household characteristics Age of head in years Education of head Landholding size Real value of assets Household size Transportation costs Distance to fertilizer seller Distance to hybrid seller Distance to motorable road Static profit 0.887*** (0.098) -0.291*** (0.078) 0.820*** (0.287) 0.008 (0.006) 0.053*** (0.016) 0.023 (0.018) 0.023 (0.019) -0.051*** (0.019) -0.018 (0.013) -0.002 (0.011) -0.104*** (0.033) Dynamic basic 0.400*** (0.114) 0.018*** (0.007) 0.062*** (0.016) 0.002 (0.016) 0.039** (0.018) -0.054** (0.023) -0.017 (0.011) 0.005 (0.012) -0.138*** (0.031) 113 Dynamic profit Irregular Spacing basic Irregular Spacing profit 0.194 (0.150) 0.833*** (0.093) -0.280*** (0.101) 0.768*** (0.293) 0.007 (0.007) 0.050*** (0.015) 0.025 (0.019) 0.026 (0.022) -0.044** (0.019) -0.017 (0.014) -0.003 (0.013) -0.109*** (0.036) 0.471*** (0.137) 0.005 (0.007) 0.050*** (0.019) 0.017 (0.020) 0.013 (0.015) -0.031* (0.017) 0.006 (0.021) 0.006 (0.016) -0.036 (0.057) 0.426*** (0.151) 0.455*** (0.122) -0.211** (0.092) -0.591 (0.402) 1.040* (0.619) -0.005 (0.008) 0.038** (0.019) 0.039* (0.023) 0.003 (0.022) -0.007 (0.021) -0.007 (0.019) 0.005 (0.015) 0.037 (0.053) Table 3-9 (cont’d) Distance to tarmac road Constant -0.016** (0.007) -0.977** (0.420) -0.012** (0.006) 3.176** (1.299) -0.008* (0.005) -0.996*** (0.348) Observations Number of hhid chi2 Note1: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Note2: G-Dummy stands for gap dummy. 3,621 1,207 6.385 3,621 1,207 129.5 3,621 1,207 128.3 -0.006 (0.006) 1.726* (1.016) 3,621 1,207 18.00 -0.012** (0.006) -1.153*** (0.323) 3,621 1,207 5.821 -0.003 (0.007) 1.525** (0.758) 3,621 1,207 6.107 114 Table 3-10 Main estimates of hybrid adoption models (average partial effects) Static basic 0.004*** (0.002) 0.012*** (0.003) -0.000 (0.003) 0.007* (0.004) -0.012*** (0.004) -0.003 (0.002) 0.001 (0.002) -0.027*** (0.007) Used hybrid previous period Profitability factors Price expectation Fertilizer cost differential Yield differential Yield differential*G-Dummy Household characteristics Age of head in years Education of head Landholding size Real value of assets Household size Transportation costs Distance to fertilizer seller Distance to hybrid seller Distance to motorable road Static profit 0.154*** (0.016) -0.051*** (0.014) 0.143*** (0.050) 0.001 (0.001) 0.009*** (0.003) 0.004 (0.003) 0.004 (0.003) -0.009*** (0.003) -0.003 (0.002) -0.000 (0.002) -0.018*** (0.006) Dynamic basic 0.079*** (0.024) 0.004*** (0.001) 0.012*** (0.003) 0.000 (0.003) 0.008** (0.003) -0.011** (0.004) -0.003 (0.002) 0.001 (0.002) -0.027*** (0.006) 115 Dynamic profit 0.035 (0.028) 0.148*** (0.017) -0.050*** (0.018) 0.137** (0.053) 0.001 (0.001) 0.009*** (0.003) 0.004 (0.003) 0.005 (0.004) -0.008** (0.003) -0.003 (0.003) -0.000 (0.002) -0.019*** (0.006) Irregular Spacing basic 0.090*** (0.028) 0.001 (0.001) 0.010*** (0.004) 0.003 (0.004) 0.003 (0.003) -0.006* (0.003) 0.001 (0.004) 0.001 (0.003) -0.007 (0.011) Irregular Spacing profit 0.077*** (0.030) 0.082*** (0.021) -0.038** (0.017) -0.106 (0.071) 0.187* (0.111) -0.001 (0.001) 0.007** (0.003) 0.007* (0.004) 0.001 (0.004) -0.001 (0.004) -0.001 (0.003) 0.001 (0.003) 0.007 (0.009) Table 3-10 (cont’d) Distance to tarmac road Observations Number of hhid Note1: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 Note2: G-Dummy stands for gap dummy. 3,621 1,207 3,621 1,207 -0.003** (0.001) -0.002** (0.001) -0.002* (0.001) -0.001 (0.001) 3,621 1,207 -0.002** (0.001) 3,621 1,207 -0.001 (0.001) 3,621 1,207 3,621 1,207 116 Standard Error NA 0.070 0.187 Table 3-11 Predicted price elasticities of hybrid adoption Static Dynamic Irregular spacing Error 0.041 0.045 0.062 0.335 0.353 0.206 NA 0.438 0.360 Short-run Elasticities Standard Long-run Elasticities 117 APPENDIX 118 Table 3A-1 Robustness check for modeling price expectation with different lag length APPENDIX # of price lags Used hybrid previous period Used hybrid in 2000 Fertilizer cost differential Price expectation Yield differential Age of head in years Education of head Distance to fertilizer seller Distance to hybrid seller Distance to motorable road Distance to tarmac road Zone dummy one Zone dummy two 12 0.194* (0.115) 0.855*** (0.120) -0.280*** (0.094) 0.833*** (0.108) 0.768*** (0.257) 0.007 (0.007) 0.050*** (0.017) -0.017 (0.011) -0.003 (0.011) -0.109*** (0.031) -0.006 (0.006) 0.225 (0.218) 0.161 (0.229) 11 0.197* (0.115) 0.854*** (0.120) -0.279*** (0.094) 0.827*** (0.107) 0.770*** (0.256) 0.007 (0.007) 0.050*** (0.017) -0.017 (0.011) -0.003 (0.011) -0.109*** (0.031) -0.006 (0.006) 0.231 (0.218) 0.165 (0.229) 10 0.195* (0.115) 0.858*** (0.121) -0.286*** (0.094) 0.858*** (0.108) 0.782*** (0.257) 0.007 (0.007) 0.049*** (0.017) -0.018 (0.011) -0.004 (0.011) -0.108*** (0.031) -0.006 (0.006) 0.235 (0.219) 0.165 (0.230) 119 9 0.199* (0.115) 0.859*** (0.121) -0.288*** (0.095) 0.897*** (0.109) 0.770*** (0.256) 0.007 (0.007) 0.049*** (0.017) -0.019* (0.011) -0.003 (0.011) -0.107*** (0.031) -0.005 (0.006) 0.246 (0.220) 0.161 (0.231) 8 0.198* (0.115) 0.856*** (0.121) -0.292*** (0.095) 0.898*** (0.110) 0.766*** (0.256) 0.007 (0.007) 0.048*** (0.017) -0.019* (0.011) -0.003 (0.011) -0.107*** (0.031) -0.005 (0.006) 0.234 (0.220) 0.159 (0.231) 7 0.198* (0.115) 0.856*** (0.121) -0.293*** (0.095) 0.900*** (0.111) 0.762*** (0.257) 0.007 (0.007) 0.048*** (0.017) -0.019* (0.011) -0.003 (0.011) -0.106*** (0.031) -0.005 (0.006) 0.237 (0.220) 0.159 (0.231) 6 0.203* (0.114) 0.851*** (0.120) -0.300*** (0.095) 0.907*** (0.112) 0.750*** (0.256) 0.007 (0.007) 0.048*** (0.017) -0.019* (0.011) -0.003 (0.011) -0.105*** (0.031) -0.005 (0.006) 0.219 (0.219) 0.148 (0.229) 5 0.204* (0.114) 0.850*** (0.120) -0.298*** (0.095) 0.902*** (0.112) 0.750*** (0.255) 0.007 (0.007) 0.048*** (0.017) -0.019* (0.011) -0.003 (0.011) -0.105*** (0.031) -0.005 (0.006) 0.221 (0.218) 0.148 (0.228) Table 3A-1 (cont’d) Zone dummy three Zone dummy four Zone dummy five Zone dummy six Zone dummy seven Landholding size Real value of assets Household size Time-average of head age Time-average of head education Time-average of household size Time-average of distance seller to fertilizer Time-average of distance seller to hybrid Time-average of distance to motorable road Time-average of landholding size Time-average of real value 1.307*** (0.327) 2.246*** (0.441) 0.810*** (0.290) 1.013*** (0.278) 1.010*** (0.332) 0.025 (0.017) 0.026 (0.019) -0.044** (0.021) -0.016** (0.008) -0.019 (0.020) 0.082*** (0.029) -0.023 (0.021) -0.011 (0.020) 0.190*** (0.066) 0.080*** (0.026) -0.027 1.315*** (0.327) 2.249*** (0.440) 0.816*** (0.290) 1.016*** (0.278) 1.018*** (0.332) 0.025 (0.017) 0.026 (0.019) -0.044** (0.021) -0.016** (0.008) -0.019 (0.020) 0.082*** (0.029) -0.023 (0.021) -0.010 (0.020) 0.190*** (0.065) 0.081*** (0.026) -0.027 1.326*** (0.328) 2.281*** (0.442) 0.813*** (0.291) 1.029*** (0.279) 1.056*** (0.333) 0.025 (0.017) 0.026 (0.019) -0.044** (0.022) -0.016** (0.008) -0.018 (0.020) 0.083*** (0.029) -0.023 (0.021) -0.010 (0.020) 0.192*** (0.066) 0.081*** (0.026) -0.027 120 1.320*** (0.329) 2.294*** (0.444) 0.807*** (0.292) 1.041*** (0.279) 1.088*** (0.337) 0.026 (0.017) 0.024 (0.019) -0.043** (0.022) -0.016** (0.008) -0.017 (0.020) 0.082*** (0.029) -0.023 (0.021) -0.011 (0.020) 0.191*** (0.066) 0.080*** (0.026) -0.025 1.305*** (0.328) 2.279*** (0.443) 0.800*** (0.291) 1.032*** (0.279) 1.059*** (0.334) 0.027 (0.017) 0.025 (0.019) -0.043** (0.022) -0.016** (0.008) -0.017 (0.020) 0.082*** (0.029) -0.023 (0.021) -0.011 (0.020) 0.192*** (0.066) 0.080*** (0.026) -0.025 1.306*** (0.328) 2.281*** (0.444) 0.800*** (0.292) 1.034*** (0.279) 1.065*** (0.334) 0.027 (0.017) 0.025 (0.019) -0.043** (0.022) -0.016** (0.008) -0.017 (0.020) 0.082*** (0.029) -0.023 (0.021) -0.011 (0.020) 0.192*** (0.066) 0.080*** (0.026) -0.025 1.292*** (0.327) 2.269*** (0.441) 0.784*** (0.290) 1.019*** (0.278) 1.046*** (0.334) 0.029* (0.017) 0.025 (0.019) -0.043** (0.022) -0.015* (0.008) -0.017 (0.020) 0.082*** (0.029) -0.023 (0.021) -0.010 (0.020) 0.192*** (0.066) 0.078*** (0.026) -0.026 1.294*** (0.326) 2.271*** (0.441) 0.784*** (0.290) 1.021*** (0.278) 1.050*** (0.332) 0.028 (0.017) 0.026 (0.019) -0.043** (0.022) -0.015** (0.008) -0.017 (0.020) 0.082*** (0.029) -0.023 (0.021) -0.011 (0.020) 0.191*** (0.066) 0.078*** (0.026) -0.026 Table 3A-1 (cont’d) of assets Time-average of fertilizer cost differential expectation Time-average of maize price expectation Time-average of yield differential expectation Constant (0.026) 0.965*** (0.176) -2.644*** (0.541) -3.381*** (0.879) 1.726* (0.929) (0.026) 0.962*** (0.175) -2.603*** (0.538) -3.369*** (0.875) 1.665* (0.924) (0.026) 0.977*** (0.175) -2.599*** (0.536) -3.419*** (0.877) 1.596* (0.923) 3,621 1,207 608.2 3,621 1,207 606.3 Observations Number of hhid chi2 3,621 1,207 607.0 Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 (0.026) 0.973*** (0.176) -2.633*** (0.544) -3.397*** (0.883) 1.574* (0.935) 3,621 1,207 605.5 (0.026) 0.981*** (0.176) -2.737*** (0.540) -3.422*** (0.883) 1.746* (0.928) 3,621 1,207 605.7 (0.026) 0.983*** (0.176) -2.722*** (0.540) -3.414*** (0.882) 1.718* (0.929) 3,621 1,207 605.6 (0.026) 0.991*** (0.175) -2.754*** (0.544) -3.420*** (0.878) 1.765* (0.931) 3,621 1,207 610.3 (0.026) 0.988*** (0.175) -2.724*** (0.541) -3.410*** (0.876) 1.725* (0.924) 3,621 1,207 611.0 121 Table 3A-2 Estimates of Maize Price Expectations District price lagged one month District price lagged two month District price lagged three month District price lagged four month District price lagged five month District price lagged six month District price lagged seven month District price lagged eight month District price lagged nine month District price lagged ten month District price lagged eleven month District price lagged twelve month Age of head in years Education of head Landholding size Real value of assets Household size Distance to fertilizer seller Distance to motorable road Distance to tarmac road Zone dummy one Zone dummy two Net seller 0.196 (0.147) -0.103 (0.217) 0.377** (0.183) -0.820*** (0.220) 1.449*** (0.496) -2.140*** (0.432) 1.585*** (0.368) -0.660 (0.440) 0.074 (0.322) 0.139 (0.378) 0.282 (0.224) 0.165 (0.191) -0.834 (1.753) 2.011 (1.866) 0.075 (0.554) -3.905 (3.050) -0.992 (4.029) -1.752 (1.664) 0.540 (3.639) 4.586** (2.065) -66.183 (139.103) -25.212 122 Net buyer 1.134*** (0.043) -1.347*** (0.053) 0.857*** (0.051) -0.016 (0.054) 0.482*** (0.181) -1.846*** (0.091) 0.213** (0.091) 1.075*** (0.181) -0.884*** (0.126) 0.438*** (0.155) 1.048*** (0.074) -0.388*** (0.080) -0.858** (0.389) -0.177*** (0.047) 0.103 (1.114) -0.193 (1.119) 1.981 (1.253) 0.492 (0.597) 4.922** (2.164) -1.983** (0.784) -176.602*** (21.507) -176.553*** Table 3A-2 (cont’d) Zone dummy three Zone dummy four Zone dummy five Zone dummy six Zone dummy seven Time-average of district price lagged one month Time-average of district price lagged two month Time-average of district price lagged three month Time-average of district price lagged four month Time-average of district price lagged five month Time-average of head age Time-average of head education Time-average of landholding size Time-average of real value of assets Time-average of household size Time-average of distance seller to fertilizer Time-average of distance to motorable road Time-average of distance to tarmac road Constant (137.802) -127.160 (137.302) -269.767** (135.128) 8.265 (142.973) -103.628 (138.235) -86.537 (186.688) -2.669*** (0.808) 3.377* (1.809) -3.202*** (0.593) 0.466 (0.984) 1.307* (0.732) 0.326 (1.920) 0.097 (0.761) 3.599** (1.400) 6.248* (3.244) 3.149 (5.503) 0.965 (3.433) -24.604** (10.336) -4.652* (2.499) 1,868.895 (1,483.041) (30.429) -364.960*** (29.566) -363.047*** (25.463) -224.023*** (29.877) -317.653*** (22.265) -376.857*** (37.188) -3.419*** (0.253) 6.167*** (0.695) -3.073*** (0.256) -1.033*** (0.358) 2.576*** (0.210) 0.077 (0.444) 0.313 (0.236) 3.991*** (1.398) -0.741 (1.284) -0.714 (1.623) -7.058*** (1.313) 3.540 (3.855) 2.123** (0.863) -1,393.941* (778.477) 1,871 0.636 822 2,957 0.915 1,069 Observations Adjusted R2 Number of hhid Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 123 Table 3A-3 Estimates of fertilizer use expectations Used fertilizer previous period Fertilizer for traditional seed Fertilizer price Maize price expectation Net seller dummy (=1) Used hybrid previous period Age of head in years Education of head Household size Zone dummy one Zone dummy two Zone dummy three Zone dummy four Zone dummy five Zone dummy six Zone dummy seven Distance to fertilizer seller Distance to hybrid seller Distance to motorable road Distance to tarmac road Landholding size Real value of assets Fertilizer use previous period 0.033 (0.037) 0.651** (0.319) -0.010 (0.009) 13.028*** (4.617) 3.391 (3.963) 0.151 (0.292) 0.338 (0.770) 0.490 (0.956) 21.028* (12.558) -19.409* (10.208) 23.591* (14.221) 53.834*** (14.428) 28.855** (12.430) 36.892*** (12.691) 30.231 (20.422) 0.887 (0.690) -0.199 (0.527) -0.346 (1.364) -1.348*** (0.340) -2.471*** (0.954) 0.000 (0.000) 0.120*** 124 Fertilizer for hybrids 0.016 (0.015) 0.379* (0.227) -0.001 (0.006) 5.580* (3.177) 9.122** (4.519) -0.147 (0.247) 0.998*** (0.310) 1.703** (0.746) 57.794*** (13.597) 0.701 (14.004) 88.285*** (13.735) 120.125*** (15.108) 98.397*** (13.656) 84.029*** (14.298) 19.606 (20.993) 0.455 (0.563) -0.120 (0.496) -0.938 (1.357) -0.835*** (0.244) 0.061 (0.360) -0.000 (0.000) 0.033*** Table 3A-3 (cont’d) Time-average of fertilizer price Time-average of maize price expectation Time-average of net seller dummy Time-average of used hybrid previous Time-average of head age Time-average of head education Time-average of Household size Time-average of distance to fertilizer seller Time-average of distance to hybrid seller Time-average of distance to motorable road Time-average of landholding size Time-average of real value of assets Constant (0.041) 7.684*** (2.812) 0.054 (0.045) 20.658* (12.311) 20.222** (8.321) -0.339 (0.350) 1.025 (0.971) -0.505 (1.286) -4.587*** (1.385) 1.487 (1.120) 2.783 (3.290) 1.388 (1.490) -0.270 (1.209) -393.134*** (124.677) Observations Number of household Chi2 Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 1,044 549 317 (0.011) 6.853*** (1.348) 0.117*** (0.035) 29.323*** (8.870) 12.266 (8.308) 0.334 (0.274) 0.251 (0.164) -1.302 (0.964) 1.048 (0.918) -1.421 (0.897) -6.055*** (2.120) -0.693 (0.446) 0.914* (0.550) -522.214*** (91.065) 2,577 1,047 1954 125 Table 3A-4 Estimates of yield response model Fertilizer use Fertilizer squared Landholding size Age of head in years Education of head Household size Distance to fertilizer seller Distance to hybrid seller Distance to motorable road Distance to tarmac road Landholding size Real value of assets Zone dummy one Zone dummy two Zone dummy three Trad 2004 9.212*** (2.608) -0.041*** (0.014) -26.613** (11.999) -1.646 (1.229) -1.223 (3.057) 1.957 (5.297) 1.747 (2.081) 6.829** (3.050) -10.253 (11.496) -1.525 (1.591) -5.465 (8.410) 17.519** (6.841) 105.573 (67.867) -14.044 (49.839) Hybr 2004 3.918*** (0.758) -0.007*** (0.002) -7.625 (7.722) 1.855 (1.672) 4.844 (4.031) 3.565 (8.551) 9.374 (7.960) 2.219 (8.601) 0.687 (15.808) -2.336 (3.298) 8.611* (5.022) 10.180** (4.595) -22.917 (74.024) 218.407*** 423.503*** (63.521) (72.168) 4.507 (2.834) 0.001 (0.024) Trad 2007 Hybr 2007 3.052*** (0.850) 0.006 (0.004) -7.811 (14.349) -0.552 (1.430) 8.729* (4.938) 11.937 (8.309) 9.132 (9.749) -10.252 (8.384) -44.457*** (15.417) 0.022 (1.384) 7.216 (4.397) -0.145 (5.169) 6.981 (5.345) -7.942 (6.403) 3.168 (16.873) 0.248 (1.974) 21.659 (13.430) 7.796 (7.215) -70.292 (68.864) -22.412 (63.187) 41.289 (88.382) 65.323*** (20.725) 0.958 (3.376) 2.333 (9.364) -1.541 (4.461) 160.765* (84.924) 367.192*** (66.476) 473.711*** (63.418) 126 Trad 2010 8.477*** (2.708) -0.028 (0.023) -23.023 (18.209) 2.394* (1.282) 8.037 (4.997) -3.975 (4.541) -4.757* (2.686) 0.443 (1.869) -29.400** (11.393) 2.062 (1.669) 8.948 (15.932) -4.715 (4.569) 48.943 (70.551) 46.876 (53.721) 95.962 (92.238) Hybr 2010 1.955** (0.985) 0.007 (0.006) 3.869 (19.664) -1.101 (1.307) 10.012** (4.557) 10.756* (6.422) -7.224 (6.052) -0.765 (4.213) 17.322 (23.105) 12.102*** (2.436) 8.828 (9.072) -1.446 (4.622) 272.200*** (65.955) 192.077*** (61.322) 486.293*** (75.951) Table 3A-4 (cont’d) Zone dummy four Zone dummy five 505.481*** 767.974*** (83.883) 100.131 (70.972) (64.912) 196.002*** (60.701) Zone dummy six 463.785*** 366.619*** Zone dummy seven Constant (89.413) 148.862* (86.184) 306.174*** (69.747) 119.381* (66.177) 43.119 (146.759) Observations Adjusted R2 Note: Trad 2004 is maize yield with traditional in 2004, Hybr 2004 is maize yield with hybrids in 2004 347 0.134 758 0.301 (95.853) 449 0.333 715.882*** -226.600*** 272.517*** -21.152 (94.252) 22.470 (84.222) 88.707 (105.705) -90.729 (86.711) 350.187*** (131.236) (61.825) 315.111*** (62.759) 485.704*** (78.676) 153.032 (105.050) 143.715 (126.907) 860 0.244 (83.588) 62.568 (80.417) 139.063* (83.673) -248.613*** (83.515) 129.183 (98.936) 248 0.325 (64.926) 295.707*** (71.175) 503.418*** (75.843) -30.706 (90.844) 80.510 (121.530) 959 0.161 127 Table 3A-5 Full estimates of hybrid adoption model Static basic Used hybrid previous period Profitability factors Price expectation Fertilizer cost differential Yield differential Yield differential*G-Dummy Household characteristics Age of head in years Education of head Landholding size Real value of assets Household size Transportation costs Distance to fertilizer seller Distance to hybrid seller Distance to motorable road Static profit 0.887*** (0.098) -0.291*** (0.078) 0.820*** (0.287) 0.008 (0.006) 0.053*** (0.016) 0.023 (0.018) 0.023 (0.019) -0.051*** (0.019) -0.018 (0.013) -0.002 (0.011) -0.104*** (0.033) 0.021*** (0.008) 0.066*** (0.018) -0.002 (0.018) 0.037* (0.021) -0.067*** (0.025) -0.016 (0.010) 0.006 (0.010) -0.145*** (0.034) Dynamic basic 0.400*** (0.114) Dynamic profit 0.194 (0.150) Irregular Spacing basic 0.471*** (0.137) Irregular Spacing profit 0.426*** (0.151) 0.833*** (0.093) -0.280*** (0.101) 0.768*** (0.293) 0.007 (0.007) 0.050*** (0.015) 0.025 (0.019) 0.026 (0.022) -0.044** (0.019) -0.017 (0.014) -0.003 (0.013) -0.109*** (0.036) 0.018*** (0.007) 0.062*** (0.016) 0.002 (0.016) 0.039** (0.018) -0.054** (0.023) -0.017 (0.011) 0.005 (0.012) -0.138*** (0.031) 128 0.005 (0.007) 0.050*** (0.019) 0.017 (0.020) 0.013 (0.015) -0.031* (0.017) 0.006 (0.021) 0.006 (0.016) -0.036 (0.057) 0.455*** (0.122) -0.211** (0.092) -0.591 (0.402) 1.040* (0.619) -0.005 (0.008) 0.038** (0.019) 0.039* (0.023) 0.003 (0.022) -0.007 (0.021) -0.007 (0.019) 0.005 (0.015) 0.037 (0.053) Table 3A-5 (cont’d) Distance to tarmac road Zone dummy one Zone dummy two Zone dummy three Zone dummy four Zone dummy five Zone dummy six Zone dummy seven Time-average of head age Time-average of head education Time-average of Household size Time-average of distance to fertilizer seller Time-average of distance to hybrid seller Time-average of distance to motorable road Time-average of landholding size Time-average of real value of assets -0.016** (0.007) 0.449* (0.233) -0.042 (0.218) 1.947*** (0.275) 2.714*** (0.248) 1.800*** (0.251) 1.844*** (0.271) 1.683*** (0.343) -0.026*** (0.009) -0.020 (0.019) 0.090*** (0.029) -0.028 (0.023) -0.005 (0.019) 0.127* (0.070) 0.118*** (0.027) -0.020 -0.012** (0.006) 0.203 (0.237) 0.345 (0.248) 1.757*** (0.374) 3.087*** (0.525) 1.277*** (0.348) 1.575*** (0.331) 1.619*** (0.482) -0.019*** (0.007) -0.011 (0.019) 0.110*** (0.026) -0.026 (0.026) -0.012 (0.028) 0.237** (0.093) 0.116*** (0.030) -0.032 -0.006 (0.006) 0.225 (0.223) 0.161 (0.215) 1.307*** (0.334) 2.246*** (0.463) 0.810** (0.332) 1.013*** (0.291) 1.010*** (0.369) -0.016* (0.008) -0.019 (0.018) 0.082*** (0.029) -0.023 (0.025) -0.011 (0.020) 0.190*** (0.070) 0.080*** (0.023) -0.027 -0.008* (0.005) 0.370** (0.183) -0.101 (0.177) 1.271*** (0.231) 1.748*** (0.240) 1.045*** (0.243) 1.048*** (0.260) 0.906*** (0.254) -0.023*** (0.007) -0.034* (0.017) 0.065** (0.027) -0.021 (0.020) -0.007 (0.020) 0.109* (0.063) 0.077*** (0.027) -0.022 129 -0.012** (0.006) 0.416* (0.220) -0.073 (0.198) 1.339*** (0.234) 1.810*** (0.249) 1.089*** (0.273) 1.111*** (0.257) 1.002*** (0.275) -0.009 (0.008) -0.012 (0.018) 0.056** (0.023) -0.028 (0.019) -0.006 (0.018) 0.059 (0.050) 0.058** (0.023) -0.005 -0.003 (0.007) 0.393* (0.225) 0.402** (0.199) 1.500*** (0.301) 2.496*** (0.409) 0.914*** (0.306) 1.192*** (0.280) 1.160*** (0.334) -0.007 (0.009) -0.009 (0.020) 0.064** (0.027) -0.027 (0.021) -0.013 (0.018) 0.145*** (0.056) 0.067** (0.030) -0.014 Table 3A-5 (cont’d) Time-average of fertilizer cost differential expectation Time-average of maize price expectation Time-average of yield differential expectation G-Dummy * used hybrid previous period G-Dummy * age of head G-Dummy * head education G-Dummy * distance to fertilizer seller G-Dummy * distance to hybrid seller G-Dummy * distance to motorable road G-Dummy * distance to tarmac road G-Dummy * landholding size G-Dummy * real value of assets G-Dummy * household size G-Dummy * fertilizer cost differential expectation G-Dummy * maize price expectation (0.029) (0.031) 1.253*** (0.209) -3.560*** (0.716) -4.432*** (1.025) (0.029) 0.965*** (0.190) -2.644*** (0.566) -3.381*** (0.937) (0.023) 130 (0.020) -0.001 (0.169) -0.004 (0.003) -0.019 (0.014) -0.025 (0.025) -0.019 (0.019) -0.060 (0.070) 0.011 (0.008) 0.012 (0.027) 0.032 (0.022) -0.036 (0.022) (0.024) 0.936*** (0.161) -2.024*** (0.463) -2.956*** (0.915) -0.027 (0.183) 0.008 (0.005) -0.000 (0.016) -0.003 (0.022) -0.015 (0.020) -0.085 (0.072) 0.004 (0.009) -0.008 (0.031) 0.029 (0.025) -0.029 (0.027) 0.093 (0.256) -0.832** (0.371) Table 3A-5 (cont’d) Used hybrid in 2000 Constant -0.977** (0.420) 3.176** (1.299) 0.729*** (0.107) -0.996*** (0.348) Observations 3,621 Number of hhid 1,207 chi2 6.385 Note: Robust standard errors in parentheses. *** p<0.01, ** p<0.05, * p<0.1 3,621 1,207 128.3 3,621 1,207 129.5 0.855*** (0.133) 1.726* (1.016) 3,621 1,207 18.00 0.732*** (0.089) -1.153*** (0.323) 3,621 1,207 5.821 0.725*** (0.115) 1.525** (0.758) 3,621 1,207 6.107 131 REFERENCES 132 REFERENCES Barrett, Christopher B. 2008. “Smallholder Market Participation: Concepts and Evidence from Eastern and Southern Africa.” Food Policy 33 (4): 299–317. Byerlee, Derek. 1994. “Maize Research in Sub-Saharan Africa: An Overview of Past Impacts and Future Prospects.” CIMMYT Economics Working Paper (CIMMYT). Caliendo, Marco, and Sabine Kopeinig. 2008. “Some Practical Guidance for the Implementation of Propensity Score Matching.” Journal of Economic Surveys 22 (1): 31–72. Chamberlain, Gary. 1985. “Heterogeneity, Duration Dependence and Omitted Variable Bias.” In Longitudinal Analysis of Labor Market Data. Cambridge University Press New York. De Groote, Hugo, Cheryl Doss, Stephen D. Lyimo, Wilfred Mwangi, and Dawit Alemu. 2002. “Adoption of Maize Technologies in East Africa–What Happened to Africa’s Emerging Maize Revolution.” In FASID Forum V,“Green Revolution in Asia and Its Transferability to Africa”, Tokyo. Dercon, Stefan, and Luc Christiaensen. 2011. “Consumption Risk, Technology Adoption and Poverty Traps: Evidence from Ethiopia.” Journal of Development Economics 96 (2): 159–173. Dixit, Avinash. 1989. “Entry and Exit Decisions under Uncertainty.” Journal of Political Economy 97 (3): 620–638. Doss, Cheryl R. 2006. “Analyzing Technology Adoption Using Microstudies: Limitations, Challenges, and Opportunities for Improvement.” Agricultural Economics 34 (3): 207– 219. Foster, Andrew D., and Mark R. Rosenzweig. 1995. “Learning by Doing and Learning from Others: Human Capital and Technical Change in Agriculture.” Journal of Political Economy 103 (6): 1176–1209. Lee, David R. 2005. “Agricultural Sustainability and Technology Adoption: Issues and Policies for Developing Countries.” American Journal of Agricultural Economics 87 (5): 1325– 34. Moser, Christine M, and Christopher B Barrett. 2003. “The Disappointing Adoption Dynamics of a Yield-Increasing, Low External-Input Technology: The Case of SRI in Madagascar.” Agricultural Systems 76 (3): 1085–1100. Mundlak, Yair. 1978. “On the Pooling of Time Series and Cross Section Data.” Econometrica: Journal of the Econometric Society, 69–85. 133 Mwangi, Wilfred M. 1996. “Low Use of Fertilizers and Low Productivity in Sub-Saharan Africa.” Nutrient Cycling in Agroecosystems 47 (2): 135–47. Ouma, James O., Festus M. Murithi, Wilfred Mwangi, Hugo Verkuijl, Macharia Gethi, and Hugo De Groote. 2002. Adoption of Maize Seed and Fertilizer Technologies in Embu District, Kenya. CIMMYT. Pannell, David J., and Frank Vanclay. 2011. Changing Land Management: Adoption of New Practices by Rural Landholders. Csiro Publishing. Spencer, Dunstan S. 1996. “Infrastructure and Technology Constraints to Agricultural Development in the Humid and Subhumid Tropics of Africa.” African Development Review 8 (2): 68–93. Sunding, David, and David Zilberman. 2001. “The Agricultural Innovation Process: Research and Technology Adoption in a Changing Agricultural Sector.” Handbook of Agricultural Economics 1: 207–261. Suri, Tavneet. 2011. “Selection and Comparative Advantage in Technology Adoption.” Econometrica 79 (1): 159–209. Tura, Motuma, Dejene Aredo, Wondwossen Tsegaye, Roberto La Rovere, Girma Tesfahun, Wilfred Mwangi, and Germano Mwabu. 2010. “Adoption and Continued Use of Improved Maize Seeds: Case Study of Central Ethiopia.” African Journal of Agricultural Research 5 (17): 2350–2358. Wooldridge, Jeffrey M. 2005. “Simple Solutions to the Initial Conditions Problem in Dynamic, Nonlinear Panel Data Models with Unobserved Heterogeneity.” Journal of Applied Econometrics 20 (1): 39–54. Zeller, Manfred, Aliou Diagne, and Charles Mataya. 1998. “Market Access by Smallholder Farmers in Malawi: Implications for Technology Adoption, Agricultural Productivity and Crop Income.” Agricultural Economics 19 (1): 219–229. 134