2m ' LIBRARY Michigan State University This is to certify that the thesis entitled TRIPLE HURDLE MODEL OF SMALLHOLDER PRODUCTION AND MARKET PARTICIPATION IN KENYA’S DAIRY SECTOR presented by William J. Burke has been accepted towards fulfillment of the requirements for the Master of degree in Agricultural Economics Science A 1" f7 ./. ,/ I . l ’ r \ ' 1" K v ‘3 I _ T Major Professor’s Signature Await 1?. 1‘? Date MSU is an Affinnative ActiorVEqual Opportunity Employer PLACE IN RETURN BOX to remove this checkout from your record. TO AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE 5/08 KilProj/AccanPres/ClRC/DateDue indd TRIPLE HURDLE MODEL OF SMALLHOLDER PRODUCTION AND MARKET PARTICIPATION IN KENYA’S DAIRY SECTOR By William J. Burke A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Agricultural Economics 2009 Copyright by WILLIAM J. BURKE 2009 ABSTRACT TRIPLE HURDLE MODEL OF SMALLHOLDER PRODUCTION AND MARKET PARTICIPATION IN KENYA’S DAIRY SECTOR By William J. Burke In Kenya, strong demand and the fact that most of the nation’s 3 million dairy cattle are in the hands of smallholders provides a tremendous opportunity for households to participate in the dairy market and increase rural incomes. Unfortunately, recent output has not kept pace with increasing demand, suggesting that barriers prevent rural farmers from tapping dairy’s underexploited potential. Using 11-year panel data from 1275 smallholders, this study develops a model to determine the factors enabling smallholder participation in Kenya’s dairy market, and uses the findings to identify strategies to improve dairy productivity and promote successful smallholder commercialization. Traditional double-hurdle market participation models are not adequate for addressing these objectives, primarily because they require the implicit assumption that all farmers are producers, whereas roughly 1/3 of rural Kenyan households do not produce milk in a given year. This study thus develops a “triple-hurdle” model, which allows for both non- producers and autarkic producers. Results suggest a bi-modal policy response to enable producers as well as the formal and informal purchasing enterprises to which they sell. Technical education, improved technologies, electrification, and access to credit are important to provide an enabling environment for producers. Along with the recent initiative to revive the parastatal dairy purchaser, evidence indicates that a more stable policy environment for small-scale traders, whose current market behavior is technically illegal and unpredictably regulated, would promote significant farmer response. For Cassie and Liam iv ACKNOWLEDGMENTS I am extremely grateful to my major professor and advisor, Dr. Thom Jayne, for the amazing opportunity to work on the Kenya project, and for his guidance, insight, and encouragement throughout my work on this and many other projects. Without his support and attentive efforts this paper would not have been possible. I am also grateful to my committee members Dr. Robert Myers and Dr. Lisa Cook, whose efforts and suggestions have greatly improved my experience and the quality of this work. I would like to thank Margaret Beaver for her support and training. This study would not have been possible without funding from USAID and the TAPRA project. My thanks go to the Tegemeo institute for use of the data and to its staff for the dedication to excellence in collecting survey data. I would like to thank all of my fellow students at MSU, especially Joshua Ariga and Joleen Hadrich for their comments and advice, as well as Elliot Mghenyi for taking a considerable amount of time to introduce me to the data used in this study. Since childhood my parents have provided me with the greatest loving care, direction, and opportunity that any person could hope for. You convinced us that we really could be anything, so thank you. I only hope that I can be anywhere near as good a parent. Thanks to my brothers and sister for unending support. Thanks to Marty for showing me it can be done and encouraging me along the way. Thanks to Liam for the extraordinary motivation, and thank you most of all to my best friend and wife, Cassie, for being with me through it all. I will always admire you, and you amaze me almost every day. TABLE OF CONTENTS LIST OF TABLES ............................................................................ vii LIST OF FIGURES ........................................................................... viii KEY TO SYMBOLS OR ABBREVIATIONS .......................................... ix 1. INTRODUCTION ......................................................................... 1 2. BACKGROUND OF KENYA AS A CASE STUDY ................................. 5 3. DATA ....................................................................................... 9 4. CONCEPTUAL FRAMEWORK ........................................................ 10 4.1 Market Participation ............................................................. 10 4.2 Structural Model ................................................................. 11 5. METHODS ................................................................................. 2O 6. VARIABLE CONSTRUCTION ......................................................... 32 6.1 Community Characteristics and Transaction Cost Determinants . . . . . 32 6.2 Household Characteristics and Investments ................................. 33 6.3 Prices .............................................................................. 35 7. RESULTS ................................................................................... 39 8. CONCLUSIONS ........................................................................... 53 APPENDIX A: AUTARKY PRICES AND TRANSACTION COSTS ............... 57 APPENDIX B: FIRST STAGE ESTIMATION WITH AND WITHOUT PRICES.62 APPENDIX C: FULL REGRESSION RESULTS FROM 3-STAGE ANALYSIS..66 APPENDIX D: COMPLETE SIMULATION ANALYSIS ............................. 70 APPENDIX E: MODEL EVALUATION ................................................. 72 REFERENCES ................................................................................ 74 vi LIST OF TABLES Table 1: Distribution of Continuous Explanatory Variables ............................ 37 Table 2: Distribution of Binary Explanatory Variables over Time ..................... 38 Table 3: Three-Stage Model for Dairy Market Participation in Kenya (MLE) ....... 40 Table 4: Average Household Production and Participation Probability Simulations by Credit Prevalence and Distance to Electricity ............... 44 Table 5: Average Household Production and Net Sales Probability and Expectation by Credit Prevalence and Distance to Electricity .......... 46 Table 6: Average Partial Effect (APE) of Distance to Electricity (km) on Unconditional Expected Volume of Net Dairy Sales (liters/hh) ........... 47 Table 7: Average Household Production and Participation Probability Simulations ............................................................ 49 Table 8: Average Household Production and Net Sales Probability and Expectation ....................................................... 50 Table Bl: Production Probit Coefficients with and without Controlling for Prices..62 Table BZ: Price Distribution by Zone Over Time ........................................ 65 Table C1: Full Regression Results from 3-stage Analysis .............................. 66 Table D1: Average Household Production and Participation Probability and Expectation by Credit Prevalence and Distance to Veterinary Services. 70 Table D2: Average Household Production and Net Sales Probability and Expectation with Multiple Enterprise Interactions ........................... 71 Table E1: Mean Predicted Probability and Sample Share ............................... 72 Table E2: Most Likely Outcome by Actual Observed Outcome ........................ 73 vii LIST OF FIGURES Figure 1: Production (1000 tonnes) Over Time for Milk and Maize in Kenya ....... 7 Figure 2: Graphic Illustration of the Three Tiered Market Participation Model ....... 33 Figure A1: Indirect Utility of Market Participants ....................................... 57 Figure A2: Indirect Utility of Sellers Under Transaction Costs ........................ 59 Figure A3: Indirect Utility of Market Participants Under Transaction Costs. . . . 60 viii AEZ APE IMR IPW KCC Ksh MLE mm MP TAMPA USAID KEY TO SYMBOLS OR ABBREVIATIONS Agro-Ecological Zone Average Partial Effect household Inverse Mills Ratio Inverse Probability Weight Kenya Creameries Company kilogram kilometer Kenyan Schilling Maximum Likelihood Estimator millimeter Market Participation Tegemeo Agricultural Monitoring and Policy Analysis United States Agency for International Development ix 1. INTRODUCTION Structural transformation and economic growth have regrettably eluded much of Africa. In theory, increased productivity on the farm will lead to lower food prices, raise the disposable incomes of food consumers, make labor available for a growing industrial sector, and initiate the structural transformation processes in a self-perpetuating cycle of growth (Johnston and Mellor, 1961; Mellor, 1998).1 Evidence of this cycle was well documented through Asia’s Green Revolution, but the pattern has yet to emerge in Africa. Although this theory is widely accepted, specific determinants of smallholders’ ability to participate and succeed in markets with the most growth potential are often mysterious. In many parts of Africa the dairy sector has been identified for its potential to increase the income generating productivity of smallholders’ assets (Walshe et. al., 1991; Staal et. a]. 1997; Kodhek and Karin, 1999; Thorpe et. a1. 2000). In Kenya, for example, preferences for local “raw” milk, and the fact that most of the nation’s 3 million dairy cattle (85% of East Africa’s dairy cattle population) are in the hands of smallholders, provide a tremendous opportunity for households to participate in that growth market (Staal and Mullins, 1996; Thorpe et. al. 2000).2 Indeed, roughly 69% of households in this study’s sample are producing dairy in any given year. Nevertheless, in many cases dairy production does not lead to increased disposable income, as one might expect, with only 71% of producers being net sellers, ' This paradigm was subsequently adapted to encompass the direct effects of increased productivity on rural poverty as well as linkages from the farm to the rural non-farm sector (Hazell and Haggblade, 1993). 2 “Growth markets” refer to markets for goods which require substantial initial investment (such as in education or capital), but provide higher returns than staple commodity markets, like maize. Many horticultural markets, in addition to dairy, are examples of “growth markets” while 12% of producers withdraw from the market, and 17% of producers purchase more milk than they sell. Recently local media outlets suggest that the prevailing opinion is that the dairy sector’s performance has not lived up to it’s potential.3 This underscores the importance of understanding what motivates households to produce dairy and participate in dairy markets, and why some households appear able to exploit seemingly attractive production and marketing opportunities while others cannot. Existing market participation (MP) models, however, are somewhat inadequate, since each requires that all households be producers (recall that nearly a third of the Kenyan sample does not produce dairy). So, the objectives of this study are two-fold: (1) to develop and estimate a model eplaining the factors enabling smallholder participation in Kenya’s dairy sector, accounting for non-producers, and (2) to use the findings of the model to identify strategies to improve smallholder dairy productivity and promote successful commercialization of dairy production. These objectives will be met by addressing the following research questions: 0 What are the determinants of rural smallholder’s ability and willingness to produce in the relatively high-value dairy sector? 0 What are the determinants of a producer’s role in the market as either buyers, sellers, or autarkic? 3 See referenced material on-line at allafrica.com and africanews.com o How do these determinants affect the level of participation, or amounts bought and sold, among participants? For example, prior to 1992 the only legal marketing channel for dairy in Kenya was the Kenyan Creameries Company (KCC), which is effectively a parastatal firm. Since 1992, aside from household-to-household transactions, a variety of channels have become available to dairy farmers, where the KCC now shares the marketing of dairy with farmer owned cooperatives, private processors in the formal sector, and informal private traders. It is important to note that the latter group, sometimes called “hawkers,” is still operating illegally. Although generally tolerated, these entrepreneurs conduct their business in an extremely uncertain policy environment (Staal et. al., 1997; Owango et. al., 1998; Kijima et. a1. 2006). In response to lower than desired production nationally, the recent policy thrust has focused on the revival of the KCC (which has been in a weakened state since the dissolution of the executive board in 1998), yet there has been little empirical analysis as to how farmers respond to these various marketing channels (i.e. is this the most efficient use of government resources?) The research questions will be addressed using an econometric model of the market participation decision. Market participation models have previously described decisions as occurring in two steps: 1) whether to participate in the market (buy or sell, versus remain autarkic), and 2) what volume will be bought or sold (Goetz, 1992; Key et. al., 2000; Holloway et. al., 2001; Bellemare and Barrett, 2006; and others). Again, however, these have limitations for the current purpose because they rely on the assumption that all households in the population are producers. While this assumption may be appropriate when applied to, say, staple grains, it is unreasonable for commodities which are not produced by a large fraction of households. Most agricultural commodities produced by smallholder farmers in fact belong in this latter category, such as horticultural products, industrial cash crops, and animal products, including dairy products in Kenya. Therefore, findings from standard MP studies understate the effects of a given determinant, because they cannot account for the impact on likelihood of production, the necessary precursor to any market related decisions. Also, in subsuming the effects of production determinants, the estimates of market participation models tend to commingle constraints on production and constraints on participating in markets, and therefore overemphasize the role of market access in explaining non-participation in markets as opposed to asset constraints or low productivity in the use of factors of production. A related issue is that many previous studies address market participation decisions as though they are made entirely a priori. One could argue, however, that the more appropriate discussion would be about market participation that is in part stochastic, and in some cases even unintended, determined by realized consumption and production shocks, which are in turn at least partially determined by factors exogenous to the household (e.g. rainfall or health related shocks). This study further develops the market participation class of models to provide a three- tiered decision framework that enables a nationally representative sample to be maintained even in markets where a sub-population do not producer. The expanded framework also addresses the possibility that market participation is partially determined by exogenous shocks originating outside the household, after production decisions are made. 2. BACKGROUND OF KENYA AS A CASE STUDY In many parts of Africa, the dairy sector has long been identified as holding potential for smallholders to increase the income generating productivity of their assets. This is particularly true in Kenya, where demand for local “raw” milk is strong domestically, as is export demand in the formal sector (Walshe et. al., 1991; Staal et. al. 1997; Kodhek and Karin, 1999; Thorpe et. al. 2000). In fact, Karanja (2003) predicted that by the end of 2008 domestic demand for dairy, driven by a growing, wealthier urban population, would outpace domestic production. While production among smallholders has increased over the past several years, local media indicates response to rising prices have been lower than expected, leading some to conclude that Kenya will be a net importer of dairy in the near future. 4 In the World Bank’s 2008 “Kenya Agricultural Policy Review,” the policy debate concerning the informal dairy market is summarized as a balance that weighs increased regulation and the safety of consumers against producers’ ability to thrive and provide increased productivity and output. The same report, however, states that consumers are safe under current regulations, citing that bacteria counts in the relatively unregulated “raw” milk are comparable to those in processed milk. Raw milk, which is actually boiled prior to consumption to kill milk-bourn illnesses, has a smaller marketing margin than processed milk, leading to higher prices for the farmer and lower prices for the consumer. That is, the cost of boiling is so small relative to formal processing, that even if boiling cost is paid by the consumer both they and 4 See referenced material on-line at allafricacom and africanews.com producers receive preferable prices in the raw milk market (Staal and Mullins, 1996; Thorpe et. al. 2000, World Bank, 2008). Indeed, evidence indicates participation in the dairy market has not only been prevalent among Kenya’s rural smallholders over the last decade, but that this activity is generally associated with increasing asset-wealth over time (Burke et. al. 2007). Thus, if more regulation is thought to have only marginal benefit to consumers, the remaining question is: what would provide a more attractive environment for producers? Regarding the formal market, recent policy has focused on the revival of the KCC (allafrica.com, 2008; africanews.com, 2007). This raises an interesting point. Consider Figure 1, which uses national level data and super imposes dairy (and maize, the countries primary crop) production over a political timeline. This shows that, after an initial lull during the period where a strong KCC and growing private sector coexisted, annual production has increased dramatically since the KCC central board was dissolved in 1998 amidst corruption scandal. This calls to question whether the current policy focus is the most efficient use of government resources. Figure 1 also shows the gap in production trends between milk and maize has been narrowing steadily for decades. That said, from the producer’s perspective, as with all entrepreneurial endeavors, there are obstacles which must be scaled by small farmers to realize the potential benefits of dairy production. First, milk production, like that of any other agricultural product, is partially a stochastic process, where factors outside the farmer’s control (i.e. rainfall) can affect output. Figure 1: Production (1000 tonnes) Over Time for Milk and Maize in Kenya 4.000.000~ "Liberalimtion" \/‘ 3,500,0001r --------------------------------------------------------------- 3.000.000— ------ m 2,500.ooo+ ----- a: g 2,000.ooo~ O . '- 1.5oo.ooo« ~ . - -------------------- 1.000.000» ------------------- V‘KCC Board/ 500.000- ...... - ............................... . -.1.1_I.s_c._9fpr.lvst§ ’ V‘Independence Kenyatta Dies traders 0 v‘Moi becomes President T r H . fl sesssseseessssseeas-§§§ --- Maize —Cow Milk ----- Linear (Maize) --- Linear (Cow Milk)! Source: FAOStat web site, FAOStat database: www.faostat.f2_ro.org Secondly, there is substantial overhead investment required to own, house and feed dairy cattle, as well as having labor available to tend the herd. For example, high—yielding “grade” cows are considered essential to sustainable dairy farming (World Bank, 2008). In 2007 the median price of a single improved breed cow was 30,000 Ksh, which at 24% of the median annual net income of rural households is a sizable investment. Moreover, a number of veterinary services are available, such as artificial insemination, which could make an endeavor more likely to succeed, but which may also present additional barriers to entry. Once a household does enter the dairy sector, there is no guarantee of success. Milk, being highly perishable, requires either cold storage or fairly immediate market access. Also, there is a variety of hidden action problems involved in any market exchange, since the quality (milk can be diluted) and hygiene (poorly handled milk is unsafe) can vary greatly, and are not easily verifiable.5 In short, there are three broad classes of variables that may explain heterogeneous household response and outcomes over time to investment in dairy: (1) a stochastic production process; (2) location-specific differences in the enabling environment, including access to and performance of localized input and output markets, and agro- ecology; and (3) household specific characteristics such as knowledge and training, asset levels, etc. An important point emerging from this discussion is that successfiil participation in dairy marketing depends on far more than the willingness of the farmer to enter, or even his/her specific market-related conditions. Although prior research has acknowledged the market’s potential and examined the industry using regional or national data, there has been little household-level panel analysis of dairy in Kenya. Presumably, this is due to the dearth of panel data required to properly investigate markets from the household perspective. It is important, however, to understand how the household unit fits into the expanding market. The lack of empirical research identifying determinants of whether and how households are participating in Kenya’s expanding dairy market leaves an important void to be filled. 5 Despite these information asymmetries, some empirical evidence shows raw milk to be comparable in safety to processed milk, possibly because of the boiling process, or the importance of reputation building through repeated transactions and/or cooperative organizations (World Bank, 2008). 3. DATA This study uses panel data from four surveys implemented by the Tegemeo Institute of Egerton University in Nairobi, Kenya. In 1997, the sampling frame was designed in consultation with the Central Bureau of Statistics, and contained 1,500 households randomly chosen to represent eight different agricultural-ecological zones (AEZ), reflecting population distribution. Of the original sample, 1,428 households (95%) were re-interviewed in 2000, 1,324 (88%) were re-interviewed in 2004, and 1,275 (85%) were re-interviewed in 2007. Holding consistently at or below 7% of the original sample per survey, this rate of attrition is reasonably low compared to similar surveys in developing countries, which can typically range as high as 20% (Alderman et. al., 2001). Attrition bias should always be a concern when working with panel data, but preliminary analysis has shown the attrition in this sample does not appear to be highly systematic, and correction measures have not altered results meaningfully. Burke et. al. (2007), for example, used inverse probability weights (IPW) based on household and community characteristics. This involves estimating an auxiliary model for attrition and weighted observations with by their re-interview probability. Ultimately, the majority of re- interview coefficients were not statistically significant. Moreover, results from that study’s weighted probit analysis had no more explanatory power than the un-weighted, and most results were identical to the thousandth decimal. This nationally representative panel data gives this study the unique opportunity to examine the household with respect to dairy market participation in order to address the research questions of concern. 4. CONCEPTUAL FRAMEWORK 4.1 Market Participation Economists have generally treated the household’s decision to participate in markets as a two-step process: first, producing households decide whether to participate (buying or selling) or remain autarkic, then, conditional on participation, how much to buy or sell (Goetz, 1992; Key et. al., 2000; Holloway et. al.,2001; Bellmare and Barrett, 2006; others). However, when considering a market such as dairy in Kenya, it is important to first acknowledge that not all households will be producers. For example, at the outset of the 10 year panel used for this study, only about 2/3 of the households were milk producers. This makes it important to add a third stage of analysis to the traditional 2- stage MP model that identifies factors influencing a household’s decision whether or not to produce. Moreover, existing models treat participation entirely as an ex-ante decision made by the household. By adding production decisions in a separate stage, however, this study will allow factors affecting production decisions to differ from those affecting market participation, thus acknowledging the fact that marketing decisions may be, in part, not an ex ante choice but often an artifact of constraints on production and/or a response to stochastic production shocks. Production decisions can only be made based on the farmer’s expectations. This distinction allows the farmer’s information to be updated after deciding whether to produce, but before deciding to participate (or not) in the market. There is a drawback to imposing the assumption that production and marketing are discrete sequential decisions when using annually aggregated household data. Unlike 10 annual crop production (where output is realized at a specific harvest time), dairy production is a temporally continuous process where annual yields are realized gradually. Imposing a sequential decision model implicitly assumes that when production decisions are made they are maintained throughout the year. Concerns over this implicit assumption can be partially assuaged by the high degree of seasonality in dairy production (which is highest during the rainy season). Of course, the best way to resolve this issue would be to collect higher frequency data, which would provide greater information to assess how marketing decisions respond to stochastic production levels. Unfortunately, this is not feasible for this study, for which only annual data is available. Given the sequential structure of the decision process, the problem must be solved recursively, similar to a dynamic programming model. First the determinants of producer’s market participation are solved for, assuming the production decision has already taken place. Then, the determinants of smallholder’s production decision can be derived based on their a priori knowledge of the factors which will subsequently influence marketing. For example, the realized production and amount of dairy to be bought or sold will depend on the amount of rainfall in a given year.6 Thus, actual rainfall affects household’s participation in the market, but when a farmer is making production decisions they can only form an expectation of what rainfall may be. 4.1 Structural Model To more formally derive this model, start with the (second-stage) market participation decision, and follow the traditional model described by Key et. al. (and later Bellmare 6 Rainfall affects the amount of fodder or grain available as an input for dairy producers to feed cattle. 11 and Barrett, Holloway et. al., and others). They posit a representative agent maximizes their utility (4.1) subject to equations (4.2) through (4.5): max u(c) (4.1) :07 —r,:.(z:)a;)+ (pr + Melanin.- ‘tiizflfsi "iiziblfsi +T =0 (4.2) qj-nj+Aj-mj-cj=0, j=1,...,J (4.3) qu’”,§qu= 0 (4.5) cj,qj,nj20 (4.6) In equation (4.1) u is the agent’s utility as a function of a vector of their consumption, c. Equation (4.2), the budget constraint, is where the role of transaction costs is introduced. m Here pj is the market price of good j, and mj is the amount of that good marketed, which is positive for sellers and negative for buyers. The agent’s role in the market is S represented by the two indicator functions: 5]: is l for sellers of good j and 0 otherwise, b and 5f is 1 for buyers of good j and 0 otherwise. Note an additional important condition: 3 b 6146}. $1 (4.7) 12 which establishes m as the net quantity marketed by stating that a household cannot be both buyer and seller in the same period. In this equation, the proportional transaction S S costs for sellers of good j, tpj , and fixed transaction costs for sellers of good j, tfj , effectively change the price they receive and thus their behavior in the market. Similarly, b the proportional transaction costs for buyers of good j, tpj , and fixed transaction costs b for buyers of good j, tfi , effectively change the price they pay and thus their behavior in the market. However, as the authors point out, these transaction costs are largely unobserved in survey data, and are thus represented as functions of more readily S enumerable factors explaining them, Z t and Z; respectively. One of the reasons transaction costs are an important element in this model is their role in explaining autarkic behavior of producers, as described in Appendix A. The inclusion of non- market transfers, T, which can be positive or negative, completes this constraint. Equation (4.3) is a feasibility constraint which indicates that for any good j, the amount consumed, Cj , the amount marketed m j , and the amount used as an input, ”1' , cannot exceed the amount produced, qj , and the endowment Aj . In the case of dairy, this model will impose the simplifying assumption that observations enter each period without an endowment, and that dairy products are not used as inputs. As will be described below, this does not imply that households enter each period without endowments of factors related to dairy production, but that they begin (and end) each period without stocks of dairy products. 13 Equation (4.4) describes the relationship between inputs, nj , and outputs through the production technology G, and considering other supply shifters, Zq . Recall, at this stage, production decisions regarding ”j have already been made, so we take "j as given. Traditionally, specifications of Z q have been limited primarily to community level characteristics (such as share of local farmers using fertilizer or hybrid seeds), and endowments over which the household has little control (such as age of household head or the amount of land cultivated) (Key et. al., 2000; and others). The importance of these factors is not disputed. However, one could argue that other important elements may exist in Z q . First, the household’s past investments play an important role in the production equation. When investment values have been included in prior analyses, such as livestock assets and trucks in Key et. al. (2000) or total value of assets in Bellemare and Barrett (2007), they have done so contemporaneously. This may arguably lead to biased results as they could be considered endogenous.7 That is, in a given time period does one own livestock assets because they participate in a given market, or do they participate in the market because they own livestock assets? In reality, the causality likely flows both ways, but this problem can be mitigated by including the lagged investments made by a household. Secondly, it is important at this stage to separate out variables which would shock the stochastic production process, but which would not be known to producers when ‘ 7 Some household investments, such as trucks in Key et. al., have (correctly) been included as a factor CXplaining transaction costs rather than a supply shifter. Nevertheless, the argument with respect to being contemporaneously endogenous remains valid. 14 production decisions are made. The early work of Goetz best addresses these exogenous shocks by acknowledging that quantities produced and consumed are random variables in his structural model, but studies which have followed downplay this fact. Fortunately, with sufficient data we can include realized values of these shocks in the model as observed determinants. Thus, we can substitute the following constraint, and instead of (4.4), we have: G(q,n,;qu= 0 (4.8) cc hi us (3 C Where Z q 7': (Zq 9 Zq a Z q ), and Z q are supply shifters associated with hi community characteristics, as in previous studies, Z q are lagged household US investments, and Z q are unknown shocks to production, revealed to the farmer after production decisions are made, yet prior to final marketing decisions. Although the household investments in this model are lagged at least one survey period to circumvent endogeneity concerns, for clarity of notation individual household and time period subscripts have been left out of this description of the model. Indirect utility functions are derived from this optimization, leading to market quantity functions, as in Key et. al., and arrive at the decision rules: 3 cc hi us ms = (q -C)= ms (Zr ’Zq ’Zq :Zq anapmaTa A),fornetsellers, (4.9) and 15 b cc hi us m mb = (C — q): mb (Z, ’Zq azq 9 Zq ana P 9 Ta A), fornet buyers (4.10) Notice that, like Bellemare and Barrett and Key et. al., empirically we will separate positive and negative values of the marketed quantity into two non-negative variables for net purchases, mb , and net sales, ms . Allowing quantities to be determined by separate processes makes this model more flexible than if we were to use a switching regression, as in Goetz. Similar factors will define threshold quantities which determine whether a household is a net buyer, remains autarkic, or is a net seller. That is, the factors determining whether and in what capacity a producer participates in the market can be thought of as the same as those which determine how much they buy or sell.8 There is also a key difference between equations (4.9) and (4.10) and the analogous equations from Key et. al.. That study asserts that factors of production (n) are solvable as a function of prices and production shifters, and thus substitutes inputs out of the quantity equations. To some extent, we agree, as will be illustrated below.9 This substitution, however, has two drawbacks. First, some of the production shifters outlined in that study (e. g. age of household head, household asset holdings, access to credit, etc.) may have an independent effect on commodity market participation, as discussed when we disaggregated shifters in equation (4.8) above. Including only the shifters, then, in a reduced form would confound their effects on input demand and market participation. 8 Although not emphasized here, a key theoretical consideration is that fixed transaction costs play a role in the decision on whether to participate in the market, but not on the quantities marketed. However, since these costs are being represented by the factors determining them, which are ultimately treated as the same for both, ignoring this theoretical consideration here has no empirical implications. 9 Some of the variables described as production shifters in Key et. al. are characterized as either community characteristics (e.g. access to credit) or household characteristics (e.g. age of household head and value of household assets), which appear in the latent model for input demand in equation (4.11). 16 Secondly, substituting for n downplays the fact that it is a vector that includes a wide rage of inputs and alternative technologies. For example, Kenyan dairy farmers can feed by either grazing cattle or using a zero-graze system. They can use (more productive) grade cows or indigenous species. Moreover, several sources contend that successful commercialization hinges on the choice of technology (World Bank, 2008; Kijima et. al. 2006). This hypothesis cannot be tested if n is substituted out of the market quantity equations. For these reasons the elements of n are not substituted out of market quantity equations, and rather treated as predetermined (and thus contemporaneously exogenous) farmer decisions. With this established, we turn to the first stage production decision. This a priori decision is solved recursively, where farmers are now maximizing expected utility (4.11), subject to the uncertain objective function for market participation (4.12): max E [u(c)] (4.11) D[m] = D[m(z, ,Z;C,z:i,z;‘s,n,p’",T,A)] (4.12) where Z 1 summarizes factors determining transaction costs for buyers and sellers. In other words, farmers decide on input demand (i.e. production decisions) considering the distribution of possible marketing outcomes (e. g. expected value and, assuming farmers may be risk averse, higher moments). In practice, the determinants of the production decision will be a function of the distribution of marketing determinants themselves. That is, to derive the decision rule for production, we need only consider what the farmer knows about marketing determinants before production takes place. 17 s b First of all, the factors affecting transaction costs, Z t and Z t , could reasonably be considered constant within a given year, and thus considered known to farmers making production decisions. For example, it is unlikely that the distance to a market, or whether there is a private trader in the village would be unknown to the farmer making production CC decisions. This is also true of community characteristics, Z q , such as distance to a veterinarian, whether there is a credit or dairy cooperative, and so on. Lagged household I hi investments, Z q , are obviously known to farmers a priori. Resources allocated to production, n, are a determinant in the marketing stage, but a decision variable in the production stage. The decision of how to allocate resources for production is theoretically simultaneous. So, obviously, it should not be included in the first production decision equation. The next vector, which by definition is unknown to the farmer prior to production us decisions, is the unknown shocks to production, Z q . Rather than ignore the fact that this will eventually be a determinant in marketing decisions, smallholders may develop some expected value of that shock on which they can base their production decision. Also, since we have implicitly assumed a concave (risk averse) utility function, higher moments of the distribution, such as the variance, will also determine production. This US distribution can be written D (Z q ). The same is true for any prices that will effect marketing decisions (e.g. input, output, and substitute prices), where production is 18 de l] decided based on the distribution of market prices, 13(1) m ). In summary, the decision rules for production decisions can be written: it = n(z, ,zgc,z:i,D(z:’),D(pm),T,A) (4.13) w] = W1 (2, ,z;C,z:i,D(z;"’ ),D(p"’),T,A) (414) W) =0,ifn=0 and W1=1,ifn>0 (4.15) where W1 is the binary indicator of whether the farmer decides to produce. Although it is also decided at this point, those determinants are not the focus of this study. The empirical model will only include the binary indicator. l9 5. METHODS Methods for estimating market participation models have existed in the literature since the early 1990’s. These analyze the household unit with respect to agricultural markets, usually employing a “two-tiered” model, otherwise known as a “double hurdle” model. Double hurdles were first introduced as a class of models by Cragg (1971) as a more flexible alternative to tobit models. Goetz (1992), often recognized for his early work in this class of models, analyzes market participation by first separating producing households into participants (buyers and sellers), and non-participants (that is, households which consume only and all of their own production) using probit regression in the first stage. Then, in the second stage, the quantities bought and sold are analyzed along the real line of production less consumption, where negative values denote net buyers and positive values denote net sellers, with a switching regression.lo In more recent work Bellemare and Barrett (2006) allow for the consideration of buyers and sellers of livestock separately by first segregating producers into buyers, autarkic, and sellers with ordered probit regression in the first tier of their analysis. Then, in the second tier, truncated normal regressions are estimated for net quantities bought and sold separately for net buyers and net sellers respectively. There are a number of other prominent models between the early work of Goetz and the more recent work of Bellemare and Barrett, although the latter seems to be the most flexible to date.ll One major drawback to all of these approaches, as described in the previous section, is that '0 This is called a switching regression because as observations move fi'om negative space to positive space they “switch” from being buyers to sellers. ” See Key et. al., Holloway et. al., and others. 20 each is reliant on the implicit assumption that all observations (and therefore all members of the population analyzed) are producers of the good of interest. In some cases such two-tiered analysis may indeed be appropriate. For instance, if one is interested in staple grain production of the rural farm population of a developing country, it is fairly reasonable to accept the assumption that all observations are at least producing some. However, in analyzing higher return growth markets (like Kenyan dairy) traditional models are not appropriate if a portion of the population are not producers. The common solution is to circumvent the issue by focusing on a sub-population of producers. Bellemare and Barrett, for example, study the determinants of livestock market participation for a population of “pastoralists in... Northern Kenya and Southern Ethiopia,” within which a random sample is drawn. Holloway et. al. examine Ethiopia’s Dairy market participation, but focus only on dairy producers in their sample. While this solves the data problem and allows the research to draw conclusions on determinants within that sub-population, it is not possible to extrapolate policy relevant implications to a national level. Moreover, such a model does not allow the researcher to identify determinants of production itself, the necessary precursor to any market related decisions. In this analysis of Kenya’s dairy market a nationally representative perspective will be maintained by adding an additional stage of analysis, thus employing a 3-stage, or triple hurdle model. Starting with a nationally representative sample, the first stage distinguishes producers from non-producers using probit analysis. In the second stage, similar to the first stage of Bellemare and Barrett, an ordered probit is used to identify factors within producing households which determine whether they are net buyers, 21 autarkic households, or net sellers. Finally, in the third stage, the determinants of buyer and seller quantities are identified in separated log-Normal regressions, which are appropriate given the truncated nature of the dependant variables. The triple hurdle model also has a major advantage over double hurdle models in general. Double hurdles are useful because they allow a subset of the data to “pile-up” at some value without causing bias in estimating the determinants of the continuous variable for the remaining sample. In many cases this is used for “comer solutions” where optimization behavior results in a zero value (e. g. for charitable contributions, as in Wooldridge, 2002). Double hurdles can also be used in estimating “selection models,” where the continuous variable is unobserved for some subset of the sample. Thus far, double hurdles have been able to allow for either censored zeros or selected zeros. The 3-stage model, on the other hand, allows for the simultaneous existence of both types of zeros. In the case of dairy, that is, the model allows for the market participation variables to be zero either because the household selected themselves out of production altogether, or because the producing household’s optimizing market participation is autarkic. Moreover, for any given household, the triple hurdle model will predict the probability that the household is a non-producer and an autarkic producer separately. The remainder of this section will be used to develop the likelihood function for the 3- stage market participation model just described which is illustrated in Figure 2. 22 Figure 2. Graphic Illustration of the Three Tiered Market Participation Model Nationally Representative Sample Stage 1 Probit Non-Producers Producers Stage 2 Ordered Probit Net Buyers Autarkic Households Net Sellers I Stage 3 Log-Normal (x2) Net Quantity Bought Net Quantity Sold There are two methods available for estimating double hurdle models: Heckit and standard Full Maximum Likelihood. Heckit estimation, first introduced by Heckman (1976) and employed by Bellemare and Barrett (2006), controls (and tests) for selection bias by using a predicted inverse Mill’s ratio (IMR) generated using first stage results as a regressor in the second stage. Note that this method is not appropriate in comer solution models under the assumption of joint normality of error terms (Wooldridge, forthcoming). The F ML method, on the other hand, maximizes the likelihood function which describes the full probability distribution of all stages. In this study, both methods could arguably be employed, since the first two stages can be thought of as a selection double hurdle, while the second and third stages represent a Combination of comer solution double hurdles. 23 To derive a likelihood function, we begin in the first stage where households are identified according to whether they are producers or not using probit analysis. To simplify notation from the conceptual framework, let y 1 be the level of milk production, x the vector of all variables thought to explain production and market participation behavior throughout the model, and W) is a binary indicator function such that: wl =1[yl > 0] (5-1) w. =0ly. = 0] (5-2) Then, following the standard probit method, we assume: “(Wt = 1 l 361,7): (130617) (5.3) Pr(wl = 0 l 351,7): 1" (1)0517) (5-4) Where (D is the standard Normal cumulative distribution function, x1 are the independent variables thought to determine production, and 7 is a vector of parameters to be estimated. Thus, the full distribution of W1 is: f (w1 I x.) = [1— (x.r)]‘[”"=°][(x.7)l[”“” (5.5) Now, focusing on the second stage, we define yz as the level of milk consumption and w2 the ordered indicator function such that: W2 = 0[y1 - y2 < 0] (5.6) 24 W2 =1b’l — y2 = 0] (5.7) W2 ‘2 Zb’l — yz > O] (5.8) In words, W2 is zero for producing households that are net buyers of milk, W2 is one for autarkic producing households, and W2 is two for producing households that are net * sellers of milk. Then, following the ordered probit model, define the latent variable W2 : * w2 = 362,3 + e elxz ~Normal(0,1) (5.9) Let a, < a2 be unknown threshold parameters defined such that: * w2 = 0 if w2 < orl (5.10) * W2 =1 if a] < W2 <02 (5,11) W2 2 2 if w2 > a, (5.12) Then, letting 3‘32 be the independent variables explaining market participation: Pr(w2 = 0 | x2,a,fl) = Pr(w.: S or1 |x2)= (13(051 — xzfl) (5.13) Pr(w2 =1 1x2, a,,6)= (13(a2 — xzfl)— (13(051— xzfl) (5,14) Pr(w2 =2|x2,a,,6)=1—(I>(052 -x2,8) (5.15) Thus, the distribution of W2 is the ordered probit: 25 f(W2 lxz) = [(D(al " xzflflllwzw] (5.16) * [(D(a2 T x216)" (I)(al '- xzflflflwzzllll '— (13(a2 _ x2fl)]1[wz=2] Finally, in the third stage, let y 3 be defined as the net purchases for net buyers, while y 4 is the net sales for the net sellers. Mathematically: y3 = yz —y1, if yz > y) , and is undefined otherwise (5.17) y4 = y1_y2,if y] > y2,and is undefined otherwise (5.18) As stated above, each of these random variables is assumed to be log-Normal, so, letting x 3 be the independent variables explaining net purchases, and x 4 those explaining net sales, the individual distribution of each can be written: f(J’3 lx3’63) = ¢[{10g(y3)— x353l/03l/(y303) (5-19) f0’4 “54954) = ¢[{10g(Y4)_ x454 I/0'4 l/(y40'4) (5.20) Where (0 is the standard normal probability density function, and 0'1- is the standard deviation of the random variable y j . it should be noted here that, as with the double hurdle models, there are no restriction regarding the elements of x1. 9 x2 9 x 3 9 and x4 (i.e. they can be the same or different explanatory variables in each stage). That said, as discussed during the derivation of the structural model for this study, the elements of 26 x2 9 x 3 9 and x4 will be the same. Nevertheless, we can finalize the derivation of our likelihood function in the more general case, where each vector is treated separately. Finally, defining 6 as the vector of all the above described parameters, and using exponential indicator functions, the joint distribution function for W) 9 W2 9 y 3 a y 4 can be written: f(w..w2,y3.y. lxfl) = l1— (x.y)l‘[”"=°] (5.2.) s -l[wl=l] r[(1)(a1 — x216) ¢[{10g(J’3 )" x453 }/0,3 ]:|1[W2=o] J’30'3 :1: (13(3617)‘ [(P(a2 — x218)“ (13(05l — leg)]1[w2=1] } [(1 — (a. — ea) — (al - em) Pri(Net seller)= PT(W1,- = 1, W2,- = 2 I x) (5.26) = (D0917 )(1 _ (D(a2 — x2176» '2 “Unconditional” is a bit of a misnomer, since all predicted probabilities are conditional on the independent variables. Also, since this study uses a balanced panel, estimates are conditional on households being intransient. Here, unconditional is only meant to imply that the probabilities (and later expected values) are for any given observation, not conditional on any of the dependent variables (production or market participation) taking a specific value. 29 Once again, one of the major benefits of the model is the ability to treat non-producers and autarkic producers separately, as shown in equations (5.23) and (5.25). Results can also be used to predict the unconditional expected values of net sales and net purchases for any given observation: Ei(Net Purchases) = 2 E(y3,-|x)=<1>(x..7)<1>(al — x.a)exp(x.e + 0%) .5... E.(Net Sales) = 2 E(y4,. Ix) = @(xlii’xl " (13(a2 _ x2ifl))exp(x4i64 + 0%] (5.28) The ability to predict expected values in equations 5.27 and 5.28, without conditioning on the observation being a producer is only possible with a triple hurdle model, and allows us to extrapolate implications to a national level. Finally, using the product rule, the partial effect of any continuous explanatory variable ( xk ) on the unconditional expected values can be derived. Since the market participation of net sellers is of primary interest to this study, we’ll focus on the partial effect on the unconditional value of net sales: a . 2 E‘; 'x) =7.¢(x..-7X