SEAFOOD MISLABELING, FISH EFFICIENCY, AND CHILD TIME USE: THREE ESSAYS
           IN AQUACULTURE AND AGRICULTURAL ECONOMICS
                                            By
                                       Eric Abaidoo
                                   A DISSERTATION
                                        Submitted to
                                Michigan State University
                        in partial fulfillment of the requirements
                                     for the degree of
         Agricultural, Food, and Resource Economics – Doctor of Philosophy
                                           2023


                                            ABSTRACT
        This dissertation consists of three essays, exploring (1) the potential disruptive effects of
seafood mislabeling (2) how rural non-farm employment (RNFE) conditions the relationship
between agricultural diversification and aquaculture efficiency, and (3) the impact of parental
education on child time use.
        The first chapter, titled “Fish demand in the U.S. Great Lakes region in the face of seafood
mislabeling” investigates whether consumer WTP for local seafood is impacted by information
about seafood fraud. The globalization of seafood trade has triggered heightened vulnerability for
fraud within the seafood supply network. Consumer perceptions of these vulnerabilities are not
limited to imported seafood products, as spillover effects are likely to influence purchasing
behavior for domestically produced seafood as well. Applying a discrete choice methodology, I
show that consumers derive positive utility from consuming locally sourced relative to imported
seafood. Upon further disaggregation, however, I find that for one consumer segment, which I
term the price-sensitive group, localness does not command a significant positive premium. Most
importantly, I demonstrate that information regarding international seafood fraud largely did not
alter local seafood demand. That said, I find some evidence of a negative spillover effect of the
information treatment on US-labeled seafood in one consumer subgroup.
        The second chapter, titled “Does rural non-farm employment (RNFE) resolve (or
exacerbate) the agricultural diversification-farm efficiency tradeoff?” studies how RNFE
conditions the relationship between agricultural diversification and fish efficiency? Competition
for scare productive resources typically implies a compromise between agricultural diversification
and efficiency. Yet the potential for non-farm income to resolve this tradeoff remains understudied.
Cash from non-farm sources may support productivity-enhancing input purchase, thereby


improving efficiency. On the other hand, by diversifying both on and off-farm, households may
be spreading their labor resources too thin, thus lowering fish efficiency. Using micro-level data
on fish farming households in Southern Bangladesh, I show that at higher levels of the non-farm
income share, diversification into crops results in significant allocative inefficiencies. Results are
weaker for the technical efficiency measure.
        The third chapter, titled “Parental educational attainment and child labor outcomes:
Evidence from Malawi” revisits a hot-button topic—child labor use in agricultural production.
Prior studies present anecdotal evidence thus far with evidence on the causal interpretation of this
relationship rarely explored. I draw on insights from the demography literature, wherein findings
suggest that the direct influence of grandparents or lack thereof on grandchildren’s socioeconomic
outcomes hinges crucially on familial living arrangements. Hence, conditional on a range of
parental characteristics and multigenerational co-residence, I use as a set of instruments
grandparents’ educational attainment to exploit plausibly exogenous variation in parents’
schooling. Using a nationally representative Malawian household panel data set, I generally find
evidence of a negative parental educational attainment impact on child labor outcomes. The effect
of maternal education on household farm work, however, is not significant. My 2SLS results are
also shown to be robust to varying degrees of violation of the exclusion restriction. With respect
to potential mechanisms, the results suggest that engagement in non-farm employment pursuits
among educated parents may mediate these effects.


                                    ACKNOWLEDGEMENTS
        Glory be to God Almighty for bringing me over the finish line. I sought His face at the
beginning of this journey and He stood with me every step of the way. For that, I am forever
grateful.
        I would also like to extend my sincere gratitude to my major professor, Dr. Ben Belton, for
his mentorship, and support over the years. His commitment to seeing me succeed in this program
has been pivotal to the timely completion of this dissertation. I am also grateful to the other members
of my dissertation committee, Drs. Thomas Reardon, Songqing Jin, and Trey Malone, whose
insights and constructive feedback have immensely benefitted this dissertation.
        Thanks are also due to my fellow graduate student colleagues (past and present) in the AFRE
and Economics programs, for their advice, and friendship during this journey. In the same breath, I
say a massive thank you to my church, Grace International Outreach Church (GIOC), for being my
community in Christ and helping me grow my faith as I worked towards this degree.
        Most importantly, I would like to acknowledge the support and sacrifices of my beloved
family, Mrs. Dorcas Tetteh (my partner), Joel Abaidoo (our son), Mrs. Rebecca Sackey (my mom),
Mrs. Becky Hubbell (my U.S. mom), Mr. Ebenezer Abaidoo (my dad), Mr. John Hubbell (my U.S.
dad), and siblings for their constant supply of inspiration, energy, and calm throughout this journey.
        Special thanks to the Bailey Scholars Program (BSP) for the community and providing me
the platform for self-discovery. And to all my wellwishers—you have all contributed your part in
ways that you could never imagine. Thank you!
                                                   iv


                                             TABLE OF CONTENTS
CHAPTER 1: FISH DEMAND IN THE U.S. GREAT LAKES REGION IN THE FACE OF
SEAFOOD MISLABELING ............................................................................................................ 1
     1.1 Introduction ..................................................................................................................... 1
     1.2 Background ..................................................................................................................... 4
     1.3 Mapping Fraud in Seafood Supply Chains ..................................................................... 7
     1.4 Consumer Preferences under Fraud Uncertainty .......................................................... 10
     1.5 Methods ......................................................................................................................... 14
     1.6 Empirical Strategy......................................................................................................... 15
     1.7 Data and Descriptives ................................................................................................... 18
     1.8 Econometric Results ..................................................................................................... 19
     1.9 Conclusions ................................................................................................................... 26
     BIBLIOGRAPHY ............................................................................................................... 28
     APPENDIX A: TABLES AND FIGURES ........................................................................ 33
     APPENDIX B: DEFINITIONS AND EXCERPT ............................................................. 52
CHAPTER 2: DOES RURAL NON-FARM EMPLOYMENT RELIEVE (OR EXACERBATE)
THE AGRICULTURAL DIVERSIFICATION-FARM EFFICIENCY TRADEOFF: THE CASE
OF AQUACULTURE IN BANGLADESH ..................................................................................54
     2.1 Introduction ..................................................................................................................54
     2.2 Data and Descriptives ..................................................................................................58
     2.3 Empirical Strategy .......................................................................................................61
     2.4 Regression Results .......................................................................................................66
     2.5 Conclusions ..................................................................................................................72
     BIBLIOGRAPHY ..............................................................................................................75
     APPENDIX A: TABLES AND FIGURES .......................................................................77
     APPENDIX B: SUPPLEMENTARY TABLES................................................................89
CHAPTER 3: PARENTAL EDUCATIONAL ATTAINMENT AND CHILD LABOR:
EVIDENCE FROM MALAWI .....................................................................................................95
     3.1 Introduction ..................................................................................................................95
     3.2 Related Literature.........................................................................................................99
     3.3 Data ............................................................................................................................101
     3.4 Empirical Strategy .....................................................................................................104
     3.5 Addressing Endogeneity ............................................................................................106
     3.6 Results ........................................................................................................................108
     3.7 Imperfect Instruments Sensitivity Analysis ...............................................................113
     3.8 Potential Mechanisms ................................................................................................114
     3.9 Conclusions ................................................................................................................115
     BIBLIOGRAPHY ............................................................................................................118
     APPENDIX A: TABLES AND FIGURES .....................................................................121
     APPENDIX B: THEORETICAL MODEL .....................................................................143
                                                                 v


  CHAPTER 1: FISH DEMAND IN THE U.S. GREAT LAKES REGION IN THE FACE
                                 OF SEAFOOD MISLABELING
1.1 Introduction
        Growing demand and increasingly sophisticated globalized agri-food networks have
coincided with a precipitous increase in international seafood trade over the past few decades.
Indeed, seafood is currently one of the world’s most widely traded food commodities (Asche, et
al., 2022; Gephart, et al., 2019; Kroetz, et al., 2020). Between 1986 and 2018, global seafood
export volume almost doubled, while seafood exports climbed from $37 billion to $164 billion in
value (FAO, 2020). This growth has fostered wider access to seafood originating far from the point
of purchase which, in turn, has fueled the recent demand for traceability and origin-labeling (FAO,
2020). Figure 1.1 presents evidence of global seafood export volume growth since the mid-1980s.
During this time, wild-caught fishery production has remained relatively stable (Abaidoo et al.,
2021). While global seafood markets are predicted to double in size by the year 2050, wild-caught
production is expected to contribute little to this additional growth (Waite, et al., 2014; Belton,
Reardon, & Zilberman, 2020).
        This trend is particularly important for U.S. seafood markets where more than 70 percent
of domestic seafood consumption originates outside the United States (NOAA, 2021). Other
accounts suggest a more conservative estimate due to reexports. According to Gephart et al.
(2019), foreign imports account for 62 to 65% of domestic seafood consumption. Population
growth and changing consumer tastes and preferences will likely drive this percentage up further.
To meet this demand, U.S. retailers and restaurants rely on imported aquatic products from
countries such as Norway, China, and Canada (Abaidoo et al., 2021). As traditional seafood
production regions become increasingly strained, supply networks will grow longer and more
                                                  1


complex, creating additional vulnerabilities to fraud. Indeed, food fraud concerns have emerged in
recent food policy debates (Meerza & Gustafson, 2020; Spink & Moyer, 2011; Spink, Ortega,
Chen, & Wu, 2017). In particular, seafood fraud has dominated global news headlines in recent
years (Warner et al., 2013).
        Seafood constitutes one of the most susceptible categories to food fraud (see Figure 1.2)
(Johnson, 2014; Kroetz, et al., 2020; Meerza & Gustafson, 2020). Prior studies indicate that
seafood markets are the most susceptible to adulteration in the United States, followed by dairy
and meat (Bitzios et al., 2017; Schug, 2016). In one such investigation, 44% of visited retail outlets
sold mislabeled fish (O'Neill et al., 2015; Warner et al., 2013). This nationwide query further
highlighted the considerable variability in seafood mislabeling rates by retail outlet type, with sushi
spots emerging as the most targeted (74%), followed by restaurants (38%), then grocery stores
(18%) (Warner et al., 2013).
        To date, the effect of fraud on consumer preferences and the ensuing demand for affected
food products remains understudied (Theolier et al., 2021). In particular, despite the prevalence of
seafood fraud, few studies have explored its effects on consumer preferences. A notable exception
is McCallum et al. (2022), who use an artefactual field experiment with European consumers to
estimate their willingness to avoid the risk and/or uncertainty of purchasing inauthentic fish; the
authors find that consumers are indeed willing to pay a premium to avoid food fraud.
        Instead, prior research has largely focused on estimating consumer willingness-to-pay
(WTP) for select food safety attributes for a variety of imported food products (Hayes et al., 1995;
Ortega et al., 2014, 2015). To illustrate, in the food safety domain, Ortega et al. (2014) evaluated
U.S. consumer preferences for enhanced food safety claims, finding that U.S. consumers have a
                                                    2


higher WTP for the food safety attributes of U.S.-farmed seafood products relative to those
primarily sourced from Asia.
        This article studies how U.S. consumers weigh tradeoffs between locally sourced and
imported seafood given the potential for fraud. We contribute to the literature in three ways. First,
we map out a seafood supply chain and highlight possible areas of food fraud vulnerability along
the chain. Second, we model consumer demand for select traceability, production method, and
processing attributes of a diversity of seafood species in the Great Lakes region. Finally, we
examine whether and how different consumer market segments respond to information about the
prevalence of seafood fraud across domestic and imported seafood supply chains. This research
question is motivated in large part by concerns from local seafood producers regarding negative
spillover effects due to fraud perceptions. That is, producers might lose out on product premiums
if consumers perceive that some degree of fraud is inevitable.
        This study also adds to a growing literature on information effects on U.S. consumer
preferences and demand for seafood (Marette et al., 2008a, 2008b; Uchida et al., 2017; Weir,
Uchida, & Vadivelo, 2021). While studying the market potential for genetically modified (GM)
salmon, Weir et al. (2021) conclude that ex ante negative biases do matter, in that providing both
negative and positive information about GM fresh salmon had similar effects on WTP as
presenting negative information only. By contrast, Uchida et al. (2017) find evidence of no
spillover effect of unfavorable information about the mercury content of swordfish on consumer
bids for wild and farmed salmon, paralleling one of the main findings of our study.
        The remainder of this article is structured as follows. In the next section, we provide a brief
background on food fraud as it pertains to domestic and international food systems. We then
present a conceptual framework of seafood supply chains, highlighting potential areas of
                                                   3


vulnerability to a range of fraudulent activities. We then describe our main hypotheses. After
presenting our empirical strategy, we report summary statistics on key variables in our data set as
well as the estimation results. We follow this up with a detailed discussion of our results and
conclude.
1.2 Background
        Variously described as the adulteration, substitution, dilution, stealing, tampering,
diversion, and misrepresentation or mislabeling of food products for economic gain, highly
publicized food fraud incidents have received substantial media attention in the past two decades
(Spink & Moyer, 2011; Spink et al., 2017). Scandals ranging from horsemeat in European beef
markets (Premanadh, 2013) to melamine in infant formula in China are some examples of globally
recognized food fraud events in recent history (Ingelfinger, 2008; Chan et al., 2008). In extreme
cases, failure to detect these adulterants promptly can result in devasting health outcomes for
consumers. For example, the high-profile melamine scandal of 2008 affected about 300,000
children, with almost 50,000 hospitalizations, leading to 6 deaths (Ingelfinger, 2008; Chan et al.,
2008; FAO, 2008; Meerza & Gustafson, 2020; Spink et al., 2017; Yang, et al., 2022).
        Wherever conditions for opportunistic behavior exist, and mechanisms to remedy such
shortcomings are non-existent or ineffective, food fraud will be sure to feature. Indeed, almost
every food product has had some history of fraud. Alum and chalk in bread flour, exhausted tea
leaves in tea bags, and inferior spirits in branded spirit bottles have all made notable appearances
in agri-food supply networks, jeopardizing human health and causing significant economic losses
to consumers (Shears, 2010; Meerza & Gustafson, 2020). As a consequence, food fraud remains
an enduring concern with far-reaching repercussions due to globalized food supply chains. This
                                                  4


phenomenon is so prevalent that in some cases consumers have begun to develop a strong taste for
adulterated food products and beverages (Shears, 2010).
        Food fraud events can occur at any point throughout the system, but some food types are
more vulnerable and thus easier to manipulate than others. For example, the adulteration of high-
quality extra virgin olive oil (EVOO) with inexpensive, low-quality seed oil is a widespread
practice given that such manipulations are impossible to detect without the aid of accurate science
and high-precision technologies. Also, quality differentiation along origin, olive type, and
chemical composition lines imply that consumers will have a tough time deciphering mislabeled
or adulterated EVOO, as it is often the case with credence attributes more generally (Meerza &
Gustafson, 2020). Food safety management systems may be effective at detecting harmful
additives but may miss or fail to alert consumers to substitutions or dilutions that do not present a
human health risk.
        In the fisheries and aquaculture sector, food fraud can be extremely challenging to detect
(Reilly, 2018; Warner et al., 2013). Asymmetric information between consumers and suppliers has
fostered fraudulent activities including species substitution, intentional mislabeling, and
undisclosed use of water-adhesive agents to increase fish weight for economic gain (Reilly, 2018).
Moreover, the practice of processing seafood offsite and then reexporting to the origin further
complicates traceability, fostering mislabeling (Asche et al., 2022). Despite routine monitoring
and testing by food safety surveillance agencies such as the U.S. Food and Drug Administration
(FDA), imported food products including seafood have become common targets of food fraud
(Reilly, 2018; Meerza & Gustafson, 2020). To date, the Seafood Import Monitoring Program
(SIMP), established to oversee compliance with general recordkeeping and reporting requirements
                                                  5


for imported seafood, covers only 13 seafood species groups (Warner et al., 2013).1 This risk-
based traceability program was designed to mitigate instances of illegal, unreported, and
unregulated fishing (IUU), and seafood fraud. However, given the inexhaustive nature of the
program’s coverage, seafood types not currently covered by SIMP are fraught with mislabeling
(Warner et al., 2013). As a result, consumer demand for food safety and authentication labeling is
fast gaining traction as certification entities represent only a partial solution to the asymmetric
information problem underlining food fraud activities (Giannakas, 2002; Ortega et al., 2014;
Zilberman et al., 2018).
        Instead of focusing on fraud more broadly, the prior literature has largely emphasized the
food safety of imported food products. For example, Ortega et al. (2015) explored media coverage
effects of food safety incidents on U.S. consumer preferences for imported aquaculture products
originating in Asia. Findings from this study suggest that U.S. consumer WTP for aquatic food
products was impacted by exposure to major food safety news headlines. Specifically, their results
indicate that consumer WTP for enhanced food safety claims declined substantially for shrimp
originating in China, and Thailand following exposure to food safety media information. By
contrast, no notable changes in consumer valuation of the enhanced food safety attribute were
observed for domestic seafood products given the information shock.
        Admittedly, not all forms of food fraud pose food safety risks or health challenges to
consumers. However, such practices can dislodge consumer confidence in food labeling and the
safety of certain food industries (Giannakas, 2002; Meerza & Gustafson, 2020). For instance, some
pork products have been found to be fraudulent, dislodging the trust of some religiously affiliated
1
  Seafood species groups covered by SIMP include Abalone, Atlantic Cod, Blue Crab (Atlantic), Dolphinfish (Mahi
Mahi), Grouper, King Crab (RED), Pacific Cod, Red Snapper, Sea Cucumber, Sharks, Shrimp, Swordfish and Tunas
(Albacore, Bigeye, Skipjack, Yellowfin, and Bluefin) (Warner et al., 2013).
                                                       6


customers in the meat sector (Bonne & Verbeke, 2008; Premanandh, 2013). Consumers are then
forced to rely on authenticity cues such as price, country of origin, and security package labels via
certification to make informed food purchasing decisions (El Benni, et al., 2019; Ortega et al.,
2014). Indeed, previous studies have noted a consumer preference for domestic finfish over
imported seafood due to concerns about potential mislabeling (Garlock et al., 2020; Marko, et al.,
2004).
        Fraud events linked to one product can have spillover effects on an entire market. While
studying the effect of fraud on the olive oil market, Meerza & Gustafson (2020) found evidence
suggestive of a negative spillover effect. Specifically, the authors note that exposure to information
about Italian olive oil fraud negatively impacted U.S. consumer demand for both U.S. and Greek
EVOO. Of the handful of studies eliciting consumer demand for food products under the risk of
fraud, Meerza & Gustafson (2020) constitute one of a few with some application to U.S.
consumers. Our paper builds on the idea of spillover effects as described by Meerza & Gustafson
(2020) to examine whether the premium for local seafood drops or perhaps increases with
knowledge about seafood fraud possibly initiated overseas. That said, we deviate from prior
research in the following important ways. First, this study focuses on fraud information-induced
consumer demand response in the context of seafood, making it the first to do so to our knowledge.
Second, we explore a richer definition of local by considering both Great Lakes produced, and
other U.S. states sourced seafood products. Finally, we explore potential heterogeneous effects in
consumer response to food fraud information published by the media.
1.3 Mapping Fraud in Seafood Supply Chains
        We base our conceptual framework on the potential vulnerability of seafood to fraud at
various stages of the supply network. Figure 1.3 maps the supply chain for seafood from source to
                                                   7


the final consumer. We present a simplified version of the U.S. seafood supply chain and reference
existing or emerging avenues for a range of common fraudulent activities along the chain. We
adapt the comprehensive seafood supply chains depicted in Fox et al. (2018), wherein key stages
of the supply networks for finfish, shellfish, and crustaceans are laid out in succession. The length
and complexity of seafood supply chains follow directly from the production method (that is, via
aquaculture or wild-caught production) and, at an extra level of granularity, the species. This is
particularly important for our study as global trade includes both wild-caught and aquaculture,
which account for 54 and 46 percent of the global production volume, respectively. For instance,
in aquaculture production, eggs and fishmeal procured through either domestic or international
sources represent critical inputs in the production process, whereas naturally occurring juvenile
fish in the wild already have the requisite conditions for survival.
         Seafood supply networks are susceptible to a variety of fraudulent activities (Kroetz et al.,
2020), with fraud manifesting in diverse ways at multiple levels of the value chain.2 3 For example,
species substitution can take many forms and occur at any stop in the seafood supply network. For
upstream supply chain actors such as fishers and farmers, post-harvest species substitution as fish
are held in storage units awaiting further processing can be tempting if there are substantial gains
to be made. For instance, the practice whereby high-value species are substituted for low-value
species yet sold at a premium is a common occurrence of seafood fraud (Fox et al., 2018; Reilly,
2018). It is also common for high-value species to be misclassified as low-value for tax evasion
purposes (Reilly, 2018). Further down the supply chain, the detection of species substitution may
2
  Examples of these fraudulent practices include species substitution, mislabeling, short weighting, adulteration, and
indiscriminate antibiotic use, among others.
3
  In a move to comprehensively capture other lesser-known fraudulent opportunities, Fox et al. (2018) extends the
scope of seafood fraud to include modern day slavery and animal welfare infractions. For the purposes of this study,
we examine seafood fraud outside of these latter ethical considerations.
                                                          8


be complicated due to processing where the morphological identification of species becomes
infeasible after they are transformed into fish sticks, fillets, and other pre-prepared fish meals
(Marko et al., 2004; Chen, et al., 2014; Fox et al., 2018; Reilly, 2018).
        In the processing and distribution segment of the seafood supply network, a unique form
of seafood fraud referred to as “short-weighting” can also occur. This involves the overglazing or
overbreading of seafood products to artificially inflate their true weight for economic gain (Reilly,
2018). More recently, Asche et al. (2022) revealed a major discrepancy between Chinese exports,
on one hand, and imports plus domestic seafood production numbers, on the other, suspected to
result from certain forms of mislabeling and “short-weighting". A typical example of this
fraudulent practice involves the addition of glaze water to frozen seafood products during
processing. While seafood species, such as crab, salmon, trout, and halibut, primarily marketed as
fresh will likely be less subject to this form of fraud, other predominantly frozen seafood, such as
tilapia and shrimp will be common targets (Love, et al., 2022). Relatedly, seafood adulteration
with carbon monoxide to enhance fish flesh appearance during frozen storage is just as prevalent
although such practices ought to be declared on fish product labels in compliance with most
national food safety protocols (Reilly, 2018). Seafood may also be adulterated with antibiotics
either directly or indirectly through fish feed to improve production efficiency and fish quality
(Fox et al., 2018).
        The majority of food fraud investigations are conducted downstream with several studies
uncovering seafood fraudulent behavior among retailers (Warner et al., 2013; Fox et al., 2018;
Reilly, 2018). Although not entirely obvious whether these actors are themselves victims of
seafood fraud initiated higher up the supply chain, surveillance reports have returned some
                                                    9


alarming results. For example, DNA tests of 1,200 seafood samples across 674 retail outlets within
the United States revealed that a third of the tested samples were substituted (Warner et al., 2013).
         Beyond species substitution, mislabeling and misleading production claims are equally
endemic. While seafood might be initially labeled correctly by name, the product may be
mislabeled later as wild-caught when it was in fact farmed. This form of mislabeling can occur at
any point along the seafood marketing chain, but most frequently occurs among distributors and
final seafood retailers such as restaurants and fishmongers (Jacquet & Pauly, 2008). A review of
recently published reports on seafood fraud by Pardo et al. (2016) indicated that 30 percent of
DNA-tested seafood product samples were mislabeled, with a majority occurring in the food
services sector. While mislabeling may not necessarily lead to outright food safety issues, other
considerations on sustainability grounds cannot be ignored as such practices could also exacerbate
current challenges with depleting fish stocks (Asche et al., 2022; Kroetz et al., 2020).
1.4 Consumer Preferences under Fraud Uncertainty
         In this section, we develop a random utility model of heterogeneous consumers with
preferences for locally sourced seafood in the face of fraud (McFadden, 1973). Suppose a
consumer 𝑛 derives utility 𝑈!"# from purchasing seafood alternative 𝑗 in choice situation 𝑡 such
that:
                                      𝑈!"# = 𝑋"#$ 𝛽!" + 𝜖!"#                                      (1)
where 𝑋"#$ denotes the product attributes of alternative 𝑗, 𝛽!" are coefficients to be estimated, and
𝜖!"# is the unobserved component of utility that is independent and identically Gumbel distributed.
We then allow the observable component to have the following structure:
                                                   10


               𝑋"#$ 𝛽!" = 𝐴𝑆𝐶" + 𝛽%& 𝑃𝑟𝑖𝑐𝑒"# +             6         7𝛽‾' + 𝜎' 𝑍!' ; 𝑥"# + 7𝛽‾,- + 𝜎,- 𝑍!,,- ;𝐺𝐿"#
                                                       '∈)\{,-,/0}
                                                  + 7𝛽‾/0 + 𝜎/0 𝑍!,/0 ;𝑈𝑆"#                                           (2)
where 𝐴𝑆𝐶" is an alternative specific constant for alternative 𝑗, 𝑃𝑟𝑖𝑐𝑒"# is the price, 𝐾 is a set of
experimentally-designed non-price attributes, 𝑍!' is a standard normal random variable, 𝑥"# is a
(|𝐾| − 2) × 1 vector4 of observable product attributes of alternative 𝑗, 𝐺𝐿"# and 𝑈𝑆"# denote the
seafood alternative’s place of origin (that is, the Great Lakes and other US states, respectively)5,
𝛽‾' is the mean of attribute 𝑘’s parameter estimate while 𝜎' is the standard deviation of the
distribution around this mean. Under the scenario with no seafood mislabeling, let us define the
mean marginal willingness-to-pay between the local and imported attribute levels as follows:
                                                                       𝛽‾,-
                                                 𝑚𝑊𝑇𝑃,-,23 = −                                                        (3)
                                                                       𝛽%&
                                                                        𝛽‾/0
                                                 𝑚𝑊𝑇𝑃/0,23 = −                                                        (4)
                                                                        𝛽%&
where 𝑚𝑊𝑇𝑃,-,23 > 0 indicates that the 𝐺𝐿 attribute level attracts a higher premium or a lower
discount relative to the 𝑖𝑚𝑝𝑜𝑟𝑡𝑒𝑑 attribute level, on average.
          Now, suppose there is a mislabeling information shock such that there exists a non-zero
probability 𝜋 ∈ (0,1) that seafood alternative 𝑗 is mislabeled. Then consumer 𝑛’s expected utility
becomes: 6
4
  Where |𝐾| denotes the cardinality of set 𝐾.
5
  Where the 𝐺𝐿 and 𝑈𝑆 attribute levels are expressed relative to the omitted category, 𝐼𝑀 representing imports.
6
  Notice that implicit in the representation of the expected utility function is the assumption that the consumer is risk-
neutral. The additive structure of the observable component of the utility function renders this assumption.
                                                           11


                                           𝐸𝑈!"# = (1 − 𝜋)𝐴𝑆𝐶" + 𝜋𝐴𝑆𝐶 4 + 𝛽%& 𝑃𝑟𝑖𝑐𝑒"# + ⋯ +
         6      { U(1 − 𝜋)𝛽‾5 + 𝜋𝛽‾5$ V + U(1 − 𝜋)𝜎5 + 𝜋𝜎5$ V𝑍!,5 }𝑔"# + (1 − 𝜋)𝜖!"# + 𝜋𝜖!"#             $           (5)
      5∈{,-,/0}
where 𝑗 ≠ 𝑙 and 𝐴𝑆𝐶 4 denotes the alternative specific constant under mislabeling. Analogously,
𝛽\5 ≡ (1 − 𝜋)𝛽‾5 + 𝜋𝛽‾5$ for all 𝑔 ∈ {𝐺𝐿, 𝑈𝑆} is defined as the estimated parameter denoting the
place of origin attribute level 𝑔 under mislabeling uncertainty.7 Taken together, we obtain the
following expression for the mean marginal willingness-to-pay between the 𝐺𝐿 and 𝑖𝑚𝑝𝑜𝑟𝑡𝑒𝑑
attribute levels, for example, in the seafood fraud information setting:
                                                                      𝛽\,-
                                                𝑚𝑊𝑇𝑃,-,236&78#
                                                                =−                                                   (6)
                                                                      𝛽%&
          We hypothesize that consumers’ willingness-to-pay for locally produced versus imported
seafood partly depends on beliefs of food fraud risk in the respective supply chains. Hence, we
expect consumers’ preferences for the local relative to the imported place of origin attribute levels
to vary under differing informational settings. We use the theoretical model above to examine two
opposing hypotheses regarding the effect of seafood fraud information on consumer preferences
for locally produced as opposed to imported food fish. We term these competing hypotheses (1)
the spillover effect, and (2) the signaling effect.
          First, the spillover effect posits that coupling information about the dominance of imports
in U.S. seafood markets with knowledge of fraudulent behavior will reduce the premium for
locally produced seafood products. Toledo & Villas-Boas (2019) and Meerza & Gustafson (2020)
present findings consistent with this assertion. In these studies, the authors argue that
7
  It is important to note the distinction between uncertainty and risk. Risk will suggest that the probability of seafood
mislabeling is known whereas uncertainty suggests otherwise.
                                                           12


contaminating or negative spillover effects could prevail even if food safety or fraud
disproportionately affects a specific product or source and not others. For example, while studying
consumer egg purchasing responses to recalls during the 2010 Salmonella outbreak, Toledo &
Villas-Boas (2019) observed that consumers also reduced egg purchases from unaffected stores
due to the outbreak. These findings suggest that unfavorable food fraud news can result in negative
spillover effects:
                                             6&78#
                          𝐻spillover : 𝑚𝑊𝑇𝑃5,23     − 𝑚𝑊𝑇𝑃5,23 < 0                                (7)
where 𝑔 ∈ {𝐺𝐿, 𝑈𝑆} indicates local seafood varieties and 𝑇𝑟𝑒𝑎𝑡 denotes the scenario under which
consumers are subjected to unfavorable seafood fraud news.
         Second, the signaling effect hypothesizes that given general information on seafood fraud,
the indication of origin might signal to consumers that they can trust food product quality or safety
if consumers associate localness to stricter food safety standards or more effective surveillance
and monitoring. In other words, consumers worried about food fraud may perceive local products
as less likely to be subject to fraud. This perception could be born out of the strong association of
lengthy food supply chains with more opportunities for fraud (Theolier et al., 2021). The core
assumptions of this hypothesis are consistent with findings in studies such as Umberger et al.
(2003) and Loureiro & Umberger (2003, 2005), who note that most consumers who preferred
country of origin labels interpreted these labels as providing additional food safety guarantees. The
signaling effect implies that unfavorable food fraud news can benefit products with a shorter
supply chain:
                                                6&78#
                              𝐻signaling : 𝑚𝑊𝑇𝑃5,23    − 𝑚𝑊𝑇𝑃5,23 > 0                             (8)
where 𝑔 ∈ {𝐺𝐿, 𝑈𝑆}.
                                                   13


1.5 Methods
          We utilize a discrete choice experiment (DCE) to estimate consumer demand for seafood
and investigate the effect of seafood fraud information on WTP for local, domestic, and imported
seafood products. DCEs have been extensively used to ascertain consumer preferences for food
product attributes in similar settings (Tonsor et al., 2009; Olynk et al., 2010; Ortega et al., 2011,
2014). In our application, the product attributes represent a bundle of characteristics including the
species, place of origin, production method, and form of processing.
          Table 2.1 lists the species, attributes, and attribute levels in our DCE. The first attribute,
seafood species, includes two species popular with U.S. consumers (salmon and trout) and one
species popular with consumers around the Great Lakes (whitefish). These species were selected
following a pilot survey of extension scientists and educators in the region which elicited their
opinions about seafood product characteristics. The other attributes include price, place of origin,
production method, and the form of processing. These attributes were selected as our pilot survey
identified them as the attributes consumers mostly think about when making seafood purchasing
decisions. Price levels ranging from $7.99 to $13.99 were selected based on retail prices (per 8oz
fillets) in major grocery stores in the Great Lakes Region. In all, four price levels were considered,
as well as three places of origin (Great Lakes, United States8 , and imported), three production
methods (wild-caught, farmed/aquaculture, and unlabeled), and two forms of processing (fresh and
frozen) labels.
          Figure 1.4 shows an example of the choice questions presented to respondents. Consumers
in each choice task were asked to select among three alternative profiles of fish along with a no-
buy option. A full factorial experimental design would require 373,248 (29×; × 39×< × 49×; )
8
  United States represents any other state outside the Great Lakes Region.
                                                          14


choice tasks. Using an orthogonal fractional factorial design (labeled design), we reduce this
number to 36 choice tasks. The 36 choice tasks were then blocked into three segments of 12 choice
questions each to reduce the number of treatment combinations presented to any one participant
(Stopher & Hensher, 2000; Louviere, 2004; Hensher et al., 2005; Caussade et al., 2005). Thus,
each participant is faced with only 12 choice questions with three product alternatives and a no-
purchase option in the final design (D-error of 0.04). The order in which consumers answered the
choice questions was also randomized to account for possible order effects. We also presented a
cheap talk script at the beginning of the DCE section of the survey to partly mitigate potential
hypothetical bias in our WTP estimates (Lusk & Schroeder, 2004).
        A randomly selected half of the respondents were provided a news article excerpt
describing the results of a recent food fraud investigation, which also explained that a considerable
share of domestically consumed seafood originated from international sources. A copy of this
excerpt is located in the APPENDIX. No such information was presented to the control units.
1.6 Empirical Strategy
1.6.1 Latent Class Model (LCM)
        We estimate a latent class model (LCM) to capture heterogeneity in consumer preferences
by sorting the sample into a finite number of groups or classes. While they offer less flexibility
than mixed logit models (MXL), latent class models impose fewer distributional assumptions
about random parameters to capture unobserved heterogeneity (Hensher et al., 2005). The model
accommodates heterogeneity across latent consumer groups while estimating common parameters
for respondents within each group. At its core, the choice probability for each class is derived from
estimating a multinomial logit model. Thus, conditional on assignment to class 𝑠, the probability
                                                  15


that individual 𝑛 chooses alternative 𝑗 while faced with a choice among 𝐽 alternatives in choice
situation 𝑡 is expressed as:
                                                              $
                                                       exp7𝑋!"#   𝛽> ;
                                      𝑃!#|> (𝑗) =                                                      (9)
                                                   ∑@? B ; exp(𝑋!?#
                                                                  $
                                                                     𝛽> )
where 𝑋 denotes a vector of select food product attributes, and 𝛽> is a vector of parameters to be
estimated common to all members in class 𝑠. Analogously, the prior probabilities for class
membership for individual 𝑛 can be specified as:
                                                       exp(𝑍!$ 𝜔> )
                                          𝐶!> =                                                       (10)
                                                  ∑0> B ; exp(𝑍!$ 𝜔> )
where 𝑠 ∈ {1, 2, 3, . . . , 𝑆}, with 𝑍! denoting a set of observable covariates factored into modeling
the class membership probabilities. Under the independence of choice tasks assumption given class
assignment, the log-likelihood for the entire sample is defined as:
                                            C         0          6
                                   𝑙𝑛𝐿 = 6 𝑙 𝑛 n6 𝐶!> op 𝑃!#|> qr                                     (11)
                                          !B;       >B;         #B;
with the vector of parameters (including the latent class membership parameters) estimated via the
conventional maximum likelihood estimation methods (Hensher et al., 2015).
        In what follows, we estimate the effect of the information treatment on consumer
preferences for the localness attribute levels by latent classes. In doing so, we estimate the
following model:
                                                "∗       ∗             ∗          ∗            ∗
                                 𝑈!"#,> = 𝐴𝑆𝐶> + 𝛽>,%&       𝑃𝑟"# + 𝛽>,,- 𝐺𝐿"# + 𝛽>,/0 𝑈𝑆"# + 𝛽>,EF 𝐹𝑅"# +
      ∗             ∗
    𝛽>,GH 𝑊𝐶"# + 𝛽>,EIF 𝐹𝐴𝑅"# + 𝛼;,> 𝐺𝐿"# × 𝑇𝑟𝑒𝑎𝑡 + 𝛼<,> 𝑈𝑆"# × 𝑇𝑟𝑒𝑎𝑡 + 𝜀!"#,>                        (12)
                                                     16


where 𝑇𝑟𝑒𝑎𝑡 is an indicator variable which takes the value 1 if the respondent was presented the
news article excerpt, and 0 otherwise; the estimated coefficients on the interaction terms, 𝛼;,> and
𝛼<,> denote the difference in the preferences for the respective place of origin attribute levels over
                                                                          "
treatment status for consumer 𝑛 in market segment 𝑠; 𝐴𝑆𝐶> denotes the alternative specific
constants representing salmon, whitefish, and trout, with the constant of the no purchase option set
to 0; 𝑃𝑟"# is a continuous price variable representing each of the four price levels considered in the
study; 𝐺𝐿"# and 𝑈𝑆"# constitute indicator variables for the experimentally-designed place of origin
attribute (the Great Lakes region and the United States, respectively), whose coefficients are
interpreted relative to the 𝑖𝑚𝑝𝑜𝑟𝑡𝑒𝑑 attribute level; 𝑊𝐶"# and 𝐹𝐴𝑅"# are dummy variables which
take the value 1 if the seafood product carries the wild-caught and farmed production method
labels, respectively and 0 otherwise, with the estimated coefficients expressed in relation to the no
label attribute level; 𝐹𝑅"# is a dummy variable which takes the value 1 if the seafood product is
                              ∗
fresh, and 0 if frozen; 𝛽>,.     represents the non-stochastic estimated parameter coefficients and 𝜀!"#,>
is the unobserved independent and identically Gumbel distributed error term.
1.6.2 Mixed Logit Model (MXL)
          We also consider an alternative approach to capturing consumer preference heterogeneity
by estimating a mixed logit model (MXL).9 We estimate a MXL model separately for the treatment
and control group using 500 Halton draws. Parameter coefficients of the experimentally designed
attributes are assumed to follow a normal distribution and are estimated in WTP space. That is, we
reparameterize the MXL models such that WTP measures are directly estimable. This approach
has been generally recommended in the discrete choice experiment literature for producing more
9
  In the rest of the paper, we use the terms mixed logit (MXL) and random parameters logit (RPL) interchangeably.
                                                           17


reliable WTP estimates (Train & Weeks, 2005; Scarpa et al., 2008; Train, 2009). The price
coefficients and the alternative specific constants, however, were assumed to be non-stochastic.
We estimate these models using the simulated maximum likelihood estimation technique (Train,
2009).
1.7 Data and Descriptives
         Survey data were collected online in early September 2021 in collaboration with market
research and data collection company Qualtrics. The survey targeted consumers in the Great Lakes
region including associated ceded territories of Tribal Nations, who were over the age of 18, were
the primary shoppers for food in their respective households and had purchased seafood in the past
year. The survey consisted of socio-demographic questions, as well as queries on household
seafood purchasing and consumption behaviors. We restrict our analyses to the 1,272 consumers
who completed the entire survey, resulting in an equal assignment of respondents to the treatment
and control groups.
         Tables 1.2 and 1.3 present descriptive statistics on key socio-demographic variables for the
sampled participants. In particular, Table 1.2 provides summary statistics by treatment status and
a balance test, while Table 1.3 compares our sample to census and other nationally representative
survey data. For comparability, we consider census data from the 2021 American Community
Survey (ACS) and the 2017-2018 National Health and Nutrition Examination Survey
(NHANES).10 Table 1.3 shows broad agreement between our sample demographics and the Great
Lakes region-specific census data. However, important deviations include an oversampling of
individuals with more education and persons aged 35-44. Most of the sampled respondents are
middle-aged (35-44 years old) with at least some college education (78%).
10
   The ACS estimates are restricted to the adult population residing in the Great Lakes region, while the NHANES
applies to the US population subgroup who indicated seafood consumption in the past 30 days.
                                                         18


          Table 1.4 provides descriptive statistics on seafood consumption variables across treatment
groups. The average respondent typically consumed seafood at home, 2-3 times a month, and
preferred wild-caught seafood. To ascertain consumers’ ex ante level of seafood fraud concern,
we posed a choice question on a 0-100 numeric sliding scale, with 100 indicating the maximum
level of concern. On average, respondents were moderately concerned about seafood fraud
(55.2).11 Reassuringly, a balance test revealed that there appear to be no statistically significant
differences in observable respondent characteristics across the information treatment and control
groups.12 The only exception is the household size variable, which revealed a statistically
significant difference at the 5% level.
1.8 Econometric Results
1.8.1 Market Segments
          Table 1.5 presents the LCM results. We focus our discussion on the LCM with four distinct
classes.13 The estimates indicate that classes generally differ from one another by the degree of
preference for localness. To this end, we label these classes such that consumers fall into one of
the following groups: (1) 𝐿𝑜𝑐𝑎𝑣𝑜𝑟𝑒𝑠, (2) 𝐶𝑂𝑂 (where COO denotes country of origin), (3)
Information-sensitive (hereafter, referred to as 𝐼𝑆), and (4) Price-sensitive (hereafter, 𝑃𝑆) groups.
For instance, locavores and respondents in the 𝐶𝑂𝑂 group find the locality attributes relevant to
their seafood choices but differ in their relative preference for the 𝐺𝐿 and the US attribute levels.
11
   One might be concerned about potential anchoring bias in consumers’ reported levels of seafood fraud concern due
to non-randomization of the slider position between subjects, ex ante. Given that the slider was positioned at 50 by
default, we conduct a simple t-test of the null that the average level of concern is not different from 50. We reject this
null in favor of the alternative at the 1% level (𝑝 − 𝑣𝑎𝑙𝑢𝑒 < 0.0001), indicating that the consumers’ average level of
concern is significantly different from 50.
12
   Balance test results are presented in Tables 1.2 and 1.4.
13
   While the Akaike and Bayesian Information Criteria indicate that extending the number of “latent” classes does
improve the model fit, doing so yielded unwieldy results and overcomplicates model interpretation. In particular, the
estimated standard errors of some coefficients were substantially large; in part, because of the small number of
observations assigned to some classes (Heckman & Singer, 1984).
                                                            19


As the group names infer, locavores indicate a stronger preference for seafood produced in the
Great Lakes region relative to seafood produced in other parts of the United States. The opposite
is true for the 𝐶𝑂𝑂 group. Additional characteristics representative of each group are presented in
Table 1.6. As the table indicates, approximately less than half (48%) of respondents in the locavore
group identified as female, while females are overrepresented across the remaining classes. The
locavore and 𝐼𝑆 groups mostly consist of younger consumers, whereas the 𝐶𝑂𝑂 and 𝑃𝑆 groups are
over-representative of participants aged 65 years and older. We also observe that at least 58% of
the sampled respondents across all classes consume seafood at home. Table 1.6 also shows that
the price-sensitive group is the least concerned about seafood fraud, while the remaining classes
report average levels of concern close to the full sample mean of 55.2. Market shares for each
latent class of consumers are 52% (locavores), 25% (COO), 11% (IS), and 12% (PS).
         Results for the locavore latent class reveal that the coefficients on the GL and US attribute
levels are positive and statistically significant. That is, consumers in this group strongly favor the
two local place of origin labels. This finding is in line with conclusions drawn in previous work
(Davidson et al., 2012; Fonner & Sylvia, 2015; Brayden et al., 2018). The results also show that
relative to the imported label, the presence of the GL attribute label induced a higher utility
increment relative to the US label. The reverse is true for the 𝐶𝑂𝑂 group. Both IS and PS groups,
however, do not appear to prefer either the GL or US attribute labels. We also find support for
disutility for price increments across all four classes, consistent with consumer demand theory.
         While locavores and consumers in the 𝐶𝑂𝑂 group indicated a strong preference for the
fresh label, no statistically significant result was reported for the other groups. Perhaps for these
consumer segments, seafood bearing a fresh label indicates a lower likelihood of being imported
(Campbell et al., 2014; Fonner & Sylvia, 2015). In other results, consumers in the 𝐼𝑆 group
                                                   20


exhibited a positive and statistically significant preference for any of the production method claims
(either wild-caught or farmed) relative to no label. However, this was not the case for the 𝑃𝑆 group.
The alternative specific constants also suggest stark differences across the respective classes.
While locavores demonstrate a strong preference for the purchase alternatives (that is, salmon,
trout, and whitefish) to the no-buy option, such a preference is absent across the remaining classes.
In fact, in some cases, consumers exhibited a strong preference for the no-buy option (for example,
the 𝐶𝑂𝑂 group). Estimates for the 𝐼𝑆 group deviate slightly from this pattern, with consumers
appearing to prefer the salmon alternative to the no purchase option. To some extent, while we do
not collect self-reported attribute nonattendance (ANA) data, ANA behavior can be inferred from
our LCM estimates. For instance, the 𝑃𝑆 group appears to ignore all the non-price attributes, while
the 𝐼𝑆 consumer segment does not attend to any of the places of origin attribute labels.
         We report marginal willingness-to-pay (mWTP) estimates for the LCM, which is expressed
as the ratios of the corresponding coefficients of the attributes of interest and the price coefficient.
Linearity of the different attributes and the price variable in the indirect utility specification yields
the following expression for the marginal WTP for attribute 𝑘 for a given latent class 𝑠:
                                                         𝛽>,'
                                            𝑊𝑇𝑃',> = −                                              (13)
                                                         𝛽>,%&
where the corresponding asymptotic standard errors of these ratios are estimated using the Delta
method with 10,000 draws. Table 1.7 reports these marginal WTP estimates for each class with
the corresponding 95% confidence intervals.
         The mWTP estimates across all attributes for consumers in the 𝑃𝑆 group are not
statistically significant, indicating that consumers in this class do not place significant importance
on any of the production method, place of origin, or processing form attributes represented in the
                                                   21


survey. By contrast, the WILD and FARM attribute levels generate a positive WTP for respondents
in the 𝐼𝑆 group though we do not find any significant premiums across the remaining attributes for
this group. Turning to the 𝐶𝑂𝑂 group, we find that the US attribute label carries a higher premium
of $4.63 per 8oz fillets of seafood relative to the GL label, which carries a mWTP of $3.62 per 8oz
fillets. By contrast, locavores have a higher mWTP for the GL place of origin label followed by
the US label (mWTP estimate of $6.38 vs $5.71 per 8oz fillets of seafood). Within the 𝐶𝑂𝑂 group,
consumers exhibited the highest marginal willingness-to-pay for the wild-caught label with a
mWTP estimate of $9.76 per 8oz fillets of seafood. These consumers also indicated positive
mWTP values of $3.28 and $1.24 per 8oz fillets for the farmed and fresh attribute labels,
respectively. Similarly, the presence of labels denoting that seafood was wild-caught, fresh, and
farmed are associated with positive and statistically significant mWTP estimates of $4.68, $4.92,
and $3.13 per 8oz fillets of seafood, respectively for the locavore group.
1.8.2 Information Treatment Effects for LCM
         To test for a differential effect of the treatment on consumers’ preferences for the locality
attribute labels, we estimate equation (10). The estimated coefficients on the interaction terms,
𝐺𝐿 × 𝑇𝑟𝑒𝑎𝑡 and 𝑈𝑆 × 𝑇𝑟𝑒𝑎𝑡 capture the changes in utility for localness given the information
shock. Results are presented in Table 1.8. Our results generally indicate that consumer preferences
for the 𝐺𝐿 and 𝑈𝑆 attributes were not significantly altered by information about fraud for any of
the consumer groups except the 𝐼𝑆 group. For this sub-group, the information shock appears to
have eroded any positive premium for the 𝑈𝑆 attribute label, which is consistent with the spillover
effect hypothesis.
         Further, to get a sense of whether the signaling or spillover effect holds for the entire
sample, we derive full sample differences in mWTP estimates for the local relative to the imported
                                                     22


attribute labels due to the information shock. Results are reported in Table 1.9. Using the estimated
class probabilities as weights, we obtain weighted average changes in mWTP for the 𝑙𝑜𝑐𝑎𝑙
attribute labels given unfavorable food fraud news. Our results suggest a decline in the premium
for the 𝐺𝐿 in lieu of the 𝑖𝑚𝑝𝑜𝑟𝑡𝑒𝑑 label (due to mislabeling) amounting to 9 cents per 8oz fillets;
however, this effect is not statistically significant. More strikingly, we find that the information
shock resulted in a $1.96 per 8oz fillets reduction in the average mWTP for the 𝑈𝑆 relative to the
𝑖𝑚𝑝𝑜𝑟𝑡𝑒𝑑 attribute label across the entire sample. This effect is statistically significantly different
from zero at the 10 percent level. Taken together, the evidence provided in Table 1.9 offers some
support for the spillover effect for US-label seafood products, though not GL-labeled products.
1.8.3 MXL Model Results
        Next, we obtain and plot the distribution of individual-specific conditional WTP estimates
for the 𝑈𝑆 and 𝐺𝐿 attribute levels across treatment status from the MXL model. These respondent-
specific estimates are essentially means of the conditional distribution of the WTP parameter
estimates, where we condition on the choices we observe the respondents make (Hensher et al.,
2015). The distribution of these WTP estimates are shown in figures 1.5 and 1.6 for the 𝐺𝐿 and
𝑈𝑆 attribute levels, respectively. As can be seen, the treatment appears to have eroded premia
across the two levels but more so for the 𝑈𝑆 attribute level. Interestingly, we also observe a
bunching of the WTP estimates around the median with respect to the 𝑈𝑆 attribute level for the
treatment group.
        We then present the MXL model estimates on all attributes featured in the choice
experiment across treatment arms. Results are reported in Table 1.10. First, as consumer demand
stipulates, the price coefficients are negative and statistically different from zero at all conventional
levels of significance across the treatment and control groups. We also observe that consumers are
                                                   23


willing to pay a premium of $1.58 and $1.96 per 8oz fillets for the 𝐺𝐿 and 𝑈𝑆 attribute levels,
respectively relative to the 𝑖𝑚𝑝𝑜𝑟𝑡𝑒𝑑 label for the control group. These WTP estimates are both
significant at the 5 percent level or better. Nonetheless, we observe a plummeting of these WTP
values with treatment (that is, the values tend closer to zero). Specifically, the mean marginal
willingness-to-pay for the 𝐺𝐿 and 𝑈𝑆 attribute levels dropped to $1.26 and $1.50, respectively
following the information shock, providing some evidence in favor of a negative spillover effect.
         In other results, the treatment appears to have induced a positive and significant premium
for farmed fish of $0.68 per 8oz fillets. The premium for the farmed attribute level was marginally
distinguishable from zero in the control group. That said, we do observe significant preference
heterogeneity for this attribute level within this sub-group. In particular, the mean and standard
deviation of the WTP estimates suggest that roughly 55% of consumers in the control group have
a positive marginal WTP for farmed fish relative to the no label option. Likewise, we observe
significant preference heterogeneity for the other non-price experimentally designed attributes for
both treatment and control groups. However, we do notice that the standard deviation on the 𝑈𝑆
attribute level tends toward zero and is no longer statistically significant after treatment exposure,
suggesting that the treatment homogenizes consumers in terms of their WTP for the 𝑈𝑆 label.
         We also test whether preferences across the treatment and control groups are the same
using a likelihood ratio test of equality of WTP and parameter coefficients across the two groups.
In doing so, we follow the approach set forth by Layton & Brown (2000) by pooling across the
two models (that is, across the treatment and control groups) and conducting the likelihood ratio
test. We document from this test that we can reject the null hypothesis that preferences can be
restricted to be the same across the treatment and control groups, with a likelihood ratio statistic
                       <
of 391.94 against a 𝜒;K,L.LM   critical value of 26.30 (at the 5 % level of significance).
                                                    24


1.8.4 Market Shares Estimation
        Next, we investigate the impact of the treatment on predicted market share estimates for
each of the seafood alternatives with all prices fixed at $10.99 per 8oz fillets. We follow Lusk &
Tonsor (2016) and Van Loo et al. (2020) by estimating a RPL model with the systematic
component expressed as follows:
                                               9
                                𝑉}!" = o𝛽~" + 6 𝜎•"' 𝑧!' q + 𝛼•" 𝑃𝑟"                           (14)
                                              'B;
where 𝑧!' has a standard normal distribution; 𝛽~" is the alternative specific constant for seafood
alternative 𝑗; 𝜎•"' denotes the lower triangular Cholesky decomposition for the variance-covariance
matrix of the random parameters with the off-diagonals set to equal zero (that is, 𝜎•"' = 0 for 𝑗 ≠
𝑘) (Lusk & Tonsor, 2016). That is, the seafood alternative specific constants are assumed to be
independently distributed. We then substitute equation (12) into a multinomial logit formula to
derive the estimated market shares for the seafood alternatives. We approximate the mean market
shares using simulations with a set of 5,000 draws for 𝑧!' . Results are reported in Table 1.11.
        As the table shows, the predicted unconditional market shares for salmon, trout, and
whitefish are 34%, 28%, and 29%, respectively in the absence of the treatment. Following the
information shock, whitefish become the species with the lowest market share (17%).
Interestingly, the share of consumers who prefer the no-purchase option falls from 9% to 6% after
the treatment. By contrast, the estimated market shares for salmon, and trout increase to 39% and
38%, respectively. That is, the information shock appears to have negatively impacted the choice
share for whitefish, while a favorable effect of the treatment was observed for salmon and trout.
Results from the conditional market share estimates reiterate these findings.
                                                   25


1.9 Conclusions
        Increasingly globalized food supply chains create added opportunities for fraud, which is
likely to influence consumer behavior. Indeed, no food product is immune to potential food fraud
risk. As one of the most targeted food fraud categories, seafood that is domestically sourced could
as well suffer consequences for fraudulent activities initiated elsewhere. In this paper, we examine
consumers’ risk mitigating responses as reflected in their valuation of localness when making
seafood purchasing decisions in the face of fraud. Using a between-sample approach, we
randomize respondents into differing informational settings to investigate whether consumer WTP
for local seafood is impacted by a seafood fraud information shock.
        Our results indicate that, consumers broadly derive positive utility from consuming locally
sourced seafood (that is, seafood produced in the Great Lakes Region or other states within the
United States). Upon further disaggregation, however, we find that for more price-sensitive market
segments, localness does not command a significant positive premium. Further, we demonstrate
that providing information regarding fraud is unlikely to significantly alter preferences toward the
local options across most consumer market segments. That said, we find that the information shock
resulted in a $1.96 per 8oz fillets decline in the willingness-to-pay for 𝑈𝑆-labeled seafood.
        In other results, we also show that the intervention disproportionately affected market
shares for certain seafood species relative to others. Specifically, the predicted market share for
whitefish recorded the largest drop with exposure to unfavorable seafood fraud information, with
salmon and trout experiencing an uptick in market share following the treatment. An investigation
into the mechanisms driving such differences in consumer response across the various seafood
species is beyond the scope of this article. However, for producers and marketers of whitefish, a
deeper dive into possible explanations for such consumer risk mitigating behavior can be of value.
                                                   26


        As a note of caution, the fact that we do not find overwhelmingly compelling evidence in
support of either the spillover or signaling effect for most consumer segments does not suggest
that seafood fraud is not of concern. First, we must point out that we consider a specific form of
seafood fraud (that is, mislabeling) in this study. To the extent that other fraudulent activities such
as indiscriminate antibiotic use, short weighting, adulteration, among others, stir stronger
consumer demand response, our results are not generalizable. Second, our results do not
necessarily suggest that consumers will not attach a significant positive premium to product
integrity assurances in the form of “food fraud-free” certification labels. For different actors along
the seafood supply chain, innovations in seafood DNA testing and authentication are fast
emerging. Nonetheless, whether such labeling features will be economically worthwhile remains
to be seen and calls for further research.
                                                  27


                                         BIBLIOGRAPHY
Abaidoo, E., Melstrom, M., & Malone, T. (2021). The Growth of Imports in US Seafood
        Markets. Choices, 36(4), 1-10.
Asche, F., Yang, B., Gephart, J. A., Smith, M. D., Anderson, J. L., Camp, E. V., . . . Straume, H.-
        M. (2022). China's seafood imports-Not for domestic consumption? Science, 375(6579),
        386-388.
Belton, B., Reardon, T., & Zilberman, D. (2020). Sustainable commoditization of seafood.
        Nature Sustainability, 3(9), 677-684.
Bitzios, M., Lisa, J., Krzyzaniak, S.-A., & Mark, X. (2017). Country-of-origin labelling, food
        traceability drivers and food fraud: Lessons from consumers' preferences and perceptions.
        European Journal of Risk Regulation, 8(3), 541-558.
Bonne, K., & Verbeke, W. (2008). Religious values informing halal meat production and the
        control and delivery of halal credence quality. Agriculture and Human Values, 25(1), 35-
        47.
Brayden, W. C., Noblet, C. L., Evans, K. S., & Rickard, L. (2018). Consumer preferences for
        seafood attributes of wild-harvested and farm-raised products. Aquaculture Economics &
        Management, 22(3), 362-382.
Campbell, L. M., Boucquey, N., Stoll, J., Coppola, H., & Smith, M. D. (2014). From vegetable
        box to seafood cooler: applying the community-supported agriculture model to fisheries.
        Society & Natural Resources, 27(1), 88-106.
Caussade, S., de Dios Ortúzar,, J., Rizzi, L. I., & Hensher, D. A. (2005). Assessing the influence
        of design dimensions on stated choice experiment estimates. Transportation research
        part B: Methodological, 39(7), 621-640.
Chan, E., Griffiths, S., & Chan, C. (2008). Public-health risks of melamine in milk products. The
        Lancet, 372(9648), 1444-1445.
Chen, S., Zhang, Y., Li, H., Wang, J., Chen, W., Zhou, Y., & Zhou, S. (2014). Differentiation of
        fish species in Taiwan Strait by PCR-RFLP and lab-on-a-chip system. Food Control, 44,
        26-34.
Davidson, K., Pan, M., Hu, W., & Poerwanto, D. (2012). Consumers' willingness to pay for
        aquaculture fish products vs. wild-caught seafood--A case study in Hawaii. Aquaculture
        Economics & Management, 16(2), 136-154.
El Benni, N., Stolz, H., Home, R., Kendall, H., Kuznesof, S., Clark, B., . . . Chan, M.-Y. (2019).
        Product attributes and consumer attitudes affecting the preferences for infant milk
        formula in China-A latent class approach. Food Quality and Preference, 71, 25-33.
FAO. (2008). Food safety and quality - Melamine. Food and Agriculture Organization.
                                                  28


FAO. (2020). The State of the World Fisheries and Aquaculture: Sustainability in Action. Food
        and Agriculture Organization.
Fonner, R., & Sylvia, G. (2015). Willingness to pay for multiple seafood labels in a niche
        market. Marine Resource Economics, 30(1), 51-70.
Fox, M., Mitchell, M., Dean, M., Elliott, C., & Campbell, K. (2018). The seafood supply chain
        from a fraudulent perspective. Food Security, 10(4), 939-963.
Garlock, T., Nguyen, L., Anderson, J., & Musumba, M. (2020). Market potential for Gulf of
        Mexico farm-raised finfish. Aquaculture Economics & Management, 24(2), 128-142.
Gephart, J. A., Froehlich, H. E., & Branch, T. A. (2019). To create sustainable seafood
        industries, the United States needs a better accounting of imports and exports.
        Proceedings of the National Academy of Sciences, 116(19), 9142-9146.
Giannakas, K. (2002). Information asymmetries and consumption decisions in organic food
        product markets. Canadian Journal of Agricultural Economics, 50(1), 35-50.
Hayes, D. J., Shogren, J. F., Shin, S. Y., & Kliebenstein, J. B. (1995). Valuing food safety in
        experimental auction markets. American Journal of Agricultural Economics, 77(1), 40-
        53.
Heckman, J., & Singer, B. (1984). A method for minimizing the impact of distributional
        assumptions in econometric models for duration data. Econometrica: Journal of the
        Econometric Society, 271-320.
Hensher, D. A., Rose, J. M., & Greene, W. H. (2005). Applied choice analysis: a primer.
        Cambridge University Press.
Hensher, D. A., Rose, J. M., & Greene, W. H. (2015). Applied Choice Analysis (Vol. 2).
        Cambridge University Press.
Ingelfinger, J. R. (2008). Melamine and the global implications of food contamination. New
        England Journal of Medicine, 359(26), 2745-2748.
Jacquet, J. L., & Pauly, D. (2008). Trade secrets: renaming and mislabeling of seafood. Marine
        Policy, 32(3), 309-318.
Johnson, R. (2014). Food fraud and economically motivated adulteration of food and food
        ingredients. Washington DC: Congressional Research Service.
Kroetz, K., Luque, G. M., Gephart, J. A., Jardine, S. L., Lee, P., Chicojay Moore, K., . . . Donlan,
        C. (2020). Consequences of seafood mislabeling for marine populations and fisheries
        management. Proceedings of the National Academy of Sciences, 117(48), 30318-30323.
Layton, D. F., & Brown, G. (2000). Heterogeneous preferences regarding global climate change.
        Review of Economics and Statistics, 82(4), 616-624.
Loureiro, M. L., & Umberger, W. J. (2003). Estimating consumer willingness to pay for country-
        of-origin labeling. Journal of Agricultural and Resource Economics, 287-301.
                                                 29


Loureiro, M. L., & Umberger, W. J. (2005). Assessing consumer preferences for country-of-
       origin labeling. Journal of Agricultural and Applied Economics, 37(1), 49-63.
Louviere, J. J. (2004). Random utility theory-based stated preference elicitation methods:
       applications in health economics with special reference to combining sources of
       preference data. Centre for the Study of Choice (CenSoC) working paper(04-001), 22.
Love, D. C., Asche, F., Young, R., Nussbaumer, E. M., Anderson, J. L., Botta, R., . . . Gephart, J.
       A. (2022). An overview of retail sales of seafood in the USA, 2017-2019. Reviews in
       Fisheries Science & Aquaculture, 30(2), 259-270.
Lusk, J. L., & Schroeder, T. C. (2004). Are choice experiments incentive compatible? A test with
       quality differentiated beef steaks. American Journal of Agricultural Economics, 86(2),
       467-482.
Lusk, J. L., & Tonsor, G. T. (2016). How meat demand elasticities vary with price, income, and
       product category. Applied Economic Perspectives and Policy, 38(4), 673-711.
Marette, S., Roosen, J., & Blanchemanche, S. (2008). Health information and substitution
       between fish: Lessons from laboratory and field experiments. Food Policy, 33(3), 197-
       208.
Marette, S., Roosen, J., Blanchemanche, S., & Verger, P. (2008). The choice of fish species: an
       experiment measuring the impact of risk and benefit information. Journal of Agricultural
       and Resource Economics, 1-18.
Marko, P. B., Lee, S. C., Rice, A. M., Gramling, J. M., Fitzhenry, T. M., McAlister, J. S., . . .
       Moran, A. L. (2004). Mislabeling of a depleted reef fish. Nature, 430(6997), 309-310.
McCallum, C., Cerroni, S., Derbyshire, D., Hutchinson, W. G., & Nayga Jr, R. (2022).
       Consomers' responses to food fraud risks: an economic experiment. European Review of
       Agricultural Economics, 49(4), 942-969.
McFadden, D. (1973). Conditional logit analysis of qualitative choice behavior. Oakland:
       Institute of Urban and Regional Development, University of California Oakland.
Meerza, S. I., & Gustafson, C. R. (2020). Consumers' Response to Food Fraud: Evidence from
       Experimental Auctions. Journal of Agricultural and Resource Economics, 45(2), 219-
       231.
NOAA. (2021). U.S. Aquaculture. Retrieved from NOAA:
       https://www.fisheries.noaa.gov/national/aquaculture/us-aquaculture
Olynk, N. J., Tonsor, G. T., & Wolf, C. A. (2010). Verifying credence attributes in livestock
       production. Journal of Agricultural and Applied Economics, 42(3), 439-452.
O'Neill, L., Holbrook, T., & Russell, C. (2015). Fishy Business: Economically Motivated
       Adulteration of Fish in Minnesota Retail Markets. Minneapolis: Food and Agriculture
       Organization.
                                                30


Ortega, D. L., Wang, H. H., & Olynk Widmar, N. J. (2014). Aquaculture imports from Asia: an
        analysis of US consumer demand for select food quality attributes. Agricultural
        Economics, 45(5), 625-634.
Ortega, D. L., Wang, H. H., & Olynk Widmar, N. J. (2015). Effects of media headlines on
        consumer preferences for food safety, quality and environmental attributes. Australian
        Journal of Agricultural and Resource Economics, 59(3), 433-445.
Ortega, D. L., Wang, H. H., Wu, L., & Olynk, N. J. (2011). Modeling heterogeneity in consumer
        preferences for select food safety attributes in China. Food Policy, 36(2), 318-324.
Pardo, M. Á., Jiménez, E., & Pérez-Villarreal, B. (2016). Misdescription incidents in seafood
        sector. Food Control, 62, 277-283.
Premanadh, J. (2013). Horse meat scandal-A wake-up call for regulatory authorities. Food
        Control, 34(2), 568-569.
Reilly, A. (2018). Overview of food fraud in the fisheries sector. FAO Fisheries and Aquaculture
        Circular(C1165), 1-21.
Scarpa, R., Thiene, M., & Train, K. (2008). Utility in willingness to pay space: a tool to address
        confounding random scale effects in destination choice to the Alps. American Journal of
        Agricultural Economics, 90(4), 994-1010.
Schug, D. (2016). Preventing food fraud. Food Engineering, 88(1), 109.
Shears, P. (2010). Food fraud-a current issue but an old problem. British Food Journal.
Spink, J., & Moyer, D. C. (2011). Defining the public health threat of food fraud. Journal of
        Food Science, 76(9), 157-163.
Spink, J., Ortega, D. L., Chen, C., & Wu, F. (2017). Food fraud prevention shifts the food risk
        focus to vulnerability. Trends in Food Science & Technology, 62, 215-220.
Stopher, P. R., & Hensher, D. A. (2000). Are more profiles better than fewer?: searching for
        parsimony and relevance in stated choice experiments. Transportation Research Record,
        1719(1), 165-174.
Theolier, J., Barrere, V., Charlebois, S., & Godefroy, S. B. (2021). Risk analysis approach
        applied to consumers' behaviour toward fraud in food products. Trends in Food Science
        & Technology, 107, 480-490.
Toledo, C., & Villas-Boas, S. B. (2019). Safe or not? Consumer responses to recalls with
        traceability. Applied Economic Perspectives and Policy, 41(3), 519-541.
Tonsor, G. T., Olynk, N., & Wolf, C. (2009). Consumer preferences for animal welfare
        attributes: The case of gestation crates. Journal of Agricultural and Applied Economics,
        41(3), 713-730.
Train, K. E. (2009). Discrete choice methods with simulation. Cambridge University Press.
                                                  31


Train, K., & Weeks, M. (2005). Discrete choice models in preference space and willingness-to-
       pay space. In Applications of simulation methods in environmental and resource
       economics (pp. 1-16). Springer.
Uchida, H., Roheim, C. A., & Johnston, R. J. (2017). Balancing the health risks and benefits of
       seafood: how does available guidance affect consumer choice? American Journal of
       Agricultural Economics, 99(4), 1056-1077.
Umberger, W. J., Feuz, D. M., Calkins, C. R., & Sitz, B. M. (2003). Country-of-origin labeling
       of beef products: US consumers' perceptions. Journal of Food Distribution Research, 34,
       103-116.
Van Loo, E. J., Caputo, V., & Lusk, J. L. (2020). Consumer preferences for farm-raised meat,
       lab-grown meat, and plant-based meat alternatives: Does information or brand matter?
       Food Policy, 95, 101931.
Waite, R., Beveridge, M., Brummett, R., Castine, S., Chaiyawannakarn, N., Kaushik, S., . . .
       Phillips, M. (2014). Improving productivity and environmental performance of
       aquaculture. WorldFish.
Warner, K., Timme, W., Lowell, B., & Hirschfield, M. (2013). Oceana study reveals seafood
       fraud nationwide. Washington, DC: Oceana.
Weir, M. J., Uchida, H., & Vadivelo, M. (2021). Quantifying the effect of market information on
       demand for genetically modified salmon. Aquaculture Economics & Management, 25(1),
       1-26.
Yang, Z., Zhou, Q., Wu, W., Zhang, D., Mo, L., Liu, J., & Yang, X. (2022). Food fraud
       vulnerability assessment in the edible vegetable oil supply chain: A perspective of
       Chinese enterprises. Food Control, 109005.
Zilberman, D., Kaplan, S., & Gordon, B. (2018). The political economy of labeling. Food Policy,
       78, 6-13.
                                                32


                          APPENDIX A: TABLES AND FIGURES
Table 1.1
          Attributes and Attribute Levels included in the Discrete Choice Experiment
 Product attribute              Levels
 Price                          $7.99/8oz fillets
                                $9.99/8oz fillets
                                $11.99/8oz fillets
                                $13.99/8oz fillets
 Origin                         Great Lakes Region
                                United States (but outside the Great Lakes)
                                Imported
 Processing form                Fresh
                                Frozen
 Production Method              Wild-caught
                                Farmed
                                Unlabeled
                                                33


Table 1.2
                                    Sample Demographics
 Variable                                          All Treatment Control Diff: p-value
 Female (%)                                        53     53       53    0.96
 Age (%)                                                                 0.07
    18 – 24 years old                               7      6        8
    25 – 34 years old                              21     22       20
    35 – 44 years old                              27     30       24
    45 – 54 years old                              12     12       12
    55 – 64 years old                              13     13       13
    65+ years old                                  20     17       23
 Marital Status (%)                                                      0.65
    Married                                        57     58       56
    Divorced                                        9      8       10
    Separated                                       2      2        1
    Single, Never Married                          28     28       28
    Widowed                                         5      5        5
 Educational level (%)                                                   0.33
    Less than High School                           2      2        1
    High School/GED                                21     23       19
    Some College                                   21     20       21
    2-Year College Deg. (Assoc.)                    9      9       10
    4-Year College Deg. (BA, BS)                   25     25       24
    Master’s Degree                                19     18       20
    Professional Deg. (Ph.D., J.D., M.D., etc.)     4      3        5
 Number of HH members (%)                                                0.04
    1                                              19     17       22
    2                                              32     29       34
    3                                              19     24       15
    4                                              20     20       20
    5+                                             10     10       10
 Annual pre-tax HH income in $ (%)                                       0.42
    Less than 20,000                               12     12       13
    20,000 – 39,999                                20     18       22
    40,000 – 59,999                                17     19       15
    60,000 – 79,999                                15     15       15
                                                34


Table 1.2 (cont’d)
   80,000 – 99,999           8    8   8
   100,000 – 119,999         7    8   6
   120,000 – 139,999         6    7   5
   140,000 – 159,999         6    7   6
   160,000 or greater        9    8  11
Region of residence (%)                 0.32
   Midwest                  67   67  67
   Northeast                31   31  31
   South                     1    2   1
   West                      1    1   1
Observations              1,272 636 636
                        35


Table 1.3
                          Overall sample demographics and representability
 Variable                                                    Sample ACS* (Great Lakes)           NHANES#
 Female (%)                                                     53                51                 52
 Age (%)
      18 – 24 years old                                          7                12                 12
      25 – 34 years old                                         21                17                 16
      35 – 44 years old                                         27                16                 15
      45 – 54 years old                                         12                16                 17
      55 – 64 years old                                         13                17                 19
      65+ years old                                             20                22                 20
 Marital Status (%)
      Married                                                   57                48                 56
      Divorced                                                   9                11                 10
      Separated                                                  2                 1                  3
      Single, Never Married                                     28                34                 25
      Widowed                                                    5                 6                  6
 Educational level (%)
      Less than High School                                      2                 9                  9
      High School/GED                                           21                29                 26
      Some College/2-year college deg.                          30                29                 31
      4-Year College Deg. (BA, BS) and beyond                   48                40                 34
 Number of HH members (%)
      1                                                         19                28                 13
      2                                                         32                34                 35
      3                                                         19                16                 19
      4                                                         20                13                 15
      5+                                                        10                10                 18
 Annual pre-tax HH income in $ (%)
      100,000 or less                                           72                75                 81
      > 100,000                                                 28                25                 19
Notes: *The ACS estimates are derived from the 2021 American Community Survey 1-Year estimates for survey
respondents in the Great Lakes region (Illinois, Indiana, Michigan, Minnesota, Ohio, New York, Pennsylvania, and
Wisconsin). #The survey weight-adjusted National Health and Nutrition Examination Survey (NHANES) estimates
are obtained from the 2017 to 2018 demographics file on US seafood consumers.
                                                         36


Table 1.4
                                        Seafood purchasing behavior
 Variable                                                      All Treatment Control Diff: p-
                                                                                                  value
 Seafood purchase frequency (%)                                                                   0.14
     Every day                                                 2.44        3.14          1.73
     Two to three times a week                                 9.21        8.33         10.08
     Weekly                                                   22.27       21.86         22.68
     Two to three times a month                               32.10       30.82         33.39
     About once a month                                       20.06       20.75         19.37
     Less than once a month                                   12.98       14.15         11.81
     Never                                                     0.94        0.94          0.94
 Seafood consumption location (%)                                                                 0.95
     At home                                                  67.53       67.61         67.45
     Away from home (e.g., restaurants, bars,                 32.47       32.39         32.55
 etc.)
 Preferred seafood source (%)                                                                     0.11
     Wild caught                                              36.34       33.60         39.08
     Farmed                                                    8.87        9.67          8.07
     Indifferent                                              41.81       43.90         39.72
     Not sure                                                 12.98       12.84         13.13
 Level of seafood fraud concern (%)                            55.2        55.6          54.8     0.59
                                                              (27.9)      (27.1)       (28.7)
 Observations                                                 1,272         636          636
Notes: Standard deviations are reported in parentheses. P-values from the null hypotheses testing of no difference
between treatment and control subgroups are also reported. Seafood level of concern is measured on a 0 – 100 scale
where 100 depicts maximum level of concern.
                                                        37


Table 1.5
                                Latent Class Model Parameter Estimates
                                                     4 Latent Classes, Fixed Parameters
                                  Locavores           COO        Information-sensitive Price-sensitive
 Variable                           Class 1         Class 2              Class 3           Class 4
 𝐴𝑆𝐶
     Salmon                       2.749***           -0.078             4.201***             0.475
                                    (0.128)         (0.176)              (0.558)           (0.830)
     Trout                        2.763***        -0.940**                0.071              0.013
                                    (0.128)         (0.183)              (0.608)           (0.866)
     Whitefish                    2.743***        -0.384**                0.030              0.895
                                    (0.127)         (0.183)              (0.615)           (0.856)
 𝑃𝑅𝐼𝐶𝐸                            -0.052***      -0.136***              -0.112**         -0.426***
                                    (0.007)         (0.015)              (0.046)           (0.079)
 𝐺𝐿                               0.334***        0.490***                0.268              0.007
                                    (0.035)         (0.069)              (0.218)           (0.292)
 𝑈𝑆                               0.299***        0.626***                0.324             -0.022
                                    (0.036)         (0.069)              (0.209)           (0.265)
 𝐹𝑅𝐸𝑆𝐻                            0.258***        0.444***                -0.048             0.165
                                    (0.028)         (0.058)              (0.196)           (0.228)
 𝑊𝐼𝐿𝐷                             0.246***        1.322***              1.017***             0.037
                                    (0.037)         (0.079)              (0.275)           (0.277)
 𝐹𝐴𝑅𝑀                             0.164***         0.169**              0.869***            -0.154
                                    (0.036)         (0.080)              (0.268)           (0.319)
 Class prob.                         0.524            0.251               0.108              0.117
 Log Likelihood                                                     -16252
 AIC                                                                 32581
 AIC (Sample adjusted)                                               2.135
 Observations                                                        1,272
Notes: Standard errors are reported in parentheses. ***p < 0.01, **p < 0.05, *p < 0.1.
                                                          38


Table 1.6
                           Descriptive Statistics by Latent Classes
                                                            Latent Classes
 Variable                              Locavores COO            Information-  Price-
                                                                  sensitive  sensitive
 Female (%)                               48.0        59.6          58.7       53.7
 Age (%)
    18 – 24 years old                      7.4         6.3          10.5        4.1
    25 – 34 years old                     25.5        16.3          25.6        8.2
    35 – 44 years old                     34.5        18.4          21.8       16.3
    45 – 54 years old                     11.3        11.9          13.5       12.2
    55 – 64 years old                     10.7        16.9          12.8       15.0
    65+ years old                         10.6        30.3          15.8       44.2
 Marital Status (%)
    Divorced                               8.0        10.0           7.5       10.9
    Married                               57.9        58.1          50.4       53.7
    Separated                              1.6         1.9           1.5        1.4
    Single, Never Married                 28.9        24.7          38.4       22.5
    Widowed                                3.6         5.3           2.3       11.6
 Educational level (%)
    Less than High School                  1.9         0.9           3.8        1.4
    High School/GED                       19.1        22.8          19.6       27.9
    Some College                          21.1        19.1          21.8       21.1
    2-Year College Deg. (Assoc.)           8.2         8.1          12.0       15.7
    4-Year College Deg. (BA, BS)          24.4        27.8          24.8       20.4
    Master’s Degree                       21.3        16.9          15.8       11.6
    Professional Deg. (Ph.D., J.D.,        4.0         4.4           2.3        2.0
 M.D., etc.)
 Annual pre-tax HH income in $ (%)
    Less than 20,000                      13.1        10.3          15.0       11.6
    20,000 – 39,999                       18.8        19.7          21.1       25.9
    40,000 – 59,999                       15.3        17.8          23.3       17.7
    60,000 – 79,999                       13.1        15.3          16.5       17.0
    80,000 – 99,999                        6.1        10.6           4.5       10.2
    100,000 – 119,999                      8.6         6.9           2.3       11.6
    120,000 – 139,999                      7.0         6.6           3.0        4.8
    140,000 – 159,999                      8.8         4.1           2.3        0.7
                                              39


Table 1.6 (cont’d)
   160,000 or greater              9.2    8.8         12.0    3.4
Seafood consumption location (%)
   At home                        65.8   75.3         58.7   66.7
   Away from home                 34.2   24.7         41.3   33.3
                                                  Mean
                                               (Std. Dev.)
Level of seafood fraud concern    57.2   56.7         51.4   46.2
                                 (27.0) (27.0)       (31.1) (29.4)
Observations                      672    320          133    147
                                     40


Table 1.7
                      Marginal WTP estimates with 95% confidence intervals
                                       mWTP estimate ($/ 8oz fillets of seafood)
                                                        [95% C.I.]
 Variable        Locavores                COO             Information-sensitive             Price-sensitive
     𝐺𝐿              6.38                  3.62                      2.37                        0.01
                [4.34, 8.41]          [2.35, 4.89]              [-1.71, 6.45]                [-1.32, 1.35]
     𝑈𝑆              5.71                  4.63                      2.86                        -0.05
                [3.79, 7.62]          [3.25, 6.00]              [-1.27, 6.99]                [-1.27, 1.16]
   𝐹𝑅𝐸𝑆𝐻             4.92                  3.28                      -0.42                       0.39
                [3.27, 6.57]          [2.09, 4.47]              [-3.85, 3.00]                [-0.65, 1.43]
   𝑊𝐼𝐿𝐷              4.68                  9.76                      9.04                        0.10
                [2.91, 6.45]         [7.26, 12.27]              [0.53, 17.56]                [-1.18, 1.37]
   𝐹𝐴𝑅𝑀              3.13                  1.24                      2.22                        -0.35
                [1.57, 4.70]          [0.05, 2.43]               [0.45, 4.00]                [-1.82, 1.13]
Notes: Asymptotic standard errors for the 95% confidence intervals calculated using the Delta Method with 10,000
draws.
                                                       41


Table 1.8
                       Latent Class Model Parameter Estimates with interactions
                                                      4 Latent Classes, Fixed Parameters
                                  Locavores           COO         Information-sensitive Price-sensitive
 Variable                            Class 1         Class 2               Class 3          Class 4
 𝐴𝑆𝐶
     Salmon                        2.746***          -0.103               4.097***            0.537
                                     (0.127)         (0.176)               (0.546)          (0.807)
     Trout                         2.754***        -0.949***                -0.226            0.050
                                     (0.127)         (0.183)               (0.609)          (0.829)
     Whitefish                     2.743***         -0.387**                -0.208            0.903
                                     (0.126)         (0.181)               (0.589)          (0.799)
 𝑃𝑅𝐼𝐶𝐸                            -0.053***        -0.136***              -0.115**        -0.425***
                                     (0.007)         (0.015)               (0.046)          (0.077)
 𝐺𝐿                                0.328***         0.528***                 0.251            0.132
                                     (0.050)         (0.087)               (0.308)          (0.318)
 𝐺𝐿 × 𝑇𝑟𝑒𝑎𝑡                            0.013         -0.069                  0.047           -0.489
                                     (0.069)         (0.112)               (0.397)          (0.499)
 𝑈𝑆                                0.326***         0.688***              0.959***            0.173
                                     (0.050)         (0.088)               (0.321)          (0.287)
 𝑈𝑆 × 𝑇𝑟𝑒𝑎𝑡                           -0.056         -0.113              -1.076***           -0.649
                                     (0.069)         (0.112)               (0.399)          (0.503)
 𝐹𝑅𝐸𝑆𝐻                             0.258***         0.454***                -0.095            0.105
                                     (0.028)         (0.058)               (0.191)          (0.227)
 𝑊𝐼𝐿𝐷                              0.245***         1.319***              1.158***            0.043
                                     (0.037)         (0.078)               (0.281)          (0.274)
 𝐹𝐴𝑅𝑀                              0.166***          0.149*               0.951***           -0.049
                                     (0.036)         (0.082)               (0.268)          (0.351)
 Class prob.                           0.526          0.249                  0.108            0.117
 Log Likelihood                                                      -16246
 AIC                                                                  32585
 AIC (Sample adjusted)                                                2.135
 Observations                                                         1,272
Notes: Standard errors are reported in parentheses. ***p < 0.01, **p < 0.05, *p < 0.1.
                                                         42


Table 1.9
       Marginal WTP estimates with interaction terms and 95% confidence intervals
                                       mWTP estimate ($/ 8oz fillets of seafood)
                                                       [95% C.I.]
 Variable             Locavores             COO           Information-sensitive Price-sensitive
 𝐺𝐿                      6.25                3.89                    2.17                      0.31
                     [3.86, 8.63]        [2.38, 5.41]           [-3.22, 7.56]             [-1.15, 1.77]
 𝐺𝐿 × 𝑇𝑟𝑒𝑎𝑡              0.24               -0.51                    0.41                     -1.15
                    [-2.33, 2.81] [-2.13, 1.11]                 [-6.33, 7.15]             [-3.49, 1.19]
 𝑈𝑆                      6.20                5.07                    8.31                      0.41
                     [3.83, 8.58]        [3.43, 6.71]           [0.14, 16.47]             [-0.91, 1.73]
 𝑈𝑆 × 𝑇𝑟𝑒𝑎𝑡             -1.07               -0.84                   -9.32                     -1.53
                    [-3.67, 1.54] [-2.43, 0.75]                [-19.24, 0.60]             [-3.92, 0.86]
 𝐹𝑅𝐸𝑆𝐻                   4.91                3.35                   -0.82                      0.25
                     [3.27, 6.55]        [2.16, 4.54]           [-4.12, 2.47]             [-0.79, 1.28]
 𝑊𝐼𝐿𝐷                    4.67                9.73                   10.04                      0.10
                     [2.91, 6.42] [7.28, 12.17]                 [1.22, 18.85]             [-1.16, 1.36]
 𝐹𝐴𝑅𝑀                    3.15                1.10                    8.24                     -0.12
                     [1.60, 4.71]       [-0.09, 2.30]           [0.87, 15.61]             [-1.74, 1.50]
Notes: Asymptotic standard errors for the 95% confidence intervals calculated using the Delta Method with 10,000
draws.
                                                       43


Table 1.10
                                Random Parameters Logit Model Estimates
 Variable                                                        Treatment                 Control
 𝑃𝑅𝐼𝐶𝐸                                                            -0.37***                -0.37***
                                                                    (0.04)                  (0.03)
 𝐴𝑆𝐶
     Salmon                                                        2.80***                 3.06***
                                                                    (0.11)                  (0.12)
     Trout                                                         2.21***                 2.48***
                                                                    (0.10)                  (0.12)
     Whitefish                                                     2.29***                  2.62**
                                                                    (0.11)                  (0.12)
 Estimates in WTP space
 𝐺𝐿
     Mean                                                     1.26*** (0.20)           1.58*** (0.20)
     S.D.                                                     1.00*** (0.35)           1.16*** (0.20)
 𝑈𝑆
     Mean                                                     1.50*** (0.21)           1.96*** (0.20)
     S.D.                                                        0.03 (0.42)           0.84*** (0.27)
 𝐹𝑅𝐸𝑆𝐻
     Mean                                                     2.30*** (0.19)           2.04*** (0.21)
     S.D.                                                     3.73*** (0.36)           3.64*** (0.24)
 𝑊𝐼𝐿𝐷
     Mean                                                     2.06*** (0.27)           3.66*** (0.25)
     S.D.                                                     5.83*** (0.28)           5.45*** (0.29)
 𝐹𝐴𝑅𝑀
     Mean                                                     0.68*** (0.25)            -0.46* (0.27)
     S.D.                                                     1.57*** (0.33)           3.93*** (0.27)
 Std. Dev. of error component                                 8.13*** (0.44)           9.51*** (0.54)
 Log Likelihood                                                     -8,513                  -8,373
 AIC (Sample adjusted)                                                2.24                   2.20
 Number of parameters                                                  16                     16
 Observations                                                        7,632                  7,632
Notes: Standard errors are reported in parentheses. ***p < 0.01, **p < 0.05, *p < 0.1.
                                                         44


Table 1.11
                        Estimated Market Shares by Treatment
                                      Unconditional                                Conditional
 Species                       Treatment                Control            Treatment              Control
 Salmon                           39%                     34%                 42%                  37%
 Trout                            38%                     28%                 38%                  31%
 Whitefish                        17%                     29%                 20%                  32%
 None                              6%                      9%
Notes: All prices are fixed at $10.99 per 8oz fillets with mean shares approximated over 5,000 draws.
                                                            45


Figure 1.1: Global seafood export volume, 1986 – 2018
Source: FAO.
                                             46


Figure 1.2: Bar graph of total number of pathogen/toxin violations from imported foods by
industry, 2002 - 2019
Source: USDA - Economic Research Service.
                                              47


Figure 1.3: Seafood supply chain with fish fraud vulnerability assessment
Notes: 1 = species substitution, 2 = mislabeling, 3 = short-weighting, 4 = adulteration, 5 = indiscriminate
antibiotic use.
                                                         48


Figure 1.4: Choice experiment question sample
                                             49


Figure 1.5: Boxplot of respondent-specific WTP for GL attribute level
Notes: The black lines in the middle of the colored rectangles denote the medians; the colored boxes show the
interquartile range (IQR), and the whiskers are 1.5 × IQR. WTP estimates outside the IQR are show in grey circles.
                                                         50


Figure 1.6: Boxplot of respondent-specific WTP for US attribute level
Notes: The black lines in the middle of the colored rectangles denote the medians; the colored boxes show the
interquartile range, and the whiskers are 1.5 × IQR. WTP estimates outside the IQR are show in grey circles.
                                                       51


                       APPENDIX B: DEFINITIONS AND EXCERPT
Definitions
Origin refers to where fish was farmed or caught:
       Great Lakes Region refers to the region spanning the following states: Illinois, Indiana,
       Michigan, Minnesota, New York, Ohio, Pennsylvania, and Wisconsin.
       The United States refers to any other state within the United States outside the Great
       Lakes Region.
       Imported refers to any country outside the borders of the United States.
Processing form refers to the form in which fish was bought by the final consumer or restaurant:
       Fresh means fish has never been frozen since harvest or catch.
       Frozen means fish has undergone frozen storage since harvest.
Production method refers to the method of fish production:
       Wild-caught means fish was captured in their natural habitat.
       Farmed means fish was raised by a fish farmer in a controlled setting (i.e., aquaculture).
       Unlabeled means no claims about the fish production method made.
                                                52


Excerpt on Fish Fraud from a News Article
“There’s something, well, fishy going on with certain favorite fish dishes, according to a new
study from the conservation group Oceana.
DNA tests showed that about 21% of the fish [that] researchers sampled was not what it was
called on the label or menu.
With so many species and with 80% of the fish Americans eat coming from international
sources, labeling is complicated.”
NB: This excerpt was derived from a published news article on the Cable News Network (CNN)
website on March 7th, 2019.
URL to the full CNN article attached:
https://www.cnn.com/2019/03/07/health/fish-mislabeling-investigation-oceana/index.html
                                               53


        CHAPTER 2: DOES RURAL NON-FARM EMPLOYMENT RELIEVE (OR
   EXACERBATE) THE AGRICULTURAL DIVERSIFICATION-FARM EFFICIENCY
              TRADEOFF: THE CASE OF AQUACULTURE IN BANGLADESH
2.1 Introduction
         This paper examines fish efficiency and agricultural diversification, and how each and their
relation are conditioned by rural non-farm employment (RNFE). This is justified by a gap in the
literature, which we show by reviewing the march of the crop diversification, RNFE, and
efficiency literatures.
         Adam Smith theorized that there are gains from specialization. Conversely, there could be
efficiency losses associated with diversification such as in agriculture. This reduction in crop
production efficiency, both technical and allocative, from crop diversification may happen for the
following reasons. We define technical efficiency as a farm’s ability to obtain maximal output
from a given input bundle. Allocative efficiency occurs when a household allots resources in a
manner that maximizes farm profits given input and output prices.
         First, efficiency of the production of crop 𝑖 could be undermined because of competition
for labor and other inputs from adding crop 𝑗. For example, in Bangladesh, rice efficiency is
undermined by adding jute, which competes with rice for water, land, and labor (Rahman, et al.,
2017).
         Second, multi-cropping could exert pressure on household labor and management time. In
Papua New Guinea, Coelli & Fleming (2004) argue that overlapping labor and management needs
among multiple cash crops create diseconomies of diversification.
                                                  54


        By contrast, some studies find a positive or null effect of crop diversification on a target
crop’s efficiency. This may happen for several reasons. First, due to physical complementarities
between crops. For example, nitrogen-fixing crops can enhance grain efficiency as found in the
farming systems literature in the 1970s/1980s. In Bangladesh, Emran et al. (2022) find that
sequentially cropped systems (i.e., rice with mungbean, lathyrus, or groundnut) improve rice
productivity.
        Second, crop diversification could stimulate cash and knowledge spillovers, which can help
the target crop. For example, income from cash cropping can purchase capital and labor (Von
Braun & Kennedy, 1994). Crop diversification could also spur productivity spillovers via cross-
crop knowledge and skill transfer (Von Braun & Kennedy, 1994).
        Third, seasonality can permit serial specialization within diversification, so crops do not
compete. See Schreinemachers et al. (2016) for the case of off-season vegetable production in
Bangladesh, with low impacts on on-season rice productivity.
        Fourth, off-farm development of businesses like cash crop-based input dealers, selling
fungible inputs like fertilizer and services like logistics could benefit food crops (Kennedy &
Cogill, 1987).
        Diversification into crop production among fish farmers is increasingly gaining in
importance as a way around malfunctioning food crop markets and a reliable means of rural
income diversification. Relatedly, synergies within diversified agricultural systems make a
compelling case for integrated aquaculture-agriculture (IAA) technology adoption. Recent studies
suggest that IAA adoption is a viable approach to sustainable agricultural intensification with
considerable potential for improving agricultural productivity and food security in Bangladesh
(Islam, 2021).
                                                 55


        However, a factor that can affect each of the above (i.e., crop diversification and efficiency)
and their relation is RNFE. The literature shows that RNFE is important to rural household
livelihoods (Lanjouw & Shariff, 2004; Deichmann, et al., 2009). In Bangladesh, we show for fish
farmers that non-farm activities account for less than half of rural household income, on average.
Similarly, Deichmann et al. (2009) report that the non-farm income share of rural household
income exceeds 50% in Bangladesh.
        RNFE conditions crop diversification and a target crop’s technical and allocative
efficiency. Indeed, there are two vectors of effect.
        First, an emerging literature shows that RNFE affects crop 𝑖’s efficiency. On one hand,
RNFE provides cash for crop input purchase. For example, Begum et al. (2013) show in
Bangladesh a positive correlation between RNFE and shrimp efficiency. The authors explain by
positing that RNFE buys inputs. In China, Rozelle et al. (1999) find that remittances improve
access to physical capital, raising maize yields. Chavas et al. (2005) also find that RNFE improves
allocative efficiency in The Gambia, indicative of capital market imperfections in the study area.
        Second, RNFE affects agricultural diversification. Others find that RNFE fuels
diversification like into fish production. For example, in Myanmar, Faxon (2020) reports that
migrant-sending households built fishponds using remittances, spurring diversification from paddy
into aquaculture. However, RNFE can also compete for resources and time with agricultural
diversification, such as into fruit trees (Huang, et al., 2009).
        The literature has left additional gaps, which we attempt to fill. First, how RNFE conditions
the crop diversification-farm efficiency tradeoff has not been studied for either type of efficiency.
Second, although Begum et al. (2013) study the relation between RNFE and technical efficiency
in aquaculture, RNFE impact on allocative efficiency was not addressed. Hence, how RNFE
                                                  56


affects optimal input (such as labor) choices given input prices in fish systems remains unclear.
Besides, shrimp farming is less representative of Bangladeshi aquaculture. It accounts for only 4%
of aquaculture production in the country. Hence, inferring Bangladeshi aquaculture productivity
from shrimp technical efficiency estimates could mislead as shrimp farming uses significantly less
inputs. Third, while Faxon (2020) shows that RNFE affects agricultural diversification, the
evidence she presents are qualitative. Hence, an empirical analysis using a larger sample can
provide additional validation to these qualitative results.
        We attempt to fill the aforementioned gaps in the literature using panel data on fish farming
households in Southern Bangladesh. We address the follow questions: (1) does non-farm income
diversification resolve or exacerbate the crop diversification-efficiency tradeoff? We hypothesize
that RNFE relieves the crop diversification-farm efficiency tradeoff. Cash from non-farm activities
can buy labor to offset competition for household labor across multiple agricultural activities. On
the other hand, RNFE may exacerbate this tradeoff if it diverts family labor away from the farm
altogether, further constraining household labor; (2) does RNFE affect fish efficiency (both
technical and allocative)? We postulate that RNFE increases fish efficiency by providing cash to
buy skilled labor and other inputs. By contrast, RNFE could draw labor away from the farm,
especially if off-farm work is year-round; (3) does RNFE affect crop diversification for fish
systems? We conjecture that RNFE helps buy labor to invest in multiple lucrative agricultural
ventures. However, RNFE may reduce crop diversification if off-farm work competes with
agricultural production for labor.
        The rest of the paper is organized as follows. Section 2.2 introduces our data and reports
descriptive statistics on key variables. Section 2.3 outlines our empirical strategy. Section 2.4
presents our results, and we offer concluding remarks in section 2.5.
                                                 57


2.2 Data and Descriptives
         This study uses data on fish farming households in the seven most important fish producing
districts in Southern Bangladesh (see Figure 2.1 for a map of the sampled districts). Aquaculture
farms in the seven surveyed districts cover 275,970 ha, accounting for 41% of the national area of
aquaculture farms and 24% of national aquaculture production. The seven selected districts
account for 80% of aquaculture production in Southern Bangladesh (DoF, 2022).
         Aquaculture in this zone is a mix of fish and shrimp, grown in traditional extensive and
improved semi-intensive systems. Agriculture is dominated by rainfed monsoon and irrigated dry
season rice, vegetable crops, and some off-season oilseeds and pulses. Rice production is oriented
predominantly to subsistence, while vegetable cultivation and aquaculture tend to be market
oriented. Aquaculture and agriculture may be distributed across separate plots on a given farm or
integrated within a single plot. (Jahan et al., 2015; Ali et al., 2022). The average number of ponds
operated per farm was 1.5 in 2020.
         The survey was a panel, conducted in 2 years, 2014, and 2020. Sampling procedure was as
follows:
         In 2014, all upazilas with negligible fish production (per the 2008 agricultural census) were
excluded from the initial sample frame, and the sample was drawn from among the remaining
upazila by probability proportional to size, leaving 13 upazila (sub-districts) from a total of 56.
         In each selected upazila, all mouza (the smallest administrative unit listed in the
Bangladesh agricultural census), underwent a second stage of trimming to eliminate those with
fewer than 20 aquaculture farms. Two to three of the remaining mouza were selected randomly in
each of the 13 upazila for inclusion in the farm survey. Prior to the survey, a census of fish farmers
                                                    58


was conducted in all selected mouza. In each selected mouza, 20 farms were selected randomly
from this list for interview.
          In 2020, we conducted a resurvey of households from the 2014 survey. Prior to the survey,
we conducted a census of all fish farming households in villages included in the 2014 survey and
attempted to identify all households previously interviewed. 579 out the 721 households
interviewed in 2014 showed up in the 2020 panel, implying an attrition rate of 20 percent. While
the attrition rate is high, we find no significant differences between attrited and non-attrited
subsamples across most variables, although more remote farms, and households with a higher non-
farm income share were slightly more likely to attrite.
          Table 2.1 describes the variables relevant to our analyses. We consider the following
production input variables: the quantity of feed and non-feed inputs applied, total person-days of
hired as well as family labor used, and the quantity of fish seed stocked.14 We also collected data
on household demographics such as the household size, dependency ratio, household head’s
gender, and educational status. Other key variables include fish price per kilogram (kg), daily wage
rate at the household-level, an off-farm participation indicator variable, crop farm income share,
fish farm income share, non-farm income share, a crop diversification indicator variable, fish plots
distance to the nearest road and from the household’s residence. We also construct a Simpson
diversification index (SDI), ranging between 0 and 1 to capture crop diversification at the intensive
margin:
                                                                      <
                                                                P!
                                         𝑆𝐷𝐼O = 1 − ∑R ƒ∑            „                                       (1)
                                                                ! P!
where AR denotes the acreage allocated to crop j.
14
   The pond size (water area) was originally reported in decimals, where 100 decimals = 0.404648 hectares. We use
this variable to scale the aforementioned input variables to produce per hectare measures.
                                                           59


         Table 2.2 presents descriptive results pooled across both panel waves and disaggregated
by crop diversification status. The following results are noteworthy. A substantial share of the
sampled farm households (69%) are into crop production. The table also shows that crop
diversified households used more family labor and almost twice as much hired labor on fish farms
as households not growing any crops. The former also applied more feed. By contrast,
undiversified farms utilized more nonfeed inputs and stocked more fish seed, on average.
However, these differences are not statistically significant.
         Turning to the household demographics, our results indicate that household size does not
vary much by crop diversification status. The average household has approximately 4.6 members,
with a dependency ratio of 0.6. Male household heads dominate our sample (96% male headship),
and household heads in crop diversified households tend to have higher levels of education.
         We also observe that a considerable share (60%) of the sampled households participated
in off-farm activities. Off-farm participation rate was higher (63%) among crop diversified
households and significantly so compared to those producing no crops (54%). Turning to the non-
farm income share, the average estimate stands at 43%, ranging between 41% (crop diversified)
and 46% (not diversified). This difference is significant at the 5 percent level and is consistent with
similar estimates reported for Sub-Saharan Africa by Reardon (1997).
         That said, aquaculture production remains the dominant source of earnings, accounting for
49% of total rural household income. This is true across both crop diversified (47%) and
undiversified households (54%). Table 2.2 also shows that the difference in fish farm income share
by crop diversification status is statistically significant. By contrast, crop production accounts for
a relatively lower share (less than 10%) of total household income.
                                                   60


2.3 Empirical Strategy
2.3.1 Technical Efficiency Estimation
        We derive technical efficiency estimates by specifying a stochastic frontier production
function (SFPF) model in panel data setting as follows:
                                     𝑌?# = 𝑓 (x?# ; 𝛽)𝑒𝑥𝑝(𝑣?# − 𝑢?# )                              (2)
where 𝑌?# denotes the fish output level; the deterministic portion of the model, 𝑓(x?# ; 𝛽) represents
household 𝑖’s fish production frontier with input vector x?# at time 𝑡; 𝛽 denotes model parameters
to be estimated; 𝑣?# is a symmetric disturbance term that captures statistical noise; and 𝑢?# is the
inefficiency term. Following Stevenson (1980), we impose a normal-truncated-normal
distributional assumption on the 𝑣— 𝑢 error pair. In particular, the symmetric random error term,
𝑣?# is assumed to be 𝑖𝑖𝑑 normally distributed with zero mean and standard deviation, 𝜎S (that is,
𝑣?# ∼ 𝑁(0, 𝜎S< )). We also maintain that the one-sided inefficiency term is 𝑖𝑖𝑑 whose distribution
derives from the truncation of 𝑁(𝜇, 𝜎T< ) at zero. Hence, the SFPF model reduces to the standard
neoclassical production function if 𝑢?# = 0 (Kumbhakar, et al., 2020). Following Kumbhakar et
al. (1991), we deploy the more efficient single-step approach which simultaneously accounts for
the determinants of inefficiency while estimating the production frontier model parameters.
        To address potential bias in the estimated production function parameters due to
unobserved household heterogeneity, we apply the Mundlak-Chamberlain approach by including
the means of the time-varying input variables as controls (Wooldridge, 2019). The Mundlak test
of the null hypothesis that the unobserved heterogeneity can be ignored is rejected at the 1 percent
level (𝑝 − 𝑣𝑎𝑙𝑢𝑒 = 0.0005). Hence, we decide against the true random effects model, which
                                                  61


maintains the strong independence assumption between the production input covariates and the
unobserved heterogeneity term.
          We use a transcendental logarithmic (translog) specification for the SFPF estimation. Not
only is the translog functional form more flexible for estimating production technology, but also
supported by results from a likelihood ratio test for our sample. Hence, we estimate the following
translog SFPF:
                   M                  M   M                            M
                                  1
 𝑙𝑛𝑌?# = 𝛽L + 6 𝛽' 𝑙𝑛𝑥?'# + 6 6 𝛽"' 𝑙𝑛𝑥?'# 𝑙𝑛𝑥?"# + 6 𝜆' 𝐷?'# + 𝜂? + 𝜁# + 𝑣?# − 𝑢?#                               (3)
                                  2
                  'B;                "B; 'B;                         'B<
where 𝑖 indexes the fish farm; 𝑌?# represents the quantity of fish harvested in kilograms per hectare
of pond water area; 𝑥?'# denotes the quantity of input variable 𝑘 used per hectare (see Table 2.1
for details on how the various input variables are defined), 𝜂? is the unobserved heterogeneity term
and 𝜁# are time dummies; 𝐷?'# are dummy variables which take the value 1 for zero-valued
observations of the input variables, and 0 otherwise, which are included to facilitate the logarithmic
transformation of the explanatory variables with zeros.15 The technical inefficiency term takes the
following form:
   𝑢?# = 𝛼L + 𝛼; 𝐶𝐷𝐼?# + 𝛼< 𝑅𝑁𝐹𝐸?# +𝛼9 𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂?# + 𝛼U 𝐺𝐸𝑁𝐷𝐸𝑅?
                     + 𝛼M 𝐸𝐷𝑈𝐶? + 𝛼K 𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷?# + 𝛼V 𝐷𝐼𝑆𝑇_𝐻𝐻?# + 𝛼W 𝐴𝑁𝑌_𝑃𝑅𝐴𝑊𝑁?
                     + 𝛼X 𝑆𝐻𝑅𝐼𝑀𝑃_𝑁𝑂𝑃𝑅𝐴𝑊𝑁?
                    + 𝑄𝑈𝐼𝑁𝑇_𝐶𝑂𝑀𝑀?# 𝜷 + 𝜁Y + 𝜁# + 𝜀?#                                                           (4)
15
   See Battese (1997) and Henderson (2015) for details. There is no such indicator variable for the fish seed variable
since it has no zero values.
                                                        62


where the distribution of the disturbance term, 𝜀?# derives from a normal distribution with zero
mean and variance, 𝜎T< truncated at −z?# 𝛼 with z?# denoting a vector of the aforementioned
determinants of technical inefficiency (see below for specifics on how each of these variables are
defined).
    •  𝐶𝐷𝐼?# denotes the crop diversification indicator variable, which takes the value 1 if the farm
       produces any crops.
    •  𝑅𝑁𝐹𝐸?# is the share of total household income from non-farm activities, ranging from 0
       to 1.
    •  𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂?# is the dependency ratio.
    •  𝐺𝐸𝑁𝐷𝐸𝑅? is a dummy variable, which takes the value of 1 if the household head is male.
    •  𝐸𝐷𝑈𝐶? denotes the household head’s level of education in years.
    •  𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷?# is the mean distance of fishponds to the nearest road (in kilometers)
    •  𝐷𝐼𝑆𝑇_𝐻𝐻?# is the average distance of fishponds from the household’s residence (in
       kilometers)
    •  𝐴𝑁𝑌_𝑃𝑅𝐴𝑊𝑁? is a dummy variable for whether the household produces any prawn,
       where the “only fish” subgroup is the comparison category.
    •  𝑆𝐻𝑅𝐼𝑀𝑃_𝑁𝑂𝑃𝑅𝐴𝑊𝑁? is a dummy variable indicating that the household cultivates
       shrimp but no prawn, with the “only fish” category as the base.
    •  𝑄𝑈𝐼𝑁𝑇_𝐶𝑂𝑀𝑀?# is a vector of dummies representing the commercialization quintiles,
       whose coefficients are interpreted relative to the lowest quintile (the omitted category).
       The 5th quintile is the most commercialized.
                                                63


    A negative and significant estimated coefficient signifies a decline in technical inefficiency in
response to a marginal increase in the variable of interest. We include farmer district dummies, 𝜁Y
to control for agroecological and infrastructural differences across districts and a time trend, 𝜁# to
capture secular effects common to all fish farmers in the inefficiency part of the model. While
controlling for unobserved heterogeneity at a much granular level (i.e., the farm household level)
is preferred, doing so worsened the model fit as indicated by the log-likelihood ratio statistic.
Moreover, the estimated coefficients in the inefficiency part of the model appear inflated, which
is symptomatic of underlying model convergence issues. Hence, we instead include farmer district
dummies to partly control for this unobserved heterogeneity.
         Following Coelli et al. (2005), we define technical efficiency as the ratio of observed fish
output, 𝑌?# to the maximum potential output that a fully efficient fish farmer can produce using
the same set of inputs:
                                         𝑌?# 𝑓 (x?# ; 𝛽 )𝑒 S"# ZT"#
                                  𝑇𝐸?# =      =                     = exp(−𝑢?# )                  (5)
                                         𝑌?#∗   𝑓(x?# ; 𝛽)𝑒 S"#
Varying between 0 and 1, this unobservable technical efficiency score is approximated using the
sample analog of the following conditional expectation term, 𝐸 [𝑒 ZT"# | 𝜖?# ], where 𝜖?# = 𝑣?# − 𝑢?#
(Kumbhakar, et al., 2020).
2.3.2 Allocative Efficiency Estimation
         Next, we obtain our allocative inefficiency measure as the natural logarithm of the ratio
between observed wage rate and the estimated marginal revenue product of labor (𝑀𝑅𝑃𝐿),
      [ "#
𝑙𝑛 —3F%-      ˜ (Barrett, et al., 2008; Henderson, 2014). The 𝑀𝑅𝑃𝐿?# is derived from the stochastic
           "#
frontier production function estimates as the product of per unit fish price and the marginal
physical product of family labor employed on fish farms, which is estimated as follows:
                                                     64


                                       𝛿𝑌?#    𝛿𝑙𝑛𝑌?#    𝑌?#
                                             =         ×                                          (6)
                                       𝛿𝐿\?#   𝛿𝑙𝑛𝐿\?#   𝐿\?#
        An allocative inefficiency (𝐴𝐼) score of zero is what theory predicts in the absence of labor
market frictions and other input or credit market failures. By contrast, 𝐴𝐼?# < 0 (𝐴𝐼?# > 0) signifies
an undersupply (oversupply) of on-farm labor than is optimal.
        In line with Barrett et al. (2008) and Henderson (2014), for non-wage employed
households, we impute their 𝐴𝐼 scores from the predicted values resulting from the regression of
𝐴𝐼 on select household- and farm-level covariates. In doing so, we account for sample selection
using Heckman (1979)’s two-step approach. Among the covariates included in the selection
equation are fish price, the dependency ratio, household size, the household head’s gender,
educational attainment as well as a time dummy. For consistent inference, we adjust the standard
errors accordingly via bootstrapping with 1,000 replications in the second stage.
        In what follows, we present our main finding from the two-step Heckman procedure. This
method has the advantage of removing bias due to nonrandom selection as a result of missing wage
data for households not involved in wage labor (Heckman, 1979). First, we estimate a probit model
on the entire sample using households’ participation in off-farm work for wages as the dependent
variable. We then include the resulting inverse Mills ratio (IMR) in an augmented regression of
the 𝐴𝐼 scores on select household- and farmer-level characteristics for the selected subsample.
Next, we impute 𝐴𝐼 scores for the households without wage labor from the resulting predicted
values. Results from the selectivity and 𝐴𝐼 regressions are summarized in Table 2.8 in the
APPENDIX. Most importantly, the estimated coefficient on the 𝐼𝑀𝑅 is statistically significant at
the 1 percent level, implying that sample selection bias could be an issue if not accounted for.
                                                 65


         Further, for ease of comparison with the 𝑇𝐸 results, we transform the 𝐴𝐼 variable into an
allocative efficiency, 𝐴𝐸 equivalent such that 𝐴𝐸 → 1 as 𝐴𝐼 → 0 and 𝐴𝐸 → 0 when |𝐴𝐼| →
 ∞. Following Henderson (2014), we perform this transformation using the kernel of the normal
density function:
                                                                  𝐴𝐼?#<
                                               𝐴𝐸?# = exp œ− < •                                                (7)
                                                                 2𝜎I2
                                                                <
centered around the “ideal" 𝐴𝐼 mean of zero, where 𝜎I2             denotes the variance of 𝐴𝐼 around a mean
of zero.
         We specify the following regression equation to quantify the associations between the
different diversification strategies and allocative efficiency:
                         𝐴𝐸?# = 𝜆L + 𝜆; 𝐶𝐷𝐼?# + 𝜆< 𝑅𝑁𝐹𝐸?# + 𝑿?# 𝜸 + 𝜂? + 𝜖?#                                    (8)
using a fixed effects panel estimator, where the righthand-side variables are as previously defined;
𝑿?# is a vector of the time-varying subset of controls included in the 𝑇𝐸 regression16; 𝜂? is a
household fixed effects term to control for unobserved household-specific heterogeneity and 𝜖?# is
the zero-valued disturbance term. We adjust for within-household correlation in the errors over
time by clustering the standard errors at the household-level (Abadie, et al., 2023).
2.4 Regression Results
2.4.1 Determinants of Technical Efficiency
         Table 2.3 reports the SFPF regression estimation results for both the Cobb-Douglas (C-D)
and Translog specifications in columns (1) and (2), respectively. In both specifications, we control
for unobserved household heterogeneity using the Mundlak-Chamberlain method and cluster
16
   We also include the households’ value of aquaculture-related assets and its quadratic term. However, we do not
include the production system dummies as standalone variables in this regression. This is because they do not vary
over time; hence, are absorbed into the fixed effects term.
                                                          66


standard errors at the household-level. Column (1) shows that all the input variables are positive
and statistically significant at the 1 percent level, indicating that fish yield is monotonically
increasing in all inputs. However, as shown in Table 2.3, results from a Wald likelihood ratio test
point towards a rejection of the joint test null that the coefficients on all quadratic and interaction
                                                                 <
terms are equal to zero (test statistic: 94.76 > 24.99 = 𝜒;M       ). Hence, we base the rest of our
analyses on the translog specification results.
        A test of the null hypothesis that production technology is characterized by constant returns
to scale (CRS) follows the derivation of output elasticities for the respective production input
variables evaluated at the pooled sample means. We find no evidence to reject the null that the
sum of the individual input elasticities equals 1, indicating constant returns to scale. Of the 5 inputs
considered, the output elasticity of familial labor is the highest (0.634), followed by fish seed
(0.497). These estimated elasticities are significant at the 5 percent level or better. By contrast, the
output elasticities for hired labor, feed, and non-feed inputs are not statistically distinguishable
from zero.
        Before turning to our main regression results, we first present graphical evidence on the
distribution of the 𝑇𝐸 and 𝐴𝐸 scores. The average 𝑇𝐸 score is 66%, which is lower than the
median as indicated by the left-skewed distribution of the 𝑇𝐸 scores histogram plot (see Figure
2.2). By contrast, we observe a pile-up at zero for the 𝐴𝐸 scores, indicating that allocative
inefficiency is rather the norm. In particular, the average 𝐴𝐸 score hovers around 34%.
        Results from the inefficiency part of the SFPF estimation are shown in Table 2.4. Column
(1) presents regression results from the most parsimonious specification. In column (2), we include
a squared RNFE term to capture potential nonlinearities between RNFE and technical inefficiency.
                                                   67


Column (3) includes an interaction between the crop diversification dummy and RNFE variable to
test for the extent to which RNFE conditions the crop diversification-technical efficiency relation.
         As indicated earlier, a negative (positive) estimated coefficient as reported in Table 2.4
indicates that technical inefficiency is declining (increasing) in the said household- or farmer-level
characteristic. We first discuss the results in column (1). The following results are of note. The
coefficient of the crop diversification indicator is not statistically different from zero. This result
suggests that, if anything, crop diversification does not compete with fish production on average,
all else held constant. Similarly, we do not find a significant association between non-farm income
diversification and technical efficiency. This result corroborates findings in other studies that also
report an insignificant relationship between off-farm employment and technical efficiency (Chavas
et al., 2005; Yang et al., 2016).
         Table 2.4 also reports weak evidence of a nonlinear effect of RNFE on technical efficiency.
The coefficient of the squared term is positive and statistically significant at the 10 percent level.
This is consistent with evidence from Bangladesh, where Mondal et al. (2020) also report similar
nonlinear effects of RNFE among crop producers.
         In other results, we find that producing any prawn has a negative and statistically
significant effect on technical efficiency. This effect is significant at the 10 percent level or better.
By contrast, the coefficient on the shrimp, but no prawn variable is not statistically different from
zero.
         The results also show that technical inefficiency is negatively and significantly decreasing
with aquaculture commercialization. Note that the parameter estimates on the commercialization
quintile dummies are interpreted relative to the bottom quintile and are each significant at the 10
percent level or better, depending on the specification. We also find that technical inefficiency is
                                                  68


increasing with fishpond remoteness, suggesting that farms are more productive the closer they
are to fish input and output markets.
         To address our first research question, in column (3), we interact the crop diversification
dummy with the non-farm income share variable. The coefficient on the interaction term is not
significant, indicating that the association between non-farm income on technical efficiency does
not differ by crop diversification status, on average. This result is slightly at odds with the results
reported in Table 2.9 in the APPENDIX, where we instead use a continuous crop diversification
variable, the Simpson diversification index.17 Table 2.9 shows that at higher levels of the non-farm
income share, diversifying into crop production results in technical inefficiencies. This effect is
significant at the 10 percent level. Perhaps, the Simpson index captures a richer variation in crop
diversification than the crop diversification indicator variable; hence, the differences we observe.
2.4.2 Relationship between diversification and allocative efficiency
         Next, we turn to estimating the association between livelihood diversification and
allocative efficiency. Regression results are reported in Table 2.5. We begin with the results in
column (1). We find a positive association between crop diversification and allocative efficiency
on average, ceteris paribus. The estimated coefficient of the crop diversification dummy is
significant at the 10 percent level. Since most fish farms (62%) in our sample tend to overuse
family labor, crop diversification could be absorbing some of this surplus household labor, thereby
improving allocative efficiency. By contrast, we do not detect any meaningful impact of RNFE on
allocative efficiency. This may suggest that fish farms may not be buying labor with cash from
RNFE to optimize household labor allocation between farm and off-farm activities.
                                                  69


        As a robustness check and due to the pile up at zero for the allocative efficiency scores, we
also present results from a Tobit regression for our most parsimonious specification. The estimated
coefficient and marginal effects are reported in Table 2.11 in the APPENDIX. As can be seen, the
results are very similar in magnitude to our FE-OLS regression estimates.
        Other interesting results also emerge. We find that the dependency ratio is positively and
significantly associated with allocative efficiency, all else held constant. Allocative efficiency was
also found to be decreasing in the average distance of fish plots from the household’s residence.
This effect is significant at the 10 percent level.
        In column (3), we include an interaction between the crop diversification dummy and the
RNFE variable. We find a negative relationship between crop diversification and allocative
efficiency at higher levels of non-farm income share. This effect is statistically significant at the 1
percent level. We suspect that undertaking both diversification strategies severely constrains
household labor. Further, the toll on managerial ability from juggling both on-farm and off-farm
diversification could undermine how efficiently farms use household labor. These results are
qualitatively similar to alternative specifications, where we instead use the continuous crop
diversification variable (see Table 2.10 in the APPENDIX).
2.4.3 Diversification and Fish Input Demand
        Table 2.6 presents the effect of both crop diversification and RNFE on fish input demand.
We estimate effects on demand for household labor, hired labor, fish seed, feed, and nonfeed inputs
per hectare of pond water area as defined in Table 2.1. However, for the nonlabor input variables,
we instead use expenditure data to better capture input quality.
        Results indicate that crop diversification increases demand for family labor on fish farms,
all else held constant. This can be due to the relatively higher demand for labor to manage sub-
                                                   70


systems on the farm in the face of labor market frictions. Moreover, crops are typically cultivated
in close proximity to fishponds, and sometimes integrated within a single plot. Hence, more family
labor use in crop production may imply more intense household labor use on fish farms. Also, this
may reflect cash investment from crop sales into aquaculture, demanding more family labor for
pond repair, guarding of fishponds, harvesting, among others. This explanation, however, is less
plausible given that a relatively small share of household income comes from crop sales (less than
10%).
         Similarly, we find that a higher non-farm income share increases household labor demand
on average, ceteris paribus. This is a surprising result as we would expect higher non-farm incomes
to increase hired labor use, substituting for family labor on fish farms. That said, this result aligns
with Takahashi and Otsuka (2009)’s findings, which demonstrated increased utilization of family
labor on rice farms when rice income was the primary source of earnings. By contrast, the
coefficient on the interaction between RNFE and the crop diversification indicator is not
statistically significant. The constraining effect of adopting both diversification strategies could
explain this result as there may be no residual family labor pool to draw from.
         Table 2.6 also shows that neither diversification strategies increased hired labor demand,
which partly justifies the household labor supply effects we observe. This could also suggest large
transaction costs in hiring in labor. In other results, we do not find any significant association
between income diversification and fish seed, feed, and nonfeed input expenditure.
2.4.4 Association between RNFE and Crop Diversification
         Table 2.7 presents regression results on the relationship between RNFE and crop
diversification. As hypothesized, RNFE can compete with crop diversification for household labor
time. Further, RNFE could substitute for crop diversification as a source of cash for fish input
                                                   71


purchase. On the other hand, RNFE can stimulate agricultural diversification such as into cash
cropping from traditional food crops. Our results show that there is a negative and significant
association between RNFE and crop diversification, suggesting a substitution between off-farm
activities and diversification into crops.
2.5 Conclusions
         Using panel survey data on fish farming households in southern Bangladesh, we examine
fish efficiency and agricultural diversification, and how RNFE conditions each and their relation.
We apply a fixed effects estimator to control for unobserved household-specific heterogeneity and
derive technical efficiency estimates by fitting a stochastic frontier production function (SFPF).
Following Barrett et al. (2008) and Henderson (2014), we also obtain a proxy for allocative
efficiency using the imputation methods depicted therein. We derive the following key findings:
         First, we do not find any significant relationship between crop diversification and technical
efficiency. Similarly, there is no significant association between non-farm income diversification
and technical efficiency. This result is supported by the finding that RNFE does not significantly
increase non-family labor input purchase, on average.
         Second, we do not find a significant interaction effect between crop diversification and
RNFE on technical efficiency. By contrast, we find that higher levels of the non-farm income share
result in a negative and significant (at the 10% level) association between crop diversification and
technical efficiency, when we define crop diversification using the Simpson diversification index.
We hypothesize that this may be due to the constraining effect of both diversification strategies on
family labor. Indeed, we show that the effect of undertaking both diversification strategies on
household labor demand is nil. Coupled with complexities of multitasking, adopting both strategies
may place enormous strain on family labor, thereby negatively impacting aquaculture productivity.
                                                    72


         Third, we find a positive and significant crop diversification effect on allocative efficiency.
This may reflect a reallocation of surplus family labor from aquaculture to crop production, where
family labor is overused. On the other hand, RNFE does not exert any meaningful impact on
allocative efficiency, all else held constant.
         Fourth, our results also indicate that for crop diversified households, higher levels of the
non-farm income share reduces allocative efficiency. Following the same reasoning, allocative
inefficiencies could result due to the strain on family labor resources. Moreover, we show that
hired labor demand does not increase significantly with income diversification to relieve the
pressure on familial labor.
         Fifth, we find evidence of a substitution between crop diversification and RNFE. Perhaps,
this points to the competing demand for household labor across both activities. On the other hand,
this may reflect the view that cash from RNFE for input purchase may substitute for liquidity from
cash cropping.
         This study opens up other avenues for future research. To start with, research into the nature
of crop diversification activities undertaken by the households could offer richer insights into
which crops complement or compete with aquaculture.
         Further, it will be useful to disentangle the household labor effects from the cash impact of
undertaking both on-farm and off-farm diversification strategies. The general impression is that
the cash effect is minimal given the limited impacts on non-household labor input expenditure.
         Moreover, an investigation of the impact of interspecies diversification on fish efficiency
is also of research interest. Such diversification can be seen as a practical way of diversifying risks
associated with species-specific disease outbreaks and price volatility.
                                                    73


        It should be noted that the results we present here are associational and should be
interpreted with caution. Hence, additional work on the causal interpretation of the three-way
relationship among crop diversification, RNFE, and farm efficiency is a valuable research pursuit
and left to future research.
                                              74


                                         BIBLIOGRAPHY
Abadie, A., Athey, S., Imbens, G. W. & Wooldridge, J. M., 2023. When should you adjust
        standard errors for clustering?. The Quarterly Journal of Economics, 138(1), pp. 1-35.
Ali, H. et al., 2022. Economic performance characterization of intensive shrimp (Penaeus
        monodon) farming systems in Bangladesh. Aquaculture, Fish and Fisheries, 2(1), pp. 57-
        70.
Barrett, C. B., Sherlund, S. M. & Adesina, A. A., 2008. Shadow wages, allocative inefficiency,
        and labor supply in smallholder agriculture. Agricultural Economics, 38(1), pp. 21-34.
Begum, E. A., Hossain, M. I. & Papanagiotou, E., 2013. Technical Efficiency of Shrimp
        Farming in Bangladesh: An Application of the Stochastic Production Frontier Approach.
        Journal of the World Aquaculture Society, 44(5), pp. 641-654.
Bezemer, D., Balcombe, K., Davis, J. & Fraser, I., 2005. Livelihoods and farm efficiency in rural
        Georgia. Applied Economics, Volume 37, p. 1737–1745.
Chavas, J.-P., Petrie, R. & Roth, M., 2005. Farm Household Production Efficiency: Evidence
        from The Gambia. American Journal of Agricultural Economics, 87(1), p. 160–179.
Coelli, T. & Fleming, E., 2004. Diversification economies and specialisation efficiencies in a
        mixed food and coffee smallholder farming system in Papua New Guinea. Agricultural
        Economics, 31(2-3), pp. 229-239.
Deichmann, U., Shilpi, F. & Vakis, R., 2009. Urban Proximity, Agricultural Potential and Rural
        Non-farm Employment: Evidence from Bangladesh. World Development, 37(3), pp. 645-
        660.
DoF, 2022. Yearbook of Fisheries Statistics of Bangladesh, 2020-21, Bangladesh: Fisheries
        Resources Survey System (FRSS), Department of Fisheries. Bangladesh: Ministry of
        Fisheries and Livestock.
Emran, S.-A.et al., 2022. Impact of cropping system diversification on productivity and resource
        use efficiencies of smallholder farmers in south-central Bangladesh: a multi-criteria
        analysis. Agronomy for Sustainable Development, 42(4), p. 78.
Faxon, H. O., 2020. The Peasant and Her Smartphone: Agrarian Change and Land Politics in
        Myanmar, s.l.: s.n.
Heckman, J. J., 1979. Sample selection bias as a specification error. Econometrica: Journal of
        the Econometric Society, pp. 153-161.
Henderson, H., 2014. Considering Technical and Allocative Efficiency in the Inverse Farm Size–
        Productivity Relationship. Journal of Agricultural Economics, 66(2), p. 442–469.
Huang, J., Wu, Y. & Rozelle, S., 2009. Moving off the farm and intensifying agricultural
        production in Shandong: a case study of rural labor market linkages in China.
        Agricultural Economics, Volume 40, pp. 203-218.
                                                 75


Islam, A. H. M. S., 2021. Dynamics and Determinants of Participation in Integrated
        Aquaculture– Agriculture Value Chain: Evidence from a Panel Data Analysis of
        Indigenous Smallholders        in Bangladesh. The Journal of Development Studies,
        57(11), pp. 1871-1892.
Jahan, K. M. et al., 2015. Aquaculture technologies in Bangladesh: An assessment of technical
        and economic performance and producer behavior, Penang, Malaysia: WorldFish.
Kennedy, E. T. & Cogill, B., 1987. Income and nutritional effects of the commercialization of
        agriculture in southwestern Kenya. s.l.:Intl Food Policy Res Inst.
Kilic, T., Carletto, C., Miluka, J. & Savastano, S., 2009. Rural non-farm income and its impact
on      agriculture: evidence from Albania. Agricultural Economics, Volume 40, pp. 139-160.
Kumbhakar, S. C., Parmeter, C. F. & Zelenyuk, V., 2020. Stochastic frontier analysis:
        Foundations and advances I. Handbook of production economics, pp. 1-40.
Lanjouw, P. & Shariff, A., 2004. Rural non-farm employment in India: Access, incomes, and
        poverty impact. Economic and Political Weekly , pp. 4429-4446.
Mondal, R. K., Selvanathan, E. A. & Selvanathan, S., 2020. Nexus between rural non-farm
        income and agricultural production in Bangladesh. Applied Economics, 53(10), pp. 1184-
        1199.
Pfeiffer, L., López-Feldman, A. & Taylor, J. E., 2009. Is off-farm income reforming the farm?
        Evidence from Mexico. Agricultural Economics, Volume 40, pp. 125-138.
Rahman, S., Kazal, M. M. H., Begum, I. A. & Alam, M. J., 2017. Exploring the future potential
        of jute in Bangladesh. Agriculture, 7(12), p. 96.
Schreinemachers, P. et al., 2016. Farmer training in off-season vegetables: Effects on income
        and pesticide use in Bangladesh. Food Policy, Volume 61, pp. 132-140.
Stevenson, R. E., 1980. Likelihood functions for generalized stochastic frontier estimation.
        Journal of Econometrics, 13(1), pp. 57-66.
Takahashi, K. & Otsuka, K., 2009. The increasing importance of non-farm income and the
        changing use of labor and capital in rice farming: the case of Central Luzon, 1979–2003.
        Agricultural Economics, 40, pp. 231-242.
Von Braun, J. & Kennedy, E. T., 1994. Agricultural commercialization, economic development,
        and nutrition, s.l.: Johns Hopkins University Press.
Wooldridge, J. M., 2019. Correlated random effects models with unbalanced panels. Journal of
        Econometrics, 211(1), p. 137–150.
Yang, J. et al., 2016. Migration, local off-farm employment, and agricultural production
        efficiency: evidence from China. Journal of Productivity Analysis, 45, pp. 247-259.
                                                 76


                                   APPENDIX A: TABLES AND FIGURES
Table 2.1
                                             Description of key variables
  Variable                                 Description
  Yield (kg/ha)                            Total quantity of harvested fish (including shrimp and
                                           prawns) from the whole farm in last production cycle per
                                           hectare of pond water area.
  Familial labor (days/ha)                 Total person-days of household labor employed per hectare
                                           spent on activities such as pond and dyke repair, stocking,
                                           feeding, fertilizer application, weeding, guarding,
                                           harvesting, and marketing.
  Hired labor (days/ha)                    Total person-days of hired labor per hectare.
  Fish seed stocked (kg/ha)                Total quantity of fish seed stocked in the last production
                                           cycle per hectare.
  Feed inputs (kg/ha)                      Total quantity of both commercial pelleted and own farm-
                                           made feed applied over last cropping cycle per hectare.
  Non-feed inputs (kg/ha)                  Total quantity of urea, NPK18, TSP, DAP, cow dung, lime,
                                           salt, and other organic manure applied per hectare.19
  Fish price (BDT/kg)                      Price per kilogram of fish sold in Bangladeshi Taka
  Wage rate (BDT/day)                      Daily wage rate in Bangladeshi Taka
  Fish farm income share                   Share of total household income from fish sales
  Crop farm income share                   Share of total household income from crop sales
  Non-farm income share                    Share of total household income from non-farm sources
                                           (i.e., wage and self-employment as well as remittances)
  Crop diversification dummy An indicator variable which takes the value 1 if household
  (0/1)                                    produces any crop, mostly rice and vegetables
  Simpson Diversification                  Degree of crop diversification (0 indicates no crop
         20
  Index                                    diversification)
  Marketed surplus share (%)               Share of harvested fish that is sold
  Distance to nearest road (km) Mean distance of fish plots from the nearest road in
                                           kilometers
18
   NPK denotes nitrogen, phosphorus, and potassium; TSP denotes triple super phosphate; DAP denotes diammonium
phosphate.
19
   We also explored the sensitivity of our main results to alternative feed and nonfeed input use variables. In particular,
we use feed and nonfeed input values in place of the quantity measures. While these alternative input variables may
better reflect input quality, the model fit for our stochastic frontier production function (SFPF) regressions are slightly
worse. Further, the input parameter estimates are attenuated. This result could be partially attributed to a worsening
of the measurement error problem especially for the fixed effects estimator since the input values incorporate self-
reported price information. Hence, we prefer the quantity-based feed and nonfeed input measures.
                                                              77


Distance to household (km) Mean distance of fish plots from the household’s residence
                           in kilometers
Table 2.1 (cont’d)
Household size             Total number of adults and children living in the household
Dependency ratio           Number of household members aged < 15 plus 65+ divided
                           by those aged 15 - 64 years old
Gender (Male = 1)          Household head’s gender
Education (years)          Household head’s years of schooling
Off-farm (0/1)             Dummy for off-farm work participation
                                         78


Table 2.2
                    Summary statistics on key variables by crop diversification status
                                                                  Total         Crop diversification status
                                                                                                            Not
                                                                                 Diversified           diversified
 Production variables21
          Yield (kg/ha)                                         2189.46           2274.81                1997.65
          Familial labor (days/ha)*                              805.33            882.97                 630.85
          Hired labor (days/ha)                                   54.78             64.17                  33.68
          Fish seed (kg/ha)                                     1626.11           1594.43                1697.32
          Feed inputs (kg/ha)*                                  2206.24           2484.90                1580.03
          Nonfeed inputs (kg/ha)                                1269.09           1071.59                1712.90
 Price variables
          Fish price (BDT/kg)*                                   234.32            217.88                 272.07
          Wage rate (BDT/day)                                    311.13            314.88                 301.34
 Household characteristics
          Gender (Male = 1)                                        0.96              0.97                   0.95
          Education (years)                                         5.4               5.6                    5.0
          Household size                                            4.6               4.7                    4.5
          Dependency ratio                                         0.60              0.61                   0.56
          Off-farm (0/1)*                                          0.60              0.63                   0.54
 Income Diversification variables
          Fish farm income share (%)*                               49                47                     54
          Crop farm income share (%)*                                8                12
          Non-farm income share (%)*                                43                41                     46
          Crop diversification (0/1)*                              0.69              1.00                   0.00
          Simpson index*                                           0.17              0.24                   0.00
 Plot-level variables
          Distance to nearest road (km)                            0.66              0.61                   0.77
          Distance to household (km)                               0.55              0.57                   0.51
 Observations                                                     1156               800                    356
Notes: Values reported are means unless otherwise stated. Monetary values are expressed in 2014 constant prices to
account for inflation. Simpson index ranges from 0 to 1, where 0 represents no diversification. * indicates significant
difference in means by crop diversification status at the 5% level of better.
                                                           79


Table 2.3
              Stochastic Frontier Production Function (SFPF) estimation results
                                     Dependent variable: ln(Yield)
                                                  (1)                           (2)
 Variable                                   Cobb-Douglas                    Translog
 𝑙𝑛(𝐹𝑎𝑚𝑖𝑙𝑖𝑎𝑙 𝑙𝑎𝑏𝑜𝑟)                            0.180***                       -0.164
                                                (0.025)                      (0.128)
 𝑙𝑛(𝐻𝑖𝑟𝑒𝑑 𝑙𝑎𝑏𝑜𝑟)                               0.131***                       0.147
                                                (0.026)                      (0.144)
 𝑙𝑛(𝐹𝑖𝑠ℎ 𝑠𝑒𝑒𝑑)                                 0.118***                     -0.273**
                                                (0.031)                      (0.132)
 𝑙𝑛(𝐹𝑒𝑒𝑑)                                      0.182***                      0.270*
                                                (0.024)                      (0.157)
 𝑙𝑛(𝑁𝑜𝑛 − 𝑓𝑒𝑒𝑑)                                0.132***                      0.248*
                                                (0.020)                      (0.134)
 1                                                                         0.061***
   × 𝑙𝑛(𝐹𝑎𝑚𝑖𝑙𝑖𝑎𝑙 𝑙𝑎𝑏𝑜𝑟)<                                                     (0.015)
 2
 1                                                                            0.008
   × 𝑙𝑛(𝐻𝑖𝑟𝑒𝑑 𝑙𝑎𝑏𝑜𝑟)<                                                        (0.028)
 2
 1                                                                            0.033
   × 𝑙𝑛(𝐹𝑖𝑠ℎ 𝑠𝑒𝑒𝑑)<                                                          (0.023)
 2
 1                                                                            -0.011
   × 𝑙𝑛(𝐹𝑒𝑒𝑑)<                                                               (0.020)
 2
 1                                                                          -0.039**
   × 𝑙𝑛(𝑁𝑜𝑛 − 𝑓𝑒𝑒𝑑)<                                                         (0.015)
 2
 𝑙𝑛(𝐹𝑎𝑚𝑖𝑙𝑖𝑎𝑙) × 𝑙𝑛(𝐻𝑖𝑟𝑒𝑑)                                                     -0.001
                                                                             (0.009)
 𝑙𝑛(𝐹𝑎𝑚𝑖𝑙𝑖𝑎𝑙) × 𝑙𝑛(𝐹𝑖𝑠ℎ 𝑠𝑒𝑒𝑑)                                                0.028*
                                                                             (0.016)
 𝑙𝑛(𝐹𝑎𝑚𝑖𝑙𝑖𝑎𝑙) × 𝑙𝑛(𝐹𝑒𝑒𝑑)                                                     -0.013*
                                                                             (0.007)
 𝑙𝑛(𝐹𝑎𝑚𝑖𝑙𝑖𝑎𝑙) × 𝑙𝑛(𝑁𝑜𝑛 − 𝑓𝑒𝑒𝑑)                                               -0.016*
                                                                             (0.009)
 𝑙𝑛(𝐻𝑖𝑟𝑒𝑑) × 𝑙𝑛(𝐹𝑖𝑠ℎ 𝑠𝑒𝑒𝑑)                                                    -0.017
                                                                             (0.014)
 𝑙𝑛(𝐻𝑖𝑟𝑒𝑑) × 𝑙𝑛(𝐹𝑒𝑒𝑑)                                                         0.002
                                                                             (0.006)
 𝑙𝑛(𝐻𝑖𝑟𝑒𝑑) × 𝑙𝑛(𝑁𝑜𝑛 − 𝑓𝑒𝑒𝑑)                                                   0.009
                                                                             (0.009)
                                             80


 Table 2.3 (cont’d)
 𝑙𝑛(𝐹𝑖𝑠ℎ 𝑠𝑒𝑒𝑑) × 𝑙𝑛(𝐹𝑒𝑒𝑑)                                                                        -0.001
                                                                                                (0.010)
 𝑙𝑛(𝐹𝑖𝑠ℎ 𝑠𝑒𝑒𝑑) × 𝑙𝑛(𝑁𝑜𝑛 − 𝑓𝑒𝑒𝑑)                                                                  0.023
                                                                                                (0.014)
 𝑙𝑛(𝐹𝑒𝑒𝑑) × 𝑙𝑛(𝑁𝑜𝑛 − 𝑓𝑒𝑒𝑑)                                                                       0.001
                                                                                                (0.006)
 𝑦𝑟2020                                                    -0.245**                              -0.042
                                                            (0.111)                             (0.119)
 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                                  2.888***                           4.681***
                                                            (0.239)                             (0.923)
 Output elasticity wrt
          𝐹𝑎𝑚𝑖𝑙𝑖𝑎𝑙 𝑙𝑎𝑏𝑜𝑟                                                                      0.634***
                                                                                                (0.116)
          𝐻𝑖𝑟𝑒𝑑 𝑙𝑎𝑏𝑜𝑟                                                                            0.155
                                                                                                (0.123)
          𝐹𝑖𝑠ℎ 𝑠𝑒𝑒𝑑                                                                            0.497**
                                                                                                (0.202)
          𝐹𝑒𝑒𝑑                                                                                   0.023
                                                                                                (0.174)
          𝑁𝑜𝑛 − 𝑓𝑒𝑒𝑑                                                                             -0.188
                                                                                                (0.120)
 𝐻] : Constant returns to scale
          LR-statistic (p-value)                                                             0.12 (0.731)
 Joint test of significance 𝛃"' = 0
          LR-statistic                                                                        94.76***
 Household FE                                                    ü                                       ü
 Log pseudolikelihood                                      -1305.26                            -1257.88
 Observations                                                1,158                               1,158
Notes: Standard errors are clustered at the household-level and are reported in parentheses. LR-statistic denotes the
Likelihood ratio statistic. We control for household fixed effects using the Mundlak-Chamberlain approach. 𝑝 <
0.10,∗∗ 𝑝 < 0.05,∗∗∗ 𝑝 < 0.01. wrt denotes “with respect to.”
                                                         81


Table 2.4
             Determinants of Technical inefficiency from Translog SFPF
                                       Dependent variable: Technical inefficiency
                                                 Coefficient estimates
                                                          (S.E.)
 Variables                            (1)               (2)                 (3)
 𝐶𝐷𝐼                                 0.225             0.542              -1.469
                                   (0.684)           (0.987)             (1.138)
 𝑅𝑁𝐹𝐸                                1.543           -10.255              -0.792
                                   (1.068)           (6.479)             (1.590)
       <
 𝑅𝑁𝐹𝐸                                                13.143*
                                                     (7.636)
 𝐶𝐷𝐼 × 𝑅𝑁𝐹𝐸                                                                3.337
                                                                         (2.091)
 𝐴𝑁𝑌_𝑃𝑅𝐴𝑊𝑁                         1.815**            2.466*              1.672*
                                   (0.812)           (1.384)             (0.935)
 𝑆𝐻𝑅𝐼𝑀𝑃_𝑁𝑂_𝑃𝑅𝐴𝑊𝑁                     1.712             2.077               1.696
                                   (1.217)           (1.713)             (1.354)
 𝑄𝑈𝐼𝑁𝑇_𝐶𝑂𝑀𝑀:
           𝑄2                     -3.012**            -2.549             -3.150*
                                   (1.243)           (1.858)             (1.668)
           𝑄3                    -4.874***           -5.382*            -5.072**
                                   (1.714)           (3.186)             (2.390)
           𝑄4                    -4.684***           -4.928*            -4.896**
                                   (1.643)           (2.860)             (2.248)
           𝑄5                    -4.301***           -4.700*            -4.381**
                                   (1.484)           (2.615)             (1.985)
 𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂                          -0.910            -1.323              -0.969
                                   (0.662)           (1.110)             (0.783)
 𝐺𝐸𝑁𝐷𝐸𝑅                             -2.529            -2.637              -2.909
                                   (1.907)           (2.538)             (2.379)
 𝐸𝐷𝑈𝐶                             -0.222**            -0.286             -0.235*
                                   (0.104)           (0.187)             (0.138)
 𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷                        1.149***            1.509*             1.142**
                                   (0.362)           (0.809)             (0.517)
 𝐷𝐼𝑆𝑇_𝐻𝐻                            -0.021             0.006              -0.022
                                   (0.145)           (0.208)             (0.158)
 𝑦𝑟2020                              5.277             9.169               5.669
                                   (4.727)           (8.465)             (5.683)
                                          82


  Table 2.4 (cont’d)
  𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                           -7.140              -11.516                    -6.291
                                                    (4.864)               (9.194)                  (5.842)
  𝜎T                                               2.162***             2.393***                  2.224***
                                                    (0.278)               (0.569)                  (0.442)
  𝜎S                                               0.520***             0.527***                  0.519***
                                                    (0.024)               (0.029)                  (0.026)
       𝜎T                                          4.159***             4.539***                  4.286***
  𝜆=
       𝜎S                                           (0.274)               (0.557)                  (0.433)
  District FE                                            ü                     ü                        ü
  Observations                                        1,158                1,158                     1,158
Notes: Negative coefficients indicate a decline in technical inefficiency with a marginal increase in the variable of
interest. Standard errors are reported in parenthesis and are clustered at the household level ∗ 𝑝 < 0.10,∗∗ 𝑝 < 0.05,∗
∗∗ 𝑝 < 0.01.
                                                           83


Table 2.5
                              Allocative efficiency regression using FE-OLS
                                                         Dependent variable: Allocative efficiency
                                                                     Coefficient estimates
                                                                               (S.E.)
 Variables                                              (1)                        (2)                    (3)
 𝐶𝐷𝐼                                                  0.059*                     0.058*               0.145***
                                                     (0.033)                    (0.033)                 (0.043)
 𝑅𝑁𝐹𝐸                                                  0.054                      0.164               0.198***
                                                     (0.049)                    (0.158)                 (0.074)
 𝑅𝑁𝐹𝐸 <                                                                          -0.128
                                                                                (0.176)
 𝐶𝐷𝐼 × 𝑅𝑁𝐹𝐸                                                                                           -0.217***
                                                                                                        (0.083)
 𝑄𝑈𝐼𝑁𝑇_𝐶𝑂𝑀𝑀:
                    𝑄2                                -0.056                     -0.060                 -0.059
                                                     (0.047)                    (0.048)                 (0.046)
                    𝑄3                                -0.054                     -0.060                 -0.052
                                                     (0.049)                    (0.050)                 (0.048)
                    𝑄4                             -0.168***                  -0.174***               -0.159***
                                                     (0.049)                    (0.050)                 (0.049)
                    𝑄5                               -0.095*                   -0.100**                -0.091*
                                                     (0.050)                    (0.050)                 (0.049)
 𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂                                          0.091***                   0.091***               0.089***
                                                     (0.031)                    (0.031)                 (0.031)
 𝐸𝐷𝑈𝐶                                                  0.006                      0.006                  0.006
                                                     (0.005)                    (0.005)                 (0.005)
 𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷                                           0.030**                    0.029**                0.032**
                                                     (0.014)                    (0.014)                 (0.014)
 𝐷𝐼𝑆𝑇_𝐻𝐻                                             -0.022*                    -0.021*                -0.022*
                                                     (0.013)                    (0.013)                 (0.012)
 𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸/109                                 1.24e-03***                1.22e-03***             1.22e-03***
                                                  (4.42e-04)                  (4.48e-04)             (4.36e-04)
 𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸 < /10K                             -3.83e-07***                 -3.77e-07**            -3.79e-07**
                                                  (1.44e-07)                  (1.46e-07)             (1.42e-07)
 𝑦𝑟2020                                            -0.333***                  -0.337***               -0.324***
                                                     (0.023)                    (0.023)                 (0.023)
 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                           0.418***                   0.417***               0.352***
                                                     (0.061)                    (0.061)                 (0.065)
 Household FE                                             ü                          ü                      ü
 Observations                                          1,109                      1,109                  1,109
Notes: Standard errors are reported in parenthesis and are clustered at the household level. ∗ 𝑝 < 0.10,∗∗ 𝑝 < 0.05,∗
                                                     ∗∗ 𝑝 < 0.01.
                                                          84


Table 2.6
                               Effect of diversification on fish input demand
                                                              Dependent variables
                                 ln(familial          ln(hired        ln(fish seed       ln(feed in    ln(nonfeed
                                    labor)             labor)            in BDT             BDT)         in BDT)
                                                              Coefficient estimates
 Variables                                                             (S.E.)
 𝐶𝐷𝐼                               0.470**             1.236                0.459          -0.758          -0.078
                                   (0.202)            (1.772)             (0.246)          (0.568)        (0.335)
 𝑅𝑁𝐹𝐸                              0.670**             1.029               -0.751         -1.599*          -0.692
                                   (0.323)            (2.661)             (0.599)          (0.863)        (0.626)
 𝐶𝐷𝐼 × 𝑅𝑁𝐹𝐸                         -0.422             -2.145              -0.205           1.189          0.277
                                   (0.337)            (3.060)             (0.546)          (0.995)        (0.722)
 𝑄𝑈𝐼𝑁𝑇_𝐶𝑂𝑀𝑀:
             𝑄2                     -0.190             1.136              0.511*            0.773        1.349***
                                   (0.223)            (1.669)             (0.291)          (0.592)        (0.413)
             𝑄3                     -0.210            4.185**            0.725**            0.508        1.350***
                                   (0.197)            (1.641)             (0.294)          (0.568)        (0.415)
             𝑄4                    -0.359*            3.782**               0.405           0.764        1.517***
                                   (0.196)            (1.624)             (0.290)          (0.540)        (0.388)
             𝑄5                  -0.594***            4.146**            0.665**            0.394        1.739***
                                   (0.206)            (1.719)             (0.280)          (0.574)        (0.405)
 𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂                           0.092           2.840***              -0.054          -0.206          -0.174
                                   (0.103)            (1.077)             (0.226)          (0.292)        (0.182)
 𝐸𝐷𝑈𝐶                            -0.079***             -0.066              -0.011          -0.022          0.028
                                   (0.022)            (0.168)             (0.020)          (0.059)        (0.040)
 𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷                           0.019             0.293               -0.012           0.091          0.056
                                   (0.058)            (0.452)             (0.056)          (0.151)        (0.089)
 𝐷𝐼𝑆𝑇_𝐻𝐻                            -0.129             0.235               -0.039          -0.014          0.099
                                   (0.084)            (0.388)             (0.052)          (0.149)        (0.086)
 𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸/109                 -7.51e-04*       -6.73e-03***        -6.50e-04**         1.16e-03      -1.38e-04
                                 (3.93e-04)         (1.39e-03)         (3.04e-04)       (8.35e-04)      (6.85e-04)
 𝑦𝑟2020                           2.205***          16.502***           -5.336***        7.481***        2.750***
                                   (0.101)            (0.773)             (0.143)          (0.274)        (0.165)
 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                         4.565***             -0.637          16.449***         4.131***        5.341***
                                   (0.260)            (2.236)             (0.377)          (0.767)        (0.529)
 Household FE                           ü                  ü                   ü               ü              ü
 Observations                        1,158             1,158                1,158           1,158          1,158
Notes: Standard errors are reported in parenthesis and are clustered at the household level. ∗ 𝑝 < 0.10,∗∗ 𝑝 < 0.05,∗
∗∗ 𝑝 < 0.01.
                                                          85


Table 2.7
                                    Effect of RNFE on crop diversification
                                                            Dependent variable: Crop diversification (0/1)
                                                                             Coefficient estimates
 Variables                                                                            (S.E.)
 𝑅𝑁𝐹𝐸                                                                              -0.137**
                                                                                     (0.065)
 𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂                                                                             0.009
                                                                                     (0.034)
 𝐸𝐷𝑈𝐶                                                                                  0.006
                                                                                     (0.008)
 𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷                                                                            -0.001
                                                                                     (0.026)
 𝐷𝐼𝑆𝑇_𝐻𝐻                                                                               0.013
                                                                                     (0.017)
 𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸/109                                                                    5.35e-04
                                                                                  (7.65e-04)
 𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸 < /10K                                                                -1.14e-07
                                                                                  (2.50e-07)
 𝑦𝑟2020                                                                            0.141***
                                                                                     (0.031)
 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                                                          0.640***
                                                                                     (0.062)
 Household FE                                                                             ü
 Observations                                                                          1,160
Notes: Standard errors are reported in parenthesis and are clustered at the household level. ∗ 𝑝 < 0.10,∗∗ 𝑝 < 0.05,∗
∗∗ 𝑝 < 0.01.
                                                         86


Figure 2.1: Map of sampled Bangladeshi districts
                                             87


Figure 2.2: Distribution of Technical and Allocative Efficiency estimates
                      8
                      6
            Density
                      4
                      2
                                                                                      TE
                      0                                                               AE
                          0   .2             .4                .6            .8   1
                                   Allocative, Technical Efficiency Scores
                                                       88


                           APPENDIX B: SUPPLEMENTARY TABLES
Table 2.8
                                       Two-step Heckman Correction
                                             Dependent variable: Off-farm work participation
                                                                       (0/1)
 Variable                                                      Coeff. estimates                 S.E.
 Panel A: Probit selection
 equation
 𝐹𝐼𝑆𝐻_𝑃𝑅𝐼𝐶𝐸                                                         0.0006**                  0.0002
 𝐻𝐻𝑆𝐼𝑍𝐸                                                                0.029                   0.027
 𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂                                                          -0.184**                   0.091
 𝐸𝐷𝑈𝐶                                                                -0.020*                   0.011
 𝐺𝐸𝑁𝐷𝐸𝑅                                                              0.699**                   0.278
 𝑦𝑟2020                                                             2.229***                   0.108
 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                                         –1.532***                    0.304
 Observations                                                              1,138
                                             Dependent variable: Allocative inefficiency (AI)
 Variable                                                      Coeff. estimates                 S.E.
 Panel B: AI equation
 𝐻𝐻𝑆𝐼𝑍𝐸                                                             0.060***                   0.015
 𝐸𝐷𝑈𝐶                                                              -0.021***                   0.007
 𝐺𝐸𝑁𝐷𝐸𝑅                                                                0.255                   0.157
 𝐸𝑋𝑃𝐸𝑅                                                                 0.001                   0.003
 𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷                                                            0.039*                   0.021
 𝐷𝐼𝑆𝑇_𝐻𝐻                                                           -0.160***                   0.025
 𝐼𝑀𝑅                                                                1.952***                   0.151
 𝑦𝑟2020                                                             4.675***                   0.192
 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                                         –3.696***                    0.335
 Observations                                                              1,138
Notes: 𝐼𝑀𝑅 denotes the inverse mills ratio. ∗ 𝑝 < 0.10,∗∗ 𝑝 < 0.05,∗∗∗ 𝑝 < 0.01.
                                                        89


Table 2.9
     Determinants of Technical inefficiency using alternative crop diversification variable
                                            Dependent variable: Technical inefficiency
                                                      Coefficient estimates
                                                               (S.E.)
 Variables                                  (1)              (2)                  (3)
 𝑆𝑖𝑚𝑝𝑠𝑜𝑛 𝑖𝑛𝑑𝑒𝑥                            -1.099           -0.960              -5.471*
                                         (1.228)          (1.313)              (2.974)
 𝑅𝑁𝐹𝐸                                      1.392         -8.984**               0.120
                                         (1.075)          (3.736)              (0.953)
       <
 𝑅𝑁𝐹𝐸                                                   11.384***
                                                          (4.069)
 𝑆𝑖𝑚𝑝𝑠𝑜𝑛 𝑖𝑛𝑑𝑒𝑥 × 𝑅𝑁𝐹𝐸                                                           8.599*
                                                                               (5.044)
 𝐴𝑁𝑌_𝑃𝑅𝐴𝑊𝑁                              1.758**          2.165***               1.579*
                                         (0.881)          (0.781)              (0.857)
 𝑆𝐻𝑅𝐼𝑀𝑃_𝑁𝑂_𝑃𝑅𝐴𝑊𝑁                           1.636            1.791               1.703
                                         (1.251)          (1.264)              (1.257)
 𝑄𝑈𝐼𝑁𝑇_𝐶𝑂𝑀𝑀:
              𝑄2                        -2.913**         -2.263**              -2.838*
                                         (1.439)          (1.132)              (1.548)
              𝑄3                        -4.751**        -4.785***             -4.484**
                                         (2.072)          (1.441)              (2.241)
              𝑄4                        -4.606**        -4.424***             -4.402**
                                         (1.961)          (1.426)              (2.092)
              𝑄5                        -4.226**        -4.217***             -4.074**
                                         (1.742)          (1.288)              (1.882)
 𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂                                -0.865           -1.142               -0.810
                                         (0.703)          (0.700)              (0.701)
 𝐺𝐸𝑁𝐷𝐸𝑅                                   -2.467           -2.314               -2.090
                                         (1.995)          (1.789)              (2.005)
 𝐸𝐷𝑈𝐶                                    -0.217*        -0.253***               -0.209
                                         (0.117)          (0.095)              (0.127)
 𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷                              1.114***         1.317***              0.983**
                                         (0.427)          (0.274)              (0.448)
 𝐷𝐼𝑆𝑇_𝐻𝐻                                  -0.022            0.013               -0.034
                                         (0.141)          (0.169)              (0.131)
 𝑦𝑟2020                                    5.408            8.077               4.371
                                         (4.834)          (5.142)              (4.455)
                                                90


  Table 2.9 (cont’d)
  𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                           -6.675               -9.284*                   -4.975
                                                     (4.975)               (4.840)                 (4.560)
  𝜎T                                               2.143***              2.295***                 2.078***
                                                     (0.372)               (0.153)                 (0.448)
  𝜎S                                               0.519***              0.527***                 0.513***
                                                     (0.025)               (0.023)                 (0.026)
        𝜎T                                         4.132***              4.359***                 4.053***
  𝜆=
        𝜎S                                           (0.365)               (0.157)                 (0.437)
  District FE                                            ü                     ü                       ü
  Observations                                        1,158                 1,158                   1,158
Notes: Negative coefficients indicate a decline in technical inefficiency with a marginal increase in the variable of
interest. Standard errors are reported in parenthesis and are clustered at the household level ∗ 𝑝 < 0.10,∗∗ 𝑝 < 0.05,∗
∗∗ 𝑝 < 0.01.
                                                           91


Table 2.10
         Allocative efficiency regression using alternative crop diversification variable
                                               Dependent variable: Allocative efficiency
                                                        Coefficient estimates
                                                                (S.E.)
 Variables                                        (1)                   (2)               (3)
 𝑆𝑖𝑚𝑝𝑠𝑜𝑛 𝑖𝑛𝑑𝑒𝑥                                   0.055                0.054             0.150*
                                               (0.054)               (0.054)           (0.083)
 𝑅𝑁𝐹𝐸                                            0.048                0.169              0.084
                                               (0.049)               (0.158)           (0.055)
       <
 𝑅𝑁𝐹𝐸                                                                 -0.140
                                                                     (0.175)
 𝑆𝑖𝑚𝑝𝑠𝑜𝑛 𝑖𝑛𝑑𝑒𝑥 × 𝑅𝑁𝐹𝐸                                                                   -0.220
                                                                                       (0.146)
 𝑄𝑈𝐼𝑁𝑇_𝐶𝑂𝑀𝑀:
               𝑄2                               -0.059                -0.063            -0.056
                                               (0.047)               (0.048)           (0.047)
               𝑄3                               -0.052                -0.058            -0.049
                                               (0.049)               (0.051)           (0.049)
               𝑄4                            -0.171***             -0.177***         -0.165***
                                               (0.049)               (0.050)           (0.049)
               𝑄5                             -0.099**              -0.104**           -0.095*
                                               (0.050)               (0.050)           (0.050)
 𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂                                    0.090***              0.091***          0.090***
                                               (0.031)               (0.031)           (0.031)
 𝐸𝐷𝑈𝐶                                            0.006                0.006              0.007
                                               (0.005)               (0.005)           (0.005)
 𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷                                     0.031**               0.030**           0.031**
                                               (0.014)               (0.014)           (0.014)
 𝐷𝐼𝑆𝑇_𝐻𝐻                                       -0.022*               -0.021*            -0.020
                                               (0.012)               (0.012)           (0.012)
                     9
 𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸/10                             1.25e-03***           1.23e-03***       1.24e-03***
                                             (4.42e-04)            (4.47e-04)        (4.49e-04)
                 <     K
 𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸 /10                           -3.82e-07***           -3.77e-07**       -3.76e-07**
                                             (1.44e-07)            (1.46e-07)        (1.47e-07)
 𝑦𝑟2020                                      -0.333***             -0.338***         -0.335***
                                               (0.025)               (0.025)           (0.025)
 𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                     0.453***              0.450***          0.431***
                                               (0.057)               (0.057)           (0.059)
 Household FE                                       ü                     ü                 ü
                                                  92


  Table 2.10 (cont’d)
  Observations                                          1,109                      1,109                1,109
Notes: Standard errors are reported in parenthesis and are clustered at the household level. ∗ 𝑝 < 0.10,∗∗ 𝑝 < 0.05,∗
∗∗ 𝑝 < 0.01.
                                                         93


Table 2.11
                                     Tobit allocative efficiency regression
                                          Dependent variable: Allocative efficiency
                                               Coefficient estimate                   Avg. marginal effect
  Variable                                               (S.E)
  𝐶𝐷𝐼                                                  0.056*                                  0.047*
                                                       (0.032)                                 (0.026)
  𝑅𝑁𝐹𝐸                                                   0.075                                  0.063
                                                       (0.052)                                 (0.044)
  𝑄𝑈𝐼𝑁𝑇_𝐶𝑂𝑀𝑀:
                  𝑄2                                    -0.043                                 -0.037
                                                       (0.028)                                 (0.024)
                  𝑄3                                    -0.008                                 -0.007
                                                       (0.028)                                 (0.024)
                  𝑄4                                 -0.077***                               -0.065***
                                                       (0.029)                                 (0.025)
                  𝑄5                                    -0.036                                 -0.031
                                                       (0.028)                                 (0.024)
  𝐷𝐸𝑃_𝑅𝐴𝑇𝐼𝑂                                           0.079***                                0.067***
                                                       (0.028)                                 (0.024)
  𝐸𝐷𝑈𝐶                                                   0.004                                  0.003
                                                       (0.004)                                 (0.004)
  𝐷𝐼𝑆𝑇_𝑅𝑂𝐴𝐷                                              0.022                                  0.018
                                                       (0.014)                                 (0.012)
  𝐷𝐼𝑆𝑇_𝐻𝐻                                               -0.017                                 -0.014
                                                       (0.017)                                 (0.015)
  𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸/10!                                    1.26e-03**                             1.07e-03**
                                                     (6.23e-04)                              (5.21e-04)
  𝐴𝑆𝑆𝐸𝑇_𝑉𝐴𝐿𝑈𝐸 " /10#                                  -3.92e-07
                                                     (2.12e-06)
  𝑦𝑟2020                                             -0.346***
                                                       (0.022)
  𝐶𝑜𝑛𝑠𝑡𝑎𝑛𝑡                                            0.450***
                                                       (0.038)
  Household FE                                                                  ü
  Observations                                                              1,109
Notes: Standard errors are reported in parenthesis and are obtained via bootstrapping with 500 replication. We control
for household-specific heterogeneity using Mundlak-Chamberlain’s Correlated Random Effects (CRE) approach by
including the time averages of all the righthand-side variables in the regression equation. ∗ 𝑝 < 0.10,∗∗ 𝑝 < 0.05,∗∗
∗ 𝑝 < 0.01.
                                                           94


   CHAPTER 3: PARENTAL EDUCATIONAL ATTAINMENT AND CHILD LABOR:
                                  EVIDENCE FROM MALAWI
3.1 Introduction
        Does child labor respond inversely to parental education? If so, whose education matters
more, and for which forms of child labor? Curbing child labor in all forms remains an elusive
undertaking especially in low- and middle-income settings. For the first time in nearly two
decades, the global effort against child labor has stalled (ILO, 2021). Recent International [Labor]
Organization (ILO) statistics indicate that 160 million children aged 5 – 17 were involved in child
labor—up from 152 million in 2016 (ILO, 2017). This trend, however, masks significant
heterogeneity in child labor prevalence at the sub-regional level. At present, child labor mitigation
appears more challenging across many sub-Saharan African, South Asian, and Latin American
countries (see Figure 3.1). Child labor is especially pronounced in sub-Saharan Africa, where 1 in
every 5 children aged 5 – 17 is a child laborer (ILO, 2017, 2021).
        At the same time, one of the factors widely acknowledged to hold promise for child labor
mitigation on the sub-continent is human capital acquisition. Studies on the cascading
intergenerational effects of parental education on child labor and schooling outcomes are well-
documented in the development economics literature (Patrinos and Psacharapoulos, 1995; Rosati
and Tzannatos, 2000; Das and Mukherjee, 2007; Emerson and Souza, 2007; Cigno et al., 2001;
Kurosaki et al., 2006). While there exists overwhelming evidence, suggesting a negative
association between parental education and child labor participation, studies that aim to establish
a causal link are rare. Potential confounders that jointly determine child labor decisions and
parental education such as cultural inclinations, prevailing local economic activity levels, inter
alia, could limit the extent to which previous research findings can inform policy.
                                                 95


         To overcome this identification challenge, I draw on insights from the demography
literature, wherein findings suggest that the direct influence of grandparents or lack thereof on
grandchildren’s socioeconomic outcomes hinges crucially on familial living arrangements.
Consistent with this finding, Zeng and Xie (2014) note that non-co-resident grandparents’
educational attainment has no bearing on grandchildren’s schooling outcomes conditional on
parental characteristics.
         While some studies provide support for this result (Warren and Hauser, 1997; Erola and
Moisio, 2007), others find evidence to the contrary (Jæger, 2012; Chan and Boliver, 2013). Despite
controlling for parents’ education, income, and wealth, Chan and Boliver (2013) note that
grandparents exert a significant direct effect on grandchildren’s occupational classes in Britain.
This result, however, does not account for multigenerational co-residence (Chan and Boliver,
2013). Hence, conditional on multigenerational co-residence and a range of parental and
household-level characteristics, I use as a set of instruments grandparents’ educational attainment
to exploit plausibly exogenous variation in parents’ schooling.22 Further, to explore the robustness
of my results, I apply practical methods that relax the exclusion restriction assumption in an
imperfect instruments framework (Conley et al., 2012). In particular, I derive bounds for the causal
parameters of interest when the instruments are allowed to violate the exclusion restriction.
         Using quasi-random access to education in Nyasaland (now Malawi) during the colonial
period (mostly between 1859 and 1964), I present evidence on the effect of parental education on
child labor outcomes while taking advantage of this opportune setting. First, the early stages of
this period coincided with the peak of slave trade in Malawi, at the height of which men, women,
22
   That is, I use maternal grandparents’ education as instruments for mother’s schooling, whereas the paternal
grandparents’ educational levels serve as instrumental variables for the father’s schooling conditional on
multigenerational co-residence as well as parental and household-specific characteristics.
                                                         96


and sometimes children were abducted in organized raids. Not long after the Christian missionaries
had settled in Nyasaland did they realize that these slave raids were more than just distractions to
their evangelism mandate. With mostly firearms and sometimes ransom payments, the
missionaries gradually brought these slave raiding forays under control. As the freed ex-slave
populations became the responsibility of the missionaries, the former were educated in missionary
schools as part of their rehabilitation (Allen, 2008).
        Second, even after missionary education was further democratized to accommodate non-
slave populations, lack of colonial governmental support and its attendant resource constraint
challenges meant that missionaries had to literally turn away scores of students due to limited
capacity. Scarce qualified teaching personnel and school infrastructure prompted a form of
rationing of missionary educational access during this time. Hence, in my instrumental variables
(IV) analysis, I use both grandparents’ education (represented by indicator variables taking the
value 1 if a grandparent at least completed primary school) to instrument for a given parent’s level
of education. Given that there are more instruments than potentially endogenous variables, this
strategy also allows for checking the credibility of my exclusion restriction—the over-
identification test.
        The nexus between parental educational attainment and child labor outcomes warrants
significant research attention given the growing evidence of strong intergenerational persistence
in child labor among households in low-income countries (Emerson and Souza, 2003; Aransiola
and Justus, 2017). Educated parents typically demonstrate a proclivity for investing in their
children’s education, which could rival alternative uses of the child’s time such as child labor
work. Moreover, the productivity of the parents’ time as an input in the child’s education may
increase with parental schooling (Behrman et al., 1999; Cigno et al., 2002; Andrabi et al., 2012).
                                                  97


For instance, Andrabi et al. (2012) observed that children with educated mothers spend more time
on school-related activities at home, which competes with time spent working within and outside
the home.
         This paper focuses on another channel—parental engagement in non-farm employment.
Using a nationally representative survey data set on household demographics and child time use
in Malawi, this study’s contributions to the broader child labor literature are threefold. First, it
provides deeper insights into parental educational effects on child labor outcomes with a focus on
non-farm employment as a key mechanism. Second, by employing an instrumental variables (IV)
strategy, this study attempts to address endogeneity issues inherent in the standard estimation of
the relationships of interest. Moreover, this study leverages a relatively longer reference period to
classify child laborers—an improvement upon earlier approaches, where a one-week reference
period is routinely used. Finally, it also accounts for child labor heterogeneity by considering two
categories of child labor work: (1) household farm work, and (2) casual, part-time or “ganyu"
labor.23
         The IV estimation results indicate that there is a negative, and statistically significant
relationship between parental education and “ganyu" labor participation. By contrast, we do not
find a significant effect of maternal education on household farm work, while the father’s
education is significantly negatively associated with this child labor measure. In other results, I
find that maternal education significantly improves school attendance. On the other hand, I do not
find a meaningful impact of paternal education on school attendance. These results are robust to
relaxing the exclusion restriction assumption. I also find that engagement in non-farm income
employment pursuits among educated parents might play a role in mediating these effects.
23
   Where “ganyu" labor refers to any form of low wage, short-term labor arrangement outside the household.
                                                        98


         The rest of the paper is organized as follows. Section 3.2 presents a review of the related
literature. Section 3.3 presents the data and some descriptive statistics. Section 3.4 illustrates the
empirical strategy, and section 3.5 addresses some endogeneity concerns. Section 3.6 summarizes
the results, section 3.7 presents a sensitivity analysis, section 3.8 tests some possible mechanisms
and section 3.9 concludes.
3.2 Related Literature
         Parental education has the potential to reduce child labor participation and improve school
attendance (Canagarajah and Coulombe, 1997; Grootaert, 1998; Bhalotra and Heady, 1998;
Canagarajah and Nielsen, 1999; Tzannatos, 2003; Kurosaki et al., 2006; Hsin, 2007; Emerson and
Souza, 2007; Das and Mukherjee, 2007). While not considered a policy variable per se (Grootaert,
1998), parental education—as a mitigation strategy—can be appealing as it is less intrusive
compared to overt child labor bans or prohibitions and has potentially longer-lasting effects (Cigno
et al., 2002). Besides altering parental preferences for/against child labor, education also affects
parents’ labor market choices. Moreover, even among non-altruistic parents, making their children
work might no longer be in their best interest if the return to childhood education is sufficiently
high. As such, most studies analyzing the correlates of child labor participation often examine both
schooling and child labor work as these decisions are interlinked. While estimating a multi-stages
sequential probit model, Grootaert (1998) notes that parental education improves the odds of
exclusively attending school as well as combining school and work in Côte D’Ivoire. In a similar
setting, Canagarah and Coulombe (1997) discuss the influence of factors that jointly determine
schooling and child labor decisions among Ghanaian children and found that school participation
appear more responsive to parental education. I build on these early contributions to the literature
while focusing on parental education as the key variable of interest.
                                                   99


        Also, relevant to this line of research is the extent to which the unitary household model in
lieu of the collective model adequately captures critical intra-household power dynamics in the
decision-making process (Thomas, 1990; Browning et al., 1994; Thomas, 1994; Duflo, 2003).
More importantly, the motivation for the collective model raises relevant questions about whose
education matters most? This paper also contributes to a growing literature on the heterogeneous
impacts of the mother’s and father’s education on child labor and schooling outcomes. In South
Asia, Kurosaki et al. (2006) find direct empirical evidence, suggesting that the mother’s education
is more important in reducing child labor and improving school attendance in Andhra Pradesh. By
contrast, Emerson and Souza (2007) stress that paternal education has a stronger negative influence
on the child labor status of sons than the mother’s education in Brazil. With respect to school
attendance, the authors find that maternal schooling exerts a stronger positive impact on girls’
school attendance, whereas the father’s education positively predicts higher school attendance for
sons.
        Consistent with Kurosaki et al. (2006), Das and Mukherjee (2007) also reveal that despite
the influence of the father’s education, maternal education significantly reduces school dropout
rates and child labor incidence among boys in urban India. Similarly, Patrinos and Psacharopoulos
(1995) also note that maternal schooling has a strong and negative influence on future employment
prospects. Further, Bhalotra and Heady (1998) review evidence of a negative and significant
relationship between maternal schooling and household farm work among children in Pakistan and
Ghana.
        An important gap in the existing literature that remains under-explored pertains to the
heterogeneity of child labor work itself. Some child labor activities are undoubtedly more harmful
than others. As a consequence, we might expect such forms of child labor to decline dramatically
                                                   100


with improvements in household socioeconomic conditions. Ali (2019) provides some evidence
on the importance of accounting for child labor heterogeneity by showing that only the worst forms
of child work experienced significant declines with increasing levels of household income. The
author interprets this result as perhaps reflecting parents’ non-pecuniary motivations behind
engaging their children in non-hazardous forms of child work such as unpaid family work. I extend
the scope of my research question to also account for this heterogeneity by considering both
household farm work and casual, part-time or “ganyu" employment.
3.3 Data
        Analysis for this study leverages data from the Malawian Integrated Household Survey
(IHS) Program. The IHS is a product of collaborative work between the World Bank and the
Malawian National Statistical Office as part of the LSMS - ISA (Living Standards Measurement
Study - Integrated Surveys on Agriculture) household survey project. Extending across multiple
rounds, the survey started off in 2010 with the implementation of the Third Integrated Household
Survey (IHS3). The IHS3 sample was designed to be representative at the national-, regional-, and
urban/rural levels. Following the IHS3, the Integrated Household Panel Survey (IHPS) 2013 was
administered to follow-up on the 3,246 households initially interviewed in 2010. The tracking of
split-off individuals during follow-up resulted in a final IHPS 2013 sample of 4,000 households
that could be linked back to 3,104 previously interviewed households during baseline.
        The two most recent rounds of the panel survey: the Fourth (IHS4) and Fifth (IHS5)
Integrated Household Surveys were conducted in 2016/17 and 2019/20, respectively. Due to
funding challenges, the initial sampling frame was halved from 204 enumeration areas (EAs) to
102 during these latter rounds. The IHS4 ended with 2,508 households tracking an original target
of 1,989 households in 102 EAs from the IHPS 2013. Following a similar tracking guideline, the
                                                101


IHS5 grew to include 3,245 households who were interviewed to collect detailed information on
individual and household demographic variables, agricultural production, other socioeconomic
activities, as well as community-level characteristics. The data were collected using survey
questionnaires via interviews with chiefly, the household head. Ultimately, for my analyses, I use
the two latest IHPS waves (that is, the IHS4 and IHS5) for reasons I will specify shortly.
          A major data limitation that often plagues many child labor studies is finding a suitable
measure of child labor outcomes. In many developing country studies, a one-week reference period
has been widely used to characterize child labor participation. Such a short reference period,
however, may induce very little variation in child labor outcomes, which could be further
exacerbated by measurement error resulting in imprecise estimates (Dorman, 2008). As an
example, the main child labor measures in the 2010 and 2013 IHPS include: (1) the number of
hours in the last 7 days (before the survey) the child spent on agricultural activities, (2) the number
of hours in the past week the child run or did any kind of non-agricultural work, and (3) the number
of hours spent yesterday collecting water. Since interviewer visits mostly occurred between March
and November, which overlaps with the peak season,24 this might predict more child involvement
in agricultural work relative other child labor activities (e.g., non-farm work).
          Further, to the extent that the one-week reference period is too short to capture any
meaningful variation in child labor activities, results may fail to fully reflect the true intensity of
child labor work.25 Hence, to partially obviate the threat of finding an insignificant relationship
between parental education and child labor for this reason, I use the two latest rounds of the IHPS
24
   Peak season refers to a time of the year when crop harvest reaches its maximum. Agricultural labor demand tends
to be highest during this time as farmers strive to harvest and get their produce to the market in time to avoid spoilage
and loss of quality.
25
   Nevertheless, see Andrabi et al. (2012), Kazianga et al. (2012) and Ali (2019) for the use of a similar time frame
for the collection of child time use data.
                                                           102


for my analysis. Unlike the earlier waves, the child labor measures represented in the IHS4 and
IHS5 are over a relatively longer reference period (specifically, over the past 12 months),
circumventing the aforementioned data limitations. Therefore, my two child labor measures
include two indicator variables: one for whether the child contributed to household farming
activities in the past year and another for whether the child engaged in any casual, part-time or
“ganyu" labor in the last 12 months.
         For the child level analysis, I restrict the sample to children aged 5 – 17 as that is typically
the age range over which the ILO reports child labor statistics. Moreover, since a majority of
children within this age bracket are of primary school-going age, this allows for studying the
potential trade-off between school attendance and child labor work. Summary statistics on both
child- and household-level characteristics for the resulting sample are reported in Table 3.1. In the
first column (the “All" column), I pool observations across both years and report descriptive
statistics at both the child- and household-level in panels A and B, respectively. The average age
of a child in the sample is about 11 years and there is an even split in gender representation. As
panel A depicts, roughly half of all observation-years identify as female with females slightly
overrepresented in the 2019 panel. A substantively high proportion of children in the sample were
reported to be currently attending school or did attend the just ended school session. It can also be
inferred from the rather high school attendance rate that few of these children can be considered
full-time workers. Roughly 40% of the children in the sample contributed to any household farm
work, and about 15% of them engaged in casual, part-time employment. Turning attention to panel
B, women appear to receive less education relative to men.26 This result is also reflected in the
26
   For the rest of the analysis, for mother’s and father’s education variables with missing values, the missing values
are substituted with the value zero and this imputation is controlled for by including indicator variables that take a
value of 1 if an observation is missing and zero otherwise as additional controls. This approach is widely utilized in
                                                          103


relatively lower educational attainment rate among both paternal and maternal grandmothers
relative to grandfathers in the sample.
         Table 3.2 presents evidence on child labor incidence and school attendance for the full
sample by child gender and age. As expected, the household farm work incidence ratio is higher
for boys (44%) compared to girls (39%). Also, older children (74%) are more likely to be involved
in household farming activities relative to younger children (33%). This result seems intuitive
given that older children are more likely to be out of school, signaling better availability to support
household farm work. Moreover, farm work can be intense; hence, the sturdier build of boys and
older children makes them better suited to working on-farm. The data also reveals a gender
disparity in casual, part-time or “ganyu" employment. Boys are disproportionately more involved
in this type of child labor work. The incidence ratio for casual, part-time employment is roughly
18% for boys, and 13% for girls. Again, the summary statistics indicate that older children have a
greater incidence of casual, part-time labor participation. Further, while a substantial proportion
of the children in the sample indicated that they do attend school, school attendance is rather low
among older children. By contrast, the younger cohort are significantly more likely to attend
school. This result partly explains the disproportionately greater fraction of older children
participating in the various forms of child work especially household farm work.
3.4 Empirical Strategy
         To quantify the effect of parental education on child labor participation, I first estimate the
following linear effects model:
                               𝑦?^# = 𝛼𝐸𝑑𝑢𝑐?^ + 𝒙𝒉𝒕 𝜷 + 𝛿Y + 𝛿# + 𝜖?^#                                       (1)
the development economics literature to reduce the number of dropped observations due to missing data (Kurosaki et
al., 2006).
                                                      104


where 𝑦?^# is an indicator variable, which takes the value 1 if child 𝑖 in household ℎ in time period
𝑡 such that 𝑡 ∈ {2016, 2019}: (1) was engaged in household farm work, (2) was involved in any
casual, part-time or “ganyu" employment in the past 12 months, and zero otherwise; 𝐸𝑑𝑢𝑐 denotes
the mother’s or father’s years of schooling; 𝛼 denotes the effect of parental education on child
labor; 𝒙𝒉𝒕 is a vector of time-varying controls including household size, a wealth index, number
of male and female household members below age 6, total area of cultivated land, household
distance from the nearest road; I also include controls for female-headship status, and religious
affiliation; 𝜷 is a vector of parameters on the time-varying household-level covariates to be
estimated; 𝛿Y is a district fixed effects term; 𝛿# is a time dummy; and 𝜖?^# is an idiosyncratic error
term with zero mean and standard deviation, 𝜎a . Standard errors are clustered at the child-level to
allow for correlation of errors for a child across years but not across children.
         It is important to note that parents’ education will likely be predetermined by 2016 for a
large fraction of children in the sample; hence, the key explanatory variables of interest should not
vary much over the survey years, if at all. That is, parental education is more or else a time-constant
variable. As such, estimating the relationship between parental education and child labor using the
traditional fixed effects approach will likely result in the estimated coefficients of interest getting
dropped. This empirical challenge presents a justification for the pooled ordinary least squares
(POLS) estimator. However, while the linear probability model (LPM) yields estimates that are
easy to interpret, the estimated probabilities can sometimes lie outside the unit interval (that is,
above one or below zero). Hence, as a robustness check, I also estimate equation (1) using a non-
linear model.
                                                  105


3.5 Addressing Endogeneity
           The key explanatory variables of interest—mother’s and father’s education—might fail to
satisfy the strict exogeneity assumption in the linear effects model. Reverse causality does not
seem to be a problem here as the data set suggests that parental educational attainment would have
been determined prior to “future" child labor supply decisions for most children in the sample.
That said, correlation between parental education and confounders residing in the idiosyncratic
error term that also predict child labor participation will yet yield inconsistent estimates. For
example, to the extent that higher household agricultural productivity leads to increased demand
for labor, child labor could worsen with higher productivity if hired agricultural labor is scarce. As
a consequence, a positive correlation between parental education and household agricultural
productivity could result in an underestimation of a negative parental educational effect on child
labor.
           To address this endogeneity concern, I use an IV strategy. The IV method requires that we
have an instrument or set of instruments that are strongly correlated with parental education (the
relevance condition) but affect child labor outcomes only through the key explanatory variable(s)
of interest (the exclusion restriction). I use grandparents’ literacy as instruments for parental
educational attainment.27 In particular, the mother’s education is instrumented by each of her
parents’ level of education, which are measured as indicator variables taking the value 1 if they at
least completed primary school and 0, otherwise.28 The choice of these instruments is motivated
in large part by Malawian grandparents limited discretion in their educational attainment
predominantly in the colonial era. In the absence of a clear educational policy, combined with a
27
   Variation in the grandparents’ education is partly driven by the quasi-random nature of missionary educational
access as explained in the introduction.
28
   I instrument for the father’s education in a similar manner.
                                                          106


lack of governmental support, Christian missionaries became the primary custodians of education
in Nyasaland pre-independence (McCracken, 2012). For enslaved persons (mostly children) who
were freed by the missions, their path to educational attainment was arguably due to chance. Upon
rescue, these ex-slaves were trained in missionary schools, with some going on to become priests
(McCracken, 2012). Further, even after missionary schools were open to non-slave populations,
the financial toll on these missions meant that not all who sought missionary education could be
admitted.
         As I show later in the results section, it is rather straightforward to see how the chosen set
of instruments satisfies the relevance assumption. However, what remains obscure is whether the
necessary exclusion restrictions are satisfied since this assumption is not directly testable. That is,
is it indeed the case that grandparents’ literacy is not correlated with other factors beyond the
parents’ education that might also influence child labor participation? A potential concern about
the IVs due to which the exclusion restriction might be violated is that an educated grandparent
can increase the returns to the child’s school attendance perhaps by helping out with schoolwork
at home, which can impact child labor participation decisions.
         As a result of the potential violation of the exclusion restriction for this reason and other
related concerns, I control for multigenerational co-residence in all my IV regressions. This
strategy is motivated by recent findings in the demography literature, arguing that non-co-resident
grandparents’ educational attainment exerts little to no influence on grandchildren’s schooling
outcomes conditional of parental characteristics (Warren and Hauser, 1997; Erola and Moisio,
2007; Zeng and Xie, 2014). The multigenerational co-residence variables are measured as two
indicator variables: one for whether the grandfather is dead or lives away from the child and
another for whether the grandmother is deceased or lives away from the household. I include both
                                                   107


variables in all my IV regressions, but in cases where these two variables are strongly correlated,
I control for one or the other due to multicollinearity.
         Further, intrinsically linked to the missionaries’ educational curricula was the mandate to
produce native purveyors of the Christian faith. As such, for the predominantly Islamic share of
the population, self-selection out of missionary education will be common (Bone, 1982). Hence, I
also include religion dummies as additional controls. Nevertheless, as a robustness check, I also
obtain 2SLS estimates while dropping parents who indicated to be Muslim to investigate the
stability of my results.29
3.6 Results
3.6.1 Descriptive Statistics
         In Figures 3.2 and 3.3, I plot raw means for the two child labor outcomes against parental
educational status for both survey years. A few important patterns emerge. First, Figure 3.2 shows
that household farm labor participation is relatively common among children with uneducated
parents. This pattern holds irrespective of the parent’s gender. Second, I find that household farm
labor participation is rather remarkably stable over time among children with educated parents. By
contrast, I observe an uptick in this outcome variable over time when parents are uneducated.
Turning attention to Figure 3.3, we observe patterns that diverge somewhat from the trends
reported in Figure 3.2. In relative terms, participation in this form of child labor work appears more
pervasive among children with uneducated parents. However, the figure shows that participation
rates in casual, part-time or “ganyu" employment worsens over time for both sub-groups
29
   While data on grandparents’ religious affiliations would have been most beneficial for this exercise, such
information on the grandparents in the data set is rather sparse. Hence, I use parents’ religious affiliations as a proxy.
Insofar as children adopt their parents’ religion in this setting, this exercise still serves its intended purpose.
                                                            108


irrespective of parental educational status. From a policy standpoint, this finding reflects in part
the importance of accounting for child labor heterogeneity.
        Figures 3.4 and 3.5 illustrate the relationship between child labor outcomes and household
wealth graphically. While Figure 3.5 shows a consistently negative relationship between casual,
part-time or “ganyu" employment and household wealth quintiles across survey waves, the pattern
suggested by Figure 3.4 is somewhat non-linear. Figure 3.4 indicates that child household farm
work participation initially worsens with household wealth then begins to improve at extremely
high levels of wealth. This finding is in line with Bhalotra and Heady (1998)’s discovery that child
labor on household farms could worsen with wealth in the presence of multiple factor market
failures.
3.6.2 Regression Results
        Linear probability model estimates for equation (1) are reported in the first two columns of
Table 3.3. The columns present estimated coefficients of parental educational effects on household
farm labor participation, and casual, part-time employment, respectively. Column (3) shows how
school attendance responds to changes in parental education. Across all columns, I include controls
for household-level covariates including the household size, a wealth index, number of male and
female household members under age 6, household area of cultivated land, distance to the nearest
road, religious affiliation of the household head, and female headship status. A few results stand
out.
        The estimated coefficients for the mother’s educational attainment variable are negative
and statistically significant for both child labor measures as reported in columns (1) and (2). In
particular, an additional year of maternal schooling is associated with a 0.4 (0.7) percentage points
decline in the likelihood of child labor involvement in household farm work (casual, part-time
                                                  109


employment), on average. There is also a strong and negative association between paternal
education and “ganyu" labor. By contrast, the father’s education does not exert a significant effect
on household farm work. Similarly, results in column (2) indicate that an additional year of the
father’s schooling decreases child labor participation in casual, part-time labor employment by 0.9
percentage points, on average, ceteris paribus.
        Table 3.3 also reports the effect of parental educational attainment on school attendance.
Results are presented in column (3). Consistent with Kurosaki et al. (2006), school attendance
appears more responsive to maternal schooling. The point estimate of the coefficient on maternal
education is 0.01, while the estimated coefficient for the father’s education is 0.003—both
estimated coefficients are statistically significant. In Table 3.4, I re-estimate equation (1) for both
child labor outcomes and the school attendance dependent variable using a probit model. The
estimated average partial effects from these probit models appear remarkably similar to the LPM
estimates. Hence, in what follows, I prioritize the LPM estimates for ease of interpretation.
        Table 3.5 reports the estimated effects of parental education on child labor outcomes and
school attendance by the child’s gender. While girls appear less likely to engage in “ganyu"
employment, I do not find any significant heterogeneous effects of parental education on child
labor and school attendance by child gender. The fact that we do not find significant sex-specific
parental educational effects on child time use suggests waning discrimination in human capital
investments against girls.
        Next, I present results from the two-stage least squares (2SLS) estimator. First, I report
estimates from the first stage of the IV analysis in Table 3.6. In column 1 (2), I report estimates
from the regression of the mother’s (father’s) years of education on the maternal (paternal)
grandparents’ literacy variables and other “exogenous" covariates for the full sample. Columns (3)
                                                  110


through (6) show first stage results disaggregated by child gender. I broadly find evidence of a
strong correlation between parental educational attainment and grandparents’ literacy.
         Table 3.7 reports the 2SLS estimation results. Columns (1)–(3) present the estimated
coefficients for the two child labor measures and school attendance for the full sample in that order.
The 2SLS estimates for the mother’s educational attainment variable are reported in panel A, while
panel B presents the 2SLS estimates for the father’s education variable. Following Olea and
Pflueger (2013), I report the effective F statistic from a heteroskedastic, and cluster-robust test of
the null of weak instruments across my IV specifications. A rejection of the null hypothesis signals
a strong first stage. Further, I also report the Hansen J statistic with its corresponding p-value from
the test of the null that the over-identifying restrictions are indeed valid. Failure to reject the null
in favor of the alternative hypothesis lends credence to the assumption that the necessary exclusion
restrictions are satisfied.
         The weak IV tests reveal that the reported effective F statistics exceed the critical value for
the 𝜏 = 30% weak instrument threshold across all specifications. That is, we can conclude that the
instruments are strong. Second, the over-identification tests are reassuring as I broadly fail to reject
the null hypothesis that the over-identifying restrictions are valid. I now turn to the results.
         First, I do not find a significant association between maternal education and child
household farm work participation. That is, after instrumentation, the effect of the mother’s
education on household farm work is attenuated (that is, it tends toward zero). By contrast, there
is a strong negative association between maternal education and “ganyu" labor involvement. In
particular, an additional year of maternal schooling is associated with a 1.5 percentage points
decrease in casual, part-time employment.
                                                   111


        Second, I find a strong positive association between the mother’s education and school
attendance. Turning attention to panel B, the 2SLS estimates indicate a negative and statistically
significant impact of paternal education on both child labor measures. Specifically, an additional
year of paternal schooling is associated with a 2.6 (2.3) percentage points decline in the incidence
of household farm work (casual, part-time employment), on average. On the other hand, the
estimated coefficient for the school attendance outcome variable is not statistically different from
zero.
        Table 3.8 presents 2SLS estimates of the impact of parental education on child time use by
child gender. Panel A reports similar effects of maternal education on “ganyu" labor across gender.
By contrast, I do not find a significant effect of maternal schooling on female school attendance,
while the mother’s education strongly improves male school attendance. Similarly, the negative
effect of paternal education on “ganyu" labor is only significant for the male subsample.
        As a robustness check, I re-run my IV analysis while restricting the sample to non-Islamic
parents. One might be concerned about selection of Islamic grandparents out of formal education
due to Christian bias in the missionary educational curriculum. Results are reported in Table 3.9.
The 2SLS estimates using this restricted sample are remarkably similar to my main results in Table
3.7. The insensitivity of the main results to this robustness check suggests that potential selection
of Islamic grandparents out of missionary education does not bias my main findings in any
meaningful way.
        Next, I explore how child labor outcomes respond to parental educational attainment for
children of differing age groups. To obtain these estimates, I interact the parental education
variables with age dummies to estimate these heterogeneous effects. Results are presented in
Figures 3.6 and 3.7 for the maternal educational attainment effect while Figures 3.8 and 3.9 present
                                                  112


the 2SLS estimates for the father’s education by child age. Some general patterns emerge. I do
find some evidence of heterogeneous parental educational attainment effects by child age. In
particular, I find that the estimated coefficients are statistically indistinguishable from zero for
relatively younger children. By contrast, the estimated effects of parental education on the child
labor outcomes for older children is negative and statistically significant although less precisely
estimated (that is, the confidence intervals are larger). This finding could be rationalized in part by
the relatively lower child labor participation rate among younger children to begin with.
3.7 Imperfect Instruments Sensitivity Analysis
        In this sub-section, I examine the sensitivity of my IV results to a relaxation of the
exclusion restriction. Following Conley et al. (2012), I obtain bounds on the causal effect of
parental education, while allowing for a direct effect of grandparents’ literacy on child time use.
While the over-identification tests suggest that the instruments may be valid, they are only
necessary, but not sufficient conditions for instrument validity (Clarke and Matta, 2018).
Consider the IV model below:
                                         𝒀 = 𝑿𝜷 + 𝒁𝜸 + 𝝐                                           (2)
                                          𝑿 = 𝒁𝚷 + 𝑽                                               (3)
where 𝒀 is a vector of the child time use variables; 𝑿 is a vector of the parental education variables;
𝒁 are the instruments (grandparents’ literacy); 𝚷 is a vector of first-stage coefficients; 𝛄 captures
the direct effect of the instruments on the outcome variables. The exclusion restriction implies that
𝜸 = 0, signaling that the instruments affect child time use only through parental education.
        The imperfect instruments framework allows for relaxing the 𝜸 = 0 assumption. In
particular, I assume that there is a direct negative association between grandparents’ literacy and
                                                  113


child labor. In doing so, I set priors such that 𝛾 falls within the range [𝛾b?! , 0], where 𝛾b?! ∈
{−0.001, −0.002, −0.003} to capture the degree of violation of the exclusion restriction.30 Bounds
are then obtained as the union of all confidence intervals for 𝛾 inside the assumed range [𝛾b?! , 0].31
Results are presented in Tables 3.10 and 3.11 for the maternal and paternal education effects,
respectively. The results indicate that the estimated bounds are relatively robust to worsening
violations of the exclusion restriction.32 Reassuringly, the 2SLS estimates fall within the estimated
bounds, which do not include zero for the significant 2SLS results. Hence, despite substantial
deviations from perfect exogeneity, my 2SLS results are robust to varying degrees of violation of
the exclusion restriction.33
3.8 Potential Mechanisms
          Higher educational attainment is typically associated with greater non-farm labor force
participation. Strong pull factors such as the relatively higher expected returns from non-farm
employment can induce a preference for non-farm engagements among the educated. Hence, in
identifying the effect of parental education on child labor outcomes, the role of the parents’
occupation cannot be ignored. In what follows, I explore whether educated parents are more likely
to participate in non-farm employment activities. A positive result from this investigation will
partly explain the strong and negative effect of parental education on child labor work, especially
household farm work. Indeed, I find strong evidence indicative of positive sorting among educated
parents into non-farm business engagements and wage employment. I test for this evidence using
the 2SLS estimates from instrumenting for parents’ education with the corresponding
30
   For the school attendance variable, 𝛾$%& becomes 𝛾$'( ∈ {0.001,0.002,0.003}.
31
   See Clarke and Matta (2018) for details on the union of confidence intervals (UCI) procedure.
32
   Note that the priors on 𝛾 need not be the same for both instruments and may be extended to differing violations of
perfect exogeneity.
33
   Some of the priors on 𝛾 are as high as 90% of the POLS estimates of parental education on child time use.
                                                         114


grandparents’ literacy indicators. Errors are clustered at the parent–level and results are reported
in Table 3.12.
        Column (1) reports that an additional year of the mother’s (father’s) education is associated
with a 4.2 (7.9) percentage points increase in non-farm business participation. I equally find strong
positive effects of parental education on the likelihood of wage employment for both mothers and
fathers. Taken together, these results have noteworthy implications. First, given that education
improves the odds of non-farm engagement and wage employment, which usually requires parents
to be away from home, we might expect child labor to decline if child labor work typically requires
close parental supervision. Second, for non-farm households with relatively younger children,
there are good reasons to expect that keeping their children in school as they (the parents) work is
typically preferred. Hence, these children by design are exempt from any form of child labor work.
However, in a “full" parent household, this mechanism will depend on whether both parents are
engaged in non-farm and/or wage employment. One can envision a scenario where the mother is
a salaried employee while the father attends to the household farm. In that case, child labor might
not fall with parental education if multiple factor markets such as land and labor are missing
(Bhalotra and Heady, 1998), as the child’s services will be needed on-farm.
3.9 Conclusions
        Child labor remains a pervasive phenomenon in sub–Saharan Africa. Given laws at both
national and international levels to minimize child labor, the innocuous nature of household child
labor participation makes it less noticeable and challenging to eradicate. In this paper, I revisit an
important empirical question: does child labor respond inversely to parental education? There is a
wide scope of anecdotal evidence suggesting that parental education reduces child labor
participation; however, studies that attempt to address possible endogeneity issues as well as child
                                                 115


labor work heterogeneity are rare. Moreover, very few studies have attempted to explore parental
engagement in non-farm employment as a potential mechanism driving these effects. To assess
the sensitivity of my results to violations of the exclusion restriction, I employ the imperfect
instruments method proposed by Conley et al. (2012).
        Using a nationally representative Malawian panel data set, I find that parental education is
generally child labor mitigating. There is a strong and negative effect of maternal schooling on
“gangyu" labor involvement, but no effect on household farm work. In particular, an additional
year of maternal (paternal) schooling is roughly associated with a 1.5 (2.3) percentage points
decline in casual, part-time or “ganyu" employment, on average. Similarly, the return for an
additional year of paternal schooling is a 2.6 percentage points decrease in household farm work.
I find limited evidence of differing estimated effects by child gender for my LPM estimates;
however, the estimated effects appear more pronounced for boys and older children for the 2SLS
estimates. Results suggest that the impact of parental education on both child labor measures are
mostly driven by older children, who are more likely to work on household farms at that age. The
study’s findings also indicate that child school attendance improves especially with higher
maternal education. This finding is consistent with Das and Mukherjee (2007) and Kurosaki et al.
(2006), who also find strong and positive effects of maternal schooling on child school attendance.
Nevertheless, evidence of such effects on child school attendance is weak for the paternal
education variable.
        Finally, I also show that parental engagement in non-farm employment pursuits could be a
mechanism underlying the negative effect of parental education on child labor outcomes. I find
strong evidence that educated parents are more likely to engage in non-farm businesses and wage
employment. Nonetheless, there are a few caveats to consider. Obviously, there could be other
                                                116


pathways through which the effect of parental education on child labor could be mediated. In
addition, further analysis is required to uncover how parental engagement in the non-farm
economy directly impacts child time use. Is it earned non-farm income or the transition to the non-
farm sector per se that predicts lower child labor participation? Supplementary qualitative data via
interviews can provide additional insights into this empirical question.
                                                 117


                                        BIBLIOGRAPHY
Ali, F. R. M. (2019). In the same boat, but not equals: The heterogeneous effects of parental
        income on child labour. The Journal of Development Studies, 55(5):845–858.
Allen, J. (2008). Slavery, Colonialism, and the Pursuit of Community Life: Anglican Mission
        Education in Zanzibar and Northern Rhodesia 1864–1940. History of Education.
Andrabi, T., Das, J., and Khwaja, A. I. (2012). What did you do all day? Maternal education and
        child outcomes. Journal of Human Resources, 47(4):873–912.
Appleton, S. and Balihuta, A. (1996). Education and agricultural productivity: evidence from
        Uganda. Journal of International Development, 8(3):415–444.
Aransiola, T. J. and Justus, M. (2017). Intergenerational persistence of child labor in Brazil. In
        International Conference on Applied Economics, pages 613–630. Springer.
Behrman, J. R., Foster, A. D., Rosenweig, M. R., and Vashishtha, P. (1999). Women’s
        schooling, home teaching, and economic growth. Journal of Political Economy,
        107(4):682–714.
Bhalotra, S. and Heady, C. (1998). Child labour in rural Pakistan and Ghana. University of
        Bristol and University of Bath, Bristol, mimeo.
Bone, D. S. (1982). Islam in Malawi. Journal of Religion in Africa, 13:126–138.
Browning, M., Bourguignon, F., Chiappori, P.-A., and Lechene, V. (1994). Income and
        outcomes: A structural model of intrahousehold allocation. Journal of Political Economy,
        102(6):1067–1096.
Canagarajah, S. and Coulombe, H. (1997). Child labor and schooling in Ghana. Available at
        SSRN 620598.
Canagarajah, S. and Nielsen, H. S. (1999). Child labor and schooling in Africa: A comparative
        study. World Bank, Social Protection Team.
Chan, T. W. and Boliver, V. (2013). The grandparents’ effect in social mobility: Evidence from
        British birth cohort studies. American Sociological Review, 78(4):662–678.
Cigno, A., Rosati, F. C., and Tzannatos, Z. (2001). Child labor, nutrition, and education in rural
        India: An economic analysis of parental choice and policy options. Washington, DC: The
        World Bank.
Cigno, A., Rosati, F. C., and Tzannatos, Z. (2002). Child Labor Handbook. Washington: The
        World Bank.
Clarke, D. and Matta, B. (2018). Practical considerations for questionable IVs. The Stata
        Journal, 18(3):663– 691.
                                                 118


Conley, T. G., Hansen, C. B., and Rossi, P. E. (2012). Plausibly exogenous. Review of
        Economics and Statistics, 94(1):260–272.
Das, S. and Mukherjee, D. (2007). Role of women in schooling and child labour decision: The
        case of urban boys in India. Social Indicators Research, 82(3):463–486.
Dorman, P. (2008). Child labour, education, and health: A review of the literature. ILO Geneva.
Duflo, E. (2003). Grandmothers and granddaughters: old-age pensions and intrahousehold
        allocation in South Africa. The World Bank Economic Review, 17(1):1–25.
Dumas, C. (2020). Productivity Shocks and Child Labor: The Role of Credit and Agricultural
        Labor Markets. Economic Development and Cultural Change, 68(3):763–812.
Emerson, P. M. and Souza, A. P. (2003). Is there a child labor trap? Intergenerational persistence
        of child labor in Brazil. Economic Development and Cultural Change, 51(2):375–398.
Emerson, P. M. and Souza, A. P. (2007). Child labor, School Attendance, and Intrahousehold
        Gender Bias in Brazil. The World Bank Economic Review, 21(2):301–316.
Erola, J. and Moisio, P. (2007). Social mobility over three generations in Finland, 1950–2000.
        European Sociological Review, 23(2):169–183.
Grootaert, C. (1998). Child labor in Cote d’Ivoire: incidence and determinants, volume 1905.
        World Bank Publications.
Hayami, Y. and Ruttan, V. W. (1970). Agricultural productivity differences among countries.
        The American Economic Review, 60(5):895–911.
Hsin, A. (2007). Children’s time use: Labor divisions and schooling in Indonesia. Journal of
        Marriage and Family, 69(5):1297–1306.
ILO (2017). Global Estimates of Child Labour: Results and Trends, 2012–2016. ILO (2021).
        Child Labour Global Estimates 2020, Trends and the Road Forward.
Jæger, M. M. (2012). The extended family and children’s educational success. American
        Sociological Review, 77(6):903–922.
Kazianga, H., De Walque, D., and Alderman, H. (2012). Educational and child labour impacts of
        two food-for-education schemes: Evidence from a randomised trial in rural Burkina Faso.
        Journal of African Economies, 21(5):723–760.
Kurosaki, T., Ito, S., Fuwa, N., Kubo, K., and Sawada, Y. (2006). Child labor and school
        enrollment in rural India: Whose education matters? The Developing Economies,
        44(4):440–464.
McCracken, J. (2012). A History of Malawi: 1859 - 1966. Boydell & Brewer Inc.
                                                119


Olea, J. L. M. and Pflueger, C. (2013). A robust test for weak instruments. Journal of Business &
        Economic Statistics, 31(3):358–369.
Patrinos, H. A. and Psacharopoulos, G. (1995). Educational performance and child labor in
        Paraguay. International Journal of Educational Development, 15(1):47–60.
Reardon, T., Stamoulis, K., Balisacan, A., Cruz, M., Berdegu ́e, J., and Banks, B. (1998). Rural
        non-farm income in developing countries. The State of Food and Agriculture, 1998:283–
        356.
Reggio, I. (2011). The influence of the mother’s power on her child’s labor in Mexico. Journal
        of Development Economics, 96(1):95–105.
Reimers, M. and Klasen, S. (2013). Revisiting the role of education for agricultural productivity.
        American Journal of Agricultural Economics, 95(1):131–152.
Rosati, F. C. and Tzannatos, Z. (2000). Child labor in Vietnam: An Economic Analysis. The
        World Bank: mimeo.
Singh, I., Squire, L., and Strauss, J. (1986). Agricultural Household Models: Extensions,
        Applications, and Policy. Number 11179. The World Bank.
Thomas, D. (1990). Intra-household resource allocation: An inferential approach. Journal of
        Human Resources, 635–664.
Thomas, D. (1994). Like father, like son; like mother, like daughter: Parental resources and child
        height. Journal of Human Resources, 950–988.
Tzannatos, Z. (2003). Child labor and school enrollment in Thailand in the 1990s. Economics of
        Education Review, 22(5):523–536.
Warren, J. R. and Hauser, R. M. (1997). Social stratification across three generations: New
        evidence from the Wisconsin Longitudinal Study. American Sociological Review, pages
        561–572. 20
Zeng, Z. and Xie, Y. (2014). The effects of grandparents on children’s schooling: Evidence from
        rural China. Demography, 51(2):599–617.
                                                 120


                                APPENDIX A: TABLES AND FIGURES
Table 3.1
                                              Summary Statistics
                                                                              Mean (S.E)
 Variable                                                      All                2016                2019
 Panel A. Child Characteristics
 Age (in years)                                          10.82 (3.60)         10.69 (3.59)        10.93 (3.61)
 Female (0/1)                                                  0.51                0.50                0.51
 Attends school (0/1)                                          0.92                0.92                0.92
 Contributes to household farm work (0/1)                      0.42                0.41                0.42
 Engaged in casual, part-time employ. (0/1)                    0.15                0.13                0.17
 Observations                                                 7,133               3,193               3,940
 Panel B. Household Characteristics
 Mother’s education (in years)                            5.35 (3.68)          5.04 (3.53)         5.59 (3.78)
 Father’s education (in years)                            6.71 (3.97)          6.44 (3.88)         6.92 (4.04)
 Mother’s age (in years)                                 37.95 (9.96)        37.87 (10.01)        38.00 (9.93)
 Father’s age (in years)                                43.20 (10.87)        43.19 (10.59)       43.21 (11.09)
 Household size                                           5.79 (1.94)          5.95 (1.93)         5.68 (1.93)
 Number of male HH members under 6                        0.53 (0.70)          0.52 (0.70)         0.53 (0.70)
 Number of female household members                       0.54 (0.70)          0.55 (0.73)         0.52 (0.68)
 under 6
 Area of cultivated land (in acres)                       1.83 (1.88)          1.88 (2.00)         1.80 (1.77)
 Maternal grandmother is educated (0/1)                        0.07                0.06                0.07
 Maternal grandfather is educated (0/1)                        0.16                0.17                0.15
 Paternal grandmother is educated (0/1)                        0.05                0.04                0.05
 Paternal grandfather is educated (0/1)                        0.11                0.11                0.12
 Female head (0/1)                                             0.27                0.24                0.28
 Religion
 % No religion                                                 1.96                2.44                1.59
 % Traditional                                                 0.63                0.03                1.10
 % Christian                                                  79.44               79.26               79.58
 % Islam                                                      15.29               15.76               14.93
 % Other religion                                              0.39                0.32                0.44
 Observations                                                 2,635               1,109               1,526
Notes: Summary statistics are reported on households with children aged 5 – 17 with observations weighted using the
2016 panel sampling weights. Standard errors are reported in parentheses.
                                                       121


Table 3.2
                 Child labor incidence and school attendance by gender and age
                                                 Gender                              Age Cohort
 Variable                    Total Girls Boys p-value:                   5 - 14       15 - 17       p-value:
                                                              Δ            yrs           yrs            Δ
 Household farm work 41.6               39.4 43.8           0.000         33.1          74.2         0.000
 Casual, part-time or        15.4       13.0 17.8           0.000         10.1          35.7         0.000
 “ganyu" labor
 Attends school              92.1       91.7 92.5           0.203         96.6          76.2         0.000
 Notes: Sample means are reported as percentages using the pooled sample across the 2016 and 2019 panel
 waves.
Source: Author’s own calculations.
                                                      122


Table 3.3
                        Estimates of parental education effects on child time use
                                                (1)                      (2)                       (3)
                                                             Linear Probability Model
                                                               Dependent Variables
 Variable                               HH Farm Work               “Ganyu" labor            Attends school
                                               (0/1)                    (0/1)                     (0/1)
 Mother’s education                          -0.004*                 -0.006***                 0.009***
 (years)                                     (0.002)                   (0.002)                  (0.001)
 Father’s education (years)                   -0.001                 -0.009***                  0.003**
                                             (0.002)                   (0.002)                  (0.001)
 Controls                                         ü                        ü                         ü
 District FE                                      ü                        ü                         ü
 Year dummy                                       ü                        ü                         ü
 Observations                                 6,951                     6,951                    6,334
Notes: Standard errors are reported in parentheses and are clustered at the child level. Control variables include
household size, wealth index, size of household cultivated land, number of female household members under age 6,
number of male household members under age 6, child’s gender, religion, female headship status, and household
distance from nearest road. * p < 0.10, ** p < 0.05, *** p < 0.01
                                                        123


Table 3.4
          Average partial effects of parental education on child time use from probit model
                                                  (1)                     (2)                        (3)
                                                                 Dependent Variables
 Variable                                 HH Farm Work              “Ganyu" labor             Attends school
                                                 (0/1)                   (0/1)                      (0/1)
 Mother’s education                            -0.004*                -0.007***                 0.010***
 (years)                                       (0.002)                  (0.002)                   (0.001)
 Father’s education (years)                     -0.001                -0.009***                   0.003**
                                               (0.002)                  (0.002)                   (0.001)
 Controls                                           ü                       ü                          ü
 District FE                                        ü                       ü                          ü
 Year dummy                                         ü                       ü                          ü
 Observations                                   6,951                    6,951                     6,334
Notes: Control variables include household size, wealth index, size of household cultivated land, number of female
household members under age 6, number of male household members under age 6, child’s gender, religion, female
headship status, and household distance from nearest road. Standard errors are reported in parentheses and are
clustered at the child level. * p < 0.10, ** p < 0.05, *** p < 0.01
                                                          124


Table 3.5
                     Effect of parental education on child time use by gender - LPM
                                                   (1)                        (2)                       (3)
                                                                   Dependent Variables
 Variable                                  HH Farm Work                “Ganyu" labor            Attends school
                                                  (0/1)                      (0/1)                     (0/1)
 Mother’s education (years)                      -0.003                  -0.006***                  0.008***
                                                (0.003)                    (0.002)                   (0.002)
 Father’s education (years)                      -0.002                  -0.010***                   0.003**
                                                (0.003)                    (0.002)                   (0.001)
 1[𝐺𝑖𝑟𝑙 = 1]                                     -0.022                  -0.048***                    -0.023
                                                (0.022)                    (0.018)                   (0.015)
 Mother’s educ × 1[𝐺𝑖𝑟𝑙 = 1]                     -0.002                     -0.001                     0.003
                                                (0.004)                    (0.003)                   (0.002)
 Father’s educ × 1[𝐺𝑖𝑟𝑙 = 1]                     0.001                       0.003                    -0.001
                                                (0.004)                    (0.003)                   (0.002)
 Controls                                            ü                          ü                         ü
 District FE                                         ü                          ü                         ü
 Year dummy                                          ü                          ü                         ü
 Observations                                    6,951                       6,951                     6,334
Notes: Standard errors are reported in parentheses and are clustered at the child level. Control variables include
household size, wealth index, size of household cultivated land, number of female household members under age 6,
number of male household members under age 6, religion, female headship status, and household distance from nearest
road. 1[𝐺𝑖𝑟𝑙 = 1] is an indicator for whether the child is a female. * p < 0.10, ** p < 0.05, *** p < 0.01
                                                          125


Table 3.6
                                       First stage regression results – LPM estimates
                               (1)                (2)                (3)              (4)        (5)            (6)
                                    Full sample                           Females                     Males
                                                                  Dependent variables
                           Mother’s            Father’s          Mother’s          Father’s  Mother’s        Father’s
 Variable                 education           education         education         education education       education
 Grandmother’s literacy    1.494***            0.817***          1.301***         0.839***   1.719***        0.782**
         (0/1)              (0.193)             (0.245)           (0.265)          (0.320)    (0.282)        (0.369)
 Grandfather’s literacy    1.800***            1.536***          2.097***         1.732***   1.476***       1.312***
         (0/1)              (0.144)             (0.178)           (0.216)          (0.243)    (0.187)        (0.262)
 Household size           -0.244***           -0.060***         -0.213***         -0.068**  -0.271***         -0.045
                            (0.027)             (0.023)           (0.039)          (0.031)    (0.037)        (0.033)
 Cultivated plot area       0.063**              0.022              0.055           -0.009    0.068*           0.052
         (acres)            (0.026)             (0.027)           (0.035)          (0.037)    (0.038)        (0.040)
 Wealth index             -0.006***            0.008***          0.006***         0.008***   0.007***       0.008***
                            (0.001)             (0.001)           (0.001)          (0.001)    (0.001)        (0.001)
 No. of female HH          0.255***              0.040           0.313***            0.129   0.366***         -0.080
         members < age 6    (0.059)             (0.061)           (0.081)          (0.080)    (0.090)        (0.095)
 No. of male HH            0.336***              0.027            0.221**           -0.022   0.282***          0.051
         members < age 6    (0.061)             (0.061)           (0.090)          (0.090)    (0.078)        (0.083)
 Female household head       -0.298             -0.148             -0.569            0.184     -0.144         -0.427
         (0/1)              (0.229)             (0.254)           (0.322)          (0.371)    (0.320)        (0.350)
 Distance to nearest road -0.024***           -0.019***            -0.011         -0.023**  -0.039***         -0.014
         (km)               (0.006)             (0.007)           (0.009)          (0.009)    (0.009)        (0.009)
 Mother’s education                            0.305***                           0.295***                  0.315***
         (years)                                (0.015)                            (0.021)                   (0.022)
 Father’s education        0.301***                              0.295***                    0.308***
         (years)            (0.014)                               (0.020)                     (0.021)
 Constant                  4.290***           8.8086***          4.551***         7.919***   2.791***       7.893***
                            (1.094)             (0.511)           (1.498)          (0.716)    (0.799)        (0.720)
 Multigenerational co-           ü                  ü                  ü                ü          ü              ü
 residence controls
                                                             126


 Table 3.6 (cont’d)
 Religion dummies                            ü                     ü                       ü                   ü     ü     ü
 District FE                                 ü                     ü                       ü                   ü     ü     ü
 Year dummy                                  ü                     ü                       ü                   ü     ü     ü
 Observations                                 6.790                  6,951                  3,536              2,800 3,415 2,801
Notes: Standard errors are reported in parentheses and are clustered at the child level. *p<0.10,**p<0.05,***p<0.01
                                                                                 127


Table 3.7
                  2SLS estimates of the impact of parental education on child time use
                                                        (1)                       (2)                      (3)
                                                                      Dependent Variables
 Variable                                        HH Farm work             “Ganyu" labor           Attends school
                                                       (0/1)                    (0/1)                     (0/1)
 Panel A: Mother’s education
 instrumented
 Mother’s education                                    0.005                 -0.015***                0.010***
                                                      (0.008)                  (0.005)                  (0.004)
 Multigenerational co-residence                           ü                         ü                        ü
 controls
 Other Controls                                           ü                         ü                        ü
 Religion dummies                                         ü                         ü                        ü
 District FE                                              ü                         ü                        ü
 Year dummy                                               ü                         ü                        ü
 Montiel Olea & Pflueger F stat                       195.19                   195.19                   185.40
 Hansen J stat (p-value)                            1.89 (0.17)              0.00 (0.99)             2.83 (0.09)
 Observations                                          6,951                    6,951                     6,334
 Panel B: Father’s education
 instrumented
 Father’s education                                  -0.026**                -0.023***                   -0.005
                                                      (0.013)                  (0.008)                  (0.006)
 Multigenerational co-residence                           ü                         ü                        ü
 controls
 Other Controls                                           ü                         ü                        ü
 District FE                                              ü                         ü                        ü
 Religion dummies                                         ü                         ü                        ü
 Year dummy                                               ü                         ü                        ü
 Montiel Olea & Pflueger F stat                        63.11                    63.11                     61.60
 Hansen J stat (p-value)                            0.02 (0.90)              1.16 (0.28)             2.14 (0.14)
 Observations                                          5,736                    5,736                     5,229
Notes: Standard errors are reported in parentheses and are clustered at the child level. Other control variables include
household size, wealth index, size of household cultivated land, number of female household members under age 6,
number of male household members under age 6, religion, female headship status, and household distance from nearest
road. * p < 0.10, ** p < 0.05, *** p < 0.01
                                                        128


Table 3.8
                                 2SLS estimates of the impact of parental education on child time use by gender
                                        (1)                 (2)                     (3)                   (4)                   (5)                   (6)
                                                          Females                                                            Males
 Variable                           HH Farm          “Ganyu” labor Attends school HH Farm work                           “Ganyu” labor        Attends school
                                   work (0/1)              (0/1)                   (0/1)                 (0/1)                                       (0/1)
 Panel A: Mother’s education instrumented
 Mother’s education                    0.000             -0.012*                   0.007                 0.010               -0.015*              0.014***
                                     (0.010)             (0.007)                 (0.005)                (0.011)              (0.008)                (0.005)
 Multi. co-residence                      ü                   ü                       ü                     ü                     ü                     ü
 Other controls                           ü                   ü                       ü                     ü                     ü                     ü
 Religion dummies                         ü                   ü                       ü                     ü                     ü                     ü
 District FE                              ü                   ü                       ü                     ü                     ü                     ü
 Year dummy                               ü                   ü                       ü                     ü                     ü                     ü
 Montiel Olea &                        92.26              92.26                  107.40                  93.74                93.74                  81.05
 Pflueger F stat
 Hansen J stat (p-value)           0.01 (0.92)         0.00 (0.95)            2.57 (0.11)            1.63 (0.20)           0.03 (0.86)           0.70 (0.40)
 Observations                          3,536              3,536                    3,245                 3,415                3,415                  3,089
 Panel B: Father’s education instrumented
 Father’s education                   -0.019              -0.000                  -0.011                -0.021             -0.032***                 0.001
                                     (0.016)             (0.009)                 (0.009)                (0.018)              (0.012)                (0.008)
 Multi. co-residence                      ü                   ü                       ü                     ü                     ü                     ü
 Other controls                           ü                   ü                       ü                     ü                     ü                     ü
 Religion dummies                         ü                   ü                       ü                     ü                     ü                     ü
 District FE                              ü                   ü                       ü                     ü                     ü                     ü
 Year dummy                               ü                   ü                       ü                     ü                     ü                     ü
 Montiel Olea &                        43.12              43.12                    38.82                 31.55                31.55                  24.04
 Pflueger F stat
 Hansen J stat (p-value)           0.14 (0.71)         0.55 (0.46)            1.54 (0.21)            0.02 (0.89)           1.54 (0.22)           0.57 (0.45)
 Observations                          2,868              2,868                    2,637                 2,868                2,868                  2,592
Notes: Standard errors are reported in parentheses and are clustered at the child level. Control variables include household size, wealth index, size of household
cultivated land, number of female household members under age 6, number of male household members under age 6, religion, female headship status, and household
distance from nearest road. *p<0.10,**p<0.05,***p<0.01
                                                                               129


Table 3.9
      2SLS estimates of the impact of parental education on child time use – robustness check
                                                         (1)                      (2)                      (3)
                                                                    Omitted Muslim sample
                                                                      Dependent Variables
 Variable                                        HH Farm work             “Ganyu" labor           Attends school
                                                       (0/1)                    (0/1)                     (0/1)
 Panel A: Mother’s education
 instrumented
 Mother’s education                                    0.009                  -0.014**                0.012***
                                                      (0.009)                  (0.006)                  (0.004)
 Multigenerational co-residence                            ü                        ü                        ü
 controls
 Other Controls                                            ü                        ü                        ü
 Religion dummies                                          ü                        ü                        ü
 District FE                                               ü                        ü                        ü
 Year dummy                                                ü                        ü                        ü
 Montiel Olea & Pflueger F stat                       142.96                   142.96                   136.84
 Hansen J stat (p-value)                            1.24 (0.26)              0.07 (0.79)             6.92 (0.01)
 Observations                                          5,852                    5,852                     5,341
 Panel B: Father’s education
 instrumented
 Father’s education                                   -0.025*                -0.028***                   -0.006
                                                      (0.015)                  (0.009)                  (0.007)
 Multigenerational co-residence                            ü                        ü                        ü
 controls
 Other Controls                                            ü                        ü                        ü
 Religion dummies                                          ü                        ü                        ü
 District FE                                               ü                        ü                        ü
 Year dummy                                                ü                        ü                        ü
 Montiel Olea & Pflueger F stat                        53.59                    53.59                     51.68
 Hansen J stat (p-value)                            0.06 (0.81)              2.01 (0.16)             2.97 (0.08)
 Observations                                          4,845                    4,845                     4,425
Notes: Standard errors are reported in parentheses and are clustered at the child level. Other control variables include
household size, wealth index, size of household cultivated land, number of female household members under age 6,
number of male household members under age 6, religion, female headship status, and household distance from nearest
road. * p < 0.10, ** p < 0.05, *** p < 0.01
                                                         130


Table 3.10
                    2SLS maternal educational impact - relaxing 𝜸 = 𝟎 assumption
                                        Estimated coefficient              Lower bound              Upper bound
                                                              Panel A: HH Farm Work
  𝛾; = 𝛾< = −0.001                                0.005                         -0.012                    0.014
 𝛾 ; = 𝛾< = −0.002                                0.005                         -0.012                    0.015
 𝛾 ; = 𝛾< = −0.003                                0.005                         -0.012                    0.015
                                                               Panel B: “Ganyu" labor
  𝛾; = 𝛾< = −0.001                             -0.015***                        -0.024                   -0.007
  𝛾; = 𝛾< = −0.002                             -0.015***                        -0.024                   -0.006
  𝛾; = 𝛾< = −0.003                             -0.015***                        -0.024                   -0.006
                                                               Panel C: Attends school
  𝛾; = 𝛾< = 0.001                               0.010***                         0.002                    0.018
  𝛾; = 𝛾< = 0.002                               0.010***                         0.002                    0.018
  𝛾; = 𝛾< = 0.003                               0.010***                         0.001                    0.018
Notes: 𝛾) and 𝛾* represent the direct effect of the maternal grandparents’ literacy variables on child time use. Bounds
derived from Conley et al. (2012)’s union of confidence intervals method. * p < 0.10, ** p < 0.05, *** p < 0.01
                                                          131


Table 3.11
                    2SLS paternal educational impact - relaxing 𝜸 = 𝟎 assumption
                                        Estimated coefficient              Lower bound              Upper bound
                                                              Panel A: HH Farm Work
  𝛾; = 𝛾< = −0.001                              -0.026**                        -0.048                   -0.005
  𝛾; = 𝛾< = −0.002                              -0.026**                        -0.048                   -0.004
  𝛾; = 𝛾< = −0.003                              -0.026**                        -0.048                   -0.003
                                                               Panel B: “Ganyu" labor
  𝛾; = 𝛾< = −0.001                             -0.023***                        -0.036                   -0.009
  𝛾; = 𝛾< = −0.002                             -0.023***                        -0.036                   -0.008
  𝛾; = 𝛾< = −0.003                             -0.023***                        -0.036                   -0.007
                                                               Panel C: Attends school
  𝛾; = 𝛾< = 0.001                                 -0.005                        -0.014                   0.007
  𝛾; = 𝛾< = 0.002                                 -0.005                        -0.015                   0.007
  𝛾; = 𝛾< = 0.003                                 -0.005                        -0.015                   0.007
Notes: 𝛾) and 𝛾* represent the direct effect of the paternal grandparents’ literacy variables on child time use. Bounds
derived from Conley et al. (2012)’s union of confidence intervals method. * p < 0.10, ** p < 0.05, *** p < 0.01
                                                          132


Table 3.12
              2SLS estimates of the effect of parental education on non-farm employment
                                                                      (1)                            (2)
                                                                           Dependent Variables
 Variable                                               Non-farm business (0/1) Wage employment(0/1)
 Panel A: Mother’s education
 instrumented
 Mother’s education                                               0.042***                        0.027***
                                                                   (0.012)                         (0.009)
 Multigenerational co-residence controls                                ü                              ü
 Other Controls                                                         ü                              ü
 Religion dummies                                                       ü                              ü
 District FE                                                            ü                              ü
 Year dummy                                                             ü                              ü
 Montiel Olea & Pflueger F stat                                     56.12                           57.01
 Hansen J stat (p-value)                                         1.06 (0.30)                     2.31 (0.13)
 Observations                                                       6,564                           6,940
 Panel B: Father’s education instrumented
 Father’s education                                               0.079***                        0.081***
                                                                   (0.021)                         (0.022)
 Multigenerational co-residence controls                                ü                              ü
 Other Controls                                                         ü                              ü
 Religion dummies                                                       ü                              ü
 District FE                                                            ü                              ü
 Year dummy                                                             ü                              ü
 Montiel Olea & Pflueger F stat                                     21.79                           22.02
 Hansen J stat (p-value)                                         0.22 (0.64)                     1.92 (0.17)
 Observations                                                       5,548                           5,736
Notes: Standard errors are reported in parentheses and are clustered at the parent level. Wage employment is measured
as an indicator variable for whether the mother worked as an employee for wages or salary in the past year in Panel
A, while it is measured as a dummy variable taking the value one if the father’s primary economic activity over the
past 12 months was wage employment in Panel B. Other control variables include household size, wealth index, size
of household cultivated land, number of female household members under age 6, number of male household members
under age 6, religion, female headship status, and household distance from nearest road. * p < 0.10, ** p < 0.05, ***
p < 0.01
                                                         133


Figure 3.1: Regional Prevalence of Child Labor
                                            134


Figure 3.2: Household Farm Labor Participation by parents’ education status
Notes: No education implies zero years of education. Observations are weighted using 2016 panel weights.
                                                       135


Figure 3.3: Casual, part-time employment by parents’ education status
Notes: No education implies zero years of education. Observations are weighted using 2016 panel weights.
                                                       136


Figure 3.4: Household farm labor participation by wealth quintiles
Notes: Q1 denotes lowest wealth quintile. The wealth index is measured using household assets based on Principal
Component Analysis. The wealth index variable was constructed using a principal component analysis where assets
such as cars, motorcycles, bicycles, televisions, electric or gas stove, generators, washing machines, air conditioner,
fan, radio, among others are given varying weights depending on the rarity of ownership among the sampled
households. Observations are weighted using 2016 panel weights.
                                                          137


Figure 3.5: Casual, part-time employment by wealth quintiles
Notes: Q1 denotes lowest wealth quintile. The wealth index is measured using household assets based on Principal
Component Analysis. The wealth index variable was constructed using a principal component analysis where assets
such as cars, motorcycles, bicycles, televisions, electric or gas stove, generators, washing machines, air conditioner,
fan, radio, among others are given varying weights depending on the rarity of ownership among the sampled
households. Observations are weighted using 2016 panel weights.
                                                          138


Figure 3.6: Effect of maternal education on household farm work - instrumented
                                              139


Figure 3.7: Effect of maternal education on casual, part-time or “ganyu" labor employment –
instrumented
                                              140


Figure 3.8: Effect of paternal education on household farm work - instrumented
                                              141


Figure 3.9: Effect of paternal education on casual, part-time or “ganyu” labor - instrumented
                                              142


                                 APPENDIX B: THEORETICAL MODEL
          In this section, I present a simple conceptual framework to formally model child labor
supply response to parental educational attainment. I use a version of the well-known Singh et al.
(1986)’s agricultural household model wherein households are simultaneously involved in both
consumption and production.34 In principle, parental education can influence child labor
participation through a variety of pathways. First, education can induce higher agricultural
productivity (Hayami and Ruttan, 1970; Appleton and Balihuta, 1996; Reimers and Klasen, 2013).
For instance, in settings with rapid or accelerating rate of technical change, educated farmers can
capitalize on the availability of new technological innovations to expand production scale. In a
cross-country study, Reimers and Klasen (2013) find a highly significant, and positive relationship
between education and agricultural productivity using panel data on 95 countries. The authors
show that this effect is robust to alternative specifications, data sets, and estimation strategies. The
resulting rise in agricultural incomes due to positive agricultural productivity shifts can relax a
household’s liquidity constraint, inducing a reorientation of the child’s time towards school.
          Second, education drives up the economic returns from non-farm work (that is, either in
wage or self-employment) which often requires skilled labor (Reardon et al., 1998). Reardon et al.
(1998) revealed that education is a strong determinant of non-farm employment participation,
projected to overtake landholdings as the major driver of non-farm income at least among rural
households. Consequently, educated parents might find their skills better suited to non-farm
activities especially in urban areas, where non-farm jobs are relatively plentiful. For children in
such households, their parent’s inter-sectoral mobility—that is, from on-farm to non-farm work—
34
   Ideally, and in keeping with the discussions above, we should employ a collective household model as the basis for
the theoretical micro-foundations. However, for expositional clarity and given that the intent of the model is not to
recover estimates for underlying parameters such as the intra-household bargaining weights, I rather use a unitary
household model.
                                                         143


might reduce their involvement in household farm work altogether if children tend to work on
farms side-by-side with their parents. The model presented here focuses on the latter pathway. This
model borrows from by Reggio (2011), but it is adapted to account for how parental education
shapes child labor participation decisions through non-farm employment.
          Consider a household comprising of two agents: a parent (agent 𝑝), and a child (agent 𝑐).
To model the household’s child labor supply decision, we will assume that we are in a setting
where only child, and adult (parent) family labor can be used in the production process, with the
child serving as a source of on-farm labor.35 The total time available to the parent is 1. We assume
that the utility functions are twice continuously differentiable, strictly quasi-concave, and
increasing in consumption, and leisure but decreasing in child labor. Another simplification of this
model is that the decision-making family member (the parent) derives utility from aggregate
household consumption, and we further assume that the cross partial derivative of the parent’s
utility function between consumption and child labor is non-negative, 𝑢c;+ ,^ ≥ 0. This assumption
implies that the marginal utility of consumption is non-decreasing in child labor. That is, household
consumption and exempting the child from work are not complementary (Reggio, 2011). Given
that this assumption is a staple in the existing theoretical child labor literature, it seems at least
standard to maintain (Dumas, 2020).
          Further, we also assume that the child cannot sell her labor hours on the labor market, but
the parent can. This assumption aligns with the type of child labor activities we consider in this
study given that most child laborers serve as contributing household workers without pay. Besides,
35
   A significant contribution of this study is the investigation of parental educational attainment on different forms of
child work including casual, part-time or “ganyu" labor. However, the theoretical model presented here focuses on the
child’s involvement in household farm work for the sake of brevity and tractability.
                                                           144


less than 0.3% of the children in the sample reported to have engaged in any full-time salaried or
wage employment during the 2016 survey year.
The household solves the following optimization problem:
                                        max             𝑢; 7𝑐; , 𝑙e , ℎ; + 𝛽𝑢< (𝑐< )               (1)
                                  c+ ,c, ,4- ,>. ,^,d+
            𝑠. 𝑡.      𝑐; + 𝜌𝑠c = 𝑏; + 𝑝8 𝐹 µ(1 − 𝛾) —71 − 𝑙e ; + 𝜆ℎ˜¶ + 𝛾𝑤e 71 − 𝑙e ;             (2)
                                              𝑐< + 𝑅𝑏; = 𝑤 + 7𝑤 − 𝑤;𝑠c                             (3)
                                                     ℎ + 𝑠c + 𝑙c = 1                               (4)
where 𝑢# denotes the parent’s utility at time 𝑡 ∈ {1, 2}36 The parent’s utility is a function of
aggregate consumption 𝑐# in each time period, and she derives utility from her own leisure (𝑙e ),
and disutility from child labor, ℎ in period 1. In the first period, the household can allocate the
child’s total time endowment, 1 across schooling (𝑠c ), work (ℎ), or leisure (𝑙c ). The parent, on the
other hand, can either work or enjoy leisure in period 1, but does not work in period 2 (that is, 𝑙e =
1 in the second period). The household’s consumption expenditure in period 1 is met with farm
income from agricultural production (𝐹(. )), borrowing (𝑏; ), and non-farm income if the parent’s
probability of engaging in non-farm work, 𝛾 is non-zero. An important assumption underlying the
period 1 budget constraint is that child labor complements adult family labor in household
agricultural production. Hence, movement of the parent completely out of farm work drives the
child’s contribution to household farm work down to zero. Household expenditure in period 1
consists of direct consumption, 𝑐; and the cost of child education, 𝜌 if she attends school; 𝐹(. )
denotes the household’s production function which we assume is increasing in total household
labor but exhibits diminishing marginal returns (that is, 𝐹′(. ) > 0, and 𝐹″(. ) < 0); 𝜆 is the labor
36
   Notice that all household decisions rest with the parent.
                                                             145


productivity ratio between adult and child labor, and 𝑤e is the prevailing non-farm employment
wage rate; 𝑝8 is the price of the production good, and we normalize the price of the consumption
good to 1.
         In period 2, the child is now a working adult, and her income depends on her educational
status in the first period. Hence, the child’s income in period 2 is given by a base income, 𝑤 plus
a schooling premium, 7𝑤 − 𝑤;𝑠c depending on the amount of schooling she received in period 1.
The household’s consumption in period 2, 𝑐< is covered by the child’s earnings as a working adult
and any outstanding household debt incurred in period 1 must be paid-off at a gross interest rate,
𝑅. Consolidating the budget constraints for the two time periods, and the child time constraint
yields the following inter-temporal budget constraint:
            𝑐<                                      𝑤         𝑤−𝑤
      𝑐; +     = 𝑝8 𝐹U(1 − 𝛾)7(1 − 𝑙b ) + 𝜆ℎ;V + − œ𝜌 −               • (1 − ℎ − 𝑙c ) + 𝛾𝑤e 71 − 𝑙e ;
            𝑅                                       𝑅            𝑅
                                                                                                 (5)
which indicates that the household’s total consumption (in period 1 units) across the two time
periods must equal the total amount of resources available to them (also expressed in period 1
units). Given this consolidated inter-temporal budget constraint, we can solve the household’s
problem above for the optimal levels of consumption, leisure, schooling, and child labor. The
generic functional form of our utility function precludes the derivation of closed form solutions
for these choice variables, though we can show that these variables can be derived as functions of
prices, wages, and the parameters, 𝛾, and 𝜆.
         Next, we investigate the influence of parental education on child labor through non-farm
employment. Assuming we have an interior solution, the first order necessary condition with
respect to child labor yields:
                                                  146


                                                                    𝑤−𝑤
                              𝑢^; = −𝜇 º𝑝8 (1 − 𝛾)𝜆𝐹′(. ) + œ𝜌 −          •»                     (6)
                                                                      𝑅
where 𝜇 denotes the Lagrange multiplier. The FOC above can be interpreted as the marginal utility
or disutility from child labor for the parent must equal the marginal benefit of child work captured
by the marginal revenue product of child labor from agricultural production, plus the difference
between the direct cost of school attendance, and the discounted premium from having an educated
child. Similarly, the first order condition with respect to period 1 aggregate consumption produces:
                                                 𝜇 = 𝑢c;+                                        (7)
Substituting (7) into (6), it follows that,
                                 𝑢^;                              𝑤−𝑤
                             − ; = º𝑝8 (1 − 𝛾)𝜆𝐹′(. ) + œ𝜌 −            •»                       (8)
                                𝑢c+                                  𝑅
which represents the trade-off between child labor and household consumption.
        To derive the effect of parental education on child labor, we can rewrite (8) as follows:
                                                                   𝑤−𝑤
                      𝐺 = 𝑢^; + 𝑢c;+ º𝑝8 (1 − 𝛾)𝜆𝐹′(. ) + œ𝜌 −            •» = 0                 (9)
                                                                      𝑅
Recall that—as one of our assumptions stipulates—parental education affects child labor by
driving up the likelihood of non-farm employment. Therefore, by implicitly differentiating (9) with
respect to the probability of non-farm work participation, 𝛾, we have:
                                                       𝛿𝐺
                                               𝑑ℎ      𝛿𝛾
                                                  =−                                           (10)
                                               𝑑𝛾      𝛿𝐺
                                                       𝛿ℎ
                           −𝜆𝑝8 𝑢c;+ µ𝐹′(. ) − 𝜆(1 − 𝛾) —71 − 𝑙e∗ ; + 𝜆ℎ∗ ˜ 𝐹″(. )¶
         =−                                                                                    (11)
                ;                                        𝑤−𝑤
               𝑢^,^ + 𝑢c;+ ,^ ¼𝑝8 (1 − 𝛾)𝜆𝐹′(. ) + 𝜌 − ƒ                 ; <(       )< ( )
                                                           𝑅 „½ + 𝑝8 𝑢c+ 𝜆 1 − 𝛾 𝐹″ .
                                                    147


where 𝑢c;+ denotes the parent’s marginal utility of period 1 consumption. Notice that given our
assumptions above, the numerator is unambiguously negative (that is, since 𝐹′(. ) > 0 and 𝐹″(. ) <
0). By contrast, the sign of the denominator is ambiguous and will depend on the relative
magnitudes of the immediate benefits of not enrolling the child in school and the discounted
premium from having an educated child working in period 2, as well as the restriction on the cross
partial derivative between consumption and child labor. This result can be expressed
mathematically as follows:37
                             −    if  𝑢c;+ ,^ = 0
                           ⎧
                           ⎪                                                                  𝑤−𝑤
                   𝑑ℎ        −    if 𝑢c;+ ,^ > 0 and 𝑝8 (1 − 𝛾)𝜆𝐹′(. ) + 𝜌 ≤ œ                         •
           𝑠𝑖𝑔𝑛 ¼ ½ =                                                                             𝑅               (12)
                   𝑑𝛾      ⎨
                           ⎪+/− if 𝑢; > 0 and 𝑝8 (1 − 𝛾)𝜆𝐹′(. ) + 𝜌 > œ𝑤 − 𝑤•
                                             c+ ,^
                           ⎩                                                                          𝑅
         In sum, our theoretical model can be distilled into three alternative predictions. First, an
increase in the probability of non-farm employment induced by higher parental education reduces
child labor if the cross partial derivative between consumption and child labor, 𝑢c;+ ,^ is zero.
Specifically, if a marginal increase (or decrease) in child labor has no effect on the marginal utility
of consumption for the parent, child labor should fall with an increase in 𝛾. Because an increase in
the likelihood of the parent’s non-farm engagement diminishes the child’s contribution to
household agricultural production, we expect child labor to diminish given that the parent derives
disutility from child labor and the marginal utility of consumption remains unaffected.
37
   Similarly, we can express the comparative statics in terms of the effect on child labor due to a change in parental
education as follows:
                                                    𝑑ℎ      𝑑ℎ 𝑑𝛾
                                                         =      .
                                                    𝑑𝐸      𝑑𝛾 𝑑𝐸
                                                     ⏟       ⏟     ⏟
                                                   (1/3)   (1/3) (3)
, where 𝐸 represents parental schooling.
                                                          148


         Second, if we assume that the marginal utility of consumption increases with child labor,
then for a marginally higher propensity of parental non-farm employment, a reduction in child
labor incidence is yet optimal if the future gain of having an educated child at least exceeds the
immediate benefit of child work. Put differently, given that the parent’s valuation of aggregate
consumption in period 2 offsets the benefit of additional child effort in agricultural production in
period 1, child labor will still decline. On the other hand, under the assumption that the marginal
utility of consumption with respect to child labor is positive, and the immediate benefit of child
work is relatively higher, then the effect of a change in 𝛾 on child labor will depend on the
magnitudes of the second order effects of child labor on the parent’s utility and household
agricultural production. The overall effect on child labor is therefore ambiguous in this case.
                                                  149