PREVENTING DIARRHEAL INFECTIONS WITH HOUSEHOLD WATER TREATMENT: LESSONS FROM SIMULATION MODELS By Kyle Scott Enger A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Fisheries and Wildlife 2012 X1 ABSTRACT PREVENTING DIARRHEAL INFECTIONS WITH HOUSEHOLD WATER TREATMENT: LESSONS FROM SIMULATION MODELS By Kyle Scott Enger Diarrheal disease kills two million children per year in developing countries. Diarrhea is controlled in industrialized countries by systems to remove sewage and distribute clean water. These systems are difficult to fund, build, and maintain in developing countries, so simpler technologies are promoted: e.g., household water treatment (HWT), handwashing, and latrines. These technologies, however, require consistent effort by individuals for proper use and maintenance, defined here as 'compliance'. Measuring compliance is difficult, and often neglected. It is important to understand how the extent and pattern of compliance within communities affects the prevention of diarrhea by HWT, while accounting for biases in field trials and characteristics of natural transmission systems, such as the presence of multiple pathogens, transient spikes of contamination, and multiple transmission routes. This question was answered by: reanalyzing and generalizing results from a HWT field trial, using a quantitative microbial risk assessment (QMRA) model to adjust for bias (chapter 3); examining the joint effects of HWT antimicrobial efficacy and compliance on prevention of diarrhea (chapter 4); and using a model of diarrheal infection transmission incorporating multiple routes of infection to further examine efficacy and compliance issues with HWT (chapter 5). The QMRA model of the field trial found that compliance greatly affected HWT effectiveness: with low compliance, 10% of diarrhea was prevented; with high compliance, 90% was prevented. It also estimated source water pathogen concentrations source water that were X2 consistent with measurements from other developing countries. The model found that the effect of an imperfect placebo device used during the field trial depended on the assumed level of compliance during the field trial. The QMRA model was modified to examine how HWT compliance and antimicrobial effectiveness jointly altered diarrheal disease risk. Given perfect compliance, increasing antimicrobial effectiveness always lowered risk. If compliance was incomplete, increasing antimicrobial effectiveness eventually ceased to lower risk, except in a few scenarios with high incidence, high compliance, or large water contamination spikes. The pattern of compliance by communities also influenced risk; e.g., risk was lower if 90% of people used HWT perfectly and 10% never used HWT, than if 100% of people used HWT 90% of the time. A preliminary transmission model simulated a community in which infected people shed pathogens, which were ingested by other people via exposure to land, drinking water, their household environment, or visits from other households. It found similar results to the QMRA models regarding compliance. Transmission of diarrheal pathogens by household visits appeared unimportant compared to other routes; however, visits consisted of exchanges of pathogens between household environments. Other types of visits (e.g., shared child care) might lead to greater transfer of pathogens and a greater influence of visits on illness. The model also inferred that viruses and protozoa were attenuated (removed from the system, e.g., by decay or sequestration) ~10 times faster than bacteria. Future sensitivity and uncertainty analyses will highlight important aspects of the model and its parameters that contribute to its results. The amount and pattern of compliance strongly affects diarrhea prevention by HWT. Further research should be conducted on improving compliance with HWT in developing countries. X3 © Copyright by Kyle Scott Enger 2012 Some rights reserved. This work is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ or send a letter to Creative Commons, 444 Castro Street, Suite 900, Mountain View, California, 94041, USA. Computer code in this work is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. Computer code in this work is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. The full GNU General Public License is also reproduced in the appendix of this work (chapter 10), with the computer code. X4 ACKNOWLEDGEMENTS I appreciate the assistance and oversight of my graduate committee members, who reliably ask and answer difficult questions: Joseph N. S. Eisenberg, Anne E. Ferguson, and Cheryl A. Murphy, led by my major professor, Joan B. Rose. I thank Joseph N. S. Eisenberg and Kara L. Nelson for substantial guidance on model design and manuscript editing; Bryan T. Mayer for reviewing the program code for chapters 3, 4, and 5; Sophie Boisson and Thomas Clasen for discussion and interpretation of the LifeStraw® randomized controlled trial (Boisson et al., 2010) and access to original data; Mark H. Weir for discussions of modeling and programming; and Ian M. Dworkin, Joseph N. S. Eisenberg, Daniel B. Hayes, James D. A. Millington, and Carl P. Simon for their excellent instruction in graduate courses concerning modeling, programming, and mathematics. I am also grateful to the Michigan State University High Performance Computing Center (HPCC) and the Institute for Cyber Enabled Research (iCER), for providing hardware and software to run the simulations described in chapters 4 and 5, as well as technical assistance (particularly Dirk J. L. Colbry). My work was financially supported by: a University Distinguished Fellowship from the Michigan State University Graduate School; a research assistantship from Joan B. Rose and the Center for Advancing Microbial Risk Assessment (CAMRA); and a grant from the Vestergaard Frandsen Corporation, which produces the LifeStraw® Family filtration device; that grant's principal investigators were Joseph N. S. Eisenberg and Kara L. Nelson. Although this represents a potential conflict of interest, we attest that the Vestergaard Frandsen Corporation did not influence study design, results, or the decision to publish the results. In conclusion, I am particularly grateful to Joan B. Rose for many insightful and thoughtprovoking conversations; research oversight; generous funding for my stipend, travel, and equipment; and a great deal of editing. v TABLE OF CONTENTS LIST OF TABLES..........................................................................................................................xi LIST OF FIGURES.......................................................................................................................xii 1. INTRODUCTION ......................................................................................................................1 1.1. Brief summary of diarrheal disease and control.................................................................1 1.2. Scientific questions.............................................................................................................2 1.3. General assumptions within this dissertation......................................................................4 2. REVIEW OF DIARRHEAL INFECTION: EPIDEMIOLOGY, INTERVENTIONS, AND MODELING .............................................................................................................................5 2.1. Introduction.........................................................................................................................5 2.2. Definition and measurement of diarrhea.............................................................................6 2.2.1. Persistent diarrhea...................................................................................................8 2.3. Multiple infections..............................................................................................................9 2.4. Asymptomatic infections....................................................................................................9 2.5. Burden of diarrheal disease...............................................................................................10 2.5.1. Morbidity...............................................................................................................10 2.5.2. Mortality................................................................................................................10 2.5.3. Disability-adjusted life years (DALYs) attributable to diarrhea............................11 2.6. Changes in diarrheal morbidity over time in individuals..................................................13 2.7. Diarrhea-malnutrition vicious cycle.................................................................................14 2.8. Cyclic or recurring changes ('seasonality') in diarrhea incidence.....................................16 2.9. Crowding (urban vs. rural)................................................................................................17 2.10. Bias in epidemiological studies of diarrhea....................................................................17 2.11. Relative contributions of pathogens to diarrheal etiology..............................................19 2.12. Survey of diarrheal pathogens........................................................................................22 2.12.1. Viral diarrheal pathogens.....................................................................................22 Rotavirus..............................................................................................................23 Norovirus.............................................................................................................25 2.12.2. Bacterial diarrheal pathogens..............................................................................27 Pathogenic Escherichia coli.................................................................................27 Campylobacter species........................................................................................31 2.12.3. Protozoan diarrheal pathogens............................................................................33 Cryptosporidium parvum and Cryptosporidium hominis....................................33 Giardia species.....................................................................................................35 2.12.4. Metazoan pathogens (helminths).........................................................................37 2.13. Interventions to prevent transmission of diarrheal pathogens........................................38 2.14. Effectiveness trials of interventions................................................................................38 2.14.1. Measures of effect in intervention trials..............................................................38 2.14.2. Nature of bias in intervention trials.....................................................................40 2.15. Compliance with interventions, and long-term sustainability........................................42 2.15.1. Costs of compliance (monetary and otherwise)..................................................43 vi 2.15.2. Psychological, social, and cultural aspects of compliance..................................44 2.15.3. Examples of compliance measurements in the field...........................................45 2.16. Interaction of intervention effects...................................................................................46 2.17. Descriptions of individual interventions.........................................................................49 2.17.1. Sanitation.............................................................................................................49 2.17.2. Water supply improvement..................................................................................50 Water quantity improvement...............................................................................51 Water quality improvement.................................................................................51 2.17.3. Hygiene................................................................................................................52 Handwashing.......................................................................................................52 Diapering and open defecation............................................................................54 Anal cleansing.....................................................................................................55 Food preparation..................................................................................................55 Fly control...........................................................................................................56 2.17.4. Household water treatment (HWT), or point-of-use (POU) technology.............56 Safe storage.........................................................................................................57 Boiling.................................................................................................................57 Solar disinfection (SODIS)..................................................................................58 Chlorination.........................................................................................................59 Filtration..............................................................................................................60 Biosand filters......................................................................................................61 Ceramic filters.....................................................................................................62 Advanced filter technologies...............................................................................62 Recent controversy surrounding HWT................................................................65 2.17.5. Other interventions..............................................................................................66 2.17.6. Gaps in knowledge about diarrheal disease interventions...................................67 2.18. Infection transmission modeling and its application to diarrheal disease.......................68 2.18.1. Types of models...................................................................................................68 2.18.2. Model verification and validation.......................................................................70 2.19. Modeling transmission of diarrheal infections...............................................................71 2.19.1. Simple mathematical example of a transmission model.....................................72 2.19.2. Environmental infection transmission models....................................................73 2.19.3. Quantitative microbial risk assessment (QMRA) models...................................75 2.20. Conceptual model of diarrheal disease transmission......................................................76 2.21. Theoretical issues regarding interventions......................................................................77 2.22. Tools and information useful for modeling diarrhea transmission.................................79 2.23. Published models relevant to endemic diarrhea transmission........................................81 2.23.1. Mechanistic model of diarrheal infection: Eisenberg et al., 2007.......................81 2.23.2. Empirical model of diarrheal disease: Schmidt et al., 2009................................82 2.23.3. Modeling indirect effects of interventions: Halloran et al., 2002........................83 2.23.4. Environmental infection transmission system (EITS) models: Li et al., 2009....84 2.24. Conclusion......................................................................................................................85 3. LINKING A QUANTITATIVE MICROBIAL RISK ASSESSMENT MODEL TO A HOUSEHOLD WATER TREATMENT FIELD TRIAL ........................................................86 3.1. Abstract.............................................................................................................................86 3.2. Introduction.......................................................................................................................87 vii 3.3. Materials and Methods......................................................................................................89 3.3.1. Conceptual framework linking QMRA models to epidemiological studies..........89 3.3.2. Model description..................................................................................................90 Step 1: Parameter entry.......................................................................................93 Step 2: Parameter values inferred through calibration........................................94 Step 3: Initiating each run; establishing equilibrium waterborne infection........96 Step 4: Calculating daily doses of marker pathogens..........................................97 Step 5: Dose response functions..........................................................................98 Step 6: Assignment of infection..........................................................................99 Step 7: Assignment of infection duration, recovery, and immunity....................99 Step 8: Surveying the population about reported diarrhea................................100 Step 9. Determining model outcomes corresponding to the Lifestraw RCT....101 Step 10: Repetition of calibration runs..............................................................102 3.3.3. Example runs of the model..................................................................................103 3.3.4. Analytical process...............................................................................................105 3.4. Results.............................................................................................................................106 3.4.1. Calibration step....................................................................................................106 3.4.2. Estimation step....................................................................................................109 3.5. Discussion.......................................................................................................................113 3.5.1. Predicting pathogen concentrations in drinking water sources...........................116 3.5.2. Calibration of microbial risk assessment models................................................116 4. THE JOINT EFFECTS OF EFFICACY AND COMPLIANCE IN HOUSEHOLD WATER TREATMENT EFFECTIVENESS .......................................................................................119 4.1. Abstract...........................................................................................................................119 4.2. Introduction.....................................................................................................................120 4.3. Materials and methods....................................................................................................123 4.3.1. Compliance..........................................................................................................124 4.3.2. Baseline incidence and etiologic fraction............................................................125 4.3.3. Short-term contamination spikes.........................................................................125 4.3.4. Calibration step....................................................................................................126 Determination of etiologic fractions..................................................................129 4.3.5. Estimation step....................................................................................................130 4.3.6. Replication of the WHO model...........................................................................130 4.4. Results.............................................................................................................................131 4.4.1. Calibration step....................................................................................................131 4.4.2. Estimation step....................................................................................................134 Comparison with the WHO QMRA model.......................................................134 Effect of LRVs given imperfect compliance.....................................................135 4.4.3. Charts of median incidences from all estimation scenarios................................143 4.4.4. Assessing impact of high LRVs: significance testing & classification trees.......150 4.5. Discussion.......................................................................................................................154 4.5.1. Information needed to inform models of diarrheal infection transmission.........155 Pathogen concentrations in source waters.........................................................156 Etiology of diarrheal disease.............................................................................156 Routes of transmission other than drinking water.............................................156 4.5.2. Conclusions.........................................................................................................157 viii 5. TRANSMISSION MODEL OF DIARRHEAL INFECTION ................................................158 5.1. Abstract...........................................................................................................................158 5.2. Introduction.....................................................................................................................159 5.3. Materials and methods....................................................................................................160 5.3.1. General description of the model........................................................................160 Summary of the steps in each model run...........................................................165 5.3.2. Technical description of the model......................................................................169 Step 1: Enter parameters....................................................................................171 Step 2: Create random number tables and output logs......................................171 Step 3: Set up the simulated community...........................................................171 Step 4: Start daily loop and tally people in all states.........................................174 Step 5: Defecation.............................................................................................174 Step 6: Transfer pathogens from land to surface water.....................................175 Step 7: Transfer pathogens between households via visits................................175 Step 8: Resupply stored water & apply household water treatment (HWT).....176 Step 9: Pathogen transfer from household environment to drinking water.......177 Step 10: First attenuation of pathogens.............................................................177 Steps 11 & 12: Decrement all status counters and apply status shifts...............178 Step 13: Calculation of daily pathogen doses and dose response.....................178 Step 14: Assignment of baseline exposures.......................................................180 Step 15: Second inactivation of pathogens........................................................180 Step 16: Continue to next day, or end simulation..............................................181 5.3.3. Calibration and estimation...................................................................................181 Determination of calibration parameter ranges.................................................182 5.4. Results.............................................................................................................................183 5.4.1. Calibration...........................................................................................................183 5.4.2. Estimation............................................................................................................191 5.5. Discussion.......................................................................................................................196 5.5.1. Calibration step....................................................................................................196 5.5.2. Estimation step....................................................................................................197 5.5.3. Limitations of the EITS model............................................................................197 5.5.4. Insights gained during the model construction process.......................................200 5.5.5. Future applications of the model.........................................................................202 Readily achievable applications........................................................................202 Applications requiring substantial modifications to the model.........................203 5.5.6. Conclusions.........................................................................................................205 6. CONCLUSIONS .....................................................................................................................207 6.1. Summary of research......................................................................................................207 6.2. Implications for future diarrheal research and prevention efforts...................................209 6.2.1. Conduct and description of field trials................................................................209 6.2.2. Compliance with interventions that prevent diarrheal infections........................211 7. APPENDIX A: DISCUSSION OF PARAMETER VALUES USED IN THE MODELS .....214 7.1. Water ingestion rate........................................................................................................214 7.2. Hand-mouth contacts per day.........................................................................................215 7.3. Log10 reduction values (LRVs) attributable to interventions.........................................215 7.3.1. LRVs attributable to sanitation............................................................................215 ix 7.3.2. LRVs attributable to intervention and placebo filters in the Lifestraw RCT.......216 7.4. Dose response functions.................................................................................................218 7.4.1. Dose response for E. coli infection .....................................................................219 7.4.2. Dose response for rotavirus and Giardia.............................................................219 7.5. Incubation periods...........................................................................................................219 7.6. Morbidity ratios..............................................................................................................220 7.6.1. Morbidity ratios in young children......................................................................220 7.6.2. Morbidity ratios in adults....................................................................................221 7.7. Durations of illness and infection...................................................................................222 7.7.1. Duration of diarrheagenic E. coli infection and illness.......................................223 7.7.2. Duration of rotavirus infection and illness..........................................................223 7.7.3. Duration of Giardia infection and illness............................................................223 7.8. Duration of immunity.....................................................................................................224 7.9. Incomplete recall of diarrheal disease.............................................................................224 7.10. Fecal excretion of pathogens........................................................................................225 7.10.1. Concentrations of pathogens in feces................................................................225 7.10.2. Amount of feces excreted..................................................................................226 7.11. Inactivation, removal, or attenuation of pathogens in the environment........................227 7.12. Pathogen movement from land to surface water...........................................................228 7.13. Demographic parameters..............................................................................................228 7.14. Summary table of all parameter values.........................................................................228 8. APPENDIX B: ADDITIONAL INTERVENTIONS FOR MITIGATING DIARRHEA .......235 8.1. Nutritional interventions.................................................................................................235 8.1.1. Breastfeeding.......................................................................................................235 8.1.2. Zinc supplementation..........................................................................................236 8.1.3. Other nutrients.....................................................................................................237 8.2. Treatment of diarrheal illness.........................................................................................237 8.2.1. Oral rehydration salts/solution/therapy (ORS)....................................................237 8.2.2. Access to care......................................................................................................238 8.3. Vaccination......................................................................................................................238 9. APPENDIX C: SOURCE CODE FOR THE MODELS ........................................................239 9.1. Overview of the source code for the models..................................................................239 9.2. GNU General Public License..........................................................................................239 9.3. The QMRA model simulating the Lifestraw field trial (chapter 3)................................254 9.4. The QMRA model investigating compliance and LRVs (chapter 4)..............................303 9.5. The EITS model (chapter 5)...........................................................................................375 10. APPENDIX D: GLOSSARY ................................................................................................446 11. REFERENCES ......................................................................................................................453 x LIST OF TABLES Table 2.1. Log10 reduction values (LRVs) for various interventions............................................64 Table 2.2. Log10 reduction values: standards for household water treatment (HWT)..................65 Table 2.3. Terms commonly used to describe or categorize models..............................................69 Table 3.1. Fixed parameter values used in the QMRA model of the Lifestraw RCT....................93 Table 3.2. Ranges for stochastically varying parameters...............................................................96 Table 3.3. Longitudinal prevalence measures from the Lifestraw field trial...............................103 Table 4.1. Compliance among individuals in each model run, given compliance type β............124 Table 4.2. Criteria for the calibration step of the QMRA model.................................................125 Table 5.1. Summary of calibration parameters............................................................................169 Table 5.2. Ranges over which calibration parameters were sampled..........................................169 Table 5.3. Description of matrices tracking households and people............................................173 Table 5.4. Criteria for calibrating the transmission model...........................................................182 Table 5.5. Association of calibration parameters with incidence.................................................188 Table 6.1. Important community characteristics for measurement in field trials........................210 Table 7.1. Summary table of all parameter values.......................................................................230 xi LIST OF FIGURES Figure 2.1. Diarrhea mortality by WHO region.............................................................................11 Figure 2.2. Diarrhea DALYs by WHO region...............................................................................13 Figure 2.3. Estimated etiology of childhood diarrhea worldwide.................................................20 Figure 2.4. Simple F-diagram of transmission of diarrheal pathogens..........................................49 Figure 2.5. Schematic of the Susceptible-Infectious-Removed (SIR) model................................72 Figure 2.6. Simple environmental infection transmission system model......................................74 Figure 2.7. Expanded diagram of diarrheal pathogen transmission...............................................77 Figure 3.1. Conceptual model for simulation of a randomized controlled trial.............................90 Figure 3.2. Simulation model flowchart........................................................................................92 Figure 3.3. Comparison of dose response functions......................................................................99 Figure 3.4. Example run of the model, with higher infection levels than the Lifestraw RCT.....104 Figure 3.5. Example run of the model, infection levels consistent with the Lifestraw RCT.......105 Figure 3.6. Distributions of LPRs consistent with the Lifestraw RCT........................................108 Figure 3.7. Distributions of simulated microbial concentrations.................................................109 Figure 3.8. LPR distributions for differing compliance assumptions and placebo behavior.......111 Figure 3.9. Longitudinal prevalence distributions under differing compliance assumptions......112 Figure 3.10. Longitudinal prevalence of waterborne infection in the estimation step.................113 Figure 4.1. Calibration results assuming medium incidence of diarrhea.....................................128 Figure 4.2. Mean baseline pathogen concentrations from calibration.........................................133 Figure 4.3. Comparison with WHO model..................................................................................135 Figure 4.4. Effect of compliance with HWT on the incidence ratio of diarrhea, by LRV...........136 Figure 4.5. Effect of compliance and spikes on the IR of childhood diarrhea, by LRVs............137 Figure 4.6. Effect of dose response function nonlinearity at high doses.....................................138 xii Figure 4.7 Incidence ratio of diarrhea by compliance level & type, spikes, and LRVs...............139 Figure 4.8. Effect of compliance on IR if large contamination spikes occur..............................141 Figure 4.9. Effect of compliance on IR with extreme pathogen concentrations..........................143 Figure 4.10. Detailed estimation results (low incidence)............................................................145 Figure 4.11. Detailed estimation results (medium incidence).....................................................146 Figure 4.12. Detailed estimation results (high incidence)...........................................................147 Figure 4.13. Detailed estimation results, WHO/EPA recommended LRVs.................................149 Figure 4.14. Detailed estimation results, extremely high baseline pathogen concentrations......150 Figure 4.15. Classification tree for incidence difference (ID) criterion.......................................152 Figure 4.16. Classification tree for incidence ratio (IR) criterion................................................153 Figure 5.1. Simplified overview of the simulated community....................................................162 Figure 5.2. Structure of each household within the simulated community.................................163 Figure 5.3. Daily progression of the EITS model of diarrheal infections....................................165 Figure 5.4. Flowchart of the operations of the EITS model........................................................170 Figure 5.5. Calibration output from EITS model, transfer parameters........................................185 Figure 5.6. Calibration output from EITS model, scatterplots & histograms..............................186 Figure 5.7. Simulated diarrhea incidence in children (calibration step)......................................187 Figure 5.8. Distributions of transfer calibration parameters........................................................190 Figure 5.9. Distributions of attenuation calibration parameters..................................................191 Figure 5.10. Estimation step, incidence by LRV of HWT...........................................................193 Figure 5.11. Comparison of EITS and QMRA results.................................................................195 xiii 1. INTRODUCTION 1.1. Brief summary of diarrheal disease and control Diarrheal disease is a major cause of illness and death in developing countries, particularly among young children (World Health Organization, 2008). Diarrheal infections are generally transmitted by the fecal-oral route: ingesting water, food, or soil that has been contaminated by feces. However, transmission can be influenced (overtly or subtly) by a wide variety of factors, including: availability of water infrastructure (e.g., piped treated water systems, or other improved water sources such as carefully constructed wells); sanitation infrastructure (latrines or sewer systems); personal hygiene; types and quantities of pathogens present; nutritional status; climate; socioeconomic status; cultural factors; and many others. Therefore, the effectivenesses of interventions to reduce diarrhea can vary greatly depending on the characteristics of the community which applies them. Reduction of diarrheal disease is an important public health goal. This can be accomplished by many different interventions that improve or modify some of the factors listed above. Some examples include: water supply improvements (for quality, quantity, or both); hygiene education (particularly handwashing); a wide array of household water treatment (HWT) interventions (e.g., boiling, chlorination, filtration); and sanitation (latrine or sewer construction). Interventions are most effective if they are used and maintained independently by communities over many years; this is referred to as 'compliance', or sometimes 'adherence'. High compliance is difficult to attain in the developing world, where money, materials, skilled personnel, and good governance are often in short supply. It is unclear how best to measure and maintain compliance within communities. The effectiveness of interventions that prevent diarrhea is difficult to measure. Conducting research investigations in developing countries is inherently challenging due to lack of resources 1 and infrastructure, and researchers are often faced with linguistic or cultural barriers. Field trials of interventions are usually impossible to blind; blinding prevents study participants from knowing whether they are receiving an active intervention or an inactive 'placebo' intervention. Blinding is difficult because interventions are usually visually obvious (e.g., a large filter unit, or people attending handwashing education sessions). Therefore, field trials are likely to be biased by the expectations of investigators or participants, or by other factors (known or unknown). Published field trials often lack key contextual information about the communities that participated in the study, which impedes interpretation of the results. Although the body of scientific literature concerning diarrhea is enormous and continues to grow, many aspects of diarrheal disease in developing countries remain poorly understood. There is a need to synthesize the available information to determine where to prioritize future research. Chapter 2 summarizes key ideas in this body of literature. Although meta-analyses concerning various aspects of diarrheal disease continue to be published and are useful, they do not clarify interrelationships between these aspects. Mechanistic modeling is a useful tool for describing and simulating complex systems problems such as disease transmission within human communities. This dissertation uses modeling methods (coupled with published data) to simulate diarrheal disease transmission. The models allow conclusions to be drawn about appropriate diarrhea control methods, and identify aspects of diarrhea transmission and control that require further research. 1.2. Scientific questions Goal 1: Develop a method to link information from epidemiologic studies with simulation models (chapter 3). Field studies of interventions that prevent diarrhea are difficult and time-consuming to perform, and are subject to numerous biases (see page 17 for further discussion). Models based on field studies can be used to infer the effect of 2 interventions on risk under circumstances that were not actually studied. Hypotheses: Biases that were observed during a field trial can be corrected by constructing a model that simulates the trial as closely as possible, and then altering the model structure (or its parameters) to remove the bias. In a similar fashion, the outcomes of counterfactual field trials can also be estimated. Method: Construct a model to simulate an actual field trial. Calibrate the model to the outcomes of the trial in order to infer the values of unobserved parameters. Use the sets of inferred parameters in subsequent estimation steps, in which the model has been modified to simulate desired situations (e.g., differing compliance levels, elimination of biases). Goal 2: How does noncompliance with interventions affect diarrheal disease levels (chapters 4 and 5)? Hypotheses: a) Perfect compliance with an intervention by X% of the population is more effective than 100% of the population using the intervention X% of the time. Or: the level of effectiveness of an intervention drops more rapidly when consistent use of the intervention is decreased, as opposed to when overall adoption of the intervention is decreased. b) What level of noncompliance renders a specific intervention ineffective (i.e., <10% decrease in longitudinal prevalence) under particular conditions? Trials are seldom powered to detect a decrease this small. c) Linear QMRA models and more complex EITS models incorporating feedback loops show similar effects of imperfect compliance and differing compliance patterns on diarrheal disease risk. 3 Method: Construct QMRA and EITS models of diarrheal disease transmission, apply a simulated intervention, alter the levels of the two types of noncompliance, and observe the results in the form of their effects on diarrheal longitudinal prevalence. Compare the results returned by both model types. 1.3. General assumptions within this dissertation Control of diarrhea requires sustainable use of interventions over many years by the communities (or individuals) that currently have high risk of diarrhea. Therefore, the effectiveness of an intervention is best described by changes in the average level of endemic diarrheal illness. Although diarrheal epidemics are undeniably important (particularly cholera epidemics), they represent short-term changes (days or weeks) in diarrheal disease risk. Any intervention that reduces endemic diarrhea consistently over several years is also likely to decrease the likelihood or severity of diarrheal epidemics during that time. Therefore, epidemics are given little consideration in this dissertation. Although the scientific literature concerning diarrhea in developing countries is vast, the precise quantitative values of many important parameters for models that describe diarrhea are unclear. The detailed datasets necessary for validating such models are likewise lacking. Human communities are also idiosyncratic, and diarrheal disease transmission in a particular community might be greatly altered by characteristics that are unmeasured or difficult to measure (e.g., cultural practices, social structure, education, socioeconomic status, local geology, local climate, etc.). Therefore, the models described in this dissertation cannot make firm predictions for specific communities. However, they can still provide general insights about the transmission of diarrheal infections and how interventions can affect their transmission. They can also indicate aspects of diarrhea that require further scientific investigation. 4 2. REVIEW OF DIARRHEAL INFECTION: EPIDEMIOLOGY, INTERVENTIONS, AND MODELING 2.1. Introduction Diarrhea is a common disease, particularly among children. Although diarrheal infections are nearly always transmitted by ingestion of feces, the details of transmission and control are complicated. Diarrheal pathogens are extremely diverse, including many distinct bacteria, viruses, and protozoa. Furthermore, these pathogens can be transmitted by different routes, such as drinking water, food, soil, or household objects. Many different interventions are available, such as sanitation (latrines), handwashing, or household water treatment; these interventions impede transmission of diarrheal pathogens on different routes. The transmission and prevention of diarrheal disease has been extensively studied, giving rise to a vast body of published scientific literature. Nonetheless, many important gaps in our knowledge remain. The central goal of this manuscript is to examine the issue of compliance with interventions to prevent diarrhea, particularly household water treatment (HWT). People in developing countries are often encouraged to adopt HWT methods, but they may use these methods inconsistently, or not at all. Although compliance affects estimates of diarrhea prevented by interventions in field trials, compliance is difficult to measure. Chapters 3, 4, and 5 use risk assessment models and infection transmission models to simulate field trials and examine the effect of differing levels and patterns of compliance on prevention of diarrhea. These models must incorporate existing information about diarrheal pathogens, their transmission, characteristics of infection and disease, and antimicrobial efficacy of HWT. This chapter summarizes available knowledge regarding diarrheal infections and their prevention in developing countries. It begins by concentrating on the epidemiology of endemic diarrhea in young children (generally under 5 years of age). Diarrheal illnesses which are 5 strongly epidemic (e.g., cholera) are only briefly mentioned. Interventions for preventing transmission of diarrheal pathogens are described, beginning on page 38. Application of simulation models to understand transmission and prevention of childhood diarrhea is discussed beginning on page 68. 2.2. Definition and measurement of diarrhea Acute diarrhea is often considered to be 3+ loose or watery stools in 24 hours, not counting normal soft stools from breastfed babies (USAID et al., 2005). However, the number of episodes/day and the number of days that are considered to separate episodes vary. Diarrhea lasting for 14+ days is generally considered ‘persistent’ (Ejemot et al., 2008). Dysentery is a diarrheal syndrome in which the stools are bloody (T. F. Clasen et al., 2006). Shigella species are a common cause of this type of diarrhea. Investigation of childhood diarrhea usually relies on the recall of the parents, which may be incomplete. Recall of diarrhea in infants by Guatemalan mothers was found to be unchanged for the two days prior to the interview, but dropped by 37% on the third day (Zafar et al., 2010). Recall for 4th and prior days was about 50% of that for the first two days. Severe illness was recalled more reliably. These results were similar to those reported in prior studies (Zafar et al., 2010). In a review (Kosek et al., 2003) including 27 field studies of diarrhea, 10 interviewed caregivers weekly or monthly, indicating that diarrhea may be underestimated in those studies. Fortunately, the remaining 17 studies interviewed caregivers two or more times per week (Kosek et al., 2003). Use of daily diaries may mitigate some of the effects of infrequent followup. Differing methods are used to measure changes in diarrheal risk. Incidence rates (the number of disease episodes over a given time in a certain number of people) are most commonly used, which may be appropriate when disease transmission is the outcome under study (Morris et al., 1996). However, the association of longitudinal prevalence (the number of person-days ill 6 over the total number of person-days observed) with growth faltering and death was stronger than the association of incidence with growth faltering and death (Morris et al., 1996) in a rural area of Ghana that was underserved with health services (Ghana VAST Study Team, 1993). Changes in longitudinal prevalence do not necessarily change incidence; for example, a treatment regime might reduce longitudinal prevalence by reducing the duration of illness, but incidence would remain unchanged. Longitudinal prevalence is also appealing because it more directly measures the extent of illness; a child with persistent diarrhea throughout most of the study period might contribute only 1 bout of illness to the community's diarrheal incidence, but is in actuality very sick (Morris et al., 1996). Point prevalence (the proportion of individuals ill at a given point in time) is sometimes used, but is far from ideal because the burden of diarrheal disease changes with time. Ratios of the above measures with the intervention group in the numerator and the non-intervention group in the denominator are used to estimate the magnitude of effect of the intervention. Odds ratios are also used to estimate the effectiveness of interventions. Unlike the previously discussed measures of risk, which are population-based and measure ill people as a proportion of total people, the odds is the number affected divided by the number not affected. The odds of illness in a treatment group can then be divided by the odds of illness in a control group to yield an odds ratio. Although the odds of illness is similar to the risk of illness when the disease is rare (i.e., the number of non-ill people nearly equals the entire population), this is not the case for diarrheal illness in many developing countries, and ‘risk’ as measured using odds therefore appears larger. Odds ratios also tend to be larger than risk ratios. Odds ratios are commonly used in regression analysis where the effect of an intervention is controlled for confounding factors, since the parameter estimates are convenient to represent mathematically as odds ratios. They are also used in case-control studies, where population-based risk cannot be 7 determined because the study population is not necessarily representative of the actual population (Hennekens & Buring, 1987). 2.2.1. Persistent diarrhea Persistent diarrhea is generally considered to last for 14 days or more. This is mainly based on convenience, and a better definition may be needed that takes nutrient deficiency and growth faltering into account (Bhutta et al., 2008). Enteroaggregative E. coli, Cryptosporidium, and Giardia are commonly implicated in persistent diarrhea, but their relative contributions are unknown (Bhutta et al., 2008). Factors associated with reduction of persistent diarrhea are mostly unknown, although early feeding of non-human milk to children and multiple acute diarrheal episodes are important risk factors (Bhutta et al., 2008). A recent review (Abba et al., 2009) of 20 studies from the 1980s and 1990s in various developing countries describing pathogens associated with childhood persistent diarrhea found no particular differences in the types of pathogens isolated from children with persistent diarrhea and children with no diarrhea. There was also no evidence to conclude that certain pathogens were more/less common in different regions of the world (most studies were from India and Bangladesh, though Latin America, southeast Asia, and sub-Saharan Africa were also represented), although individual studies varied. Enteropathic E. coli (particular types could not be distinguished) was detected in 25% to 33% of children, and no other pathogen was found in more than 10% of children. Children with persistent diarrhea were more likely to have at least 1 pathogen detected than children without diarrhea (75% vs. 43%). However, studies tested for different suites of pathogens, sample size was often small, laboratory procedures were often poorly documented, and studies commonly did not cover a whole number of years, so seasonality may have influenced some results. 8 2.3. Multiple infections Simultaneous infections by multiple diarrheal pathogens are common. In a review focusing on norovirus (Patel et al., 2008), 4 of 5 studies of children in developing countries hospitalized for severe sporadic acute gastroenteritis showed that <7.5% had a mixed infection, although a Peruvian study was an outlier with 24%. It was not stated which organisms were tested for. A study focusing on diarrheal parasites of hospitalized patients of all ages in Kolkata, India (A. K. Mukherjee et al., 2009) showed that mixed infections were the norm. 101/147 patients infected with Giardia and 66/84 patients infected with Cryptosporidium were coinfected with at least 1 other pathogen. Vibrio cholerae was most commonly (25% of coinfections) associated with Giardia, and rotavirus and other parasites (33% and 30% respectively of coinfections) were most commonly associated with Cryptosporidium. E. coli was also identified in 13% of Giardia and Cryptosporidium cases. 2.4. Asymptomatic infections Particularly in developing countries, asymptomatic infection with various diarrheal pathogens is common (Wennerås & Erling, 2004). Asymptomatic infections can be described by morbidity ratios, i.e., the proportion of infections that yield illness. Morbidity ratios can be estimated by stool surveys, where a random sample of a community is chosen and one or more stools from each person is collected and analyzed, regardless of whether the person has diarrhea or not. Morbidity ratios vary greatly by pathogen and setting; morbidity ratios tend to be lower in developing countries than in developed countries due to more frequent exposure, since immunity is often developed to disease rather than infection (R H Gilman et al., 1988; Cravioto et al., 1990; Valentiner-Branth et al., 2003; A. H. Havelaar et al., 2009). Because multiple infections and asymptomatic infections with diarrheal pathogens are common, it is often difficult or impossible to reliably attribute an episode of diarrhea to a 9 particular pathogen, even if detailed laboratory results are available. 2.5. Burden of diarrheal disease Diarrheal disease is common throughout the world. It is usually mild and self-limiting in healthy, well-nourished people in clean environments, but is an exceptionally serious problem among children in developing countries (World Health Organization, 2008). 2.5.1. Morbidity In the late 1980s to mid-1990s, it was estimated that children under 5 years of age in developing countries suffered from a median of 3.2 diarrheal episodes per child-year of life (Kosek et al., 2003). Children aged 6-11 months were most at risk, with a median of 4.8 episodes per child-year. Diarrheal risk may vary drastically depending on location and the degree of (under)development of the community. Age-specific incidences of diarrhea in developing countries have not decreased from 1955 to 2000, according to three major reviews that were conducted sequentially during that time period (Jamison et al., 2006). Gamma distributions can satisfactorily describe the number of episodes per child and the duration of episodes (Schmidt & Cairncross, 2009); the gamma distribution has a flexible shape (which can resemble a mound with a long right tail, or a monotonic decrease), and is defined by two parameters. 2.5.2. Mortality The approximate mortality rate due to diarrhea in children under 5 years of age in developing countries is 4.9 per 1000 per year (Kosek et al., 2003), for a total of about 2 million deaths per year (World Health Organization, 2008; Boschi-Pinto et al., 2008). In contrast to morbidity, mortality due to diarrhea has substantially decreased since the 1950s-1970s, when the rate was about 13.6 per 1000 per year (Kosek et al., 2003). This trend appears to be continuing, at least in part due to promotion of oral rehydration solution and other good practices for care of children with diarrhea (Jamison et al., 2006). However, the relatively constant levels of incidence 10 argue that the root problem of diarrheal disease transmission remains to be addressed, and diarrheal illness is still a major killer of children. In children under 5 years of age, 17% of deaths are due to diarrhea; for comparison, 17% of deaths are due to pneumonia, and malaria accounts for 7% of deaths (World Health Organization, 2008). About 35% of mortality in children under five years of age due to diarrhea is thought to be from acute diarrhea, 45% from persistent diarrhea, and 20% from dysentery (R E Black, 1993). Huge disparities are seen when diarrhea mortality is separated by WHO region (Figure 2.1). The situation was worst in Africa by far (World Health Organization, n.d.). Figure 2.1. Diarrhea mortality by WHO region Chart depicts publicly available data (World Health Organization, 2012). 2.5.3. Disability-adjusted life years (DALYs) attributable to diarrhea Diarrhea morbidity and mortality can also be measured using DALYs per person lost due to diarrhea (Figure 2.2). DALYs describe years of healthy life that are lost to disease, and they 11 are used to compare morbidity between widely varying diseases (Murray & A. D. Lopez, 1996). Africa loses the most DALYs per person to diarrhea, and this increased from 2000 to 2004 (WHO DALY estimates for 2008 were not yet available at this writing). Since diarrheal disease is often brief and self-limiting, mortality is responsible for nearly all of the DALYs; even though death is unlikely for a given diarrheal episode, it contributes many DALYs when it occurs (because fatalities nearly always occur in children, resulting in the loss of many future years of life). It has been argued that DALYs for diarrhea should consider long-term sequelae of chronic gastrointestinal infection (e.g., reduced intelligence and reduced physical fitness); this increases the number of DALYs lost to diarrhea by two to six times, depending on the assumptions used (Guerrant et al., 2002). However, such sequelae are not included in World Health Organization DALY calculations. 12 Figure 2.2. Diarrhea DALYs by WHO region Chart depicts publicly available data (World Health Organization, 2012). 2.6. Changes in diarrheal morbidity over time in individuals Risk for diarrheal morbidity approximately doubles in the 6th-11th month of age, compared with the first 6 months of life. This is largely because weaning is a critical time for breast-fed infants, increasing exposure to pathogens; when weaning commences, diarrheal infections and risk of death increase markedly (Motarjemi et al., 1993).The risk drops off sharply in one-yearolds and declines gradually thereafter (Bern 1992). Diarrheal mortality is highest in the first year of life and drops by roughly a factor of 4 among ages 1-4 years (C Bern et al., 1992). Although immunity to certain diarrheal pathogens can be acquired (e.g., effective vaccines against rotavirus and polivirus exist), its importance is somewhat unclear. Diarrheal pathogens 13 are very diverse, and immunity from one pathogen serotype does not necessarily confer immunity to other serotypes of the same pathogen. Furthermore, immunity to diarrheal disease does not necessarily confer immunity to infection (A. H. Havelaar et al., 2009). Infected people without diarrhea can still shed pathogens and may still suffer important health effects; for example, children with asymptomatic cryptosporidiosis grow more slowly than uninfected children (Checkley et al., 1997). Finally, although diarrheal risk decreases as children age, they are acquiring immunity at the same time that they are learning to behave more hygienically, and it is unclear whether improved hygiene or increased immunity is more important for preventing diarrhea. Evidence for approximately twofold increased risk of a diarrheal episode following recovery from a previous episode has been found in some datasets; this risk gradually declines over several weeks (Schmidt et al., 2009). A similar effect with risk declining more than twofold over about 13 weeks was seen in an observational study in urban Brazil, and remained even after adjusting for the age at the time of an episode and age upon enrollment in the study (Genser et al., 2006). This increased risk was only related to the timing of the prior episode, and not to the number of previous episodes, indicating that relapsing or intermittent symptoms from the same infection might be the cause (Genser et al., 2006). This general phenomenon has also been reported from rural Zaire (Tonglet et al., 1999). 2.7. Diarrhea-malnutrition vicious cycle The evidence on how diarrhea and malnutrition compound each other has been recently reviewed (Guerrant et al., 2008). Malnourished children tend to suffer more and longer bouts of diarrheal illness, approximately doubling the amount of time spent ill. This is partially explained by damage to the absorptive capacity of the small intestine by enteric illness, as well as by malnutrition itself which further limits the resources available to repair that damage. Even 14 asymptomatic gastrointestinal infections appear to compound this feedback loop. However, the pathophysiology of these relationships is not well understood. Increased burden of enteric infection in the first 2 years of life is associated with decreased work productivity, earning capacity, and cognitive impairment in later childhood and adulthood (Guerrant et al., 2008). Exclusive breastfeeding in children aged < 6 months protects them from malnutrition if they develop diarrhea, but exclusive breastfeeding cannot be sustained past approximately 6 months of age (Motarjemi et al., 1993). Any breastfeeding is likely to mitigate the effect of diarrhea on malnutrition; under some circumstances, partially weaned children can have more diarrhea or slower growth than completely weaned children, which could be due to energy, protein, or micronutrient deficit from overreliance on breast milk (Tonglet et al., 1999; McDade & Worthman, 1998). It is unclear how diarrhea and malnutrition are linked quantitatively. Both of these syndromes are broad aggregations of many factors. Although many studies have linked anthropometric measures of malnutrition to increased diarrheal risk, such associations can disappear after adjusting for age, sex, time of enrollment (seasonal factors), and the caretaker’s assessment of the child’s growth (Tonglet et al., 1999). It may be possible to clarify the action of the diarrhea-malnutrition vicious cycle by calibrating an infection transmission model to data describing changes in diarrhea incidence among children who frequently suffer from diarrhea. Children who have suffered a recent bout of diarrhea are more likely to suffer subsequent bouts (Schmidt et al., 2009), and children who are stunted or wasted due to repeated episodes of diarrhea are also more likely to suffer further from diarrhea (Guerrant et al., 1992). Increased susceptibility to new diarrheal infections after resolution of a previous infection might be modeled by an increase in the probability that a single pathogen causes disease, which is the interpretation of the k parameter in the exponential dose 15 response equation (see Equation 3.2, page 98) (Haas et al., 1999). This increase could be estimated by calibrating a diarrhea transmission model to field data. The duration of time during which the probability of infection is increased would also need to be specified; it would probably be greater than the duration of disease, and might occur even for asymptomatic infection. 2.8. Cyclic or recurring changes ('seasonality') in diarrhea incidence The word 'seasonality' is often used to describe regular changes in disease incidence over time, often on an annual scale. However, seasons themselves are often distal influences on a more proximal factor such as temperature or humidity that better predicts the ‘seasonal’ effect (Fisman, 2007). ‘Seasonal’ changes in disease can be tenuously related (or unrelated) to seasons or climate, such as children coming together at the beginning of a school term leading to increased measles transmission (Grassly & Fraser, 2006). The obvious nature of seasons might lead to overattribution of disease fluctuations to them, or failure to measure other potentially more informative factors (Fisman, 2007). Therefore, seasonality should be considered as regular changes in disease incidence or prevalence, rather than changes related to particular seasons of the year. Seasonality of diarrhea is variable and depends on local climate as well as the organism. For example, rotaviral disease tends to increase in cooler, drier weather (K. Levy, A. E. Hubbard & J. N. S. Eisenberg, 2009), while enterotoxigenic E. coli (ETEC) tends to peak at warm, wet times (Estrada-Garcia et al., 2009). Seasonal effects can differ depending on the community; for example, a community whose preferred water source dries up during the the dry season might experience increased diarrhea at that time due to use of poorer quality source water. In other areas, the onset of the rainy season might increase diarrheal disease for various reasons, such as rain washing pathogens from land into source water, or relatively poor nutritional status because food stores from the previous harvest have been depeted (S. Sutra et al., 1990). 16 Apparent ‘seasonal’ (or simply oscillatory) cycles can arise in infectious disease models that include acquired immunity. This can arise from accumulation of susceptible individuals as they are born, followed by an increase in disease mediated by an increased proportion of susceptibles, with a subsequent decline in illness until sufficient susceptible individuals are born (Grassly & Fraser, 2006). Combining this behavior with seasonal factors can lead to complex dynamics, including chaos (Grassly & Fraser, 2006). 2.9. Crowding (urban vs. rural) Overcrowding is associated with increased risk of diarrhea, particularly when coupled with poor sanitation. This phenomenon has been widely reported during military campaigns and in refugee camps (Lim & Wallace, 2004). Crowding may occur within the household (many people in a small dwelling) or in the community as a whole (periurban slums). Crowding within households is easily measured (number of people sleeping in a room or occupying a hut) and is associated with diarrhea (Chacín-Bonilla et al., 2008); however, the effect of this type of crowding on diarrhea is likely to be confounded with many other risk factors, such as low socioeconomic status and malnutrition. The precise effect on diarrheal risk by crowding at the community level is more difficult to measure, and also difficult to disentangle from other determinants of diarrheal risk, such as sanitation and malnutrition. Studies comparing diarrhea in districts that are more or less densely populated but have otherwise similar populations do not appear to be available in the published literature. 2.10. Bias in epidemiological studies of diarrhea Many epidemiological studies concerning diarrheal disease have been published. A recent systematic review (T. Clasen, I. G. Roberts, et al., 2009) of water quality intervention studies found 68 studies. Unfortunately, methodological problems are frequent in such studies (V. A. 17 Curtis & Cairncross, 2003; T. Clasen, I. G. Roberts, et al., 2009). This is partly because of the challenging nature of research in developing countries: cultural differences, language barriers, and infrastructure limitations complicate work in these communities. In field trials of interventions to prevent diarrhea, bias is a serious problem that is difficult to quantify. There are two broad types of bias that are particularly important in field trials: selection bias and information bias. Selection bias occurs when people who are selected to participate in a study are not representative of the larger population that the study is meant to describe (Last, 1995). For example, investigators might choose communities for an intervention trial that they think would be particularly likely to comply with the intervention; this would increase the apparent effectiveness of the intervention. Selection bias hinders generalization of results to larger populations. In contrast to selection bias, which generally occurs at the beginning of a study and is unrelated to the experimental groups within a study, information bias refers to informational error within the experimental groups. There are several types of information bias that are relevant to field trials. Probably the least serious is nondifferential misclassification bias, where some participants are in the wrong experimental group. If misclassification is similar in all experimental groups (hence nondifferential), then the effect size cannot be exaggerated and a conservative estimate of the effectiveness of an intervention would be obtained (Rothman, 1986). However, there are many other types of information bias that are more serious because they yield different effect measurements depending on the experimental group. For example, courtesy bias may arise when participants tell the experimenters what they think they want to hear; therefore, users of a HWT device might report less diarrhea, even if the device is completely ineffective. This might be deliberate or unconscious behavior by the participant. Recall bias could occur if participants in different groups differ in how they remember (and report) diarrheal episodes; for 18 example, presence of a HWT device in the home might prompt frequent consideration of clean water and diarrhea, and therefore diarrhea might be more completely recalled. Hawthorne effects, in which participant behavior is influenced by the knowledge that they are being observed (McCarney et al., 2007), are another type of information bias. This could act similarly to selection bias by boosting compliance with interventions, but it might also have different effects by experimental group. For example, recipients of an intervention might receive more follow-up during the study, which could alter their participation of the study or their perception of diarrheal illness. Interviewer bias (Last, 1995) can also arise when investigators know which experimental group they are studying; for example, interviewers might unconsciously interview control or intervention participants differently because of their expectations about the effectiveness of an intervention. Two important strategies to reduce bias are: 1) random assignment to study groups; and 2) blinding of the study group assignment to the participants (single-blind) or, preferably, to participants and investigators alike (double-blind) (Hennekens & Buring, 1987). However, these strategies are often problematic in field trials because the intervention is visually obvious and cannot be concealed. A meta-analysis of a wide variety of intervention studies indicates that lack of blinding can exaggerate a protective effect by about 30% (L. Wood et al., 2008). Therefore, an unblinded study of an intervention that in fact has no effect would be expected to show a relative risk of about 0.7, which is similar to the effect size reported by many household water treatment studies (Hunter, 2009). 2.11. Relative contributions of pathogens to diarrheal etiology Although dozens of different pathogens can cause childhood diarrhea in developing countries, pathogenic E. coli, rotavirus, norovirus, Shigella, Campylobacter jejuni, Giardia, and Cryptosporidium are particularly important. They contribute differently to acute and persistent 19 diarrhea. Enterotoxigenic E. coli (ETEC), rotaviruses, and noroviruses are particularly important in acute diarrhea, but enteroaggregative E. coli (EAEC), Cryptosporidium, and Giardia seem particularly important in persistent diarrhea (Guerrant et al., 2008). Estimates of the proportion of disease caused by particular pathogens are difficult to obtain. However, a review (Lanata & W. Mendoza, 2002) of 266 studies from 1990-2002 indicates that pathogenic Escherichia coli, Giardia, and rotavirus are probably the most common pathogens in community settings worldwide (Figure 2.3). Rotavirus is remarkable for the large numbers of inpatient and outpatient visits attributed to it; in contrast, Giardia accounts for few outpatient and inpatient cases compared to community cases. Coinfections and diarrheal illnesses of unknown etiology accounted for a large proportion (25-35%) of all cases. Figure 2.3. Estimated etiology of childhood diarrhea worldwide Adapted from Lanata and Mendoza (2002), graphs 1-3. Examination of four WHO subregions (AfroD, AfroE, AmroB, and SearoD) which had 20 five or more community studies covering a wide array of pathogens showed reasonably consistent proportions of disease attributable to bacteria, protozoa, and viruses. Considering cases where an etiology was identified, the proportion of cases where bacteria are isolated is approximately 60% in those four regions (Lanata & W. Mendoza, 2002). Protozoa accounted for 16-34%, and rotavirus accounted for 9-22%. However, since only rotavirus was included, the contribution of viruses is probably underestimated. Also, it was common for no pathogen to be isolated (30-55% of cases in those four regions). There are limitations to the Lanata and Mendoza (2002) review; in particular, the studies examined might not be representative, since investigators may have chosen to work in regions with known problems or on their particular pathogen(s) of interest. Pathogens that are more difficult to detect might be underrepresented. A more recent review (Abba et al., 2009) examined the etiology of persistent diarrhea (lasting >14 days) in young children (age ranges varied, but all were under six years), using four studies from Bangladesh, six from India, five in Central & South America, two in Zambia, one in Thailand, and one in Viet Nam. Overall, pathogenic E. coli was found in 31%-41% of children with persistent diarrhea, and in 22%-30% of children without diarrhea. Rotavirus, enteric adenovirus, Campylobacter, Salmonella, Vibrio cholerae, enterohemorragic E. coli (EHEC), Giardia, Cryptosporidium, and Entamoeba were all found, but at 10% or less. Much less information is available concerning the etiology of diarrhea in people older than five years. Although a recent review (Fischer Walker et al., 2010) found 22 studies, 15 were inpatient and 7 were outpatient. Of the outpatient studies, only two of them studied more than one pathogen. Only one study was community-based, and it examined ETEC only. Information about asymptomatic infection in older children and adults is similarly lacking, and information about coinfection was usually absent (Fischer Walker et al., 2010). However, the outpatient 21 studies indicated similar proportions of diarrhea attributable to bacteria, protozoa, and viruses as the community studies described in Lanata and Mendoza (2002): 60%, 21%, and 19%, respectively (Fischer Walker et al., 2010). 2.12. Survey of diarrheal pathogens Common pathogens causing endemic diarrhea are discussed below, with particular reference to developing countries. Rotavirus, diarrheagenic E. coli, and Giardia are given special attention because they are used as model organisms in the simulation models in chapters 3, 4, and 5 of this dissertation. Numerous parameters are used to describe these organisms' behavior in the models; they are briefly mentioned here, but are described in detail in the appendix. 2.12.1. Viral diarrheal pathogens A wide variety of enteric viruses are known to cause diarrhea. In general, they are small (25-100 nm) nonenveloped RNA viruses that are environmentally stable, although they are usually vulnerable to chlorine. The combination of flocculation/sedimentation and chlorination in standard water treatment usually reduces enteric viruses by four log10 or more. Due to their small size, they cannot be removed by most filtration methods. Rotavirus and norovirus (formerly known as Norwalk virus) appear to be most important with regard to diarrhea. Among hospital-based studies of children in 11 different developing countries, rotavirus prevalence was 34.9% on average, and calicivirus (the family that includes norovirus) prevalence was 10.3% on average, while adenovirus accounted for 6.3% and astrovirus for 3.5% (Ramani & Gagandeep Kang, 2009). It is unclear whether these proportions also apply to disease in the community setting. Both rotavirus and norovirus tend to cause more severe disease than other enteric pathogens (Ramani & Gagandeep Kang, 2009), they are both highly potent with extremely high shedding, and they are responsible for high proportions of 22 severe childhood diarrheal illness in developing and developed countries, with 12% due to norovirus (Patel et al., 2008) and 33% due to rotavirus (Cook et al., 1990). Norovirus also accounted for 12% of mild to moderate diarrheal illness in all ages (Patel et al., 2008). Rotavirus Transmission and ecology Rotavirus is primarily transmitted by the fecal-oral route but may also be transmitted person-to-person, and in some circumstances it might be inhaled (Heymann, 2004). Although improved santitation and hygiene greatly inhibit transmission of many other diarrheal pathogens, rotavirus generally infected all children before four years of age in both developing and developed countries before rotavirus vaccine was available (Cook et al., 1990). Although there are several different serotypes of rotavirus, allowing repeated infection, risk of further episodes declines with each additional episode, and the severity of subsequent episodes also declines (Velázquez et al., 1996). Rotavirus disease is strongly seasonal in temperate regions, peaking in the winter in the Americas and in spring or fall in other areas, but is seen year-round with less seasonal change in tropical regions (Cook et al., 1990). Even within tropical areas, however, rotavirus incidence tends to be higher under cooler and drier conditions, although there is much heterogeneity (K. Levy, A. E. Hubbard & J. N. S. Eisenberg, 2009). It is not clear why this is, but drier conditions might facilitate airborne suspension of droplet nuclei and promote inhalation of rotavirus, and differing sanitary conditions and episodic local events (such as floods) likely modify any effect of humidity or temperature (K. Levy, A. E. Hubbard & J. N. S. Eisenberg, 2009). Dose response Rotavirus is extremely potent, with an ID50 of approximately 6 focus-forming units (FFU) (Anon, 2012) as measured in a human feeding study (Ward et al., 1986). 23 Symptomology The incubation period for rotavirus infection is short, 1-3 days (Blaser et al., 2002). Rotavirus disease consists of fever, vomiting, and watery diarrhea; fever and vomiting usually last 2-3 days, while diarrhea may continue for 8 days. However, rotavirus infection is frequently asymptomatic; approximately 60% of young rural children who were shedding rotavirus in Guinea-Bissau, Mexico, and Argentina did not have diarrhea (Cravioto et al., 1990; Vergara et al., 1996; Fischer et al., 2002). Burden of disease Rotavirus is responsible for about 6% of diarrhea cases in children aged less than five years in the developing world and 20% of the childhood diarrhea deaths according to a review (de Zoysa & Feachem, 1985) of 7 studies from the early 1980s. A subsequent review (Parashar et al., 2003) of studies published from 1986 to 2000 yields a similar estimate of rotavirus involvement in 8% (IQR 4 to 12) of in-home cases for children under five years of age, and approximately 20% for outpatient and inpatient cases. Approximately 20% of diarrhea deaths in low-income countries were due to diarrhea, similar to the earlier estimate. This corresponded to a 1/205 risk of dying from rotavirus by age 5 years, compared to a 1/49,000 risk in high-income countries (Parashar et al., 2003). The proportion of severe diarrheal disease attributed to rotavirus has increased over time (Harry B Greenberg & Mary K Estes, 2009). This may be because of reductions in diarrhea caused by other pathogens; high shedding and high potency means that rotavirus is particularly difficult to control (Harry B Greenberg & Mary K Estes, 2009). Persistence in the environment Rotavirus resists inactivation in the environment. On aluminum or ceramic at 4°C or 20°C in high or moderate relative humidity, approximately 90-99% is inactivated within the first week, but further inactivation is much slower; even at 60 days approximately 1% of the original 24 amount of virus remained (Abad et al., 1994). Control Rotavirus disease is very difficult to control due to high levels of shedding and its extremely low ID50 of ~6 virions (Anon, 2012). Two live-virus vaccines have been developed, RotaTeq and Rotarix, which are both highly effective: 74% against diarrhea and 100% against severe diarrhea for RotaTeq, and 95% against severe diarrhea for Rotarix (Harry B Greenberg & Mary K Estes, 2009). It is unclear whether these vaccines will be as effective in severely underdeveloped environments, although trials are underway (Harry B Greenberg & Mary K Estes, 2009). Before rotavirus vaccine was available, rotavirus generally infected all children before the age of four years in both developing and developed countries (Cook et al., 1990). Correct use of oral rehydration solution (ORS) in diseased children is very effective in preventing rotavirus mortality (Blaser et al., 2002). Norovirus Transmission and ecology Norovirus is highly potent and is shed in large quantities by infected individuals, and it can easily be transmitted by person-to-person contact as well as by contaminated food or fomites (Mattison, 2011). A substantial fraction of the population lacks a receptor necessary for infection with Norwalk virus (the first norovirus discovered); these 'secretor-negative' (Se-) individuals appear innately immune (Lindesmith et al., 2003). In a study population of 77 individuals (49% male, 71% white, 23% black) 28.6% were Se- (Lindesmith et al., 2003). It is unclear how much Se+/status varies among human populations. Even if an individual is Se+, they may still be resistant due to acquired immunity, if they have been exposed to norovirus previously. However, 25 noroviruses are diverse, and important characteristics of immunity (duration and cross-protection among strains) are poorly understood (Lindesmith et al., 2003); furthermore, acquired immunity following norovirus infection only lasts a few months, after which a previously infected individual can be reinfected by the same serotype (Carter, 2005). Peak shedding of norovirus occurred during (31%) or after (69%) disease in 16 experimentally infected volunteers, at approximately 7×1010 genome copies / mL of stool (Atmar et al., 2008). Virus was detectable for 1 to 9 weeks in stool; however, peak levels only lasted for one to three days (Atmar et al., 2008). The duration of illness and the incubation period both averaged two days (Atmar et al., 2008). Dose response Like rotavirus, norovirus appears highly potent; however, dose response relationships are inconclusive. Although some dose response models have been published (Peter F M Teunis et al., 2008), much of the data used came from feeding studies using viral stocks that had been stored for long periods and were highly aggregated. The extent to which noroviruses are naturally aggregated in the environment is unclear. Symptomology Norovirus disease is unpleasant, but relatively mild and seldom dangerous (Blaser et al., 2002). It has an incubation period of about two days, and lasts about two days (Atmar et al., 2008). Vomiting or diarrhea may onset suddenly; however, the extent of these symptoms varies greatly among diseased persons (Blaser et al., 2002). Burden of disease Norovirus had been primarily considered an epidemic gastroenteritis, but evidence is mounting that it is an important cause of sporadic gastroenteritis as well. The annual incidence of norovirus disease in children less than 5 years of age in developing countries is approximately 26 197 inpatient cases per 100,000 (Patel et al., 2008). This figure was 118 in industrialized countries; by contrast, the incidence of outpatient cases was estimated to be 1665 per 100,000 (Patel et al., 2008). It is unclear what the outpatient or community-level incidence would be in developing countries but it is likely to be substantially larger than the inpatient incidence. Persistence in the environment Norovirus is very stable in the environment, although disinfection studies commonly use surrogate caliciviruses because it is not currently possible to grow human norovirus in cell culture (Mattison, 2011). Control Good hygiene and disinfection practices are essential. No vaccine is available. 2.12.2. Bacterial diarrheal pathogens Many different bacteria can cause diarrheal illness, including numerous species and strains from the genera Salmonella, Vibrio (notably Vibrio cholerae, the cause of cholera), Clostridium, Streptococcus, Yersinia, and Bacillus. However, pathogenic Escherichia coli and Campylobacter appear to be responsible for particularly large shares of bacterial diarrhea (Figure 2.3, page 20), and these are discussed in detail below. Pathogenic Escherichia coli Important pathotypes of E. coli Escherichia coli usually resides in the mammalian large intestine, benefiting itself as well as the host. However, there are several well-established pathotypes of diarrheagenic E. coli (Kaper 2004, Nataro 1998). With respect to diarrhea, the most important are enteropathogenic E. coli (EPEC), enterotoxigenic E. coli (ETEC), and enteroinvasive E. coli (EIEC), all of which produce watery diarrhea. Although enterohemorrhagic E. coli (EHEC) is much more potent than EPEC, ETEC, or EIEC and can cause severe or lethal disease, EHEC is relatively uncommon, 27 and is not a major cause of diarrhea. Shigella species are highly potent, like EHEC, but they are less dangerous, although they are an important cause of dysentery; Shigella are very closely related to E. coli, and are probably technically conspecifics (James B Kaper et al., 2004). Although EIEC is the E. coli pathotype most closely related to Shigella, EIEC is much less potent (Blaser et al., 2002; Anon, 2012). Transmission and ecology Shedding of ETEC may vary by strain. For E. coli H10407, the geometric mean number of bacteria per gram of feces was 4.03×107, and it did not differ greatly whether subjects were symptomatic or not (Levine et al., 1980). For E. coli 214-4, the geometric mean was 5.23×108, over tenfold higher (Levine et al., 1980). ETEC can often be detected in apparently healthy people. In developing countries among healthy 0-11 month olds, and 1-4 year olds, 11.7% and 7.1%, respectively, are estimated to be colonized with ETEC (Wennerås & Erling, 2004). ETEC disease tends to be more prevalent in warm, wet weather, which aids its multiplication in the environment (Qadri et al., 2005; J P Nataro & J B Kaper, 1998). In Bangladesh, its seasonality is similar in shape and magnitude to that of cholera, with one peak at the beginning of the hot season (spring) and another in autumn just after the monsoons, when more fecal material is entering surface water (Qadri et al., 2005). A household-level case-control study (R E Black et al., 1981) of ETEC was conducted in the rural Matlab area of Bangladesh. Extensive culturing of water, food, domestic animals, and family members was carried out. Compared with households testing negative for ETEC, risk of infection was only slightly higher (10.0% vs. 8.3%) in households where a water source or a domestic animal was positive for ETEC. In contrast, risk of infection increased 3.5 fold, to 28 29.0%, among households for which ETEC was found in stored water or cooked food, highlighting the importance of exposure within the household. Within households having an infected member, the proportion of household contacts becoming infected, and the proportion of those infected contacts actually acquiring disease, declined with age. This is consistent with development of immunity, since contamination of food and water within the household indicates that all members are likely to be similarly exposed. Dose response Pathogenic E. coli is much less potent than the other pathogens described in this chapter. Many studies with human volunteers have used disease as the outcome, and their ID50s range from 2×105 to 1×108. The only available experiment with infection as the outcome used EIEC (H L DuPont et al., 1971) and had an ID50 of 2×106 CFU (Anon, 2012) Feeding studies of ETEC or EPEC in healthy volunteers typically give 2-3g of NaHCO3, which neutralizes stomach acid and reduces the infectious dose (Levine et al., 1977). However, it has been suggested that food as a vehicle would have a similar acid-neutralizing effect (Levine et al., 1977). ETEC and EPEC do not appear to be transmitted person-to-person, partly because of their low potency; a study of ETEC-infected volunteers co-housed with uninfected volunteers did not result in any transmission of infection (Levine et al., 1980). Food was all served individually to the volunteers over the course of the experiment, so there was no opportunity for ETEC to spread via that route (Levine et al., 1980). Symptomology ETEC disease generally consists of watery diarrhea without fever, although its severity can vary widely, depending partly on the set of toxins that the strain is carrying (Blaser et al., 2002). The incubation period lasts from one to two days (Blaser et al., 2002), and ETEC diarrhea in 11 29 experimentally infected adult volunteers lasted 82.1 (SD 50.7) hours, with 12.0 (SD 9.29) stools over the course of the illness (R E Black et al., 1982). These were compared with other volunteers who were treated with antibiotics; immediate treatment cut diarrheal duration, stool volume, and hours ill by half. EPEC disease can be more severe than ETEC, and there are many documented examples of lethal epidemics in children (Blaser et al., 2002). However, experimental infections in adult volunteers showed a relatively mild syndrome of less than two days of watery diarrhea after an incubation period of less than 1 day (Blaser et al., 2002). Burden of disease In a recent review (Abba et al., 2009), pathogenic E. coli was by far the most common pathogen isolated from children with persistent diarrhea in various developing countries; it was detected in 31% to 41% of children, compared to <10% for other pathogens. Pathogenic E. coli also predominated similarly among pathogens isolated from asymptomatic cases (Abba et al., 2009). Enterotoxigenic Escherichia coli (ETEC) is the most common type of diarrheagenic E. coli (Qadri et al., 2005). It is probably also the most common cause of childhood diarrhea in the developing world, responsible for approximately 1/7 of diarrheal episodes in children aged less than 1y and almost ¼ of diarrheal episodes in 1-4 year olds (Wennerås & Erling, 2004). It can also cause severe dehydrating cholera-like disease in adults (Qadri et al., 2005). Diagnosis is complicated since many other Gram-negative bacteria produce similar toxins, so both toxins as well as the E. coli bacterium must be tested for in order to yield accurate results (Wennerås & Erling, 2004). Persistence in the environment Laboratory studies of E. coli in unfiltered river water in the dark indicate exponential 30 decay at a rate of 1.15 day-1 (in water taken from above a sewage outfall) to 0.64 day-1 (in water from below a sewage outfall) (Flint, 1987). Filtering or autoclaving the water before adding E. coli enhanced survival (Flint, 1987). A summary of inactivation rates of E. coli published by other workers indicates high variability, spanning over 2 orders of magnitude (0.009 to 2 days-1) at 15-20°C (Pond et al., 2004). Control In addition to sanitation and hygiene, proper food preparation and handling (particularly of weaning foods), is important (Motarjemi et al., 1993). ETEC vaccines are being actively researched (Harro et al., 2011). Campylobacter species Transmission and ecology In humans, Campylobacter jejuni and, less commonly, Campylobacter coli can cause gastroenteritides (A. H. Havelaar et al., 2009). Although Campylobacter can be spread by contaminated food and water, campylobacteriosis is mainly a zoonosis, being primarily associated with the gastrointestinal tract of birds, especially poultry (Dechesne et al., 2006). Campylobacter does not grow in water and (like E. coli) are an indicator of post-treatment contamination in water distribution systems. However, immunity appears to protect against disease rather than infection, and asymptomatic shedding is common (A. H. Havelaar et al., 2009). In a comparison of Mexican children aged less than four years and Swedish patients (ages not given), Swedish patients tended to carry only one Campylobacter serotype, while mixed serotypes were carried by 42% of Mexican children tested (Sjögren et al., 1989). Campylobacter epidemiology varies greatly between the developed and underdeveloped world, probably due to development of immunity early in life. Illness is rare after about five years of age in developing countries, but occurs among adults in industrialized countries, 31 probably because they avoided exposure (and therefore immunity) in childhood (A. H. Havelaar et al., 2009). Dose response In human volunteers, C. jejuni has an ID50 of approximately 900 CFU (R E Black et al., 1988; Anon, 2012). Neutralization of stomach acid by food or sodium bicarbonate could increase its potency (Miliotis & Bier, 2003). Symptomology Campylobacter can cause acute self-limited watery diarrhea lasting 2-6 days in healthy humans with an incubation period of about three days (range of 1-7 days), often with fever and nausea, but seldom with vomiting (Miliotis & Bier, 2003). In developing countries, campylobacteriosis is typically a disease of very young children; after about age two years, immunity has been acquired (A. H. Havelaar et al., 2009). Burden of disease Campylobacter is a substantial contributor to childhood diarrhea in developing countries; although it might not generally contribute as much morbidity as diarrheagenic E. coli (Blaser et al., 2002). Illness is rare after early childhood, due to development of immunity from early infections. Persistence in the environment Campylobacter survival increases with decreasing temperature, and it may survive for weeks in water samples in the laboratory at 4-10°C (Buswell et al., 1998). It is vulnerable to acid (although it can pass through the stomach if protected by food), but it can tolerate freezing (1-2 LRVs by freezing and thawing) and salting (about three weeks in 6.5% NaCl at 4°C) (Miliotis & Bier, 2003). Control 32 Limiting contact with bird feces and proper preparation of meat and eggs is important. It is somewhat unclear how best to manage household poultry to limit Campylobacter infection in developing countries. A randomized trial (Oberhelman et al., 2006) of chicken corralling in a Peruvian periurban community showed similar incidences (about three infections per personyear) of asymptomatic infections in children less than six years of age between households with corrals and households without. However, it also found 1/3 higher incidence of diarrhea in general and twofold higher incidence of symptomatic Campylobacter infection in households with corrals. 2.12.3. Protozoan diarrheal pathogens Protozoa are more difficult to culture than bacteria. However, they are larger, and are generally detected and counted under the microscope once they have been concentrated and stained with immunofluorescent dyes. They are more easily filtered from water due to their size, but tend to be more resistant to chlorine and are more potent than many bacterial pathogens. Cryptosporidium is used as a de facto standard for evaluating water filtration since it is smaller (5 microns) and more environmentally resistant than Giardia; if Cryptosporidium is undetectable, Giardia should be as well (American Water Works Association, 1999). Cryptosporidium parvum and Cryptosporidium hominis Transmission and ecology Cryptosporidium belongs to the phylum Apicomplexa and reproduces both sexually and asexually in the intestinal tract, where it is an obligate intracellular parasite. It has a broad host range, occurring in most (perhaps all) vertebrates, but it does not cause diarrhea in adult dogs, cats, or horses, perhaps meaning that these animals are low risk for transmission to humans. Its oocysts are the infective stage, which are transmitted by the fecal-oral route. Infections in ruminants seem to be the biggest source of environmental contamination (Miliotis & Bier, 2003). 33 Cryptosporidium hominis appears to only infect humans; it was recently shown to be a separate species from Cryptosporidium parvum, which primarily infects cattle but can also infect humans (Hunter & R. C. A. Thompson, 2005). Many other species of Cryptosporidium exist, but do not appear to infect humans (Hunter & R. C. A. Thompson, 2005). Cryptosporidium is an intracellular parasite within epithelial cells in the small intestine, which shields it from the host immune response and limits the effects of chemotherapies (Miliotis & Bier, 2003). Dose response Several dose response datasets are available for C. parvum infection in humans. They provide ID50s ranging from 12 to 455 oocysts (median 165) (Anon, 2012). The two largest datasets (8 doses each) fit the exponential dose response equation, with ID50s of 165 and 132 oocysts (H L DuPont et al., 1995; Messner et al., 2001; Anon, 2012). Only one feeding experiment has been done with C. hominis in humans (Cynthia L Chappell et al., 2006); the response measured was disease (rather than infection), and the ID50 was 17 oocysts (Anon, 2012). Therefore C. hominis may be roughly 10 times more potent than C. parvum in humans. Symptomology Cryptosporidiosis had a mean incubation period of 5 days and lasted for a mean of 6 days in 13 volunteers experimentally infected with C. hominis (Cynthia L Chappell et al., 2006). It is generally short and self-limited, causing watery diarrhea in healthy people, but is particularly dangerous to people with AIDS because there is no effective treatment (Miliotis & Bier, 2003). This can lead to infections lasting months or years, with severe damage to the gut. Infections can often be asymptomatic in apparently healthy children and adults (Blaser et al., 2002). Disease duration can be substantially longer in malnourished developing-country children compared with 34 people in industrialized countries; mean diarrhea durations of 21 and 13 days have been reported for C. hominis and C. parvum, respectively, in children under five years of age in a Brazilian shantytown (Bushen et al., 2007). Burden of disease Although Cryptosporidium is ubiquitous throughout the world (Blaser et al., 2002), cryptosporidiosis prevalence is relatively high (20-27%) (Miliotis & Bier, 2003) in children in certain underdeveloped contexts. However, it causes little or no symptomatic disease in older children and adults in developing countries. Persistence in the environment Cryptosporidium oocysts are about 5 microns in diameter, and are generally even hardier than the durable cysts of Giardia. Oocysts are inactivated more quickly at warmer temperatures (Erickson & Ortega, 2006). Ultraviolet light is effective, yielding 1 to 5 LRVs depending on the strain and the type/intensity of UV (Erickson & Ortega, 2006). They can survive for months in cold lakes or streams. They are highly resistant to chlorine. They can be inactivated by heat >64.2°C for 2+ minutes, or drying at 18-28°C for >4h. Control Effective control of Cryptosporidium is generally achieved in municipal drinking water treatment by filtration yielding nonturbid water (< 1 NTU) (USEPA, 2012). No effective anticryptosporidial medications exist. HWT is particularly important for HIV-infected people to strictly limit exposure to Cryptosporidium in drinking water, even in industrialized countries. Giardia species Transmission and ecology The currently accepted name for the Giardia species that primarily affects humans is Giardia duodenalis, although many papers refer to it as G. lamblia or G. intestinalis. Even within 35 G. duodenalis, certain assemblages are generally restricted to humans, and earlier beliefs that giardiasis is a single zoonosis now appear to be incorrect (Cacciò et al., 2005). Assemblages within G. duodenalis may later prove to be separate species (Cacciò et al., 2005). G. duodenalis is a flagellated protozoan that attaches to the small intestinal wall and absorbs nutrients from the gut lumen. Shedding of cysts varies greatly over time (0 to 2.5×107 cysts/g of feces), even within the same symptomatic individual (Porter, 1916). Because of this, diagnosis is often based on three specimens collected over several days (Miliotis & Bier, 2003). Reinfection with Giardia following drug treatment can be extremely rapid; 98% of 44 Peruvian children who were treated had reacquired infection within 6 months (R H Gilman et al., 1988). Immunity is to symptomatic giardiasis, rather than to Giardia infection. Dose response Giardia is a potent pathogen, with an ID50 of 35 cysts based on a human feeding study; it fits the exponential dose response function (Rendtorff, 1954; Anon, 2012). Symptomology Giardiasis often presents with diarrhea and flatulence, with foul-smelling foamy stools (Miliotis & Bier, 2003). In a study of experimentally infected humans (Rendtorff, 1954), it had a mean incubation period of 14 days; the mean duration of infection was 18 days, not counting two participants whose infection lasted > 100 days but did not shed during much of that time. Giardia infection is noninvasive, but can lead to villous atrophy and nutrient malabsorption (Miliotis & Bier, 2003) Partial immunity appears to develop, and infections are often asymptomatic (Miliotis & Bier, 2003); reports exist of 3% of infections being symptomatic among fathers and their children in Pakistan (Ensink et al., 2006) with a similarly low symptomatic proportion in urban 36 Brazilian 6-45 month olds (M S Prado et al., 2005). However, a report (Peréz Cordón et al., 2008) from a periurban area of Peru with poor sanitation describes 60% of infections among children aged 1 month to 9 years as being symptomatic. Burden of disease Giardia infection is common (prevalence of 2-5%) in the developed world but is 20%-30% prevalent elsewhere; some communities have been documented with much higher prevalences (Blaser et al., 2002). Weaned children are more susceptible than adults; prevalence declines somewhat after adolescence (Blaser et al., 2002). Since asymptomatic infections are so common, the true health impact of Giardia remains unclear. Persistence in the environment Pooling data from two studies (Wickramanayake et al., 1985; deRegnier et al., 1989) yields an estimate of inactivation at a rate of 0.55/day in water at 20ºC. Cysts degrade more quickly in soil or cattle feces than in water (Olson et al., 1999). The cysts do not tolerate freezing (Olson et al., 1999). Control Effective water treatment and good hygiene limit Giardia transmission. In addition, several curative chemotherapies for Giardia infection are available (Blaser et al., 2002). 2.12.4. Metazoan pathogens (helminths) Intestinal tapeworms or roundworms can aggravate the diarrhea-malnutrition vicious cycle, although light infections are thought to have negligible nutritional consequences. Worms themselves do not appear to cause much diarrhea (Hall et al., 2008). In addition, poor sanitary conditions are likely to simultaneously lead to infection with helminths and diarrheal pathogens. An observational study (Genser et al., 2006) in an urban Brazilian area with poor sanitation tested children once for Ascaris lumbricoides, Trichuris trichiura, and Giardia species, finding 37 that only Giardia was associated with diarrhea incidence. 2.13. Interventions to prevent transmission of diarrheal pathogens Diarrheal infections nearly always have a major fecal-oral component to their transmission, and this is the major target of interventions to control diarrhea. Nonetheless, many different types of interventions are being used and investigated throughout the world. Unfortunately, it is difficult to ascertain how effective most of these interventions are in actual practice. 2.14. Effectiveness trials of interventions The scientific literature is replete with results of effectiveness trials of various interventions in developing countries, with some assessing multiple interventions used simultaneously. In general, these trials usually take place over a few weeks or months, or perhaps a full year. They frequently show diarrheal disease reductions of approximately 30%, regardless of the type of intervention. Although children under 5 years of age are usually studied because they bear the greatest morbidity and mortality, other groups (all ages, <15 years of age, <3 years of age, etc.) are sometimes studied. Effectiveness may be measured with incidence ratios (number of new cases) or prevalence ratios (the number of persons infected at a given time); these two measures may differ because incidence and duration of illness are not tightly linked (Schmidt et al., 2009). Unlike clinical trials for determining drug effectiveness, intervention trials in communities are seldom blinded and may not be randomized, which can lead to bias. However, practical considerations (e.g., obvious visibility of intervention materials, expense of conducting research in multiple communities) make clinical trials of community interventions impractical. 2.14.1. Measures of effect in intervention trials Field trials of interventions commonly use some type of risk ratio to describe the effectiveness of the intervention. Many different measures of risks are used, with corresponding 38 risk ratios (RRs; also called relative risks), in which the risk in an intervention group is divided by the risk in a control group. An RR of 1 generally indicates no effect, while an RR < 1 suggests that the intervention prevents disease, and an RR > 1 would mean that there was more disease in the intervention group than in the control group. In general, the preventable fraction (PF) is given by 1 – RR, and describes the proportion of disease that the intervention prevents. Brief descriptions of common types of risk follow. Point prevalence (sometimes simply called prevalence) is the proportion of people who are affected at a single point in time. Its relative risk measure is the prevalence ratio (PR). Since the amount of diarrhea present in communities can vary greatly over time, point prevalence is not a very good measure of risk. Incidence (sometimes called 'incidence rate') is the number of new cases of a disease that arise in a population over a period of time (often expressed in terms of 1 year). Its relative risk measure is the incidence ratio (IR). The longitudinal prevalence (LP) is the number of person-days affected, divided by the number of person-days observed. Person-days are the product of the number of people observed and the mean time of observation per person. Its relative risk measure is the longitudinal prevalence ratio (LPR). The odds is the number of instances where the disease occurred divided by the number of instances where the disease did not occur. Its relative risk measure is the odds ratio (OR). Odds are often used in case-control studies where the association of a particular factor with disease is being assessed. All of these measures of risk are specific to a particular population (such as a particular age group studied in a certain community). The risk of diarrhea in general may be measured, or the risk of infection or diarrhea from a particular pathogen. 39 Although the primary goal of interventions is to lower the risk of diarrheal disease, it is also useful to measure the antimicrobial effectiveness of these interventions. If an intervention cannot remove microorganisms in the laboratory, there is no reason to conduct a field study of that intervention to assess its impact on diarrheal disease. The log10 reduction value (LRV) is often used to assess water treatment methods; it is calculated by taking the log10 of the number of a certain microorganism detected in a certain volume of treated water, and subtracting the log10 of the number of those microorganisms detected in treated water. Thus, an LRV of 1 removes 90% of microorganisms; an LRV of 2 removes 99%, and so on. Very high LRVs (5 or more, 99.999% reduction; see Table 2.1, page 64 for examples) are often desirable because some pathogens are both extremely infectious and extremely abundant; thus it might be necessary to eliminate nearly all of a pathogen in order to reduce risk to an acceptable level. If the LRV is low (< 2), antimicrobial effectiveness is sometimes described as a simple percentage of a certain type of microorganism removed or inactivated. 2.14.2. Nature of bias in intervention trials Effectiveness of interventions as measured by trials is likely to be higher than effectiveness in actual practice for several reasons: • The intervention may be discontinued by the population after the study is completed and the investigators depart from the community. ◦ Cost to families or the community (in time or money) may become prohibitive (Stockman et al., 2007). ◦ The population may not believe that the intervention is worthwhile, or the intervention may not be culturally appropriate (Paul, 1955) ◦ The community might fail to maintain the intervention in good working order because 40 they lack the necessary skills, equipment, or money; or maintenance may simply be forgotten. • Blinding is difficult or impossible to implement in community-based trials (B. F. Arnold & Colford, 2007), which can introduce bias: ◦ Respondents, knowing that they are receiving an intervention, may consciously or unconsciously underreport diarrheal illness, or comply more effectively with the intervention, e.g., the Hawthorne effect, or ‘courtesy bias’ (Luby et al., 2006). ◦ Investigators’ observations may be biased by their expectation that the intervention will be effective. • Surveys themselves can alter behavior: ◦ Question-behavior effects: the survey causes the respondent to think about the topic more deeply than usual, potentially changing the response or the behavior (Spangenberg et al., 2008). Question-behavior effects may themselves be altered, for example, by differing frequency of surveying (Zwane et al., 2011). • The investigators conducting a trial may themselves introduce bias: ◦ A trial showing high efficacy might be more likely to be published than a trial that shows little/no efficacy (publication bias) (Hunter, 2009). ◦ Investigators receiving outside funds may (consciously or unconsciously) preferentially report findings that are aligned with the funder’s desires. This may occur through publication bias or through other means. Bias might also lead to understatement of effectiveness. For example, a family that has poor hygiene practices might nonetheless report regular handwashing because it is a socially desirable answer. In a hygiene education trial, this would overestimate apparent handwashing in 41 the absence of an effect on disease, understating effectiveness of the intervention. Although bias is seldom quantified, a recent meta-epidemiological study (L. Wood et al., 2008) compared a wide variety of intervention studies (e.g., caesarian sections, smoking cessation, cancer treatment outcomes, etc.) with good/poor blinding to determine how these factors influence the reported effect of an intervention. When considering trials with subjectively reported outcomes, comparison of 104 unblinded trials with 205 blinded trials showed that unblinded trials tended to have odds ratios that were 25% lower (indicating a bias toward greater perceived effectiveness) compared to blinded trials (L. Wood et al., 2008). There was a similar effect regarding proper randomization of participants to treatment groups (allocation concealment). However, unblinded trials with objective outcomes did not show evidence of bias (L. Wood et al., 2008). This would seem particularly relevant to interventions to prevent diarrhea in developing countries since diarrhea reporting can be highly subjective, although such interventions were not the focus of this meta-epidemiological study. A few trials report on microbiological outcomes, such as the log reduction of microorganisms attained by a particular water treatment method. However, most trials consider reduction of a diarrheal syndrome as the main outcome, without reference to particular pathogens. Since the ultimate goal of an intervention is to improve health, and it is easier to measure symptoms in people than to identify pathogens in feces, disease outcomes tend to be preferred in the literature. 2.15. Compliance with interventions, and long-term sustainability Compliance with an intervention refers to the extent that people or communities actually use an intervention. It is often measured in terms of the proportion of people who are using (or claim to use) an intervention at a particular time. The word 'adherence' is sometimes used in place of compliance (Aronson, 2007). Sustainability is a related concept that refers to 42 maintenance of an intervention in a community over many years (essentially forever) without outside assistance, and it is notoriously difficult to attain (Shediac-Rizkallah & Bone, 1998). Compliance is necessary, but not sufficient, for sustainability. An intervention that people comply with and initially seems sustainable may become unsustainable if conditions change, such as deterioration of the infrastructure, economy, environment, or political situation. Household water treatment (HWT) trials tend to be short, rendering sustainability unmeasurable; only 4 of 35 household interventions reviewed (T. Clasen et al., 2007) lasted one year or more, while 4 of 6 trials of interventions at the water source lasted three or more years. These water source interventions included well digging and installation of public taps, and did not include connection of individual households to the water distribution network (T. Clasen et al., 2007). 2.15.1. Costs of compliance (monetary and otherwise) Monetary cost to the user is a key issue in compliance, and this is somewhat controversial. Some nonzero price, however small, may reduce waste since families willing to use the intervention would presumably also be willing to pay for it. Establishment of local for-profit businesses that sell interventions (such as HWT devices) are likely to be sustainable if they are profitable (Joe Brown et al., 2007); however, even seemingly tiny prices (e.g., $0.08/month) can impede compliance among very poor people (Stockman et al., 2007). It is also often politically difficult to allot government resources to marginalized or underprivileged populations (Batterman et al., 2009). Like money, time is a limited resource that must be carefully managed, and competing demands on time (particularly for women) in developing countries mean that the time required for a new task must be taken from other tasks (Awumbila & Momsen, 1995). Since interventions to prevent diarrhea generally require daily effort to maintain (e.g., consistent handwashing, or 43 daily treatment of drinking water), interventions that require little time and effort to operate should have higher compliance and a better likelihood of sustainability. 2.15.2. Psychological, social, and cultural aspects of compliance Even if an intervention’s effectiveness can be significantly demonstrated in a trial and is adopted initially by the community, the intervention may still fail if the magnitude of the disease reduction is too small to be perceived easily by individuals, who may therefore fail to recognize its importance (E. M. Rogers, 2003; Mäusezahl et al., 2009). A mother might not notice any difference between three illnesses per year in her child as opposed to four illnesses per year, even though this represents a 25% reduction in incidence. Therefore she might not be motivated to comply with the intervention. Even though the basics of the germ theory of disease are widely understood in many developing countries, 'germs' are invisible and abstract, and may not effectively motivate behavior change (V. A. Curtis et al., 2009). Interventions that yield visible daily benefits, such as savings in time and energy due to improved water supply or construction of conveniently available and comfortable latrines, may be maintained better by the community (Waddington et al., 2009). Habits (a behavior that occurs 'automatically' following a particular stimulus) appear important for handwashing compliance, and habits are typically established in childhood (V. A. Curtis et al., 2009). If a habit has not been established, it might be learned or encouraged by particular motivators. In a review of structured observation studies of handwashing in several developing countries (V. A. Curtis et al., 2009), disgust from feces and dirty hands was a key motivator for handwashing. Affiliation (essentially, doing what everybody else is doing) was also important; although affiliation can promote handwashing if it is common, it can also discourage handwashing if it is uncommon. Fear was not an effective motivator, except in the context of a severe immediate threat like a cholera epidemic (V. A. Curtis et al., 2009). These motivations 44 might apply similarly to other interventions to prevent diarrhea, such as sanitation or HWT. Compliance with interventions also depends on cultural factors. Local beliefs about water or hygiene impact the acceptability of an intervention. For example, women’s cleanliness is often highly regarded socially, but in some settings cleanliness implies attempting to attract other women’s husbands (V. A. Curtis et al., 2009). Belief that disease is pre-ordained or that people are powerless to affect disease (fatalism) can also impede the acceptance of interventions. In addition, diarrhea is often perceived to be a normal part of child development rather than a disease (V. A. Curtis et al., 2009). Communities also have different capacities to adapt to changing conditions, which is based partly on available resources but also on cultural factors (Batterman et al., 2009). 2.15.3. Examples of compliance measurements in the field In practice, high compliance is seldom attained. HWT chlorination trials provide informative illustrations of compliance, because measuring chlorine residuals in stored water during unannounced home visits is a rapid and objective measurement. As part of a large metaanalysis of HWT trials, 16 chlorination trials were reviewed (T. Clasen, I. G. Roberts, et al., 2009), finding mixed evidence for effectiveness (e.g., a significant rate ratio of 0.61 for all ages across four studies, but a nonsignificant longitudinal prevalence ratio [LPR] of 0.91 for children under 5 years of age across 5 other studies). Eleven of these trials measured chlorine residuals. Three of these trials estimated compliance at 49% (Chiller et al., 2006), 44% (Crump et al., 2005), and 61% (Crump et al., 2005). A fourth trial of 20 families had apparently perfect compliance because a health worker treated families' water daily , but found no protective effect (Kirchhoff et al., 1985). ; a possible explanation might be consumption of untreated water outside the home. A fifth trial (Doocy & Burnham, 2006) was an outlier that differed from the other trials by its remarkably low LPR of 0.09; it was carried out in refugee camps and reported 45 95% compliance. The compliance levels during the remaining trials were unclear. More accurate measurements of compliance are needed to improve quantitative understanding of its impact on effectiveness. With respect to HWT, the amount of untreated water consumed appears particularly important. The challenges in implementing sustainable interventions are illustrated by the following example. An evaluation (B. Arnold et al., 2009) of a large HWT and handwashing promotion campaign in 90 Guatemalan villages, during which families with children under three years of age were visited for 30 minutes monthly or bimonthly by health educators who promoted various HWT methods, handwashing with soap, and good nutrition. At the end of the campaign it was estimated that 70% of participating households were using HWT regularly. However, in an evaluation of 600 households 6 months after the conclusion of the intervention, various health measures were no different than in control villages. Longitudinal prevalence of diarrhea, ‘highly credible gastroenteritis’, cough, or difficulty breathing were all no different, as were anthropometric measures of malnutrition, and only 37% of intervention households still selfreported as using HWT. Furthermore, hygienic conditions, soap use, and self-reported handwashing behavior were no different. Statistically significant, but small, positive differences were seen in confirmed HWT use, which was 26% using any HWT method. 2.16. Interaction of intervention effects Interaction of interventions means that their joint effectiveness differs from their combined individual effectivenesses. Interaction between interventions is common, but may be positive or negative. In a review (Fewtrell et al., 2005) including 5 studies combining water supply improvements with sanitation and hygiene education, and found that the effect on childhood diarrhea was similar in those studies as the effect seen in other studies for water supply, sanitation, or hygiene alone. There was a similar effect when combining water treatment and 46 handwashing in Karachi squatter neighborhoods; there was no additional benefit to combining the interventions, although each intervention alone cut diarrhea prevalence by about 50% (Luby et al., 2006). However, a meta-analysis (Gundry et al., 2004) of 7 household water treatment (HWT) intervention studies found that the effectiveness of interventions increased as sanitation improved. Positive interaction between sanitation and source water quality has been observed for diarrhea (VanDerslice & Briscoe, 1995); furthermore, increased water use and latrine possession positively interacted to improve infant weight and length in rural Lesotho (Esrey et al., 1992) (presumably by reducing diarrhea). In general, though, interaction between interventions appears to be negative, with sanitation an occasional exception. Since good sanitation removes feces from the environment and therefore drastically reduces the amount of pathogens available, it may enable greater effectiveness of other interventions by allowing them to act on a region of the dose-response curve where disease risk is more rapidly declining. A possible explanation for negative interaction may be found in bias due to difficulty in blinding field trials. For example, the effectiveness of an intervention in a field trial may be overestimated if the trial is unblinded and the outcome is subjective (L. Wood et al., 2008). Such a bias would probably be similar whether a single intervention or two joint interventions are being applied, which would give results resembling negative interaction. Since diarrheal illness is transmitted by multiple pathogens over multiple routes, an intervention that can successfully impede transmission might not reduce pathogen exposure enough to detectably impact disease in a highly contaminated environment; enough transmission may occur by alternate routes to maintain high endemicity (Briscoe, 1984). Once conditions are somewhat improved, interventions which seemed initially to have low effectiveness might yield larger reductions in disease by further reducing the amount of pathogens into a zone of a dose47 response curve where disease risk rapidly decreases. This is especially likely if the relationship between the number of pathogens ingested and the development of disease is nonlinear (Briscoe, 1984). Unfortunately, preexisting characteristics of the area that might affect an intervention trial are often omitted or not clearly reported; many ‘single interventions’ might be considered ‘multiple interventions’ if prior community development is considered. Furthermore, metaanalyses attempt to generate overall effectiveness measures for particular interventions, but they generally lump trials together to obtain a pooled result with little regard for characteristics of the community before the intervention was implemented. Even if such characteristics are considered, power may be lacking to determine if heterogeneity between studies is due to community characteristics. A meta-analysis (Gundry et al., 2004) examined sanitation, urban/rural setting, type of water source, water storage, and blinding to see if they affected household water treatment effectiveness, but found no relationships other than increasing effectiveness with increasing sanitation. Sanitation explained 1/3 of the heterogeneity between those studies, but the remaining 2/3 could not be explained (Gundry et al., 2004). Interventions may also interact simply by facilitating each other in the household. For example, water supply improvements facilitate handwashing simply by making more water available (Curtis 2000). They might also facilitate activities such as breastfeeding and food preparation by increasing the amount of time available to the mother, such as by eliminating the necessity of walking long distances to collect water. Natural conditions may also interact with interventions. For example, HWT chlorination was ineffective during abnormally heavy July rains in Karachi, but was effective for most of the rest of the year (Luby et al., 2006). 48 2.17. Descriptions of individual interventions In general, diarrheal disease should be prevented by preventing feces from entering the mouth, as illustrated by the classic ‘F-diagram’ (Figure 2.4) (V. A. Curtis et al., 2000); sometimes 'fomites' are also added, denoting an object contaminated with pathogens. Anything that removes feces or pathogens from the environment has the potential to reduce infectious diarrhea; different interventions are possible, which act on different parts of the diagram. Interventions are described in detail below and will be referred to briefly in the context of particular pathogens. Figure 2.4. Simple F-diagram of transmission of diarrheal pathogens The 'F-diagram' (so called because nearly all compartments begin with 'F') illustrates how diarrheal pathogens can be transferred by different vectors within a community. HWT means household water treatment. Modified from V. A. Curtis et al. (2000). See Figure 2.7 (page 77) for a more detailed diagram. For interpretation of the references to color in this and all other figures, the reader is referred to the electronic version of this dissertation (at Michigan State University, proquest.com, Google Scholar, or openthesis.org). 2.17.1. Sanitation Sanitation appears particularly important because it can restrict all routes connecting feces to other compartments (Figure 2.4). Although developed-country urban sanitation infrastructure including flush toilets, piped sewer systems, and sewage treatment plants are a highly effective 49 ideal, they are prohibitively expensive in many locations. Nonetheless, in crowded urban areas, this may be the only feasible solution due to the lack of space for building latrines. Although sanitation seems important intuitively because of its key role in removing feces from the environment, there have been relatively few rigorous trials establishing its effectiveness on diarrhea; those available tend to be observational and unrandomized (Barreto et al., 2007; Henry & Rahim, 1990; Esrey, 1996). Large sanitation construction initiatives tend to be accompanied by other infrastructure improvements as well as hygiene education, making it difficult to isolate the effect of sanitation alone (Barreto et al., 2010). A large study (Barreto et al., 2007) comparing longitudinal prevalence of diarrhea in 0-3 year olds in the city of Salvador in Brazil before and after the implementation of a sewerage construction program showed reduction of diarrhea by about 43% in areas of the city with prevalence above 8 days of diarrhea per child per year. Most areas with lower prevalence of diarrhea did not have significant reductions in diarrhea, leading to an overall decrease of about 22% in the longitudinal prevalence of diarrhea. A later study of 0-4 year olds in the same area (Barreto et al., 2010) found a reduction in prevalence of Giardia infection from 14.1% to 5.3%; the new sewerage connections alone appeared to account for the greatest share (about 25%) of this decline, with improved cleanliness in/near the house accounting for an additional 17% (Barreto et al., 2010). Given available space, several inexpensive and effective latrines can be built with local materials, such as the ventilated improved pit (VIP) latrine, pour-flush latrines connected to a hand-dug septic pit, and composting toilets (e.g, SkyLoo). These are considerable improvements over crude pit latrines, which may be avoided due to stench and risk of collapse. Latrines must also be carefully located in order to prevent fecal contamination of groundwater. 2.17.2. Water supply improvement Improvement of the water supply at the source is an intuitively appealing intervention, 50 perhaps because it often involves building large tangible objects like wells, water tanks, or distribution systems. Provision of a piped connection and tap for each household is ideal but very expensive, costing approximately $100 per person just for the initial investment (Hutton & Laurence Haller, 2004). If a tap is absent, water must be gathered outside the household and stored within the household, where it can be easily recontaminated even if source water quality is excellent. A recent comprehensive review (T. Clasen, I. G. Roberts, et al., 2009) found variable results among six source-based interventions (improved wells or public taps, not including private taps); four studies showed reductions in diarrhea, including an incidence ratio of 0.83 in 6-23 month olds in a rural Bangladeshi setting and a risk ratio of 0.45 in all ages in a rural Chinese setting. Although water supply improvements commonly improve both the quantity of water available to the family as well as its quality, these two improvements have also been studied separately. Either alone generally yields similar benefits as both together (Esrey et al., 1991). Water quantity improvement Improving the quantity of available water, even if it remains contaminated, can be beneficial by facilitating more frequent washing and therefore improved hygiene. Time and labor required to gather water indirectly reduce the amount of water available. Basic guidelines for disaster response (Sphere Project, 2011) have recommended 15 liters per person per day, with a water point < 500 meters from the house, and < 15 minutes waiting time at the water point. However, these standards are frequently unmet in developing countries. Water quality improvement Improvements in water quality primarily yield benefits through reduced ingestion of pathogens in drinking water, although it may also reduce the amount of pathogens introduced into food. This may be particularly important if incompletely cooked weaning foods are made 51 with contaminated water (Motarjemi et al., 1993). The relationship between measured water quality and diarrhea risk is unclear; although presence of common indicators of poor water quality (turbidity, thermotolerant coliforms, high heterotrophic plate count) strongly suggest that water is unsafe to drink, the levels of these indicators are poorly correlated with the amounts of actual pathogens that are present. Furthermore, apparently clean water may still be contaminated with pathogens. A meta-analysis of 11 studies measuring E. coli or thermotolerant coliforms found no relationship between childhood diarrhea risk and the level of the bacterial indicator (Gundry et al., 2004). 2.17.3. Hygiene Approximately 2 to 6 liters of water are needed daily per person for basic hygiene practices (Sphere Project, 2011). The best-studied aspect of hygiene with respect to diarrhea is handwashing, although aspects such as diapering and food preparation are also important. Handwashing Failure (or inability due to lack of water) to wash hands provides a route for pathogens to be transmitted from person to person, as well as between people and their environment. Curtis and Cairncross (2003) describe 9 studies of handwashing in developing countries finding that 0 to 20% of people (median 13%) washed their hands after defecation or after cleaning a child who had defecated. Dirty hands can also introduce bacteria such as pathogenic E. coli into food, where they may multiply (Motarjemi et al., 1993), and mothers are usually responsible both for cleaning children after defecation and preparing food for the family (V. A. Curtis & Cairncross, 2003). Handwashing may be particularly important in preventing transmission of highly potent pathogens, such as Shigella sp. (Motarjemi et al., 1993). Accordingly, promotion of handwashing with soap in developed country daycare facilities has shown significant disease reduction (relative risk of approximately 0.5), even though the environment would seem to be much less 52 contaminated than in a developing country setting (V. A. Curtis & Cairncross, 2003). Handwashing works best when soap is used, which is fortunately inexpensive. Transient bacteria are removed equally well by handwashing with ordinary soap and water as with antiseptics, and resident bacteria on the skin are relatively unaffected by handwashing with soap and water (Lowbury et al., 1964). Diarrheal pathogens contaminating hands would generally be transient bacteria. Although antibacterial agents may have residual activity on bacteria that remain on the skin following handwashing (Lowbury et al., 1964), one formulation of antibacterial soap was no more effective than ordinary soap in preventing diarrhea or pneumonia in a Karachi shantytown (Luby et al., 2005). LRVs measured in studies of handwashing are found in Table 2.1 (page 64). Highly variable (relative risk from 1 to about 0.25) effectiveness of handwashing on diarrhea has been reported in a meta-analysis of 20 studies of various age groups, both observational and intervention-based (V. A. Curtis & Cairncross, 2003). Overall, the relative risk of diarrheal incidence was 0.57, although many studies were of poor quality. A meta-analysis (Ejemot et al., 2008) considered five community-based, randomized, controlled intervention trials of hygiene education that included handwashing; an overall incidence rate ratio of 0.68 was estimated. This measure was consistent across studies of young children as well as children aged less than 15 years. Despite the usefulness of handwashing in preventing diarrhea and other diseases, perpetuating good handwashing behavior remains a challenge. A 53% reduction in longitudinal prevalence of diarrhea effectiveness due to handwashing was estimated in squatter settlements in Karachi (Luby et al., 2006). However, no effectiveness was seen against childhood diarrhea during a ~13 month followup period that began 18 months after the earlier study concluded (Luby et al., 2009). Although intervention households were more likely to have a place to wash 53 their hands and were more likely to demonstrate better handwashing technique, there was little difference in longitudinal prevalence of diarrhea between intervention and control households. Furthermore, intervention and control households had similar spending on soap and were equally likely to have soap in the house. The authors suggest that frequent reinforcement of handwashing behavior may be necessary for sustainability; although continued home visits would be prohibitively expensive, mass media messages might be helpful. However, the similarities in purchasing and having soap between the intervention and control groups indicate that the price of soap may be a barrier to sustaining effective handwashing behavior. Even if water and soap are available, good handwashing practices are uncommon. In 11 developing countries, handwashing with water after defecation was only observed in 45% of caregivers, and just 17% used soap (V. A. Curtis et al., 2009). Handwashing is largely habitdriven, meaning an unplanned reaction to something in the environment, such as touching something unclean (V. A. Curtis et al., 2009). Since habits are largely acquired in childhood, changing handwashing behavior in a community might require many years; once a community improves handwashing, it can be very stable over time, because conforming to social norms is a powerful influence on behavior. However, if the social norm is lack of handwashing, as it is in many areas, this impedes improvement of handwashing behavior; increased promotion via broadcast media or visual cues like posters may help redefine social norms while providing cues that reinforce habits (V. A. Curtis et al., 2009). Diapering and open defecation Fecal contamination around the home has repeatedly been linked to increased diarrheal disease risk (V. A. Curtis et al., 2000), although this relationship disappeared in a study in Myanmar when adjusted for education and socioeconomic status (Han & K. Moe, 1990). Latrines do not necessarily remove all fecal contamination, and diarrhea reductions attributed to 54 latrines are likely due to removing stools from the environment (V. A. Curtis et al., 2000). Many cultures do not consider infant stools to be hazardous (Motarjemi et al., 1993), and diapering was associated with absence of diarrheal illness in children under two years of age in a case-control study in Nicaragua (Gorter et al., 1998). Even if diapers or potties are available, feces may still enter the environment if they are not properly disposed of, such as in a latrine. Yeager et al. (1999) describe conditions in a dry shantytown area of Peru. Potty-training can be a long and difficult process, during which defecation in clothes or on the ground is common (Yeager et al., 1999). Even if child feces are noticed on the floor or near the house, the mother is often too busy to dispose of them immediately, and disposal in places such as the garbage dump or the street is common; nonetheless, feces in potties are more likely to be disposed of, and even incomplete disposal may be helpful (Yeager et al., 1999). Effective potty-training may therefore yield important reductions in fecal contamination of the environment. Anal cleansing Few published scientific papers have studied anal cleansing. Effective soft materials similar to toilet paper may be unaffordable or scarce, which can lead to increased exposure to feces (McMahon et al., 2011). Focus groups of Kenyan schoolchildren reported lack of instruction from their parents and teachers regarding proper anal cleansing, and parents did not perceive the benefits of toilet paper as worth the cost (McMahon et al., 2011). A study of school toilets and hygiene in Colombia suggested that simply providing toilet paper, towels, and soap could be an important method for preventing diarrhea, since that composite factor was associated with more cases of diarrhea than the number of toilets available (Koopman, 1978). Food preparation Diarrheal risk substantially increases following 6 months of age, when infants who were 55 exclusively (or mostly) breastfed begin consuming weaning foods, which are often more heavily contaminated than other foods eaten by the family (Motarjemi et al., 1993). Weaning foods are often thin porridges which are not thoroughly heated since long cooking makes a porridge that is too thick for infants, and organisms such as Bacillus cereus and pathogenic E. coli can multiply in such foods at ambient temperature (Motarjemi et al., 1993). The necessity of feeding infants several times per day means that food may be prepared in advance and stored at ambient temperature to save time (Motarjemi et al., 1993). Storage of hot food in vacuum flasks (which are durable and relatively inexpensive) reduces the rate of cooling, allowing food to be stored safely for up to 12 hours (Mensah & A. Tomkins, 2003). Certain traditional grain food preparations that are fermented with lactobacilli can limit growth of, or actually kill, pathogenic bacteria, principally through the creation of acidic conditions (pH 3.5 to 4.5) (Adams & Nicolaides, 1997). Production of other compounds, such as bacteriocins (polypeptide ‘antimicrobials’), hydrogen peroxide, and CO2, may also aid pathogen inhibition (Adams & Nicolaides, 1997). It is unclear to what extent fermentation inactivates viruses; based on limited research, rotavirus appears to persist in fermented foods (Mensah & A. Tomkins, 2003). Fly control Although measures such as insecticide treatment can yield reductions in diarrheal illness due to fly killing, such interventions are not likely to be sustainable (V. A. Curtis et al., 2000). Sanitation interventions are probably more effective, since they remove feces from the environment, preventing flies from accessing them (V. A. Curtis et al., 2000). 2.17.4. Household water treatment (HWT), or point-of-use (POU) technology This is a particularly active area of investigation. Since HWT can be carried out by 56 individual families, it can be effective even if infrastructure or community cooperation is limited. Water is frequently collected outside the home and stored in the home, and even if the source water is perfectly clean, the stored water may become contaminated when things fall into it, or when dirty hands or cups are placed in it (A. J. Pickering et al., 2010). HWT can therefore be more effective than interventions at the water source (T. F. Clasen et al., 2006). HWT can also be implemented far more cheaply per person than more infrastructure-intensive interventions like latrines and wells (Hutton 2004). Although HWT seems appealing, it can require substantial behavioral change or monetary investment, which impede sustainability. There are several different methodologies, which are described below. LRVs for various HWT methods are given in Table 2.1, and antimicrobial standards for HWT are given in Table 2.2. Safe storage Safe storage refers to use of a storage container that does not allow anything to touch the water inside, preventing recontamination. This commonly takes the form of a vessel with a covered top (or a narrow neck) and a spigot at the bottom. It is often combined with other HWT methods. A less expensive alternative to safe storage is the ‘two-cup method’, which attempts to prevent contamination of stored water by using a single, clean, cup which is supposed to be the only object that touches the stored water. Water is decanted from the storage vessel with the cup and then poured into another cup to drink. However, it may be difficult (or impossible) to ensure that the cup remains clean, or to prevent other objects from entering the water. Boiling Boiling is the most widely used HWT method, and it is the only method that can reliably destroy all pathogens (T. Clasen, 2009). However, it requires fuel, which is frequently expensive or unavailable. Depending on how it is used, boiled water may require cooling, which is time57 consuming. Stored boiled water is also subject to recontamination in the household. Solar disinfection (SODIS) Solar disinfection is accomplished by filling clear 1 to 2 liter polyethylene terephthalate (PET) bottles with water and placing them in the sun for at least 5 hours, or for two days under cloudy conditions (EAWAG, 2004). Oxygenating the water (e.g., by shaking the bottles) increases effectiveness (EAWAG, 2004). The method is primarily suited for treating drinking water; it is difficult to produce enough water by SODIS for washing or other purposes (EAWAG, 2004). Pathogens in the water are inactivated by a combination of heating and irradiation by ultraviolet A (UV-A). Relatively clear water of <30 NTU is recommended for irradiation to be effective (EAWAG, 2004); nonetheless, highly turbid water (>300 NTU) can still be made safer by SODIS due to solar heating of the water. Simply heating water using solar energy, for example in a solar cooker or using black or metal vessels, is a similar method to SODIS. Under sunny conditions temperatures over 60°C can be attained, which essentially pasteurizes the water, inactivating most microorganisms (Sobsey, 2002). Even if a solar cooker is used, at least three hours in full sun is required to inactivate 99% of viruses in 3.8 L of water (Sobsey, 2002). Disadvantages of these methods include the substantial time and effort needed to fill bottles and place them in the sun, as well as having to wait for several hours for the method to work. The method is much less effective on cloudy days. Although some community trials have shown effectiveness of SODIS in reducing diarrhea(T. Clasen, I. G. Roberts, et al., 2009; A. Rose et al., 2006), a recent communityrandomized trial in rural Bolivia (Mäusezahl et al., 2009) found no effect of SODIS on diarrhea incidence, longitudinal prevalence, severe diarrhea, or dysentery, despite provision of bottles and repeated educational sessions demonstrating the method. However, the study was only powered 58 to detect a 33% reduction in diarrhea incidence, and the communities lacked sanitation, so other infection pathways remained open and may have swamped any effect of SODIS. Furthermore, only about 32% of intervention households were observed to be using SODIS on any given day, and its use tended to be lower during the cultivation season when families were busy. These observations highlight the difficulty of implementing SODIS sustainably. Chlorination Chlorination can be performed in the home by adding sodium hypochlorite (household bleach) to water (attaining about 5 mg/L free residual chlorine) (Centre for Affordable Water and Sanitation Technology, 2008), shaking the container to mix the water, and letting it stand for 30 minutes. This will kill most microorganisms, although Cryptosporidium is a common pathogen that is resistant. Commercial bleach may be used, or a custom solution may be produced and sold. Such a solution usually corresponds to one capful of solution for the particular size of storage vessel that is common in the community; the solution bottles are custom-designed to accomplish this. Increasing the pH of the solution above 11.9 with NaOH lengthens the shelf life to 12-18 months, though it should be used within 60 days of opening (Lantagne & Gallo, 2008). A double dose of hypochlorite solution is often recommended if the water is turbid (Lantagne & Gallo, 2008), although this may increase the risk of exposure to carcinogenic disinfection byproducts from the reaction of hypochlorite with organic material in the water. Chlorination is sometimes combined with a flocculant (e.g., PuR packets) which facilitates settling of particulate matter, improving its appearance and aiding removal of pathogens. Certain plants such as Moringa oleifera and Opuntia species can also provide flocculants (S. M. Miller et al., 2008). Some people find the taste of chlorinated water unappealing, although properly treated water should not taste strongly of chlorine. CDC’s Safe Water System (SWS) combines chlorination with safe storage (Lantagne & Gallo, 2008). Devices containing polymer beads that bind 59 chlorine or bromine and release it into contaminated water are more expensive than adding chlorine solution to water, but may be easier to use because they do not require measurement of a dose; furthermore, bromine has a milder taste than chlorine (Dunk, 2007). Such devices are effective at inactivating bacteria and bacteriophages (McLennan et al., 2009; Coulliette et al., 2010). In a meta-analysis of the effectiveness of HWT chlorination against childhood diarrhea, a pooled relative risk estimate of 0.71 (0.56-0.89) was determined for children under five years of age (B. F. Arnold & Colford, 2007). Three studies in urban or peri-urban areas showed a larger risk reduction (0.63: 0.50-0.80) than the 5 rural studies (0.89: 0.71-1.13). However, the trials were not carried out over multiple years (the longest one covered 87 weeks), and longer trials appeared to show lower effectiveness (although this trend was nonsignificant), raising questions about sustainability. Recurring costs and interruption in chlorine supply are major obstacles to sustainability of HWT chlorination. For example, in Malawi, where a chlorination solution (WaterGuard) has been marketed nationwide, 64% of a nationwide sample (Stockman et al., 2007) of mothers had heard of the solution, and 12% of those said they were currently using it. Among mothers who had used WaterGuard in the past but were not using it at the time of the survey, 39% said that they couldn’t afford it and 34% said that it was “currently unavailable”. The price of WaterGuard was about $0.08 for one month’s supply. Even if rural families are willing to pay for chlorination solution, distance (or the cost of transportation) may make it impossible for them to obtain it. Filtration The two major types of filters are ceramic filters and sand filters. They are relatively expensive (initial cost of $20 or more) (Sobsey et al., 2008) and require frequent maintenance. Ceramic filters need to be scrubbed to remove trapped material that slows flow through the filter; 60 sand filters require cleaning of the upper layer of sand to remove sediment and improve water flow through the filter. Filtration trials have shown relative risks for diarrhea of about 0.4 when comparing households with filters to those without (Sobsey et al., 2008; T. Clasen, I. G. Roberts, et al., 2009). Ceramic filters appear more effective at removing bacterial and protozoan pathogens from drinking water than biosand filters (Sobsey et al., 2008). Filters have the important advantage of improving the appearance (and often the taste) of the treated water. Chlorination may also be used on stored filtered water to further improve disinfection and prevent recontamination. Biosand filters The biosand filter (BSF) is a recent improvement on older slow sand filter designs. It consists of a tall bucket containing a layer of gravel with several feet of sand atop it. The outlet pipe originates at the bottom of the filter (in the gravel layer), and the mouth of the pipe (where filtered water exits) is slightly higher than the top of the sand layer. This ensures that the entire filter column remains wet, allowing an aerobic biofilm to establish itself near the top of the sand. The biofilm is believed to enhance deactivation of pathogens (Manz, 2009); this is consistent with results showing increasing deactivation of pathogens (including some viruses) while the filter ‘matures’ over several weeks (Elliott et al., 2008), but could also be explained by increased residence time in the filter due to decreased flow rate as pores close over time (Elliott et al., 2008). Inactivation is drastically lowered (by 1-3 LRV of E. coli) if the amount of water passed through it daily exceeds its pore volume (Elliott et al., 2008). Although BSFs can often be constructed in developing-country communities according to instructions that are freely available, manufacture of properly functioning BSFs requires great attention to detail and consideration of local conditions (e.g., the characteristics of the sand/gravel available, quality of cement/concrete, etc.) (Manz, 2007). Although BSFs can produce water on demand, a separate 61 container is usually needed to store filtered water (Manz 2007), which is subject to recontamination if not stored safely. Ceramic filters Ideally, ceramic filters have pores that are too small (about 0.2 μm) to allow most bacteria and protozoa to pass through, although they cannot block viruses (Joe Brown et al., 2007). They can sometimes be made out of local materials, though quality control can be challenging (Sobsey et al., 2008). The filters need to be cleaned, and cracks in the filter reduce (or eliminate) its effectiveness. The two main ceramic filter designs are ‘pot’ (a large ceramic pot-shaped filter nested inside another vessel that captures the filtered water) and ‘candle’ (a cylindrical filter, often used inside a plastic receptacle to capture filtered water). An evaluation of a program to produce, distribute, and promote ceramic pot filters in Cambodia (Joe Brown et al., 2007) showed that the number of filters in use decreased linearly over time, with about half of 506 households still using the filter after 24 months. Breakage accounted for 65% of the disused filters, so replacement filters must be purchased frequently (they cost $2.50 to $4.00 to produce). A pilot project in a small Bolivian community (T. F. Clasen et al., 2006) evaluating imported candle filters indicated that they were effective in reducing diarrhea, but identified several problems: many users reported inadequate quantity or flow of water, replacement filters were unavailable, and approximately 8% of the filters were broken 9 months after they were provided. Advanced filter technologies Nanofilters with a pore size small enough to remove viruses have also been built into HWT devices for use in developing countries (T. Clasen, Naranjo, et al., 2009; Boisson et al., 2010). Such filters require advanced manufacturing techniques and cannot be produced locally like 62 ceramic pot filters. However, they are effective at removing pathogens, and in general should perform comparably to ceramic filters against bacteria and protozoa, while removing more viruses than ceramic filters. 63 Table 2.1. Log10 reduction values (LRVs) for various interventions Laboratory* Intervention Actual community* BactProto- BactProtoViral Viral Citations erial zoan erial zoan HWT chlorination 6+ 6+ 5+ 3 3 3 (Sobsey et al., 2008) HWT coagulation & chlorination (PUR® sachets) 9 6 5 7 2-4.5 3 Notes Cryptosporidium resists Cl; LRV doesn't apply to it. (Sobsey et al., 2008) Chlorinated polymer beads (HaloPure®) 6 3 (McLennan et al., 3 LRVs against 2009; Coulliette Clostridium. et al., 2010) (McLennan et al., 3 LRVs against 2009; Coulliette Clostridium. et al., 2010) Brominated polymer beads (HaloPure®) 6 5 Ceramic filtration 6 4 6 2 0.5 4 (Sobsey et al., 2008) Biosand filtration 3 3 4 1 0.5 2 (Sobsey et al., 2008) Sari filtration (4 layers of ordinary cloth) 2 V. cholerae (Huq et al., 1996) attached to particulates. 6 4 3 (T. Clasen, Naranjo, et al., 2009) 5.5+ 4+ 3+ Nanofiltration (LifeStraw Family®) Solar disinfection (SODIS) Handwashing with soap Handwashing without soap 2 1 (Sobsey et al., 2008) 0.5 3 1 3 1 Pore size of 20 nm. (Lowbury et al., 1964; Luby et al., 2001) 0.3 (Ansari et al., 1989; A. J. Pickering et al., 2011) *LRVs are commonly higher in the laboratory (careful implementation and maintenance) compared with actual communities (might be improperly used or poorly maintained). 64 Table 2.2. Log10 reduction values: standards for household water treatment (HWT) Standard BactProtoViral erial zoan Citation WHO 'highly protective' target 4+ 5+ 4+ (Sobsey & Joe Brown, 2011) WHO 'protective' target 2+ 3+ 2+ (Sobsey & Joe Brown, 2011) WHO 'interim' target Meets 2/3 of 'protective' targets above. (Sobsey & Joe Brown, 2011) USEPA HWT standard 6+ 4+ 3+ (USEPA, 1987) USEPA primary drinking water standard for treatment facilities * 4+ 3+† (USEPA, 2012) *<500 colonies on heterotrophic plate count; ≤5% positive samples for total coliforms monthly. †LRV of 2+ for Cryptosporidium. Recent controversy surrounding HWT Three recent reviews (Schmidt & Cairncross, 2009; Waddington et al., 2009; Hunter, 2009) have questioned whether there is really enough evidence to warrant widespread promotion and scaling up of HWT, despite substantial investment by WHO, the Gates Foundation, and others to expand it (Schmidt & Cairncross, 2009). WHO has referred to chlorination and safe storage as “consistently the most cost-effective” water, sanitation, and hygiene intervention (World Health Organization, 2002). However, Schmidt and Cairncross (Schmidt & Cairncross, 2009) argue that the decision of whether to promote HWT primarily rests upon demonstration of a health effect, since other benefits from HWT are small and the scalability and acceptability (which are critical to sustainability) of HWT remain unclear. The health effect is difficult to establish because results from trials vary greatly, and publication bias may be a large problem, particularly regarding smaller trials (Hunter, 2009). These reviews also conclude that the published health effect estimates attributed to HWT might be wholly explained by bias (reporter or observer), since it is difficult or impossible to blind studies of HWT interventions. They therefore call for 65 larger, long-term, blinded studies to be carried out to establish whether beneficial health effects truly exist, before making large investments in expansion of these interventions. Hunter’s recent review (Hunter, 2009) also showed that intervention effectiveness was lower among studies with longer follow-up; similar findings have been reported (B. F. Arnold & Colford, 2007) regarding household water chlorination. This suggests difficulty with sustainability. However, other explanations might include declining of reporting bias over time or by a tendency for longer-term studies to be better designed and therefore less subject to bias. Hunter’s analysis (Hunter, 2009) also shows that effect sizes shown in most HWT trials over longer periods are roughly within the range that might be explained by bias in unblinded studies with subjective outcomes (risk ratio of ~ 0.7) (L. Wood et al., 2008). Ceramic filtration is a notable exception, showing substantial reductions in diarrhea even in long-term trials (risk ratio of 0.44, 95% CI 0.28 to 0.70) (Hunter, 2009). However, Wood’s (L. Wood et al., 2008) quantification of bias due to lack of blinding drew on a widely varying set of clinical trials, nearly all of which were drug trials or therapeutic intervention trials. However, there seems little reason to believe that HWT trials would be any less susceptible than drug trials to bias due to lack of blinding. These assertions have been contested by several noted researchers in the HWT field, who want current HWT promotion efforts to continue even while additional needed research is carried out (T. Clasen, Bartram, et al., 2009). Patterns of publication suggesting publication bias in HWT have been noted elsewhere, but have been interpreted as possibly being caused by trials of different methods in drastically differing settings (T. Clasen, I. G. Roberts, et al., 2009). 2.17.5. Other interventions Many other important methods are available to mitigate the impact of diarrheal disease, such as: nutritional interventions (e.g., encouragement of breastfeeding, zinc supplementation); 66 vaccination against rotavirus; and treatment methods (particularly oral rehydration solutions). However, these interventions are not directly addressed in the research described in chapters 3, 4, and 5 of this dissertation. The reader may refer to page 235 in the appendices for a discussion of these methods. 2.17.6. Gaps in knowledge about diarrheal disease interventions There is relatively little information about real-world effectiveness of interventions outside of field trials, and high-quality field trials lasting for a year or more are scarce. Use of an intervention may decline as equipment or infrastructure deteriorate and outside encouragement of healthy behaviors decreases. People may also forget how to apply the intervention effectively. This is difficult to investigate because the same communities must be studied over many years. Intervention trials commonly consider diarrhea as the outcome, without regard to etiology. Given the differing characteristics of common diarrheal pathogens, it is likely that interventions will affect different pathogens in distinct ways. The relative abundances of different pathogens in different areas may therefore affect intervention effectiveness, as well as the interactions of interventions with each other. However, information concerning antimicrobial effectiveness of some interventions (particularly HWT methods) is available (Table 2.1, page 64). The effectivenesses of interventions are likely to differ depending on a community's level of development, because improvements in sanitation, water supply, or hygiene in an area are likely to restrict certain routes that pathogens travel in that community. Therefore, an intervention might prevent a smaller or larger fraction of disease depending on the interventions that have preceded it (Briscoe, 1984; VanDerslice & Briscoe, 1995). Existing intervention effectiveness estimates from field trials are likely to be incomplete because they do not take into account previous interventions (or protective characteristics) that are impacting the community at the time of the study. Effectiveness studies should carefully describe the status of the community 67 in detail before the intervention took place, in order to place the resulting effectiveness estimates in the context of the community. This would yield a large number of effectiveness estimates narrowly defined to particular settings. Although they would be difficult to summarize, they would be useful for validating diarrheal transmission models and the effect of interventions on the transmission network. 2.18. Infection transmission modeling and its application to diarrheal disease Classical epidemiological methods (e.g., regression analysis of risk factors associated with a particular health outcome) are powerful within certain contexts, such as determining factors associated with illness in a point-source outbreak of disease. However, these methods often incorporate inappropriate assumptions with respect to infectious disease transmission. For example, classical statistical methods often assume that outcomes experienced by different individuals are independent (i.e., are not influenced by other individuals). This is clearly violated when pathogens are passed from person to person (Koopman, 2004). Modeling allows formalization of relationships between individuals, as well as between individuals and their environment, avoiding the false assumption of independence. Models of infectious disease transmission have been used to guide public health efforts, particularly with regard to choosing among different control strategies. These include polio eradication by vaccination (K. M. Thompson et al., 2006), smallpox preparedness planning (Keeling & Rohani, 2008), and slowing pandemic influenza spread (Cooper et al., 2006). However, modeling methods have seldom been applied to common childhood diarrhea in developing countries (but see Eisenberg et al., 2007), despite the magnitude of the problem. 2.18.1. Types of models In general, models attempt to represent the behavior of a real system in a simplified fashion according to a defined set of criteria. Models and simulations are used in many disciplines and 68 the nomenclature can be confusing and inconsistent. The general classification in Table 2.3 is based on Haefner (2005) with some modifications: Table 2.3. Terms commonly used to describe or categorize models Model descriptor Key question Notes Mechanistic (yes) Empirical (no) Does the model explicitly include the inner workings of the system being studied? Dynamic (yes) Static (no) Does the modeled system change over time? Continuous (yes) Discrete (no) Is time (or other quantities of interest, e.g., population size) measured continuously? Empirical models are sometimes called descriptive or phenomenological models, and often constitute simple equations. Systems of differential equations usually describe time continuously, while a system of difference equations could measure time as a discrete integer number of days. Spatially heterogeneous (yes) Are spatial relationships explicitly Spatially represented? homogeneous (no) A spatially heterogeneous model must also represent space in a continuous or a discrete manner (discrete space is divided into cells or zones). Stochastic (yes) Deterministic (no) Are random events included? Models based on matrices of probabilities (Markov chain models) are sometimes called stochastic models because they track expected long-run behavior of systems where behavior of individual units is uncertain, even if they do not directly incorporate random events. Model descriptor Definition Notes Compartmental Quantities (e.g., pathogens, water) Commonly used in models of flow between storage infection transmission. compartments (e.g, people, tanks). Agent-based The model represents discrete units whose characteristics change; relationships between these units are explicitly defined. Network Describes links between units ('nodes'); e.g., person A might contact persons B & C, but B & C do not contact each other directly. 69 Also called individual-based models. Any model will incorporate some combination of the characteristics described in Table 2.3. Examples might be a mechanistic dynamic discrete-time spatially homogeneous deterministic model, or a mechanistic dynamic continuous-time discrete-space stochastic model. A larger model might also contain a sub-model of a different type. An advantage of mechanistic models is their ability to explicitly describe interdependence between individuals and groups. The ability to represent nonlinearities, such as feedback loops, is an important feature as well. 2.18.2. Model verification and validation Any model must be carefully verified. Model verification means ascertaining that the model does what its designer intends; i.e., there are no bugs in the algorithm/program (Haefner, 2005). This is separate from whether the designer's intentions are ill-informed or mistaken. Once the model has been verified, validation statistically compares the behavior of the model to real systems to assess whether the model can explain or predict the behavior of those systems (Haefner, 2005). Thorough validation thus requires detailed data collected by carefully designed experimental or observational studies. Although model validation may be possible for very simple systems, it is probably impossible to thoroughly validate a model of a complex or complicated system (Oreskes et al., 1994). It is unlikely that we can observe everything of importance within a system; furthermore, any measurement we make of a system contains error (Oreskes et al., 1994). Many aspects of diarrheal infections are poorly understood; for example, although we know that malnutrition and diarrhea exacerbate each other, we do not know exactly how, the mechanism is likely to vary depending on the particular pathogen, and it is unclear how much diarrhea is attributable to particular pathogens in particular communities. Nonetheless, models of complex, poorlyunderstood systems can still provide insight about how the real systems operate, and models can always be tested to determine whether they approximate certain essential features of the system; 70 this might be termed 'weak validation' or 'confirmation'. However, confirmation does not necessarily mean the model accurately reflects the workings of the real system; further investigation of the system might reveal inconsistencies (Oreskes et al., 1994). Although models have limitations (as do all forms of human knowledge), constructing and assessing them remains useful because they allow us to formalize available knowledge and apply it to understanding and addressing problems. The process of model building often provides guidance about the kind of information that needs to be gathered by studies in order to more fully understand the system. This can occur even if scant data are available for validating or confirming the model. Modeling is a process that attempts to improve understanding of a complex system through rigorous description of the system's characteristics and their interrelationships, followed by comparison of the model against reality, followed by revision of the model, and so on in a repeating cycle. Analysis of the model and its results may suggest experiments that could allow more rigorous validation, or provide information to improve the structure of the model (e.g., by carefully measuring a particular value that greatly alters the results of the model). 2.19. Modeling transmission of diarrheal infections Infection transmission models are typically dynamic models, describing the behavior of a system of susceptible and infected hosts over time. They explicitly describe changes in transmission depending on the number of people that are infectious. These models are commonly represented by box-and-arrow diagrams, in which boxes (sometimes called 'compartments') represent a 'stock' and solid arrows represent a 'flow'. For example, the classic SIR (Susceptible – Infectious – Removed) model (Keeling & Rohani, 2008) represents hosts (e.g., people) flowing in and out of 'susceptible', 'infectious', and 'removed' stocks, where 'removed' could mean immunity or death (Figure 2.5). 71 Figure 2.5. Schematic of the Susceptible-Infectious-Removed (SIR) model Solid arrows represent hosts flowing between stocks (which represent infection states); dashed arrows represent the influence of a stock upon a flow. Red and blue arrows represent infection and recovery, respectively. All letters represent terms in equation 2.1 below. Modified from Keeling and Rohani, 2008. 2.19.1. Simple mathematical example of a transmission model Simple infection transmission models can be described deterministically by systems of differential equations, where the rate of change of the number of persons in a particular stock is described using first-order rates of individuals entering or leaving the stock. For example, in the SIR model (Figure 2.5), S, I, and R represent the proportion of the population in the susceptible, infectious, or immune stocks at any given time (thus, the three stocks sum to 1) (Keeling & Rohani, 2008): dS/dt = -βSI dI/dt = βSI – γI dR/dt = γI (2.1) Transmission is often described by a parameter (β) that indicates how likely it is for a contact between an infected and uninfected person to result in a new infection. Assuming that every member of the population is equally likely to contact every other member, the product of S and I is proportional to the number of possible contacts where an infectious person can infect a susceptible person. The rate of recovery (γ) is the reciprocal of the infectious period and governs the recovery of infected people (i.e., the rate of their movement from I to R). The SIR model 72 structure can be modified to represent many different systems, e.g.: susceptible or infected (SI) if immunity is not an issue; susceptible, exposed, infected, or removed (SEIR) where the ‘exposed’ state accounts for the incubation period, etc. The basic reproduction ratio, R0, is a particularly important concept in infection transmission modeling. It is the average number of individuals that an infected individual directly infects if it enters a completely susceptible population (Anderson & May, 1991). The infection must die out if R0 is less than 1, because the pathogen would not replace itself (Anderson & May, 1991). Since populations are seldom completely susceptible, an R0 substantially greater than 1 might be necessary for the disease to remain endemic in a population. The general concept of R0 can also be applied to subpopulations within a larger population, describing the number of new infections in one subgroup due to transmission from one other subgroup (M. G. Roberts & Heesterbeek, 2003). Epidemics may still occur if R0 is < 1 at the community level if conditions become favorable for transmission within some subgroup, effectively establishing a local R0 > 1; this can be simulated in stochastic models (Halloran et al., 2002). Stochastic models can also allow random extinction of the disease in the population, rendering the population disease-free unless the pathogen is reintroduced. 2.19.2. Environmental infection transmission models The SIR model and related models described above assume that pathogens are only transmitted through person-to-person contact. However, many pathogens remain viable in the environment, and can be transmitted between hosts in many ways (e.g., food, water, or fomites). Mechanistic models describing the transmission of pathogens through the environment and their effects on hosts are called environmental infection transmission systems (EITS) models (Li et al., 2009). These models facilitate the simulation of disease prevention interventions because they 73 explicitly represent the numbers of pathogens in the system, which can then be directly modified by an intervention. The action of an intervention to reduce a particular flow of pathogens reduces risk to the people who ingest the pathogens. A simple EITS model that could represent transmission of diarrheal infections can be created (Figure 2.6) from the SIR model described previously (Figure 2.5), by adding an 'environment' compartment that describes pathogens in the environment; infectious hosts release pathogens, which can be picked up by other hosts. Since diarrheal infections seldom confer complete immunity, the 'removed' compartment has been removed. Note that the two blue stocks represent susceptible or infectious hosts, and the yellow environment stock represents pathogens. The number of pathogens in the environment stock influences the rate by which susceptible hosts become infectious. Figure 2.6. Simple environmental infection transmission system model Solid arrows represent hosts or pathogens flowing between stocks (S and I represent susceptible and infected hosts; E represents pathogens in the environment); dashed arrows represent the influence of a stock upon a flow. Red, blue, magenta, and grey arrows represent infection, recovery, pathogen shedding, and pathogen inactivation, respectively. All letters represent terms in equation 2.2 below. Modified from Li et al., 2009. The EITS model in Figure 2.6 is represented mathematically as follows (modified from Li et al., 2009): 74 dS/dt = -piSE + γI dI/dt = piSE – γI dE/dt = sI – mE (2.2) As in the SIR model, γ is the rate of recovery of infectious people, although in this model they become susceptible again instead of becoming 'removed'. The rate of shedding into the environment (pathogens per infectious host per day) is represented by s; the inactivation rate of pathogens in the environment is represented by m; the pickup rate of pathogens by susceptible people is p; and the probability of infection per pathogen is i. Interventions to reduce transmission could be simulated by increasing m, or by reducing p. This EITS model is highly simplified; it can be made more realistic by using quantitative microbial risk assessment (QMRA) techniques to represent the relationship between environmental pathogens and infection. 2.19.3. Quantitative microbial risk assessment (QMRA) models QMRA models are a widely used method for understanding the risk of disease (Haas et al., 1999). Typically, these models use an exposure step to estimate the dose of pathogens ingested, followed by a dose response step in which the mean dose of pathogens entering the host is translated into a probability of infection (or illness, or death) by a dose response equation. These equations (often called dose response models) assume that a single pathogen has some probability of causing infection; several possible equations exist, and the most commonly used are the exponential equation and the beta-Poisson equation (Haas et al., 1999). The parameters of dose response equations are determined by fitting them to data from experimental studies in which hosts (sometimes humans) are given differing doses of pathogens by a particular route (e.g., oral, inhaled, or parenteral), and the proportion of hosts who become infected or diseased by each dose are recorded. However, it is unclear whether these dose-response relationships apply to (possibly malnourished or ill) children, or to developing country settings. For ethical reasons, dose response studies in humans must include only healthy volunteers, who are nearly 75 always adults. Although some live attenuated rotavirus vaccine trials with healthy children have provided dose response data (Vesikari et al., 1985; Pichichero et al., 1990), the vaccine strains of these pathogens probably behave differently than wild-type pathogens. It is not clear (and perhaps unknowable) how dose response equations might differ in malnourished (or otherwise unhealthy) children. QMRA models are often relatively simple to construct and use (e.g., in an Excel spreadsheet). However, they cannot fully describe secondary transmission of infection, such as when a person who becomes infected from the initial exposure passes the infection to additional people. Therefore, they are particularly good for describing risk from exposures to pathogens where secondary transmission of infection to additional uninfected people is low. QMRA models can describe a static response to a particular dose in a particular population, or they can dynamically describe changes in risk with changing exposure over time. QMRA techniques can also be used as components of infection transmission models incorporating transmission of pathogens through the environment. 2.20. Conceptual model of diarrheal disease transmission Diarrheal disease is characterized by diverse, simultaneous, interdependent modes of transmission that differ among many different organisms. This makes it a challenging syndrome to model compared to other infectious diseases. Transmission routes for diarrheal infections can be described by using stocks to represent pathogens moving between different locations. A good starting point for this is the ‘F-diagram’ (V. A. Curtis et al., 2000) shown earlier (Figure 2.4, page 49). The F-diagram can be expanded to more faithfully represent the complexities of transmission and control points for interventions (Figure 2.7). The boxes in each diagram represent stocks of pathogens, and the arrows represent flow of pathogens between them. Stocks marked with a + denote areas where some pathogens can multiply. Colored lines indicate where 76 interventions remove pathogens, limiting transmission to new hosts. These lines are broken because no intervention can always inactivate all pathogens using any particular route. Figure 2.7. Expanded diagram of diarrheal pathogen transmission The symbol '+' denotes places where some pathogens might multiply. HWT: household water treatment. Produced by the author, using V. A. Curtis et al. 2000 as a starting point (see Figure 2.4, page 49). Figure 2.7 shows that diarrheal infections are transmitted in a variety of complicated ways. An EITS model with so much detail would be extremely difficult to construct and interpret. However, the diagram remains useful for considering which routes to include or omit from the model, and it was used to guide construction of the EITS model described in chapter 5 of this dissertation. 2.21. Theoretical issues regarding interventions When an intervention is applied to a disease transmission system, it has a direct effect on 77 the individual receiving it by reducing their risk of disease. In addition, it has an indirect effect on individuals not receiving the intervention by reducing disease transmission within the community. This has been clearly described (Halloran et al., 2002) from the point of view of communities Y and N, in which some community Y residents are immunized, but no community N residents are immunized. Unimmunized people in community Y still benefit indirectly from immunization because reduction of disease spread lowers their chance of contacting a diseased person. The overall difference between the two communities is the community-level effect of the immunization intervention. The difference between the immunized and unimmunized in community Y is the direct effect of immunization, while the difference between the unimmunized in community Y and the unimmunized in community N is the indirect effect. The total effect is the difference between the immunized persons in community Y compared with the unimmunized in community N. The intervention need not be an immunization; interventions where adoption is variable in the community, such as handwashing or latrine use, should operate in a similar way. A comparison of two similar rural Zimbabwean villages, one of which had partial latrine coverage and the other having no latrines, showed decreased diarrhea among households lacking latrines in the village with latrines, compared with the village without latrines (Root, 2001). Systems modeling allows explicit representation of the mechanisms of indirect effects, and allows investigation of how incomplete participation (termed 'compliance' or 'adherence') in an intervention decreases the indirect effects of that intervention. Interaction of control measures against diarrheal illness is common, but may be positive or negative. A review (Fewtrell & Colford, 2005) considering five studies combining water supply improvements with sanitation improvements and hygiene education found that the effect on childhood diarrhea (32% reduction in diarrhea) was similar in those studies as the effect seen from other interventions alone: water supply (two studies, 33% reduction), sanitation (1 study, 78 24% reduction), HWT (eight studies, 34% reduction) or hygiene (seven studies, 46% reduction). A similar effect was found when combining water treatment and handwashing in Karachi squatter neighborhoods; there was no additional benefit to combining the interventions, although each intervention alone cut diarrhea prevalence by about 50% (Luby et al., 2006). However, a meta-analysis (Gundry et al., 2004) of seven point-of-use intervention studies found that the effectiveness of point-of-use interventions increased as sanitation improved. Increased water use and latrine possession positively interacted to improve infant weight and length in rural Lesotho (Esrey et al., 1992), and positive interaction has been observed between sanitation and source water quality (VanDerslice & Briscoe, 1995). Sanitation is one of the best studied interventions with regard to interaction with other interventions, and given its key role in removing feces from the environment at the point where they are produced, it seems likely to interact particularly strongly with other interventions. In particular, it seems likely that improved sanitation is necessary to realize further benefits from other interventions. By mechanistically modeling transmission of diarrheal infection and the effects of interventions, it should be possible to predict conditions under which interaction between two interventions is positive or negative. 2.22. Tools and information useful for modeling diarrhea transmission Intervention trials in communities throughout the world have shown the extent of effectiveness of interventions under favorable conditions, in terms of effectiveness against diarrhea in children under 5 years of age without reference to particular pathogens. Important characteristics of the communities that could impact diarrhea transmission, such as the use of latrines in the village, nutritional status of the population, or the prevalence of breastfeeding, are often not given, which makes it difficult to interpret the results. Nonetheless, the available trials provide indications of the effectiveness of various interventions. Much useful work has been done within the framework of quantitative microbial risk 79 assessment. Dose response equations that translate dose of a pathogen into the probability of developing disease have been developed for many diarrheal pathogens (Haas et al., 1999). Distributions have also been developed that describe likely exposure to pathogens on the basis of how they adhere differentially to different surfaces (e.g., hands compared with a cloth), which could theoretically allow estimation of a dose. Basic infection transmission models like the examples described above (Figures 2.5 and 2.6, page 72) commonly assume that individuals mix evenly, i.e., every person has the same chance of contacting any other person. However, even mixing is unrealistic. People most frequently contact other people within their own household, people mix preferentially within their own age group (Mossong et al., 2008), and the nature of the contact affects its intensity, e.g., contacts at home are more likely to be physical than contacts at work (Mossong et al., 2008). Home, school, workplace, and leisure contacts accounted for over 80% of all contacts in a survey carried out in several European countries (Mossong et al., 2008). Even if such factors are accounted for, people are also likely to have stable (i.e., nonrandom) patterns of connections to other people over time (Keeling & Eames, 2005). Several other studies (Wallinga et al., 2006; Bates et al., 2007; Ogunjimi et al., 2009) of how frequently people contact each other in various ways have been conducted recently, providing information on opportunities for pathogen transmission. However, with the exception of Bates et al. (2007), those studies were carried out in industrialized countries. Epidemiological data concerning diarrheal pathogens also assists model construction. Factors such as incubation period, period of communicability, immunity, asymptomatic carriage, and persistence of the organism in the environment are important for describing disease transmission. Diarrheal disease risk is also known to change greatly with age, with greatest risk after weaning and decreasing risk thereafter. 80 Demographic characteristics are also important since susceptibility to diarrhea varies with age and young children have poor hygiene and sanitation habits. Transmission within households is particularly important due to frequent contacts between individuals sharing a household (Mossong et al., 2008). Information about household composition is available for various countries through the Demographic and Health Surveys (DHSs) (USAID, 2012). Many meta-analyses have summarized the effects of drinking water interventions, hygiene interventions, and nutritional interventions using relative risks of disease comparing groups with an intervention to groups without an intervention (Gundry et al., 2004; B. F. Arnold & Colford, 2007; Waddington et al., 2009; Ejemot et al., 2008; Lazzerini & Ronfani, 2008; Hunter, 2009). However, these meta-analyses and the studies that they summarize seldom report useful information for mechanistic modeling, such as pathogen concentrations in drinking water or the interventions already used in communities before the studies began. 2.23. Published models relevant to endemic diarrhea transmission Although transmission models are widely used in infectious disease epidemiology, few consider diarrhea in humans. Several that model diarrhea directly or provide useful insights to modeling infectious diarrhea and its prevention are discussed below. 2.23.1. Mechanistic model of diarrheal infection: Eisenberg et al., 2007 A model of diarrheal infection transmission within a hypothetical community of a single pathogen with complete immunity following infection, considered five transmission routes: within-household, between-household, household-to-water, water-to-household, and introduction of pathogens from outside the community (J. N. S. Eisenberg et al., 2007). This allowed interdependencies between transmission routes; for example, reduction of transmission between water and households would also secondarily reduce within-household and between-household transmission. Parameters governing the strength of the transmission routes were varied, and the 81 results of a hypothesized water treatment intervention that eliminated all risk from contaminated drinking water subsequently varied, giving the following results from the model: 1. Low water contamination levels resulted in low preventable fractions of diarrhea from water treatment. 2. If between-household transmission was low, the preventable fraction increased as withinhousehold transmission increased, since that route created secondary cases within households, and these secondary cases could then be prevented by water treatment. 3. If within-household transmission was low and between-household transmission was increased, the preventable fraction increased at first but decreased again at high levels of between-household transmission (which created an important alternate route for disease transmission aside from contaminated water). If within-household transmission was then increased (i.e., both within- and between-household transmission were high), the preventable fraction decreased further. In this case transmission could be sustained by contacts between people, and water treatment therefore prevented little diarrhea. Analysis of this model showed that the effectiveness of an intervention could vary drastically depending on transmission characteristics within a community. Since additional interventions could preferentially alter different transmission routes, it also partially explains how interactions between various interventions might arise. 2.23.2. Empirical model of diarrheal disease: Schmidt et al., 2009 Schmidt et al. (2009) constructed a model describing diarrheal illness in a population by drawing the number of illness episodes from a distribution for each individual and then assigning an onset day and a duration to each episode. Individuals had a subject-specific error assigned to them to simulate differential susceptibility by individuals. Assignment of episodes was partly determined by increasing risk during the days following a previous episode, to simulate 82 autocorrelation (‘clumping together’) of disease episodes. Individuals in the model were independent, and transmission between individuals was not modeled. The model’s primary intended use was to generate simulated datasets to assess different disease surveillance schemes and explore effects of likely sources of error in surveillance data. Although it is an empirical model that does not mechanistically simulate disease transmission, the authors provide gamma distribution parameters drawn from real datasets for episode duration and number of episodes per year. An important issue raised by Schmidt et al. (2009) is the likely existence of autocorrelation of disease episodes within individuals. Some datasets show evidence for increased risk of an episode during the few weeks following a previous episode. This is reasonable given the fact that it may take several weeks for the full nutrient absorptive capacity of the small intestine to regenerate after an episode of acute diarrhea (Chen et al., 1983). In addition, people living in unhygienic environments often have chronic alterations (lasting for years) of the small intestine (termed tropical sprue, tropical malabsorption, or tropical enteropathy) which is associated with diarrhea and similar intestinal abnormalities that lead to malabsorption of nutrients; it is thought to have a root bacterial cause, partially because it can be treated with antibiotics (Lunn, 2000; Blaser et al., 2002). Autocorrelation of diarrheal episodes could also be explained by the existence of intermittent or relapsing infections (Schmidt et al., 2009). 2.23.3. Modeling indirect effects of interventions: Halloran et al., 2002 Modeling non-diarrheal infections can still provide important insights into diarrhea prevention. A detailed stochastic model of influenza transmission by contact between individuals in simulated communities has been described (Halloran et al., 2002). Contacts between individuals within their communities depended on neighborhood and level of school (and therefore also age). To simulate an intervention, the proportion of individuals vaccinated was 83 then increased in some communities. The model was used to investigate the direct and indirect effects of increased immunization, finding large indirect benefits to unvaccinated individuals. For example, if half of the population in the intervention community was given a vaccine that was 70% effective, the unvaccinated population received benefit equivalent to directly receiving a vaccine that was 40% effective. Indirect benefits to unvaccinated people increased as the proportion of vaccinated people increased. This scenario is analogous to many other interventions, in which a particular behavior (e.g., handwashing) is already practiced in a community at a certain level, and is increased by means of an intervention. However, it is unclear whether adoption of interventions preventing diarrhea by some individuals would similarly protect noncompliant individuals indirectly. 2.23.4. Environmental infection transmission system (EITS) models: Li et al., 2009 Much infectious disease modeling work has examined person-to-person transmission, but diarrheal illness is largely mediated by the environment (Figure 2.4, page 49; Figure 2.7, page 77). An SIR model of influenza transmission with an added environmental (E) compartment, modeled as a system of differential equations, has been described (Li et al., 2009); a simplified version of this model is shown in Figure 2.6, page 74. The model used susceptible, infected, and removed (i.e., immune or dead) compartments to track hosts, while an environment compartment contained pathogens shed into the environment by infected hosts. Hosts can then pick up pathogens from E and become infected; pathogens in E are also inactivated at a particular rate. The model used three different parameter sets to simulate differing transmission modalities: frequently touched fomites, infrequently touched fomites, and airborne transmission. However, the R0s from the three parameter sets were identical; this allowed comparison of the intervention effectiveness under differing infection transmission conditions. Two interventions were 84 considered: 1) decontamination, increasing the inactivation rate of pathogens in the environment by 25%; or 2) an 'avoidance measure' (perhaps analogous to improved hygiene) decreasing the pick-up rate of pathogens from the environment by 25%. When transmission occured via frequently touched fomites, neither intervention was effective. However, if transmission was airborne or via infrequently touched fomites, each interventions reduced the incidence of infection by about a third, with decontamination being more effective than 'avoidance'. Thus the effectiveness of an intervention depends on the nature of the transmission route it affects, even if transmission is identical pre-intervention. 2.24. Conclusion Diarrheal disease in developing countries is characterized by diverse pathogens, multiple routes of transmission, and numerous interventions to prevent it. A clearer understanding of the action of interventions on the transmission of diarrhea would be helpful for planning and implementing diarrhea control programs. Modeling techniques are useful for synthesizing the large amount of published information regarding transmission and control of diarrhea, and they can yield further insight about diarrhea transmission and control. Chapters 3, 4, and 5 of this dissertation describe the development and application of such models. 85 3. LINKING A QUANTITATIVE MICROBIAL RISK ASSESSMENT MODEL TO A HOUSEHOLD WATER TREATMENT FIELD TRIAL This chapter consists of previously peer-reviewed and published content (including supplementary material) that has been reformatted and reorganized: Enger, K.S., Nelson, K.L., Clasen, T., Rose, J.B., Eisenberg, J.N.S., 2012. Linking quantitative microbial risk assessment and epidemiological data: informing safe drinking water trials in developing countries. Environmental Science and Technology 46, 5160–5167. 3.1. Abstract Intervention trials are used extensively to assess household water treatment (HWT) device efficacy against diarrheal disease in developing countries. Using these data in policy, however, requires addressing issues of generalizability (relevance of one trial in other contexts) and systematic bias associated with design and conduct of a study. A published randomized controlled trial (RCT) of the LifeStraw® Family Filter in the Congo was used as the basis for a quantitative microbial risk assessment (QMRA) model, to demonstrate the application of models to water safety and health issues. The QMRA model accounted for bias due to 1) incomplete compliance with filtration, 2) unexpected antimicrobial activity by the placebo device, and 3) incomplete recall of diarrheal disease. Effectiveness was measured using the longitudinal prevalence ratio (LPR) of reported diarrhea. The Lifestraw RCT observed an LPR of 0.84 (95% CI: 0.61, 1.14). The model predicted LPRs, assuming a perfect placebo, ranging from 0.50 (2.597.5 percentile: 0.33, 0.77) to 0.86 (2.5-97.5 percentile: 0.68, 1.09) for high (but not perfect) and low (but not zero) compliance, respectively. The calibration step provided estimates of the concentrations of three pathogen types (modeled as pathogenic E. coli, Giardia, and rotavirus) in drinking water consistent with the longitudinal prevalence of reported diarrhea measured in the trial constrained by epidemiological data from the trial. The QMRA model demonstrated the 86 importance of compliance in HWT efficacy, the need for pathogen data from source waters, the effect of quantifying biases associated with epidemiological data, and the usefulness of generalizing the effectiveness of an HWT trial to other contexts. 3.2. Introduction The randomized controlled trial (RCT) is considered the gold standard study design in epidemiology; it is the study design with the least systematic bias, and therefore the highest internal validity. Two important components of RCT design for internal validity are: the randomization of subjects to the intervention and the non-intervention groups; and blinding of the subject and investigator to group assignment. It is difficult to blind HWT interventions because these devices are visually obvious and cannot be concealed from participants or investigators. It is also difficult to develop a placebo HWT filter that does not remove pathogens, but improves the appearance of water like an effective filter (Boisson et al., 2010). Other biases may also affect the internal validity of an estimate derived from the trial, such as recall bias, incomplete compliance with the intervention, or unexpected difficulties conducting the trial (Boisson et al., 2010). In a recent RCT (Boisson et al., 2010) in rural communities in the Democratic Republic of the Congo (DRC) using the LifeStraw® Family Filter (LFF; Vestergaard Frandsen Corporation, Lausanne, Switzerland), investigators attempted to blind the intervention. The LFF is an ultrafilter with a 20 nm pore size that was shown to remove 99.99999% of Escherichia coli, 99.998% of MS2 coliphage, and 99.97% of Cryptosporidium oocysts from challenge water in the laboratory (T. Clasen, Naranjo, et al., 2009). For the Lifestraw RCT, investigators developed a placebo filter resembling the LFF in appearance, weight, operation, and flow rate (Boisson et al., 2010). The placebo was tested in the laboratory for three weeks against the same three organisms, and no removal was observed. In the field, however, the intended placebo removed 87 on average 91% (95% CI: 88-93%) of thermotolerant coliform bacteria (TTC), a group that includes E. coli and indicates fecal contamination, from source water (Boisson et al., 2010). Therefore, the study could only compare a highly effective filter with a poorly effective filter. Although 65% of people reported using the filter, most filter users also reported drinking unfiltered water (Boisson et al., 2010). The proportion of unfiltered water that people consumed was not quantified. The Lifestraw RCT did not find a statistically significant (P < 0.05) effect of the LFF against diarrhea (Boisson et al., 2010). Quantitative microbial risk assessment (QMRA) models can examine and account for biases associated with environmental intervention trials (e.g., imperfect compliance, recall bias, or an imperfect placebo) and can explore risks associated with contexts different than those observed in empirical studies. Such models can provide a conceptual framework for understanding systems that are difficult to explore in the real world. QMRA models have been used to quantify disease risk in many contexts (J. N. S. Eisenberg et al., 2008; Haas et al., 1999; Parkin, 2008). The analytic framework for linking QMRA and epidemiological data described here consists of: 1) a calibration step using a QMRA model to produce results consistent with the epidemiological study; and 2) an estimation step that examines counterfactual scenarios that adjust for biases within the study and explores how altered contexts affect risk. The effectiveness of an intervention in those contexts can then be estimated, even if it was never directly studied under such conditions. This chapter describes a counterfactual causal inference framework using a QMRA model to evaluate the impact of biases on estimates of intervention efficacies. This is illustrated by simulating the Lifestraw RCT (Boisson et al., 2010) and adjusting for some of its biases, to estimate the effectiveness of the LFF compared with a perfect placebo under differing levels of LFF compliance. 88 3.3. Materials and Methods 3.3.1. Conceptual framework linking QMRA models to epidemiological studies Quantitative microbial risk assessment (QMRA) uses environmental contamination data as inputs to models used to predict risk of infection and/or disease. Epidemiological studies provide data on patterns of disease measured by incidence or prevalence, and measures of relative risk. This chapter illustrates a framework for the calibration of risk models by using epidemiological data from a particular study that describes the risk in a particular context, where the context is defined by a particular time in a particular geographic setting (Figure 3.1). The calibration process involved simulating a risk model many times using different input and parameter values. The parameter sets (or parameter distributions) representing those simulations that were consistent with the epidemiological study comprised the calibrated model; the parameter distributions represented the context in which the epidemiological study was conducted. Using this calibrated model, the epidemiological study was generalized to other contexts in the estimation step. The estimation step consisted of a set of simulations in which specific parameter values were varied to describe different contexts, such as alternative intervention strategies or different ecological or social settings. 89 Figure 3.1. Conceptual model for simulation of a randomized controlled trial The results from an actual epidemiology study defined a context (e.g., the LifeStraw® Family Filter randomized controlled trial [Lifestraw RCT] in rural Congo). The calibration phase provided a set of simulated studies that are consistent with this defined context. Calibration also estimated values for parameters that were not observed during the real study, thus inferring unobserved context of the real study. The estimation phase provided simulated studies that were generalized to other contexts (e.g., higher or lower compliance than was observed during the Lifestraw RCT). 3.3.2. Model description For the research described in this chapter, a QMRA model was developed that simulates the following chain of events: 1. Determination of the concentrations of three pathogen types (bacteria, protozoa, and viruses) in drinking water, sampled from gamma distributions 2. Calculation of daily doses of pathogens based on their concentrations and the amount of water consumed 3. Use of dose response functions: convert daily pathogen doses to probabilities of infection 90 4. Assignment of infection to individuals, based on the probability of infection 5. Assignment of diarrheal illness, based on morbidity ratios The same conceptual approach illustrated in Figure 3.1 could also be applied to more complex models including processes such as transmission dynamics (J. N. S. Eisenberg et al., 2005; J. N. S. Eisenberg et al., 2007) or environmental fate and transport dynamics (J. N. S. Eisenberg et al., 2006). The model describing the Lifestraw RCT conducted in the Congo (Boisson et al., 2010) follows a simulated population of children under five years of age for 12 months using a time unit of 1 day. The population was surveyed about their diarrheal symptoms every four weeks, similar to the Lifestraw RCT. The simulated children ingested bacteria, protozoa, and viruses in their drinking water, respectively represented by diarrheagenic Escherichia coli, Giardia cysts, and rotavirus. These three pathogens were chosen because they are major causes of diarrheal disease in much of the developing world, and they represent the three main taxa of waterborne pathogens.(Lanata & W. Mendoza, 2002) A child was either susceptible to, immune to, or infected by each of these three pathogens; we assumed that the infective processes of each pathogen were independent of each other, and a child could therefore be infected with 0, 1, 2, or 3 types of pathogens simultaneously. Children were divided into two groups; the first group received the intervention filter, whose log10 removal values (LRVs) for E. coli, Giardia, and rotavirus were 6.9, 3.6, and 4.7 respectively based on laboratory testing (T. Clasen, Naranjo, et al., 2009). The other group received the placebo filter, whose LRVs were set to 1.05 for all three pathogens based on the field trial for thermotolerant coliform removal (Boisson et al., 2010). The model is more completely described by the flowchart in Figure 3.2; the steps of the flowchart are explained in detail below. 91 Figure 3.2. Simulation model flowchart 92 Step 1: Parameter entry A simulation run begins by reading 28 model parameters (Table 3.1), which were estimated from the published scientific literature and are discussed in greater detail in chapter 7 They remain constant for every simulation run. Table 3.1. Fixed parameter values used in the QMRA model of the Lifestraw RCT Description of parameter values Value Reference Morbidity ratios (proportion of infected who are symptomatic) Escherichia coli 0.214 (Vergara et al., 1996) Giardia 0.590 (Peréz Cordón et al., 2008) Rotavirus 0.397 (Fischer et al., 2002) Escherichia coli (gamma distribution, mean 3 days) shape = 1.775 scale = 1.690 (Estrada-Garcia et al., 2009) Giardia (gamma distribution, mean 11 days) shape = 3.206 scale = 3.431 (Kent et al., 1988) Duration of infection Rotavirus (uniform distribution; mean 2.5 days) Range 1-4 days (Kapikian et al., 1983) 1.85 (Boisson et al., 2010) Shape parameter for all gamma distributions of pathogen type concentrations a Period of immunity for all pathogens 7 days No. children under 5 years of age, intervention group 85 (Boisson et al., 2010) No. children under 5 years of age placebo group 105 (Boisson et al., 2010) Intervention (LPIrad) 0.0749 (Boisson et al., 2010) Placebo (LPPrad) 0.0896 (Boisson et al., 2010) 0.836 (Boisson et al., 2010) 1.178 L/day (Akpata, 2004) Escherichia coli 6.9 (T. Clasen, Naranjo, et al., 2009) Giardia 3.6 (T. Clasen, Naranjo, et al., 2009) Longitudinal prevalence of reported diarrhea for each group Longitudinal prevalence ratio of reported diarrhea (LPRrad) Water ingestion Log10 reduction values (LRVs), intervention group 93 Table 3.1 (cont'd) Description of parameter values Value Reference 4.7 (T. Clasen, Naranjo, et al., 2009) Calibration step 1.05 (Boisson et al., 2010) Estimation step 0 Rotavirus LRVs, placebo group, all 3 pathogens Dose response function parameters (Anon, 2012) Rotavirus; beta-Poisson parameters Chance of remembering diarrhea >2 days in the past (H L DuPont et al., 1971) (Rendtorff, 1954; J B Rose et al., 1991) α = 0.2531 N50 = 6.171 Giardia; exponential k parameter α = 0.155 N50 = 2.11×106 0.0198 E. coli (enteroinvasive); beta-Poisson parameters (Haas et al., 1993; Ward et al., 1986) 0.54 (Zafar et al., 2010) Compliance with device use: chance of using device on a given day Calibration step 0.65 Estimation step (Boisson et al., 2010) 0, 0.65, or 1.00 Compliance with device use: If using device on a given day, proportion of water treated Calibration step 2/3 or 1/3 a The scale parameters for the gamma distributions of pathogen types are determined by the mean concentration of pathogen types, which is randomly sampled from a uniform distribution during calibration (Table 3.2). Step 2: Parameter values inferred through calibration There are 4 other parameters (Table 3.2) which are unknown and were therefore inferred during the calibration process: 1. Baseline longitudinal prevalence of reported non-waterborne diarrheal disease in 'children' (under five years of age; LPrNW). This value was assumed to be the background LP of diarrhea due to the sum of all transmission pathways except drinking water, as well as all non-communicable causes of diarrhea; 2. Mean concentration of diarrheagenic Escherichia coli (representing bacteria) in untreated 94 drinking water (bacteria/L); 3. Mean concentration of Giardia (representing protozoa) in untreated drinking water (cysts/L); 4. Mean concentration of rotavirus (representing viruses) in untreated drinking water (virions/L). These values are necessary for the model, but no pathogen data exist for these water sources, nor are there clinical data describing the etiology of diarrheal disease where the Lifestraw RCT was conducted. The drinking water sources used in the Lifestraw RCT study area were unimproved, and consisted mainly of surface water and unprotected springs; however, the water was abundant and naturally clear (Boisson et al., 2010). Diarrheagenic E. coli, Giardia, and rotavirus were chosen to represent bacteria, protozoa, and viruses because they are common causes of diarrheal disease in developing countries (Lanata & W. Mendoza, 2002), and they represent the three major taxa of diarrheal pathogens. The prior distributions for the mean concentrations of these pathogen types were defined as uniform distributions, with ranges given in Table 3.2. Posterior distributions for these values were obtained by running multiple simulations with varying mean concentrations. Model runs returned three key outcomes: the longitudinal prevalence of reported diarrhea for the intervention (LPIrwd) and placebo (LPPrwd) groups, and their ratio, the longitudinal prevalence ratio (LPRrad). These three outcomes were also estimated by the Lifestraw RCT (Table 3.3). If the outcomes from a model run fell within the 95% confidence intervals for the outcomes estimated by the Lifestraw RCT, the mean pathogen type concentrations and the LPrNW were retained for use in the estimation step. 95 Table 3.2. Ranges for stochastically varying parameters Uniform distributions used to determine the values of the stochastically varying parameters for each simulation model run during calibration Lower limit Upper limit (low calibration compliance) Upper limit (medium calibration compliance) Mean concentration per L, pathogenic E. coli in untreated drinking water 0 7.0×104 8.0×104 Mean concentration per L, Giardia cysts in untreated drinking water 0 0.95 1.3 Mean concentration per L, rotavirus in untreated drinking water 0 0.14 0.18 Baseline non-waterborne diarrhea longitudinal prevalence (LPrNW)* 0 0.0972 0.0972 Description * The upper limit for baseline diarrhea longitudinal prevalence is the upper limit of the 95% CI for LP in the <5 year old intervention group in the Lifestraw RCT (Boisson et al., 2010). Step 3: Initiating each run; establishing equilibrium waterborne infection Once all parameters were available, the simulation run could begin. It was assumed that at the beginning of a simulated intervention study, each participant was in one of three states (susceptible, infected, or immune), and that the proportions of participants among these states were in equilibrium. In the model, this equilibrium was established by initially infecting every child with all three pathogen types and assigning infection durations randomly from a uniform distribution ranging from 0 to 50 days ('equilibration infections'). This range was chosen in order to prevent extreme oscillations in the proportion infected. In earlier versions of the model, these oscillations arose because the entire population was initially susceptible, leading to immediate infection of much of the population, followed by simultaneous recovery, followed by simultaneous infection. By using a wide range of infection durations for the equilibration infections, children become susceptible at different times during the equilibration period, thereby preventing large oscillation artifacts. After children recovered from an equilibration infection, 96 they were immune for 7 days, after which they could be reinfected through exposure to contaminated drinking water; infection durations were subsequently assigned from the distributions described in Table 3.1. Equilibrium was reached when the temporal trend of the infection prevalence was flat; the infection prevalence at equilibrium stochastically oscillated around the mean prevalence. The model reached equilibrium in approximately 60 days (e.g., Figure 3.4 and Figure 3.5); the precise time depended on the mean concentrations of pathogens. The simulated intervention study begins on day 128. Step 4: Calculating daily doses of marker pathogens The model included 85 children in the intervention group and 105 in the placebo group, consistent with the Lifestraw RCT (Boisson et al., 2010). The concentration of each pathogen type in the source water was sampled from a gamma distribution (Table 3.1) for each child on each day. The shape parameter of these distributions was always 1.85; the scale parameter was the randomly selected mean concentration for a particular pathogen, divided by the shape parameter. The shape parameter was obtained by fitting a gamma distribution to the thermotolerant coliform (TTC) counts measured by the Lifestraw RCT in untreated source water, excluding high outliers (over the detection limit of 30,000 CFU / 100 mL; 3.8% of the data). Each child’s daily dose of a particular pathogen type was determined using the LRVs (Table 3.1) attributable to the device that child is using: Daily dose = cd[(1 – w) + w10-r] (3.1) where c is the concentration per liter of a pathogen type in untreated water (sampled from a gamma distribution), w is the proportion of water treated (which varies depending on compliance), r is the LRV (which varies depending on whether the intervention filter, the placebo filter, or no filter is being used), and d is the liters of water consumed daily. 97 Step 5: Dose response functions The daily doses of pathogen types were converted to responses (i.e., daily probabilities) per susceptible person of becoming infected using dose response functions (Anon, 2012; Haas et al., 1999). These functions were obtained using results from studies in which adult volunteers were fed widely varying doses of pathogens, and monitored for development of infection. An exponential dose response function was used for Giardia, and a beta-Poisson dose response function was used for E. coli and rotavirus: Exponential: Response = 1 – e-kd (3.2) Beta-Poisson: Response = 1 – (1 + [d / N50][21/α - 1])-α (3.3) where d is the dose (number of pathogens ingested per day in drinking water), k is the parameter for the exponential model, and α and N50 are the parameters for the beta-Poisson model (N50 is the dose at which 50% of the population exhibits the response). As α approaches infinity, the beta-Poisson dose response function approaches the exponential dose response function (Haas et al., 1999). A graph of the dose response functions is found in Figure 3.3 (Figure 4.6a, page 138, presents the same information in log-log scale). 98 Figure 3.3. Comparison of dose response functions Step 6: Assignment of infection New infections were randomly assigned to susceptible children each day, according to the response probabilities obtained from the dose response functions. Step 7: Assignment of infection duration, recovery, and immunity If a child is infected, they were assigned an infection duration (Table 3.1, page 93) sampled from a gamma distribution (E. coli or Giardia) or a uniform distribution (rotavirus). Infections due to the 3 pathogens were tracked independently within each child. Functionally, this was done using a matrix with 1 row per child and 1 column per pathogen. The entries of the matrix were numbers of days; positive numbers denoted time remaining until recovery and negative numbers denoted elapsed time since recovery. Each day, 1 was subtracted from all entries, and an 99 individual recovered when an entry reaches 0. Following recovery, the individual was immune for 7 days; after this, they are once again susceptible and may be reinfected. The children in the placebo group were identical to the children in the intervention group, except that the log10 reductions attributable to the placebo device were lower (Table 3.1, page 93), and they therefore ingested higher doses of pathogens. Step 8: Surveying the population about reported diarrhea Although the model explicitly tracked infection, the Lifestraw RCT measured reported disease. Since some infections are asymptomatic or unreported, the model had to simulate the process of people reporting diarrhea in order to output disease measures comparable to the Lifestraw RCT. Reporting of diarrhea was simulated via 12 monthly surveys of the population. The mean of the 12 monthly estimates of prevalence during the year-long study period estimated the longitudinal prevalence (LP) of diarrhea. More generally, an LP is the proportion obtained by dividing the person-time affected by the total person-time observed. Each simulated survey estimated the prevalence of diarrhea by allowing each person to report whether any diarrheal illness occurred over the previous 7 days, corresponding to the actual survey process during the Lifestraw RCT. Reporting of diarrhea by people in the simulated community followed these rules: 1. the most recent infection for a particular child was determined, i.e., the infection with the longest duration remaining, or if not currently infected, the infection that resolved most recently; 2. that infection was stochastically determined to be symptomatic or asymptomatic, using the morbidity ratio (Table 3.1, page 93) as the probability of symptoms given infection; 100 3. asymptomatic infections were never reported; 4. symptomatic illness on that day or the previous 2 days was always reported; 5. symptomatic illness during the previous 3-7 days had a 54% chance of being remembered (Table 3.1, page 93); if the illness was remembered, it was reported. The 5th rule came from published recall bias measurements (Zafar et al., 2010). The prevalence of diarrhea reported by a particular survey was the proportion of people reporting diarrhea according to the rules above. Averaging the results from the 12 simulated surveys gave an estimate of the LP. Step 9. Determining model outcomes corresponding to the Lifestraw RCT To clarify this process, some terminology is described here. There were several different longitudinal prevalences (LPs) which were tracked or output by the model. LPs were determined (and subscripted) according to 2 categories: 1) intervention group (I) or placebo group (P); and 2) waterborne infection (wi), or waterborne diarrhea (wd), or any diarrhea (ad). In addition, reported diarrhea was prefixed with r. For example, the longitudinal prevalence in the intervention group of any reported diarrhea was LPIrad. Another parameter, LPrNW, was defined as the longitudinal prevalence of reported diarrhea acquired by nonwaterborne routes. LPrNW was not affected by water treatment. Total longitudinal prevalence of reported diarrhea for the intervention group (LPIrad) or the placebo group (LPPrad) was obtained by adding LPrNW to LPIrad or LPPrad. Since LPs are proportions and a person might simultaneously carry infection acquired from waterborne or non-waterborne routes, addition was carried out in this manner to avoid double-counting of the intersection between the two types of routes: LPIrad = LPIrwd + LPrNW(1 - LPIrwd) or LPPrad = LPPrwd + LPrNW(1 - LPPrwd) (3.4) 101 Table 3.1, continued The effectiveness of the LFF was measured by a longitudinal prevalence ratio (LPR). To correspond with the Lifestraw RCT, the primary outcome measure was the LPR of any reported diarrhea (LPRrad), calculated as follows: LPRrad = LPIrad / LPPrad (3.5) Analogous LPs and LPRs could also be calculated for waterborne infection, waterborne diarrhea, or reported waterborne diarrhea. For example, since the model tracked waterborne infection daily, the LPs for waterborne infection in the intervention (LPIwi) and placebo (LPPwi) groups could be used to generate LPRs for waterborne infection (LPRwi), as if the entire population was observed with perfect accuracy (Figure 3.10, page 113). Step 10: Repetition of calibration runs During the calibration step, the simulation model was run 100,000 times. Each run randomly selected different pathogen concentrations and a background diarrheal longitudinal prevalence (LPrNW) from a uniform distribution (Table 3.2, page 96). The upper limits of these uniform distributions were obtained from a simplified calibration process carried out before the actual calibration step. This process used the idea that the maximum possible concentration of a particular pathogen type is the concentration that yields the maximum LPPrad consistent with the Lifestraw RCT if other two pathogen types are absent. For a particular pathogen type, these values were estimated by examining results from 12 successive model runs. Each of the 12 runs progressively increased the concentration of a single pathogen type, with concentrations of the other 2 pathogens set to 0. The smallest concentration yielding an LPPrad above the 95% confidence limit for the placebo group (i.e., an LPPrad > 0.11; Table 3.3) provided the upper limit of the uniform distribution for that pathogen type's concentration (Table 3.2, page 96). The 102 appropriateness of these upper limits were checked by visually examining scatterplots of pathogen concentration by LPPrad after the full calibration step had completed (not shown). Parameter values and the results from all runs in the calibration step were saved. The time course of infection and reported diarrhea for two example simulation runs are in Figures 3.4 and 3.5 (page 104). Parameter combinations are selected that yield LPIrad, LPPrad, and LPRrad values within the 95% confidence intervals reported in the RCT trial (Table 3.3). These parameter combinations are then reused in the estimation step of the process. Among the three pathogen types, high concentrations of one pathogen were associated with lower concentrations of the other pathogens. This occured because the model was calibrated to match the level of reported disease seen in the Lifestraw RCT, without reference to the particular pathogen types. Therefore higher levels of disease from one pathogen type must be balanced by lower levels of disease from other pathogen types. Table 3.3. Longitudinal prevalence measures from the Lifestraw field trial Lower limit Measure Estimate (95% CI) Upper limit (95% CI) Longitudinal prevalence, intervention group (LPIrad) 0.0749 0.0526 0.0972 Longitudinal prevalence, placebo group (LPPrad) 0.0896 0.0673 0.112 0.84 0.61 1.14 Longitudinal prevalence ratio, intervention/placebo (LPRrad) * During calibration, for a simulation model run to be considered consistent with the Lifestraw RCT, all 3 measures must fall between the lower and upper limits. 3.3.3. Example runs of the model Graphical output from two runs of the model is shown in Figure 3.4 and Figure 3.5. They display the time courses of two calibration runs of the QMRA model. Figure 3.4 has higher 103 waterborne infection and reported waterborne disease levels than observed in the Lifestraw RCT, and Figure 3.5 is consistent with the Lifestraw RCT. The simulated surveys of diarrhea are shown by the purple × symbols; equilibration occurred during the first 128 days, before the first simulated survey. Each survey asked whether any diarrhea was remembered during the previous 7 days. In contrast, the lines signify daily prevalence of infection, as simulated daily over the course of the model run. Proportion affected (placebo) Proportion affected (intervention) Figure 3.4. Example run of the model, with higher infection levels than the Lifestraw RCT 1 Any infection E. coli infection Giardia infection Rotavirus infection LPIrwd (prior week) 0.8 0.6 0.4 0.2 0 0 100 200 300 1 400 500 600 Any infection E. coli infection Giardia infection Rotavirus infection LPPrwd (prior week) 0.8 0.6 0.4 0.2 0 0 100 200 300 400 Time (simulated days) 104 500 600 Proportion affected (intervention) 1 Proportion affected (placebo) Figure 3.5. Example run of the model, infection levels consistent with the Lifestraw RCT 1 Any infection E. coli infection Giardia infection Rotavirus infection LPIrwd (prior week) 0.8 0.6 0.4 0.2 0 0 100 200 300 400 500 600 Any infection E. coli infection Giardia infection Rotavirus infection LPPrwd (prior week) 0.8 0.6 0.4 0.2 0 0 100 200 300 400 500 600 Time (simulated days) 3.3.4. Analytical process The simulation model described above was implemented in two steps: calibration and estimation (Figure 3.1, page 90). The calibration step estimated the four unknown parameter values (Table 3.2, page 96) by constraining the model outputs to the results of the Lifestraw RCT (Table 3.3, page 96). The calibration step included 100,000 runs. It was conducted for two calibration compliance conditions, low and medium, because the proportion of water treated by filter users during the Lifestraw RCT was unknown but believed to be substantially less than 1. Additionally, the parameters in Table 3.2 (page 96) were randomly sampled from uniform distributions for each simulation run. If a model run yielded results consistent with the Lifestraw RCT, its set of four parameter values (Table 1) was used in the estimation step. A run was considered consistent if the LPIrad, LPPrad, and LPRrad all fell within the 95% confidence limits reported from the Lifestraw 105 RCT (Table 3.3, page 103). The estimation step determined effectiveness given: 1) a perfect placebo, and 2) low, medium, high, or perfect estimation compliance. It consisted of ten model runs for each parameter set that was consistent with the Lifestraw RCT (totaling >2000 runs) for each of four estimation compliance levels, assuming a perfect placebo. The estimation step simulated measurements of LPIrad and LPPrad, and their ratio LPRrad, which were calculated in the same way as in the calibration step (number of monthly personsurveys reporting diarrhea during the previous 7 days, divided by the total number of personsurveys). Differing compliance values were used in the calibration and estimation steps. In the calibration step, ‘calibration compliance’ referred to a set of compliance values describing what probably occurred during the actual Lifestraw RCT; these were necessary to calibrate the model to the four parameter values in Table 3.2 (page 96). In the estimation step, ‘estimation compliance’ referred to a larger set of compliance values that allow the model to make predictions for several different scenarios. Calibration compliance and estimation compliance were considered simultaneously in the results because different calibration compliance levels led to different results in the estimation step. The QMRA model was programmed in Octave 3.2; the code also runs in MATLAB 7.11. The program code is in the appendices (chapter 9, page 239). Results were analyzed using R 2.11; the two-tailed Wilcoxon rank sum test (α = 0.05) was used to compare distributions. 3.4. Results 3.4.1. Calibration step Out of the 100,000 simulation runs in the calibration step, 210 were consistent with the Lifestraw RCT based on the criteria in Table 3.3 (page 103) and assuming low calibration 106 compliance with water filtration. Repeating the calibration step assuming medium calibration compliance yielded 258 consistent runs. Calibration estimated distributions for two outputs: 1. The longitudinal prevalence ratio (LPRrad) distributions were similar by level of calibration compliance (Figure 3.6). The estimate from the Lifestraw RCT falls within the central 95% of the distributions, suggesting consistency between the model and the Lifestraw RCT. The median LPRrad estimated by the model differed from the Lifestraw RCT because the Lifestraw RCT was a single experiment, whereas each distribution of LPRrad represented over 200 simulated experiments. 2. Simulated concentrations of pathogen types in untreated water (Figure 3.7) were higher for medium calibration compliance compared with low calibration compliance, which was necessary to produce LPIrad and LPPrad values consistent with the RCT. In individual calibration runs, higher concentrations of one pathogen type were associated with lower concentrations of the other two pathogen types. The median diarrheagenic E. coli concentration predicted by the model is lower than the median thermotolerant coliform (TTC) concentration measured in untreated drinking water in the Lifestraw RCT (Figure 3.7). This is plausible since E. coli are a subset of TTC, and not all E. coli are pathogenic. 107 1.1 1.0 0.9 0.8 0.7 95% CI for LPR from RCT Estimated LPR from RCT 0.6 Longitudinal prevalence ratio (LPRrad) Figure 3.6. Distributions of LPRs consistent with the Lifestraw RCT Low Medium Calibration compliance Distributions of longitudinal prevalence ratios from simulation runs consistent with the Lifestraw RCT from the calibration step for low (65% of children treat 1/3 of their drinking water) and medium (65% of children treat 2/3 of their drinking water) calibration compliance. These distributions differed significantly (Wilcoxon rank sum test, p = 0.02). Boxplots include: median (heavy line), 25th and 75th percentiles (lower and upper limits of the box), 2.5th and 97.5th percentiles (X symbols), and range (whiskers). 108 105 104 103 102 10 Low Medium calib- calibration ration compli- compliance ance Simulated rotavirus / L 10-1 Simulated Giardia / L Simulated E. coli or actual TTC / L Figure 3.7. Distributions of simulated microbial concentrations 10-1 10-2 10-3 Measured TTC, LifeStraw RCT Low calibration compliance Medium calibration compliance 10-2 10-3 10-4 Low calibration compliance Medium calibration compliance Simulated distributions of microbial concentrations per liter of untreated water, consistent with the Lifestraw RCT. Distributions were obtained from the calibration step assuming low (65% of children treat 1/3 of their drinking water) or medium (65% of children treat 2/3 of their drinking water) calibration compliance. Thermotolerant coliforms (TTC) measured by the Lifestraw RCT are also shown for comparison with simulated E. coli. For all three pathogen types, the concentration distributions differed by calibration compliance (Wilcoxon rank sum test, p < 0.001). 3.4.2. Estimation step This step estimated LFF effectiveness compared to a perfect placebo for low, medium, high, and perfect estimation compliance, given low or medium calibration compliance. Estimation compliance was a major driver of effectiveness. For example, under low, medium, high, and perfect estimation compliance, the median LPRrad was 0.86, 0.70, 0.50, and 0.13 respectively, regardless of calibration compliance (Figure 3.8). Additionally, LPIrad was significantly greater with medium calibration compliance (compared to low calibration compliance), for all levels of estimation compliance except perfect (Figure 3.9). This difference 109 occurred because both calibration steps (low and medium calibration compliance) were constrained to the same RCT result; if calibration compliance decreases, the pathogen concentrations must also decrease for the model to remain consistent with the Lifestraw RCT. During the estimation step, the higher LPPrad for medium calibration compliance was due to lack of protection from the perfect placebo; therefore, the LPIrad values were also higher. The differences between low and medium calibration compliance decrease as estimation compliance increases. 110 1.5 Estimation step, calibrated to low compliance, perfect placebo Medium calibration compliance * Estimation step, calibrated to medium compliance, perfect placebo Calibration step Low Medium High Perfect 0.5 1.0 Low calibration compliance * LPR from Lifestraw RCT: 95% CI Estimated LPR 0.0 Longitudinal prevalence ratios of reported diarrhea (LPRrad) Figure 3.8. LPR distributions for differing compliance assumptions and placebo behavior Calibration step Low Medium High Perfect Estimation compliance Distributions of longitudinal prevalence ratios of reported diarrhea (LPRrad), estimation step for the intervention group, by calibration compliance, estimation compliance, and placebo type. *: LPRs from the calibration step, and therefore with an imperfect placebo. a, b: These pairs of distributions illustrate the effect of the imperfect placebo on LPRrad, depending on low (a) or medium (b) calibration compliance. They differed significantly by imperfect vs. perfect placebo (Wilcoxon rank sum test, p < 2x10-16). 111 Medium calibration compliance 0.05 0.10 0.15 Low calibration compliance Perfect compliance High compliance Estimation compliance Medium compliance Low compliance Perfect compliance Perfect placebo High compliance Medium compliance Low compliance LP from RCT, imperfect placebo 95% CI Estimated LP Perfect placebo 0.00 Longitudinal prevalence of reported diarrhea (LPIrad, LPPrad) Figure 3.9. Longitudinal prevalence distributions under differing compliance assumptions Distributions of reported longitudinal prevalence (LP) of diarrhea in the estimation step for the intervention group, by calibration compliance and estimation compliance. a, b, c, d: Each of these pairs of distributions differed significantly by calibration compliance (Wilcoxon rank sum test, p < 2x10-16). Compliance levels: Perfect placebo (100% of children treat 0% of their drinking water), low compliance (65% of children treat 1/3 of their drinking water), medium compliance (65% of children treat 2/3 of their drinking water), high compliance (65% of children treat 100% of their drinking water), perfect compliance (100% of children treat 100% of their drinking water). Calibration compliance altered the estimated effect of a perfect placebo (Figure 3.8). Adjustment for the imperfect placebo increased the estimated preventable fraction of disease by 8 percentage points assuming low calibration compliance (median LPRrad: 0.94 and 0.86 for imperfect and perfect placebo, respectively). Assuming medium calibration compliance, the preventable fraction increased by 22 percentage points (median LPRrad: 0.92 and 0.70 for 112 imperfect and perfect placebo, respectively). All three pathogen types contributed substantially to infection and disease (Figure 3.10). Mixed infections accounted for about 2% of infections. Figure 3.10. Longitudinal prevalence of waterborne infection in the estimation step Any Med. Mixed Mixed Low Med. Bac. Low Bac. Med. Prot. Low Prot. Med. Virus Low Virus Med. Longitudinal prevalence of waterborne infection (Lpwi) 0.00 0.05 0.10 0.15 0.20 Pathogen: Any Compliance:Low 0LMHP 0LMHP 0LMHP 0LMHP 0LMHP 0LMHP 0LMHP 0LMHP 0LMHP 0LMHP Estimation compliance Longitudinal prevalence of waterborne infection (LPwi) during the estimation step with a perfect placebo, by pathogen type. LPwi is higher for medium calibration compliance, compared with low calibration compliance. Compliance levels: None (0; 0% of children treating 0% of their drinking water, i.e., perfect placebo), low compliance (L; 65% of children treat 1/3 of their drinking water), medium compliance (M; 65% of children treat 2/3 of their drinking water), high compliance (H; 65% of children treat 100% of their drinking water), perfect compliance (P; 100% of children treat 100% of their drinking water). 3.5. Discussion Results from an epidemiological study may only be relevant to the ecological and social conditions of the communities studied. However, quantitative microbial risk assessment (QMRA) models that are calibrated to epidemiologic data can predict risk under scenarios that 113 were not actually studied, known as counterfactual scenarios (J. J. Kim et al., 2007; Tuite et al., 2010). A calibration process to epidemiological data can facilitate the use of QMRA models where the environmental contamination data required by those models are are difficult to measure. There are many situations where epidemiological data can be used to provide parameters to QMRA models, such as in a developing country context where direct measures of disease risk are frequently measured but environmental contamination data are rare. In this context, the modeling framework described herein was used to generalize results from the Lifestraw RCT (Boisson et al., 2010). Generalizing across different compliance scenarios, this model quantified the relationship between compliance and HWT effectiveness. Our analysis suggests that perfect compliance in the Lifestraw RCT communities would yield an LPRrad of 0.13, suggesting that 87% of reported diarrhea could be prevented by consumption of treated water. This result, suggesting that only 13% of diarrhea in the Lifestraw RCT community was caused by non-waterborne transmission, is consistent with a HWT trial in a refugee camp in which there was 95% compliance and an 83% reduction in diarrhea prevalence (Doocy & Burnham, 2006), as well as numerous field trials of ceramic filters indicating risk ratios < 0.5 (T. Clasen, I. G. Roberts, et al., 2009; Hunter, 2009). Noncompliance will result in an underestimate of protective effect compared to a situation in which there is perfect compliance. Additionally, these trials and our risk assessment model do not account for the interdependency of other transmission routes. Thus, it is unclear whether this model accurately assesses the proportion of diarrhea associated with drinking water. The reduction in effectiveness due to the imperfect placebo also depended on the assumed calibration compliance level during the Lifestraw RCT. Assuming low calibration compliance, the imperfect placebo decreased effectiveness (preventable fraction) by 8 percentage points. 114 Effectiveness decreased 22 percentage points assuming medium calibration compliance (Figure 3.8, page 111). Compliance has several components. This analysis used a simple formulation in which each child had a daily probability of using the device; if the device was used, a fixed proportion of water was treated. In reality, some people may be highly consistent users or nonusers, while others might use the device occasionally (e.g., drinking untreated water while working outside the home). Effectiveness might differ given perfect use by 50% of a population, compared to an entire population treating 50% of their water. These distinctions are further explored in chapter 4. These results are consistent with the reasonable expectation that increasing compliance should increase effectiveness. Recent systematic reviews suggest a positive but not statistically significant relationship between compliance and effectiveness, perhaps due to difficulty in measuring compliance (B. F. Arnold & Colford, 2007; T. Clasen, I. G. Roberts, et al., 2009; Waddington et al., 2009). Modeling has shown how even occasional treatment failures by a water treatment plant could cause high levels of diarrheal disease in populations (Hunter et al., 2009). Collectively, these results show the importance of consistent treatment of drinking water, by well-managed municipal plants or by high compliance with HWT. Future studies should examine the joint effects of compliance and log10 removals by a HWT device. QMRA models can be used to extend the results shown in Figure 3.8 (page 111) by providing estimates for risk reductions as a function of both compliance and device efficacy. For example, for a given compliance level, what are the expected risk reductions when using a device that provides 5 log10 removal, versus a device that provides 3 log10 removal (see section 4.4.4, page 150)? Such analyses could provide performance metrics and standards that address not only microbiological efficacy but also the correct, consistent and sustained use of a device by the target population. 115 3.5.1. Predicting pathogen concentrations in drinking water sources Few data exist on pathogen concentrations that individuals ingest via drinking water in developing countries. Calibration of this model used the epidemiological data from the Lifestraw RCT to predict concentrations of pathogens in untreated water (Figure 3.7, page 109). The predicted Giardia concentrations are consistent with measurements from southeastern Brazilian raw water sources (< 0.1 to 3.4 cysts/L) (Razzolini et al., 2010), but lower than other Brazilian (2.5 to 120 cysts/L) (Neto et al., 2010) or Honduran source waters (2.4 to 21 cysts/L) (SoloGabriele et al., 1998). The predicted diarrheagenic E. coli concentrations are much lower than the 2.5∙105--1.6∙107 CFU of enterotoxigenic E. coli detected in sewage-impacted Indian rivers (Singh et al., 2010). Furthermore, the predicted rotavirus concentrations are substantially lower than measurements from polluted creeks in Sao Paulo, Brazil (geometric mean, ~2.7 focusforming units/L) (Mehnert & Stewien, 1993). It is reasonable that the clear source water in the Lifestraw RCT site would have lower pathogen concentrations than the above water sources, many of which were from polluted urbanized environments. Measuring viable pathogen concentrations in drinking water (and other exposure routes) in several epidemiological studies would increase the accuracy and precision of risk estimates, as indicators of fecal contamination (e.g., coliform bacteria or E. coli) correlate poorly with presence or concentrations of actual pathogens (American Water Works Association, 1999; Leclerc et al., 2002; Toranzos et al., 1988). 3.5.2. Calibration of microbial risk assessment models Microbial risk models, like many environmental models, can never be fully validated (Oreskes et al., 1994). However, they can be confirmed through a calibration process using epidemiological data. We present a framework that first calibrates using epidemiological data, and second estimates risks under differing counterfactual scenarios. The calibration process 116 transforms uninformed priors to informed posteriors by constraining the model using the epidemiological outcome data from the Lifestraw RCT. The low percentage of calibration runs that were consistent (<0.3%) indicates that the trial data imparted substantial information to the model by constraining the acceptable parameter space. Source water pathogen data would be useful to further calibrate the model for the Congo field site. Risk models informed by epidemiological data are powerful tools to generalize beyond the context in which epidemiological studies are conducted. Models can be used to inform study design and intervention strategies; epidemiological studies can calibrate these models. Few published examples take this approach, but see: (J. J. Kim et al., 2007; Tuite et al., 2010). The framework provided here can facilitate such future research activities. Results from our examination of HWT field trials indicate that compliance and pathogen concentrations in source water are particularly important processes to characterize. Data from these processes would enhance the calibration step, providing the opportunity to describe other unobserved aspects of the system. Additionally, pathogen measurements from other environmental sources (e.g., hands, food, feces) would facilitate the extension of our model system to consider transmission by multiple environmental pathways (Li et al., 2009); such transmission models would allow investigation of interdependency of multiple transmission routes, and ultimately multiple interventions. Although additional data on pathogen concentration and compliance and the extension to considering transmission are important steps forward in informing intervention strategies, our analysis provides the important and robust conclusion that effectiveness is highly sensitive to compliance, suggesting that trials of household level interventions should measure compliance as carefully and as effectively as possible. Compliance guidelines should be developed for HWT interventions, in addition to the microbial reduction guidelines for HWT devices recently 117 published by WHO (Sobsey & Joe Brown, 2011). 118 4. THE JOINT EFFECTS OF EFFICACY AND COMPLIANCE IN HOUSEHOLD WATER TREATMENT EFFECTIVENESS This chapter consists of a manuscript and its supplementary material that has been peerreviewed and accepted for publication. It has been reformatted and reorganized: Enger, K.S., Nelson, K.L., Rose, J.B., Eisenberg, J.N.S. The joint effects of efficacy and compliance: a study of household water treatment effectiveness against childhood diarrhea. Publication forthcoming in the journal Water Research. 4.1. Abstract The effectiveness of household water treatment (HWT) at reducing diarrheal disease is related to the intrinsic efficacy of the HWT method at removing pathogens, how people comply with HWT, and the relative contributions of other pathogen exposure routes. Although many HWT methods are efficacious at removing or inactivating pathogens, their effectiveness within actual communities is decreased by imperfect compliance. However, the quantitative relationship between compliance and effectiveness is poorly understood. To assess the effectiveness of HWT on childhood diarrhea incidence via drinking water for three pathogen types (bacterial, viral, and protozoan), a quantitative microbial risk assessment (QMRA) model was developed. The model allowed examination of the relationship between log10 removal values (LRVs) and compliance with HWT for scenarios varying by: baseline incidence of diarrhea; etiologic fraction of diarrhea by pathogen type; pattern of compliance; and size of contamination spikes in source water. Benefits from increasing LRVs strongly depend on compliance. For perfect compliance, diarrheal incidence decreases as LRVs increase. However, if compliance is incomplete, there are diminishing returns from increasing LRVs in most of the scenarios considered. Higher LRVs are more beneficial if: contamination spikes are large; contamination levels are generally high; or 119 some people comply perfectly. The effectiveness of a HWT intervention at the community level may be limited by low compliance, such that the benefits of high LRVs are not realized. Patterns of compliance with HWT should therefore be measured during HWT field studies and HWT dissemination programs. Studies of pathogen concentrations in a variety of developing-country source waters are also needed. Guidelines are needed for measuring and promoting compliance with HWT, in addition to recently published WHO HWT efficacy guidelines. 4.2. Introduction An effective intervention can be defined as one that reduces disease (i.e., is efficacious) and one that people use (i.e., they comply). For example, a drug or vaccine must be protective, and people must take the drug or receive the vaccine; contaminated water must be correctly treated, and people must drink the treated water. Both efficacy and compliance must be evaluated when assessing the ability of an intervention to reduce illness; both are dynamic factors that can vary over time. Household water treatment (HWT) interventions are an interesting example that illustrates these two factors, where pathogen removal characterizes efficacy and behavior characterizes compliance. This chapter examines the joint effects of 1) pathogen removal by a HWT device, and 2) the degree to which communities use the device. It focuses on the protective effects of HWT against diarrhea in developing countries, a leading cause of morbidity and mortality (Kosek et al., 2003). Household water treatment (HWT) is a common strategy for reducing diarrhea in developing countries. HWT technologies include chlorination, filtration, solar disinfection (SODIS), and boiling. Systematic reviews of field trials suggest that HWT is effective in preventing diarrhea (B. F. Arnold & Colford, 2007; T. Clasen, I. G. Roberts, et al., 2009). However, lack of blinding and publication bias are important issues in the HWT literature that may exaggerate effectiveness (Schmidt & Cairncross, 2009; Waddington et al., 2009; Hunter, 120 2009); see also pages 17 and 65. Antimicrobial effectiveness of HWT is commonly measured by log10 reduction values (LRVs) from laboratory testing. Such tests use indicator organisms to represent the three main classes of waterborne pathogens: viruses, bacteria, and protozoan cysts. LRVs are a common metric for assessing different HWT methods (Sobsey & Joe Brown, 2011; Sobsey et al., 2008). The United States standard for HWT “microbiological water purifiers” is LRVs of 6 for bacteria (99.9999% inactivation), 4 for viruses, and 3 for protozoa (USEPA, 1987). The World Health Organization (WHO) recommends that “highly protective” devices have LRVs of 4 for bacteria, 5 for viruses, and 4 for protozoa (Sobsey & Joe Brown, 2011); see also Table 2.2, page 65. The WHO recommendations use a quantitative microbial risk assessment (QMRA) assuming perfect compliance and an acceptable risk level of 10-6 disability-adjusted life-years (DALYs) for diarrheal disease from each pathogen type (Sobsey & Joe Brown, 2011). In contrast, compliance, the extent to which persons (or a population) use a HWT method, is often poorly defined and poorly measured. Compliance (sometimes referred to as adherence) has many dimensions. Individuals might reject a HWT method because of cost, difficulty using HWT, or taste of treated water. Well-established theory regarding adoption of new technologies indicates that 10%-20% of a community will not use a new technology, even after acceptance by most of the community (E. M. Rogers, 2003). Furthermore, preventive practices (such as HWT) that require consistent individual effort to reduce the probability of an adverse effect have difficulty spreading. This is because the benefit (e.g., bouts of diarrhea averted) is a ‘non-event’ that is distant in time from adopting the practice; therefore, the benefit gained is not obvious to the user (E. M. Rogers, 2003). HWT devices might simultaneously be used frequently and inconsistently. For example, someone might drink treated water at home, but untreated water 121 while working. During a HWT field trial in rural Congo, nearly all households sometimes drank untreated water (Boisson et al., 2010). Although the variable and incomplete nature of compliance is widely recognized, it is often unmeasured or incompletely measured by field trials. A review of 30 relatively well-conducted field trials of water quality interventions found that 7 did not report compliance, and 9 measured compliance by “occasional observation” only (T. Clasen, I. G. Roberts, et al., 2009). Furthermore, consumption of treated water was never directly measured (T. Clasen, I. G. Roberts, et al., 2009). Studies that report compliance find that communities rarely use HWT methods 100% of the time. For example, a meta-analysis of HWT chlorination studies indicated a median of 78% of samples having detectable free chlorine (range 36-100% over 12 studies) (B. F. Arnold & Colford, 2007). Compliance is difficult to measure and is subject to Hawthorne effects (where people's behavior changes because they know that they are being observed) and other biases. Participants in a trial might report that they use the intervention more frequently than they actually do. Compliance might increase during a trial because study personnel remind people to use HWT (deliberately or not). Field trials over longer periods show lower HWT effectiveness against diarrhea; decreasing compliance over time is one explanation (Hunter, 2009). It is particularly difficult to determine the amount of untreated water that HWT users consume outside the home. Despite not being well measured, compliance clearly influences HWT effectiveness, because HWT can only prevent diarrhea if people use it (Duflo et al., 2007). Field measurements of LRVs tend to be lower than laboratory-measured LRVs for many reasons, such as differing water quality or suboptimal maintenance of HWT devices (Sobsey et al., 2008). Nonetheless, the benefits from HWT might be eroded by slight noncompliance. For example, a risk assessment of diarrheal infection from intermittent treatment by a Ugandan water treatment plant estimated that 122 water treatment failure for one day per year increased the annual probability of enterotoxigenic Escherichia coli (ETEC) infection via drinking water from 0.001 to 0.1 (Hunter et al., 2009). The relationship between compliance and LRVs (which measure efficacy) can be illustrated mathematically: d = u(1 - c) + uc10-L (4.1) where d is the dose of pathogens consumed, u is pathogens per liter of untreated water, c is compliance (the proportion of drinking water treated), L is the LRV of the HWT method, and one liter of water is consumed. Using equation 4.1 assuming that source water contains 10,000 pathogens per liter, 5 LRVs of pathogens are inactivated, and 1% of drinking water is untreated, then 100 pathogens are ingested. For LRVs of 4, 3, 2, and 1, the numbers of pathogens consumed are, respectively: 101, 110, 199, and 1090. The dose (and therefore the infection risk) is very similar for LRVs of 5, 4, and 3 (100 to 110 pathogens), which leads to the hypothesis tested in this chapter: incomplete compliance results in marginal reductions in diarrheal disease as LRVs increase. 4.3. Materials and methods To test the hypothesis, a QMRA model was used to simulate waterborne transmission of diarrheal infection (bacteria, protozoa, and viruses) in children aged less than five years. This model was based on the model in chapter 3 (Enger et al., 2012) that simulated a randomized controlled trial of the LifeStraw® Family filter (LFF; a HWT device) in rural Congo (Boisson et al., 2010). The model was programmed in MATLAB 7.12 and Octave 3.2; results were analyzed with R 2.11. The model only considered diarrhea transmitted by drinking water, omitting other transmission routes (e.g., contaminated food, objects, or hands). Parameter values for the model are summarized in chapter 7 and Table 7.1 (page 230). Four important concepts in the model are: compliance, baseline incidence, etiologic 123 fractions, and short-term contamination spikes. They are described in the following paragraphs. 4.3.1. Compliance Compliance with HWT within a community was modeled considering three groups of children: 1) children who exclusively consumed treated water (“perfect compliance”); 2) children who never consumed treated water (“no compliance”); and 3) children who consumed fixed proportions of treated and untreated drinking water (“partial compliance”). Overall compliance (c) was calculated as follows: c = ( 1 - ( a + n ) )p + a (4.2) where a is the proportion of children who always use HWT, n is the proportion of children who never use HWT, and p is the proportion of water treated by partial compliers. For a given value of c, three types of compliance at the community level were defined: α) c children with perfect compliance and the remainder with no compliance; β) c/2 children with perfect compliance, ( 1 – c )/2 children with no compliance, and the remainder partially comply, treating a fraction c of their daily water intake (Table 4.1); γ) all children partially comply by treating a fraction c of their water. If c = 1 or 0, only compliance type α is possible. Table 4.1. Compliance among individuals in each model run, given compliance type β Proportion of simulated children who are: Overall compliance (a proportion) Perfect compliers Noncompliers Partial compliers Proportion of water treated by partial compliers 1 1 0 0 Nonapplicable 0.99 0.495 0.005 0.5 0.99 0.95 0.475 0.025 0.5 0.95 0.8 0.4 0.1 0.5 0.8 0 0 1 0 Nonapplicable 124 4.3.2. Baseline incidence and etiologic fraction To further generalize the results, differences in the baseline incidence of diarrhea and the relative contributions of viruses, bacteria, and protozoa to diarrheal incidence (etiologic fractions) were considered. The average incidence categories were: 0 – 2 (low); 2 – 6 (medium); and 6 – 12 (high) episodes per child-year (Kosek et al., 2003). Three sets of etiologic fractions were used (Table 4.2), based on reviews of etiologic studies of childhood diarrhea (Lanata & W. Mendoza, 2002; Ramani & Gagandeep Kang, 2009); the determination of these etiologic fractions is discussed in detail on page 129. Table 4.2. Criteria for the calibration step of the QMRA model Midpoint and range of etiologic fractions for childhood diarrhea Description Bacteria Protozoa Viruses Etiologic fractions A. High bacteria, medium protozoa, low viruses 55% 47.5 to 62.5% 30% 22.5 to 37.5% 15% 7.5 to 22.5% Etiologic fractions B. High bacteria, medium viruses, low protozoa 55% 47.5 to 62.5% 15% 7.5 to 22.5% 30% 22.5 to 37.5% Etiologic fractions C. Bacteria slightly predominating over protozoa and viruses 40% 32.5 to 47.5% 30% 22.5 to 37.5% 30% 22.5 to 37.5% The incidence ranges were: low, 0-2 episodes per child-year; medium, >2-6 episodes per childyear; and high, >6-12 episodes per child-year. 4.3.3. Short-term contamination spikes Measurements from surface waters indicate that concentrations of indicator organisms are highly variable (Boehm, 2007; K. Levy, A. E. Hubbard, K. L. Nelson, et al., 2009). The variability of pathogen concentrations is expected to be similar or greater, and abnormally large spikes of contamination might occur occasionally. Spikes of pathogen concentrations were simulated on random days, assuming that each spike lasted exactly one day, there were n spikes 125 per year, and the spike height was x fold higher than the mean baseline concentration on days lacking a spike. For this analysis, x = 1 (no spikes), 10, 103, or 105, and n = 5. To aid comparison between spike scenarios, the mean number of pathogens in t daily 1-liter samples of source water was held constant regardless of spike height: b0 is the mean baseline concentration in a scenario without spikes, and bs is the mean baseline concentration in a scenario with spikes. Solving for bs gave the appropriate mean baseline concentration during spike scenarios: b0t = bs(t - n) + nxbs → bs = b0t / (nx + t - n) (4.3) 4.3.4. Calibration step The simulation was implemented in two steps: calibration and estimation. The calibration step simulated transmission of diarrheal infection by drinking water in the absence of HWT, and is described in detail below. It estimated concentrations of bacteria, viruses, and protozoa that were consistent with: 1) assumptions of low, medium, or high incidence of diarrhea; and 2) assumptions about the relative importance of these pathogen types to diarrheal etiology (Table 4.2, page 125). The estimation step used these pathogen concentrations to estimate the risk of diarrhea under various HWT scenarios, defined by different LRVs and different levels of compliance; estimation is described in more detail on page 130, in section 4.3.5 The calibration step modeled waterborne diarrheal infection and disease in a simulated community prior to the introduction of an HWT intervention. The calibration step was run 12 times (3 incidence levels × 4 spike heights) and each of these yielded 3 sets of pathogen concentrations (since there were 3 sets of etiologic fractions), for a total of 36 calibration scenarios. Each calibration step consisted of 100,000 model runs. Each model run began by randomly selecting a mean pathogen concentration for each of the three pathogen types independently. The pathogen concentrations in simulated drinking water were randomly drawn 126 daily from a gamma distribution, with the scale parameter determined by the mean pathogen concentration, and the shape parameter determined by the distribution of thermotolerant coliforms measured in source water from a rural area of the Congo (Boisson et al., 2010; Enger et al., 2012). The central 95% of that distribution spanned 1.4 log10. Therefore, concentrations could vary 25 fold from day to day, even in the absence of spikes. Each calibration run followed 100 simulated children over 1 year with no HWT use. The output of each run yielded a community incidence of diarrheal disease and etiologic fractions for the three pathogen types. The mean pathogen concentrations used by a calibration run were retained for use in the estimation step if: 1) the incidence of diarrhea estimated from the model run fell into the appropriate range (Figure 4.1d); and 2) the proportions of diarrhea episodes attributable to bacteria, protozoa, or viruses fell into one of the three sets of etiologic fractions (Table 4.2, page 125; Figure 4.1e). Thus the calibration process yielded sets of pathogen concentrations consistent with 9 distinct calibration scenarios for each of the 4 spike scenarios, for a total of 36 separate calibration scenarios. 127 Figure 4.1. Calibration results assuming medium incidence of diarrhea Calibration results, medium incidence. Black lines denote acceptable ranges for incidence and etiology. b. Diarrheal incidence by protozoan conc. Incidence (episodes/child-year) 0 5 10 15 Incidence (episodes/child-year) 0 5 10 15 a. Diarrheal incidence by bacterial concentration 0 10000 30000 50000 70000 Mean E. coli concentration per L 0 0.5 1.0 1.5 2.0 Mean Giardia concentration (cysts/L) d. Histogram of incidence, 105 calibration runs 0 Frequency 5000 10000 Incidence (episodes/child-year) 0 5 10 15 15000 c. Diarrheal incidence by virus concentration 0 5 10 15 Incidence (episodes/child-year) f. Density plots of pathogen concentrations consistent with medium incidence Mix A Mix B Mix C 0 Density 0.2 0.4 0.6 0.8 1.0 0.3 Viru 0.5 ses 0.7 0.9 oa toz 0.7 Pro 0.5 0.3 0.1 0.1 0.9 0.00 0.02 0.04 0.06 0.08 Mean rotavirus concentration per L e. Etiology of runs with medium incidence A = 160 runs; B = 160 runs; C = 121 runs 0.9 0.7 0.5 0.3 0.1 Bacteria 0.001 0.1 10 1000 Pathogens/L in untreated water See text immediately before & after this panel of charts for further explanation. 128 For this analysis, calibration of the model required consideration of the largest possible ranges of three joint pathogen concentrations that could conceivably yield an incidence value in the desired range. Consequently, all calibration steps used ranges of pathogen concentrations that began at 0 and ended at an empirically determined concentration where the minimum incidence of diarrhea rose above the highest acceptable incidence value. The resulting distributions of pathogen concentrations that were consistent with the calibration process had negative skew (Figure 4.1f) because a scatterplot of randomly selected pathogen concentrations by incidence forms a gradually increasing band of points (Figure 4.1a, b, & c), and pathogen concentrations that fall within a narrow horizontal band are accepted. Determination of etiologic fractions The etiologic fractions were obtained from a review (Lanata & W. Mendoza, 2002) of 266 etiology studies of inpatients, outpatients, and communities, grouped by WHO subregion. Only community studies were considered because the inpatient and outpatient cases are a more severe subset of the full range of diarrhea cases in a given community. Furthermore, only studies from developing WHO subregions AfroD, AfroE, AmroB, and SearoD were considered, because those subregions had 5 or more community-based studies covering most major diarrheal pathogens (Salmonella, Shigella, Campylobacter, enterotoxigenic Escherichia coli [ETEC], enteropathogenic E. coli [EPEC], Giardia, Cryptosporidium, and rotavirus). When the pathogens were collapsed into the categories of bacteria, protozoa, and viruses, the four regions broadly agreed on the proportions of diarrhea cases attributed to these three categories. Among cases of diarrhea for which an etiologic agent was identified, bacteria predominated (60%), followed by protozoa (15-30%) and viruses (10-20%). However, the contribution of viruses was probably underestimated by Lanata and Mendoza (2002), since rotavirus was the only virus examined by the studies they reviewed. A more recent review (Ramani & Gagandeep Kang, 2009) of hospital129 based studies of viral gastroenteritis in children in developing countries indicated that about 64% of cases were attributable to rotavirus (other viral gastroenteritides were attributed to caliciviruses, adenoviruses, and astroviruses, or had no etiology determined). Using this information to adjust the conclusions drawn from Lanata and Mendoza’s (2002) work, it is suggested that bacteria accounted for approximately 55% of episodes and protozoa and viruses each accounted for 15-30% of episodes, providing a basis for the ranges of etiologic fractions used for calibration (Table 4.2, page 125). 4.3.5. Estimation step The estimation step modeled waterborne diarrheal infection and disease in a simulated community where a HWT intervention is being used. Each estimation scenario used marker pathogen concentrations from one of the 36 calibration scenarios, for 5000 simulated children over 50 years. Estimation scenarios were defined by the treatment efficacy of the device and the level of compliance. Specifically, combinations of three factors were used: 1) LRVs of the HWT device against all three pathogen types (1, 2, 3, 4, or 5); 2) overall compliance by the community (c = 1, 0.99, 0.95, 0.80, or 0); 3) type of compliance by the community (α, β, or γ; described in detail on page 124). An estimation step was run once for every possible combination of these three factors, for each of the 36 sets of pathogen concentrations from calibration. Each estimation step for a given scenario had 70 to 150 model runs. If a particular calibration step supplied more than 150 sets of pathogen concentrations, 150 sets were randomly sampled for use in the corresponding estimation scenarios. Incidences and incidence ratios (IRs) were determined for various combinations of compliance and device effectiveness; IRs were relative to scenarios in which no HWT was used. 4.3.6. Replication of the WHO model For comparison, the QMRA model from the WHO HWT recommendations (Sobsey & Joe 130 Brown, 2011) was replicated in R 2.11. The model was slightly modified to output incidence instead of DALYs, and its assumption of 94% immunity to rotavirus was eliminated, since the model developed in this chapter only considers young children. To obtain total diarrhea incidence from the WHO model, the bacterial, protozoan, and viral diarrhea incidences from its output were summed. 4.4. Results 4.4.1. Calibration step Each calibration step (100,000 runs) yielded 70 to 1164 runs consistent with each of the 36 calibration scenarios. An example of typical calibration output is shown in Figure 4.1 (page 128); if a run was consistent with the incidence criterion (Figure 4.1d), it was tested for consistency with pathogen mixtures A, B, or C (Figure 4.1e; Table 4.2, page 125). The resulting pathogen concentrations in untreated drinking water, as suggested by the model, are shown in Figure 4.2; without spikes, median estimates ranged from 1100 to 120,000 bacteria/L, 0.06 to 1 protozoa/L, and 0.003 to 0.04 viruses/L (Figure 4.2a, b, and c). It is not clear whether these concentrations are reasonable because pathogen concentrations in source waters have seldom been measured in developing countries. These estimated concentrations (Figure 4.2) are generally lower than the few published measurements available from developing countries (Enger et al., 2012); see also chapter 3, page 116. Baseline bacterial concentrations during spike scenarios were relatively insensitive to the size of the spike (Figure 4.2, a, d, g, and j). This was because the dose response functions for the three pathogen types are such that the probability of E. coli infection increased relatively slowly with log10 increases in dose, compared to Giardia or rotavirus (Figures 3.3 and 4.6, pages 99 and 138). In addition, the morbidity ratio (probability of diarrhea, given infection) was much lower 131 (0.214) for bacterial infection, in contrast to protozoan (0.59) or viral (0.40) infection. Therefore, the maximum contribution to diarrheal morbidity was reached proportionally sooner for bacteria than for protozoa or viruses (note plateauing of the scatterplot in Figure 4.1a on page 128, compared with 4.1b and 4.1c). Since all three pathogen types were increased proportionally during a spike, the impact on overall diarrheal morbidity from protozoa and viruses was therefore larger during a spike. 132 ABC ABC ABC Low Medium High (incidence) 133 10-6 10-6 10-4 10-4 10-2 ABC ABC ABC Low Medium High (incidence) f. Viruses, spike height 10 10-6 10-4 10-2 ABC ABC ABC Low Medium High (incidence) i. Viruses, spike height 1000 10-4 10-2 ABC ABC ABC Low Medium High (incidence) l. Viruses, spike height 100000 10-6 10-3 10-1 ABC ABC ABC Low Medium High (incidence) k. Protozoa, spike height 100000 10-2 Viruses/L untreated water Viruses/L untreated water Viruses/L untreated water 10-1 10-3 10-5 10-1 10-3 10-5 10-5 10-3 10-1 ABC ABC ABC Low Medium High (incidence) h. Protozoa, spike height 1000 Viruses/L untreated water ABC ABC ABC Low Medium High (incidence) ABC ABC ABC Low Medium High (incidence) e. Protozoa, spike height 10 10-5 Protozoa/L untreated water Protozoa/L untreated water ABC ABC ABC Low Medium High (incidence) j. Bacteria, spike height 100000 Protozoa/L untreated water 101 102 103 104 105 ABC ABC ABC Low Medium High (incidence) g. Bacteria, spike height 1000 101 102 103 104 105 Bacteria/L untreated water Bacteria/L untreated water 101 102 103 104 105 ABC ABC ABC Low Medium High (incidence) d. Bacteria, spike height 10 Protozoa/L untreated water 101 102 103 104 105 Mean baseline pathogen concentrations from calibration step, by mixture (A, B, C) and incidence a. Bacteria, no spikes b. Protozoa, no spikes c. Viruses, no spikes Bacteria/L untreated water Bacteria/L untreated water Figure 4.2. Mean baseline pathogen concentrations from calibration ABC ABC ABC Low Medium High (incidence) 4.4.2. Estimation step Each estimation step consisted of 70 to 150 model runs, representing distributions of incidences and incidence ratios (IRs) of diarrheal disease given a particular scenario. The medians of all incidence distributions are shown in Figures 4.10-4.13, page 145. Figures 4.4-4.5 and Figures 4.7-4.9 present subsets of those results for clarity, expressed as IRs. Comparison with the WHO QMRA model This model and the WHO model (Sobsey & Joe Brown, 2011) produced reasonably consistent estimates of diarrhea risk reduction, assuming 100% compliance (Figure 4.3). This occurred despite differing pathogens and parameter values in each model, as well as substantial differences in model structure (this model considers community-level risk; the WHO model considers individual risk). The WHO model (modified to assume no viral immunity) best resembled this model assuming high incidence. However, the WHO model did not account for repeated episodes of diarrhea in one year, hence its incidence could not be greater than one episode/child-year for each pathogen type. The WHO model indicated that bacteria contribute less childhood diarrhea than protozoa or viruses. This model assumed the opposite, based on reviews of childhood diarrheal etiology (Lanata & W. Mendoza, 2002; Ramani & Gagandeep Kang, 2009). Consequently, although the WHO model results were somewhat similar to the results presented here with respect to total incidence of diarrhea, they differed greatly in the contribution of particular types of pathogens. Furthermore, the WHO model does not account for repeated episodes of diarrhea. Therefore it cannot return incidences higher than 1 for each marker pathogen, and differs substantially from this model when the LRV is 0. 134 Figure 4.3. Comparison with WHO model 10 1 10-1 10-2 10-3 10-4 Bacterial diarrhea,(WHO) diarrhea WHO Protozoan diarrhea, WHO Prot. diarrhea (WHO) Viral diarrhea,(WHO) diarrhea WHO Viral diarrhea, WHO, no viral imm. Total diarrhea, WHO Total diarrhea, WHO, no viral imm. Diarrhea, assuming high incidence Diarrhea, assuming medium incidence Diarrhea, assuming low incidence 10-5 Risk of diarrhea (incidence, episodes/child-year) Diarrhea incidence (episodes/child-year) by LRVs, assuming etiologic fractions A and 100% compliance 0 1 2 3 4 5 Log10 reduction values for all 3 pathogen types Colors apply to symbols in the same way as lines; for example, bacterial diarrhea in the two models can be compared by comparing the red lines to the red symbols. Effect of LRVs given imperfect compliance If compliance slightly decreased to 99% and there were no pathogen spikes, this model predicted little or no additional benefit from LRVs above 3 in many scenarios (e.g., Figure 4.4). If compliance is 80%, there was little benefit from increasing LRVs beyond 2. This behavior was similar regardless of compliance type (), pathogen mixture (A, B, C), or incidence level (low, medium, high) (Figures 4.10-4.13, page 145). 135 10-1 10-2 10-3 10-4 80% compliance 95% compliance 99% compliance 100% compliance 10-5 Incidence ratio of childhood diarrhea 1 Figure 4.4. Effect of compliance with HWT on the incidence ratio of diarrhea, by LRV 0 1 2 3 4 5 Log10 reduction values for all 3 pathogen types Assuming medium incidence, compliance type β, no spikes, and pathogen mixture A. More scenarios are in section 4.4.3. If pathogen spikes were included, the incidence ratio increased as spike height increases, for all LRVs (Figure 4.5). This effect was not due to an overall increase in incidence, because the model was calibrated to maintain the same incidence (baseline pathogen concentrations were also reduced to compensate for spike height; Equation 4.3). Rather, the increase in IR was due to the nonlinearity of the dose-response functions at high doses; during a large spike, reducing dose x-fold might only reduce risk by a factor less than x (Figure 4.6). Diminishing returns from LRV increases were still seen when spikes were introduced. Spikes 10 times above baseline gave similar results as no spikes (Figures 4.10-4.12, page 145). 136 0.5 5×10-2 0.1 10-2 80% compliance, no spikes 80% compliance, spike height 103 80% compliance, spike height 105 99% compliance, no spikes 99% compliance, spike height 103 99% compliance, spike height 105 5×10-3 Incidence ratio (IR) of childhood diarrhea 1 Figure 4.5. Effect of compliance and spikes on the IR of childhood diarrhea, by LRVs 0 1 2 3 4 5 Log10 reduction values for all 3 pathogen types 10 Assuming medium incidence, compliance type β, and pathogen mixture A. More scenarios are in section 4.4.3. 137 Figure 4.6. Effect of dose response function nonlinearity at high doses Example with etiologic fraction A, medium incidence, and spike height 1000 Since dose response functions are linear when doses are relatively low, reducing dose by a factor x also reduces the probability of infection by x. If spikes were absent, the pathogen concentrations used in the model were generally in the linear region. However, at high doses (such as during spikes), reducing the dose by x might only reduce the probability of infection by some factor y, where y < x. For example, assuming medium incidence and spike height (sh) of 1000, the daily dose of bacteria ingested during a spike was ~107(chart b). Reducing the dose by 1 LRV, from 107 to 106, reduced the probability of infection by ~1/4. This effect was less marked for the other two pathogen types (charts c & d), but it is still present. See Figure 3.3 (page 99) for a semilog plot of chart a. 138 Compliance type changed the relationship between LRVs, IR, and spikes (Figure 4.7). When there were no spikes, the results for compliance types α, β, and γ were similar. However, as spike height increased, α had the lowest IRs and γ had the highest IRs. Figure 4.7 also shows additional benefit from LRVs 4 and 5 for the highest spike scenario (105-fold baseline). The benefits were greatest for compliance type α, in which children either complied perfectly or not at all. The benefits were smaller but still evident for β, in which children complied perfectly, partially, or not at all (see also Table 4.1, page 124). In contrast, under γ every child complied partially, always consuming some untreated water. 0.4 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 0.6 0.7 0.8 0.9 1 Compliance α, no spikes Compliance β, no spikes Compliance γ, no spikes Compliance α, spike height 103 Compliance β, spike height 103 Compliance γ, spike height 103 Compliance α, spike height 105 Compliance β, spike height 105 Compliance γ, spike height 105 0.3 0.3 Incidence ratio (IR) of childhood diarrhea Incidence ratio (IR) of childhood diarrhea Figure 4.7 Incidence ratio of diarrhea by compliance level & type, spikes, and LRVs 0 1 2 3 4 5 Log10 reduction values for all 3 pathogen types Assuming medium incidence, compliance of 0.8, and pathogen mixture A. See page 124 for a detailed description of compliance types. Benefits from higher LRVs was more pronounced under conditions of high incidence, large 139 spikes, compliance type α, and high compliance (Figure 4.8). The benefits decreased as compliance decreased. 140 Figure 4.8. Effect of compliance on IR if large contamination spikes occur Assuming medium incidence, compliance type α, and pathogen mixture C. 141 There is little information available regarding pathogen concentrations in source waters in developing countries; this is why the model was calibrated to obtain these pathogen concentrations in the first place. If the pathogen concentrations obtained from calibration are too low, that might make high-LRV HWT appear less effective than it actually was. This was evaluated with additional estimation runs using pathogen concentrations obtained by calibrating to high incidence (the three rightmost panels of Figure 4.2, page 133), and multiplying them by 10 or 100. An example is shown in Figure 4.9, and all such runs are shown in Figure 4.14 (page 150). Diminishing returns from increasing LRVs remained apparent. However, LRVs of 4 sometimes represented an improvement over LRVs of 3, particularly if incidence was high Figure 4.14b) or if compliance was 99%. 142 0.5 0.05 0.1 0.2 0.05 0.1 0.2 0.005 0.01 0.02 0.005 0.02 Incidence ratio (IR) of childhood diarrhea Incidence ratio (IR) of childhood diarrhea 1 Figure 4.9. Effect of compliance on IR with extreme pathogen concentrations 80% compliance 95% compliance 99% compliance 100% compliance 0 1 2 3 4 5 Log10 reduction values for all 3 pathogen types Estimation results using mean pathogen concentrations 100-fold the calibrated values for high incidence. Pathogen mixture A, calibration type β. Pathogen mixture C tended to give lower IRs than A or B if incidence was high or spike height was high (Figures 4.10-4.12). As incidence was increased from low to high, the effects of pathogen mixture and compliance type increased. 4.4.3. Charts of median incidences from all estimation scenarios Median incidence values from all estimation scenarios are shown below in Figures 4.104.14. Figures 4.4, 4.5, 4.7, and 4.8 above are subsets of Figures 4.10-4.12, expressed as incidence ratios (IRs) and charted as lines. Figure 4.13 is similar, but uses various combinations of LRVs for bacteria, viruses, and protozoa that are consistent with recent WHO guidelines (Sobsey & Joe 143 Brown, 2011) and USEPA guidelines for HWT (USEPA, 1987). Figures 4.10-4.14 display all combinations of overall compliance (0, 0.8, 0.95, 0.99, 1), compliance type (α, β, γ), and pathogen mixture (A, B, C) for a particular combination of incidence levels (low, medium, high) and spike scenarios (5 spikes per year that are 1×, 10×, 1000×, or 100,000× baseline mean pathogen concentration). Each symbol represents the median value of a distribution of estimation results from a scenario defined by a particular combination of these variables. Note that if compliance equals 0 (complete noncompliance, equivalent to LRV=0) or 1 (perfect compliance), then only compliance type α is possible. 144 Figure 4.10. Detailed estimation results (low incidence) a. Low incidence, 0 spikes/year 1x baseline b. Low incidence, 5 spikes/year 10x baseline c. Low incidence, 5 spikes/year 1000x baseline d. Low incidence, 5 spikes/year 10,000x baseline 145 Figure 4.11. Detailed estimation results (medium incidence) a. Medium incidence, 0 spikes/year 1x baseline b. Medium incidence, 5 spikes/year 10x baseline c. Medium incidence, 5 spikes/year 1000x baseline d. Medium incidence, 5 spikes/year 10,000x baseline 146 Figure 4.12. Detailed estimation results (high incidence) a. High incidence, 0 spikes/year 1x baseline b. High incidence, 5 spikes/year 10x baseline c. High incidence, 5 spikes/year 1000x baseline d. High incidence, 5 spikes/year 10,000x baseline 147 Estimation steps were also run for LRV values that were consistent with existing recommendations from the WHO and the USEPA (Sobsey & Joe Brown, 2011; USEPA, 1987) (Figure 4.13). The recommendations consist of sets of three LRVs: bacterial, viral, and protozoan. These runs did not include spikes. They were divided into 3 groups: I) complete noncompliance, i.e., 0:0:0; II) LRVs consistent with the WHO interim target, where two of the three marker pathogens met the WHO protective type but the third had an LRV of 0; III) as II, but the third marker pathogen had an LRV of 1; IV) a series of LRVs including the WHO protective target of 2:3:2 and the WHO highly protective target of 4:5:4; and V) the USEPA standard of 6:4:3. The WHO protective target was generally more protective than the WHO interim targets, but there was little difference between the WHO protective target (2:3:2), the WHO highly protective target (4:5:4), and the USEPA target (6:4:3) unless compliance was 95% or higher. 148 Figure 4.13. Detailed estimation results, WHO/EPA recommended LRVs 149 Figure 4.14. Detailed estimation results, extremely high baseline pathogen concentrations 4.4.4. Assessing impact of high LRVs: significance testing & classification trees The Wilcoxon rank sum test was used to assess whether incidence significantly differed depending on LRVs. This was done by considering all scenarios in Figures 4.10-4.12 except those with overall compliance of 0 or 1, and comparing pairs of scenarios that had identical compliance levels, compliance type, etiologic fractions, and spike height, but had differing LRVs (3 vs. 5 for all pathogen types). The Wilcoxon test detected a difference between the two incidence distributions at p < 0.05 in 153 of the 324 comparisons (47.2%). However, because a statistically significant difference does not necessarily mean an important difference, two measures of importance (which were chosen a priori) were further considered: an incidence difference (ID) > 0.2 diarrheal episodes per child-year, and an incidence ratio (IR) < 0.9. 150 Classification trees were then constructed (rpart package version 3.1. for R) to describe which scenarios showed improvement in diarrheal incidence if LRVs were increased from 3 to 5. The ID criterion was more restrictive (fewer scenarios met it than the IR criterion) and therefore the tree was simpler (Figure 4.15). Most of the scenarios meeting the criterion (32/43) had a spike height of 100,000, and those also had the following 2 characteristics: 1) medium or high incidence; and 2) some perfect compliers in the community (i.e., compliance types α or β). Of the remaining 11 scenarios that met the criterion but had a spike height < 105, all of these had high incidence and a spike height of 1000; further, 10 of them were compliance types α or β. In addition, 18 of the 43 scenarios that met the criterion had an overall compliance of 99%. The IR tree was more permissive (94/324 scenarios met the criterion) and the tree was structured somewhat differently (Figure 4.16). However, 71/94 had spike heights of 1000 or 105, and 61/94 had overall compliance of 99%. As with the ID tree, compliance types α and β tended to meet the criterion more frequently than γ, and higher incidence also met the criterion more frequently than medium or low incidence. 151 Figure 4.15. Classification tree for incidence difference (ID) criterion 152 Figure 4.16. Classification tree for incidence ratio (IR) criterion 153 4.5. Discussion The model developed in this chapter indicated that the risk of diarrhea decreases linearly (on a log-log scale) with pathogen removal by HWT under perfect compliance conditions. This is a direct consequence of the fact that the dose response relationships are linear in the range of pathogen doses that individuals usually receive (Figure 4.6, page 138). These results are somewhat consistent with those reported in the WHO guidelines on health-based targets for HWT devices (Sobsey & Joe Brown, 2011). Although the model used in this analysis has a similar QMRA approach as the WHO model, there are some important distinctions. For example, the three indicator pathogens used here were pathogenic E. coli, rotavirus, and Giardia, whereas the WHO guidelines used Campylobacter, rotavirus, and Cryptosporidium. Time-varying pathogen concentrations were calibrated to reflect realistic diarrhea incidence levels, as well as realistic etiologic fractions of pathogens. The WHO guidelines assumed constant concentrations of the pathogens in sewage, and that drinking water was contaminated with 0.01% sewage. Most importantly, this model relaxed the assumption of perfect compliance, examining the joint effects of varying compliance and LRVs. Assuming imperfect compliance, its results differed greatly from the WHO model, particularly for higher LRVs. For many of the scenarios with imperfect compliance, diminishing health improvements from increasing LRVs were observed; similar conclusions from a differently structured QMRA model were published recently (Joe Brown & T. Clasen, 2012). Specifically, when the variation in pathogen concentration was limited to 25 fold (i.e., no spikes) and compliance was 99%, little additional diarrhea was prevented for LRVs above 3 (Figure 4.4, page 136). Assuming 80% compliance, LRVs above 2 prevented little additional diarrhea. If spikes occurred and some of the population complied perfectly (compliance types α or β), LRVs above 3 sometimes prevented additional episodes of diarrhea. 154 These results indicate the importance of including compliance in risk estimations and in policy development, and also emphasize the importance of understanding the different dimensions of compliance. For example, some people may never comply, others may comply when they are home but not when they are away from home, and yet others may comply only during periods of perceived high risk. These simulations suggest that, given a particular overall compliance level within a community (i.e., a proportion of person-time spent complying), HWT scenarios that include more perfectly complying individuals prevent more diarrhea. Although the implications of these different dimensions of compliance are not well understood, it is clearly difficult to obtain long-term, high compliance with household interventions in developing countries (Makutsa et al., 2001; B. Arnold et al., 2009; Luby et al., 2009). Difficulty in achieving high compliance with intervention strategies also extends to sanitation and hygiene interventions. Handwashing compliance is incomplete in both industrialized (Bischoff et al., 2000) and developing countries, especially with soap (V. A. Curtis & Cairncross, 2003). Despite the obvious importance of sanitation in removing pathogens from the environment and breaking the fecal-oral cycle of transmission, approximately half the population of southern Asia and sub-Saharan Africa openly defecates or has an unimproved latrine (World Health Organization & UNICEF, 2010). Even if latrines are available, they might not be used consistently (B. F. Arnold et al., 2010; Banda et al., 2007; Montgomery et al., 2010). 4.5.1. Information needed to inform models of diarrheal infection transmission Although many of the modeled scenarios had diminishing health improvements beyond 3 LRVs (and sometimes beyond 2 LRVs), scenarios were identified where LRVs above 3 were beneficial (e.g., Figures 4.7 and 4.8, page 139). Understanding which scenarios are most realistic requires a better characterization of the variability in pathogen concentrations in source waters, the relative proportions of pathogens in contaminated water, and the extent to which these 155 pathogens are also transmitted through other environmental pathways. These issues are further discussed below. Pathogen concentrations in source waters Little information is available on the variability of pathogen concentrations in source waters. Even point measurements of pathogen concentrations are scarce (Enger et al., 2012), and it is unclear what reasonable spike concentrations would be. However, contamination spikes are plausible due to various mechanisms, including stormwater runoff, defecation directly into source waters, or washing of contaminated items like diapers. In the absence of spikes, we assumed that the daily concentration of pathogens varied over a 25-fold range, consistent with measurements of thermotolerant coliforms in source water in rural Congo (Boisson et al., 2010) and E. coli in a rural Ecuadorian stream (K. Levy, A. E. Hubbard, K. L. Nelson, et al., 2009). Etiology of diarrheal disease The contribution of different pathogens to diarrheal disease is also uncertain and depends upon ecology, sociology, and infrastructure. Published etiologic fractions include diarrhea from all transmission routes, not only drinking water; pathogen profiles for different routes, for example food versus water, will differ. In addition, the true distribution of etiologies may differ from the distribution of reported etiologies. For example, certain bacteria may be more frequently identified because they are easier to culture. For this analysis, three broad mixtures of etiologic fractions were chosen, based on the most comprehensive information available. Future research, particularly from the Global Enterics Multicenter Study (University of Maryland, 2012), will further clarify diarrheal etiology. Routes of transmission other than drinking water Finally, this model only accounts for infection via drinking water. Additional routes of transmission (e.g., contaminated hands, objects, or food) operate in underdeveloped 156 communities. Considering these routes would decrease the apparent effectiveness of a HWT device, since these routes would affect users and nonusers of HWT alike. This model also does not account for infection transmission between individuals. Effective HWT would reduce the number of infected people, thus reducing pathogen shedding, thus indirectly preventing infection in people not using HWT. This would increase the apparent effectiveness of HWT, assuming imperfect compliance (Halloran et al., 1991). Although examining effectiveness in the context of multiple transmission pathways is important, it probably would not affect our general conclusions about the joint effects of compliance and LRVs on the effectiveness of HWT interventions. 4.5.2. Conclusions Recent WHO guidelines (Sobsey & Joe Brown, 2011) provide an important framework for evaluating the health benefits of HWT devices resulting from their LRVs. However, the simulation results presented here indicate that prevention of diarrhea byHWT is limited by compliance. Thus, the classification system in the WHO guidelines incompletely informs HWT users and promoters regarding effectiveness of devices if < 100% of drinking water is treated. For promoters of HWT, these simulation results emphasize that facilitating consistent, sustained use is extremely important when deciding which devices to use for a program, in addition to the antimicrobial efficacy of the device. HWT cannot greatly reduce transmission of diarrheal disease by drinking water unless compliance is high and sustained. More research is necessary to understand the full complexities of compliance, to explicitly measure compliance in intervention trials, and to incorporate compliance in development policy. This chapter provides a modeling framework that examines the impact of compliance on the effectiveness of interventions, as an initial step toward more complete consideration of compliance by researchers, policymakers, and development workers. 157 5. TRANSMISSION MODEL OF DIARRHEAL INFECTION 5.1. Abstract Although quantitative microbial risk assessment (QMRA) models are useful tools for describing infectious disease risks in many different circumstances, they ignore two important characteristics of particular importance for diarrheal infections in developing countries: 1) secondary transmission effects, and 2) multiple transmission routes. To further investigate whether: 1) diminishing returns are seen with increasing HWT log10 reduction values (LRVs) under incomplete compliance; and 2) whether the pattern of compliance within a community affects diarrhea prevention, an environmental infection transmission system (EITS) model was constructed. The model incorporated household structure, multiple routes of transmission (including drinking water, environmental exposure, and contacts between households), and pathogen shedding by infected individuals. It also included simultaneous transmission of bacterial, viral, and protozoan pathogens. Although the model can simulate the effects of additional interventions, including sanitation, handwashing, and safe storage of treated drinking water, this work concentrates upon HWT only. A calibration step determined values for seven parameters in the EITS model that were consistent with high diarrheal incidence in developing countries. Subsequently, those parameter values were reused in estimation steps that applied HWT interventions with varying levels of antimicrobial effectiveness and varying patterns of compliance to simulated communities. The results resembled previous conclusions from QMRA models: 1) LRVs above 3 seldom prevent additional diarrhea, and 2) the pattern of compliance alters HWT effectiveness, even if overall compliance was held constant. In contrast to the QMRA models, the EITS model indicated that LRVs above 3 prevented little additional diarrhea, even when compliance was perfect. 158 5.2. Introduction Infection transmission models have been often been used to better understand infectious diseases, with the goal of improving the health of the population by controlling or stopping transmission (Keeling & Rohani, 2008). However, transmission models have seldom been applied to diarrheal disease in developing countries, a particularly severe and widespread public health problem. Although it is clear that clean drinking water, effective sanitation, and hygienic behavior can effectively control diarrheal infections, the best ways to achieve these goals are far from clear, given the serious resource constraints in developing countries. Modeling systems of transmission of diarrheal infections would allow simulation of various interventions, providing insight on which interventions, or combinations of interventions, might yield the largest reductions in diarrheal disease. Environmental infection transmission system (EITS) models (Li et al., 2009) are particularly suited to modeling diarrheal infections because they can explicitly describe the diverse pathways that pathogens can take through the environment (e.g, Figure 2.7, page 77). Furthermore, substantial information is available regarding pathogen inactivation or removal by various interventions (Table 2.1, page 64), which can be simulated in an EITS model by removing the appropriate proportion of pathogens from particular compartments at appropriate times. The QMRA model in chapter 4 described diminishing returns from increasing household water treatment (HWT) log10 reduction values (LRVs) when compliance with HWT is imperfect. However, that model did not account for any secondary transmission effects of HWT. If an intervention is used by some households in a community, those households will benefit because some infections will be prevented; however, because complying households have fewer infections, they release fewer pathogens into the environment, and consequently non-complying 159 households benefit also. These secondary transmission effects would prevent additional disease beyond what was measured in the QMRA model in chapter 4. They might also affect the point at which diminishing returns are seen from increasing LRVs when compliance is imperfect. The primary goal of the EITS model described below was to determine whether increasing LRVs from HWT still leads to diminishing returns given imperfect compliance, in the context of a model with secondary transmission and multiple transmission routes. In addition, the EITS model is well-suited to exploring additional questions (which time considerations did not permit including in this chapter), such as: • The nature of interaction between two joint interventions: under what circumstances is it positive, negative, or absent? Previously published research (see page 46 for a summary) suggests two hypotheses: ◦ Sanitation usually interacts positively with other interventions. ◦ Two non-sanitation interventions applied jointly usually interact negatively. • How much diarrhea prevented by an intervention is due to immediate effects on those who use it, and how much is due to indirect effects, since healthier people shed fewer pathogens, reducing risk to the community as a whole? 5.3. Materials and methods 5.3.1. General description of the model An EITS model was programmed using MATLAB software (version 7.13, R2011b) that simulated a small isolated community in a developing country (parameter values for the model are summarized in chapter 7 and Table 7.1, page 230). The community consisted of 200 households with an average of 5 people per household. The community gathered their drinking water from a surface water source (Figure 5.1). The system had four types of compartments 160 containing pathogens: 1) the land where the community resided; 2) the surface water from which the community obtained drinking water; 3) stored drinking water within each household; and 4) the household environment, which represented the hands of household members as well as fomites and surfaces within the household. Since each household (Figure 5.2) contained a stored drinking water compartment and a household environment compartment, and there was only one land compartment and one surface water compartment in the community, each model run contained 2h + 2 distinct compartments, where h is the number of households in the community. Each compartment contained pathogens of three distinct types: bacteria, viruses, and protozoa. Each household contained variable numbers of people, who could be either young children less than five years old (18% of the population), or children/adults aged 5 years or more (82% of the population). These two types of people are called 'children' and 'adults' for brevity. The only source of pathogens was infected people, who contaminated the community's land and their household environment through defecation. Pathogens gradually moved from land into surface water; randomly scheduled rainfall events increased the rate of this transfer. Households were randomly connected with one another, allowing exchange of pathogens between households (Figure 5.1). All pathogens were attenuated exponentially over time in all compartments, representing a variety of processes that can remove pathogens completely from the system (e.g., inactivation; sedimentation; transport in flowing water; percolation below the soil surface; etc.). 161 Figure 5.1. Simplified overview of the simulated community Each simulated community contained about 200 households. Each household contained a group of people and two compartments for pathogens: 1) a container for stored water, and 2) a more abstract household environment compartment representing pathogens that were available for ingestion on hands, objects, and surfaces (Figure 5.2). Stored water was collected at the beginning of each day from the community's surface water; the stored water could subsequently be contaminated by pathogens in the household environment. Four types of interventions could operate: household water treatment (HWT) destroyed pathogens in the stored water immediately after collection, safe storage prevented recontamination of stored water by hands, handwashing prevented contamination of hands after defecation, and sanitation prevented contamination of land by defecation. Households could 162 comply perfectly, incompletely, or not at all with these interventions. Each day, each person ingested variable doses of three pathogen types (bacteria, viruses, and protozoa, represented by diarrheagenic Escherichia coli, rotavirus, and Giardia duodenalis) from their household's stored drinking water, their hands, and the land outside the household. Figure 5.2. Structure of each household within the simulated community The four transfer calibration parameters marked by '*' influence movement of pathogens between compartments. There are also three attenuation calibration parameters describing attenuation of the three pathogen types in all compartments; they are not shown in this figure, but see Table 5.1 (page 169). The model used discrete timesteps of one day. Each day included four types of events: 1) contamination events, which represented defecation and were the only source of new pathogens in the system; 2) pathogen transfers, which described movement of pathogens between 163 compartments; 3) exposure events, which described ingestion of pathogens by people and the subsequent possibility of new infections (i.e., transfers of pathogens from compartments to people); and 4) pathogen attenuation, which described removal of infectious pathogens from the system in each compartment over time (corresponding to inactivation, sequestration, or any other means that could render pathogens unable to ever contact a host). Since many of the parameter values describing these events are highly uncertain, seven abstract 'calibration parameters' (see Figure 5.2, page 163, and Tables 5.1 and 5.2, page 169)were varied during each model run in a calibration step consisting of many thousands of model runs; values of these parameters that yielded acceptable results were retained for use in subsequent estimation steps. The events occurred in a particular sequence, as shown in Figure 5.3. To avoid arbitrarily biasing the doses of pathogens ingested by people, exposure events happened at a variable time each day. This meant that pathogens introduced into the system at the beginning of each day would be attenuated for some random fraction of a day before they were ingested, introducing additional variability into the doses that people received. 164 Figure 5.3. Daily progression of the EITS model of diarrheal infections Summary of the steps in each model run The events marked in Figures 5.1-5.3 are explained in more detail below; events that include calibration parameters are marked with a *. 1. There were two contamination events representing defecation. The total number of pathogens excreted per infected person is determined by the grams of feces excreted per 165 person per day multiplied by the number of pathogens per gram of feces. Adults excrete more pathogens than children due to higher fecal output, and symptomatic people likewise excrete more pathogens than asymptomatic people due to higher fecal output. The pathogens excreted by a person are distributed as follows: a) Contamination of the land (Fl) • Most of the pathogens excreted by infected individuals follow this route. b) Contamination of the person's household environment (Fh); summarizes hand, surface, and fomite contamination related to anal cleansing • A small proportion of the pathogens excreted by infected individuals follow this route. 2. There are four main pathogen transfers: a) Land to surface water (R); summarizes runoff due to rainfall events, soil erosion, people washing or bathing in the water, etc. • This event is defined by daily rates of pathogen transfer from land to water. There are two rates: a lower one for non-rainy days and a higher one for rainy days (rain events occur randomly, 14 days apart on average). b) * Surface water to stored drinking water (Se and Sf), representing daily resupply of stored water by each household • All pathogens remaining in each stored drinking water compartment at the beginning of each day are transferred onto the land (Se); this is a very small transfer, but is included for completeness. • Each household's stored drinking water is refilled directly from surface water at the beginning of each day (Sf). The amount of pathogens transferred is 166 determined by multiplying the pathogens in surface water by a calibration parameter (CPSf); this is analogous to dilution. c) * Household environment to stored water (H) • The transfer of pathogens from each household's environment to its stored drinking water was determined through multiplication by a calibration parameter (CPH&Dh); event 3b (below) also used CPH&Dh. d) * Transfer of pathogens by inter-household visits (V), representing pathogens carried between households on hands, fomites, food, etc. • A set of two-way links between households was randomly created at the beginning of each model run. Each day, a subset of these links was randomly selected, signifying visits occurring that day. Each visit represented a two-way transfer of pathogens between the households' environment compartments. Transfer of pathogens from household A to household B was: (number of pathogens in household A's environment) / (number of people in household A) * (calibration parameter CPV denoting proportion of pathogens transferred). Household B simultaneously transferred pathogens to household A in the same manner. 3. There were four exposure events, where susceptible people may become exposed and subsequently develop infection or disease. a) Drinking water ingestion (Dw) • Determined by: (liters of water consumed daily) × (number of pathogens in stored drinking water) / (volume of stored drinking water container). b) * Ingestion of pathogens from the contaminated household environment (Dh) 167 • The number of pathogens in the household environment multiplied by a calibration parameter (CPH&Dh; see also event 2c above). c) * Ingestion of pathogens from land (Dl); summarizing people playing or working in soil, consumption of locally grown food, etc. • The number of hand-mouth events per day multiplied by a calibration parameter (CPDl). Each person's dose was subtracted from the outside environment. d) Baseline exposure (B) • Each day, people were randomly chosen to become exposed to each of the three pathogen types, independently from the doses they had received. This simulated importation of infection. 4. Exponential attenuation of all pathogens in all compartments: a) * Decay of pathogens in all compartments was based on published rates in raw water, multiplied by 3 independent calibration parameters (one for each pathogen type; CPatten). 168 Table 5.1. Summary of calibration parameters Transfer calibration parameters CPSf Transfer of pathogens from surface water to stored drinking water (analogous to dilution of pathogens in surface water) CPH&Dh Transfer of pathogens from the household environment to: 1) stored water; or 2) people, who then ingest them CPV Transfer of pathogens between the household environment compartments of a pair of households CPDl Transfer of pathogens from land to people (who then ingest them) Attenuation calibration parameters CPatten (bacteria) Daily exponential inactivation/removal rate in all compartments, bacteria. CPatten (viruses) Daily exponential inactivation/removal rate in all compartments, viruses. CPatten (protozoa) Daily exponential inactivation/removal rate in all compartments, protozoa. Different calibration parameter values were sampled from uniform distributions (Table 5.2) for each calibration run of the model. Table 5.2. Ranges over which calibration parameters were sampled Calibration parameter Lower limit Upper limit Transfer calibration parameters (proportions of pathogens transferred) CPSf 10-6 10-2.4 ≈ 4.0×10-3 CPH&Dh 10-5 10-1.5 ≈ 0.032 CPV 10-5 10-0.5 ≈ 0.32 CPDl 10-11.5 ≈ 3.2×10-12 10-6 Attenuation calibration parameters (daily rates of inactivation, removal, etc.) CPatten (bacteria) 10-0.5 ≈ 0.32 102 CPatten (viruses) 10 103 100.6 ≈ 4.0 103 CPatten (protozoa) Calibration parameters were sampled from uniform distributions on the log10 scale. These ranges were determined empirically before calibration; see page 182 for further explanation. 5.3.2. Technical description of the model The mechanics of the model are described in detail in the flowchart below (Figure 5.4). 169 Figure 5.4. Flowchart of the operations of the EITS model Each numbered step in this flowchart is described in detail below. 170 Step 1: Enter parameters Each simulation run began by reading in 53 parameters, 7 of which were calibration parameters that were randomly varied during the calibration step (Tables 5.1 and 5.2, page 169), and 37 of which were obtained from published scientific literature. The remaining parameters were based on expert opinion. Further discussion regarding choice of particular (non-calibration) parameters is in chapter 7, and the parameter values are summarized in Table 7.1, page 230. Step 2: Create random number tables and output logs All random numbers needed for the model run were pre-generated when the model run began (which reduced processing time). Matrices were also generated to store output from the model. Step 3: Set up the simulated community Assignment of children and adults to households Simulated communities contained n households (n ≈ 200), each containing a certain number of people randomly drawn from a Poisson distribution whose mean was 5 people/household. If a household was assigned 0 people, it was discarded; thus, the number of households in the simulated community varied slightly between runs. Children and adults were randomly assigned to households as follows: 1. The total number of adults in the community was determined by multiplying the number of people in the community (~1000) by the proportion of the community consisting of people aged five years or older (0.82) (Ayad et al., 1994); see page 228 for further discussion. 2. One person in each household was designated an adult. 3. Each remaining adult was randomly assigned to a household that was not already filled with adults. 171 4. Once all adults had been assigned to households, the remaining people were considered children under five years of age. Assignment of compliance to households Compliance with HWT, handwashing, and sanitation was considered to be a characteristic of the household, not a characteristic of each person. The community had a particular compliance level and compliance type that was designated at the beginning of each model run. Compliance was described by overall compliance (proportion of person-time spent complying) as well as compliance type (α, households complied perfectly or not at all; β, households complied perfectly, partially, or not at all; γ, all households complied partially; for a more detailed description, see section 4.3.1, page 124). Using these compliance parameters for the community, each household was randomly assigned a compliance level for each intervention, describing the proportion of time that household used it. In some simulations that included HWT, safe storage was applied to all households complying perfectly or partially with HWT, since safe storage is a particular characteristic of some HWT interventions. Data structures for tracking household and person characteristics Household composition and compliance were stored using a household matrix with one row per household, and a people matrix with one row per person (Table 5.3). Children were at the top of the people matrix, and adults were at the bottom. These matrices also explicitly tracked the amounts of pathogens in household compartments, as well as infection status of individual people. The household matrix had one row per household, and contained information about household-level compartments of microbes. The people matrix had one row per person, and contained information about infection state (-1 = susceptible; 0 = immune; 1 = exposed; 2 = infected; 3 = diseased), the number of days remaining in the infection state (if susceptible, these are negative), and the row in the household matrix that each person belonged to. Community172 level compartments of microbes were tracked separately (by 3-element vectors); 'nMw' is the number of microbes in the water reservoir, and 'nMl' is the number of microbes on the land. Table 5.3. Description of matrices tracking households and people Column Household matrix ('HHs') People matrix ('People') number 1 Number of people in the household Infection state (bacteria) 2 Number of adults (aged >= 5 years) Infection state (viruses) 3 Number of children (aged < 5 years) Infection state (protozoa) 4 Number of bacteria in household environment* Time counter† (bacteria). 5 Number of viruses in household environment* Time counter† (viruses). 6 Number of protozoa in household environment* Time counter† (protozoa). 7 Number of bacteria in household's water Household that the person belongs to (row number in 'HHs') 8 Number of viruses in household's water Number of people in the person's household 9 Number of protozoa in household's water Unused‡ 10 Unused‡ Unused‡ 11 Unused‡ Unused‡ 12 Unused‡ Unused‡ 13 Compliance with sanitation (proportion) Unused‡ 14 Compliance with HWT (proportion) Unused‡ 15 Compliance with handwashing (proportion) Unused‡ 16 Unused‡ Unused‡ 17 Whether household has safe storage (binary) Unused‡ Infection states: -1 = susceptible; 0 = immune; 1 = exposed; 2 = infected; 3 = diseased * The only sources of microbes on hands were from a) defecation; or b) inter-household visits. † Each time counter was an integer denoting the number of days remaining in a particular person's infection state. Each day, 1 was subtracted from all time counters. When the time counter reached 0, the infection state changes. If the infection state was 'susceptible', the time counter was negative and denoted the number of days a person has spent susceptible to a particular pathogen. See page 178 for more detail. ‡ Unused rows were reserved for future expansion of the model, or were emptied as the design of the model changed during its development. Generating connections between households 173 Households were connected by a randomly generated network to allow pathogens to be directly exchanged between households. An Erdős–Rényi random network was generated using the 'erdrey' function from the CONTEST toolbox for MATLAB (A. Taylor & Higham, 2008), using an estimate of mean network degree (number of connections per household) observed in rural Ecuadorian villages (Zelner et al., 2012). Initial infection status Initially, all persons in the village were classified as exposed; this exposure lasted for a randomly determined period from 0-18 days for each person, after which they developed infection or disease. This prevented the development of large oscillations in infection prevalence, in the same fashion as in the QMRA models in the two preceding chapters (see page 96). Step 4: Start daily loop and tally people in all states Once the community had been generated, the first simulated day in the community could begin. The numbers of people in each infection state for each pathogen (exposed, infected, diseased, immune, or susceptible) were tallied and stored. If the tallies did not agree with the number of people in the simulated community, an error would occur and the simulation would end. Step 5: Defecation People who were infected with a pathogen excreted pathogens into the environment. For each pathogen type and each infected person: Pe = fdn (5.1) where Pe is the number of pathogens excreted, f is grams of feces excreted daily by a person without diarrhea, d is the number of defecation events per day (which depends on whether the person is symptomatic [3 events] or asymptomatic [1 event]), and n is the number of 174 pathogens per gram of feces. Most of these pathogens were deposited on the land, but a small proportion entered the hands compartment for that person's household: Ph = Peh (5.2) where Ph is the number of pathogens added to hands by a particular person, and h is the proportion of feces that remain on the hands (1/1000). The number of pathogens deposited on land by a particular person is then Pl = Pe – Ph. If sanitation or handwashing interventions were included in the model run, LRVs were then applied to Pl and Ph according to sanitation and handwashing compliance, respectively, in the same way as HWT interventions (see step 7 below, Equation 5.4). Step 6: Transfer pathogens from land to surface water Each day, a small proportion (0.001) of pathogens was transferred from land to surface water, representing movement of pathogens by processes such as soil erosion, groundwater movement, laundering or defecating directly into water, etc. If a rain event occurred on a given day (probability of 1/14), a higher proportion (0.05) of pathogens were transferred from land to surface water. These proportions probably vary greatly depending on climate and hydrogeology, and were chosen by expert opinion, due to the absence of data; see page 228 for further discussion. Step 7: Transfer pathogens between households via visits Conceptually, each visit constitutes a contact between 2 people from different households, each of which gives a proportion of their pathogens to the other. A subset of the possible connections between households (as determined in step 3, page 171) was randomly chosen, based on a daily probability (2/7, i.e., twice weekly on average) of a contact occurring for each 175 connection. It was possible for a household to visit more than 1 other households per day. For each visit and each pathogen type, the number of pathogens transferred from one particular household to one other household (Pt) was: Pt = CPV × Pb / N (5.3) where CPV is a calibration parameter representing the proportion of pathogens transferred between households during a visit, Pb is the number of pathogens in the household environment before the contact, and N is the number of people in the household. Step 8: Resupply stored water & apply household water treatment (HWT) The first event to occur each day was resupplying the stored water within each household. For simplicity, any pathogens remaining in the stored water compartment were transferred to the land, and then a proportion of pathogens in the surface water (calibration parameter CPSf) was transferred into each household's stored water compartment. Log10 reduction values (LRVs) were then applied to the pathogens in the stored water of each household, depending on the household's compliance with HWT: Pt = Pb(1 - c) + Pbc10-L (5.4) where Pt is the number of pathogens remaining after treatment, Pb is the number of pathogens before treatment, c is the proportion of pathogens treated by the household, and L is the LRV of the treatment method. The value of c for each household was determined at the beginning of each model run, based on the overall compliance level and the compliance type within the simulated community (see step 3, page 171). Equation 5.4 was also used to calculate pathogen removal or inactivation from handwashing or sanitation (described in step 5, page 174). 176 Step 9: Pathogen transfer from household environment to drinking water Pathogens within each household environment compartment could be transferred to its stored drinking water compartment, simulating recontamination of water when people remove it from the storage container. Pt = CPH&Dh × PbS (5.5) where Pt is the number of pathogens transferred, CPH&Dh is a calibration parameter describing the proportion of pathogens transferred from hands to water, Pb is the number of pathogens in the household environment before the transfer, and S equals 0 if the household uses safe storage, and 1 otherwise. Thus safe storage completely blocked the transfer of pathogens from the household environment to stored drinking water. Step 10: First attenuation of pathogens Pathogens in all compartments were exponentially attenuated (i.e., completely removed from the system) according to a particular rate r for each pathogen type. Since attenuation affected the doses of pathogens received by people (Figure 5.3, page 165), it was initially applied over a random proportion x of the day, after which people ingest doses of pathogens: Pr = Pbe-rx (5.6) where Pr is the number of pathogens remaining in a particular compartment after decay, and Pb is the number of those pathogens in that compartment before decay. The daily attenuation rate r was given by three values of a calibration parameter (CPatten), one value for each pathogen type. CPatten was the only calibration parameter that took different values for each pathogen type. 177 Steps 11 & 12: Decrement all status counters and apply status shifts At all times, each person had a particular infection status for each pathogen type: susceptible, exposed, infected, diseased, or immune. A status counter measured the number of days remaining for that status. Every day, 1 was subtracted from all status counters. When a status counter reached 0, that person transitioned to the next state: exposed → infected (possibly diseased) → immune → susceptible (Table 5.3). When a person transitioned from exposed to infected or diseased, a duration of infectiousness was randomly chosen from a distribution. The durations of the immune and exposed states were fixed (for parameter values, see Table 7.1, page 230; but recall that the initial exposure duration varied; step 3, page 171). Step 13: Calculation of daily pathogen doses and dose response All people ingested daily doses of pathogens, removing those pathogens from the system. If a person was susceptible to a pathogen, they might become exposed to that pathogen as a result of that day's dose. The dose of each pathogen type was the sum of three component doses: pathogens from stored drinking water, pathogens from land, and pathogens from the household environment. Pathogens from stored water Each person's dose of each pathogen type from stored water (Dw) was determined as follows: Dw = PwI / V (5.7) where Pw is the number of pathogens in the household's stored water, I is the amount of stored water ingested in liters per day (which is higher for adults and lower for children), and V is the volume (in liters) of the household's stored drinking water container. Pathogens from land 178 Each day, people ingest a dose Dl of pathogens from land, with children ingesting more pathogens than adults: Dl = CPDl × Pl(Zc/Za) for children, and Dl = CPDl × Pl for adults, (5.8) where CPDl is a calibration parameter representing the proportion of pathogens on the land ingested daily; Pl is the number of pathogens on the land; and Zc and Za are the numbers of daily hand-mouth contacts for children and adults (respectively 328/day and 130/day; USEPA, 2011). Dl could be considered to include pathogens from the environment outside the household, including soil and locally grown food. Pathogens from the household environment A proportion of the pathogens in a household's environment at the time of dose calculation was ingested by its occupants. Children ingested more of those pathogens than adults, as described immediately above. The dose Dh received in this manner by a particular occupant was: Dh = CPH&Dh × Ph(Zc/Za) for children, and Dh = CPH&Dh × Ph for adults, (5.9) where CPH&Dh is a calibration parameter representing the proportion of pathogens in the household environment ingested daily; and Ph is the number of pathogens in the household environment. Dose response: converting pathogen doses to exposure and infection Each of the three doses Dw, Dl, and Dh were then removed from the appropriate compartments (stored drinking water, land, and household environment), and were summed for each person and each pathogen to obtain the total dose for each pathogen received by each 179 person. Dose response functions were then used to convert the doses into probabilities of infection, in the same manner as in chapters 3 and 4 (Equations 3.2 and 3.3, page 98). Based on the resulting probabilities of infection, it was randomly determined whether each person who was susceptible to a particular pathogen type became exposed to that pathogen. Any person who became exposed would eventually develop infection (see steps 11 and 12, page 178), provided the model run did not end first. Step 14: Assignment of baseline exposures In order to force the model to have a minimum non-zero incidence of diarrhea, baseline exposures were assigned. All people had an equal daily probability per pathogen type Sp of being selected for a baseline exposure. If a person was selected and was also susceptible, their status was changed to exposed. If a non-susceptible person was selected, nothing occurred. Sp = (TpBc) / (365Mp) (5.10) where Tp is the proportion of the baseline incidence in children attributable to that pathogen type (0.5 for bacteria, 0.25 each for viruses and protozoa, based on etiologic fractions discussed in chapter 4, page 129), Mp is the morbidity ratio in children for that pathogen type, and Bc is the baseline incidence of diarrhea in children from all pathogen types. The value for Bc was assumed to be 0.5 episodes per child-year, since measurements of diarrheal incidence in developed countries are often roughly 0.5 to 1.5 episodes per person-year (M. E. Wilson, 2005), and the best possible outcome of various interventions in a developing country community would be to reduce diarrheal incidence to levels found in developed countries. Step 15: Second inactivation of pathogens Pathogens in all compartments were then inactivated over the remainder of the day (1 – x) in the same manner as in step 11, Equation 5.6, page 177. 180 Step 16: Continue to next day, or end simulation Each model run was scheduled to terminate after a certain number of days. Once pathogens had been inactivated over the final portion of the day, the simulation terminated if the required number of days had elapsed, or continued to the next day (see step 4, page 174) if not. 5.3.3. Calibration and estimation All of the parameters used in this model are substantially uncertain, and communities vary across many characteristics that are difficult to measure, but could nonetheless influence the transmission of diarrheal infections. In the calibration step, the model was run many times while varying the values of the calibration parameters (Table 5.2, page 169). If a model run's output was consistent with certain calibration criteria chosen a priori (Table 5.4), its calibration parameter values were retained and used for estimation later. The calibration criteria (Table 5.4) considered incidence and etiologic fractions in children in developing countries aged less than five years. Observational studies of diarrhea in developing countries show that the incidence of diarrhea has a wide range, depending on the country and the community (Kosek et al., 2003); a relatively high incidence of childhood diarrheal disease (6-12 episodes per child-year) was chosen because the calibration step is meant to simulate conditions in a very poorly developed community with essentially no hygienic or sanitary infrastructure. Studies of diarrheal etiology also indicate that bacterial diarrhea is generally more common than viral or protozoan diarrhea; see page 19 for further discussion (Lanata & W. Mendoza, 2002). However, the importance of differing transmission routes is unknown; therefore, model runs consistent with the first two criteria were subdivided by the relative importance of transmission via stored drinking water, so that estimation could be conducted for varying scenarios of high, medium, and low waterborne transmission. The sets of calibration parameter values that fit the calibration criteria (Table 5.4) can be 181 considered to represent a variety of differing communities. In the estimation step, differing interventions can be applied to these communities at a variety of compliance levels to determine the likely effect of these interventions on diarrheal disease. Table 5.4. Criteria for calibrating the transmission model Description Units Criteria Incidence of childhood* diarrheal disease episodes per childyear* 6 to 12 Etiologic fractions proportion of childhood* diarrheal episodes by pathogen type 32.5 to 62.5% bacterial 7.5 to 37.5% viral 7.5 to 37.5% protozoan Route importance proportion of childhood* diarrheal episodes from drinking water Mostly water: >2/3 to 100% Water & nonwater similar: 1/3 to 2/3 Mostly nonwater: 0 to <1/3 See Table 5.1 (page 169) for descriptions of the 7 calibration parameters that were varied to obtain these values during the calibration process. *Children aged less than 5 years. Determination of calibration parameter ranges The calibration process used a single set of ranges for the calibration parameters (Table 5.2, page 169). The ranges were determined by running a series of mock calibration processes consisting of about 30,000 runs each. The first mock calibration process used extremely wide ranges of calibration parameter values (7 to 9 orders of magnitude). For subsequent processes, these ranges were gradually narrowed, one calibration parameter at a time, guided by the distributions of calibration runs within each calibration parameter range that were consistent with the calibration criteria (Table 5.4). For example, if CPSf was varied from 10-9 to 10-2, and no model runs whose results were consistent with the calibration criteria were seen from 10-9 to 108, CPSf might be varied from 10-8 to 10-2 next. Random sampling of all calibration parameters 182 was carried out as if they were uniformly distributed on the log10 scale. The calibration process was done once, producing three mutually exclusive sets of parameters: 1) a set consistent with predominantly (> 2/3) waterborne transmission; 2) a set with similar waterborne and nonwaterborne transmission; and 3) a set consistent with predominantly (> 2/3) non-waterborne transmission. Those sets of parameters were then used as the basis for estimation scenarios in which HWT interventions were applied to the simulated community. All runs of the EITS model were on MATLAB 7.13 (R2011b; 64 bit). The program code for the EITS model was originally developed using Octave 3.2.3, but it runs on either Octave or MATLAB without modifying the code. Each run of the model took about 16 seconds in MATLAB; for comparison, each run required about 30 seconds in Octave. Output from the model was analyzed with R 2.15.1. 5.4. Results 5.4.1. Calibration The calibration step included 228,480 model runs. Calibration runs whose output was consistent with the criteria in Table 5.4 were considered to represent distinct communities with differing characteristics that were summarized by their calibration parameter values. There were 1728 (0.756%) consistent runs; of these, 963 (55.7%) had < 1/3 waterborne transmission, 129 (13.2%) had 1/3 to 2/3 waterborne transmission, and 537 (31.1%) had > 2/3 waterborne transmission. Charts of calibration output are presented in Figures 5.5 and 5.6. Incidence of diarrhea in children was highly variable for all transfer calibration parameter values, ranging from roughly 0.5 to 40 episodes per child-year (Figure 5.5). The proportion of cases of diarrhea attributable to waterborne transmission increased as CPSf increased or as CPDl decreased (Figure 5.5a and d, 183 and Figure 5.6d). Bacterial pathogens generally had lower attenuation rates (CPatten) than viral or protozoan pathogens, though this was not universal (Figure 5.6a). The distribution of diarrheal incidence in children over all the calibration runs was multimodal (Figure 5.6b) with peaks near 0, 9, and 20 episodes per child-year; the last two peaks roughly correspond with the maximum incidence levels attainable by each pathogen type, visible on Figure 5.6a where the point clouds approach their maxima as CPatten decreases. Of the 40,230 (17.6%) calibration runs that met the incidence criterion, it was common for a single pathogen type to predominate (Figure 5.6c, note densely clustered dots near each corner of the triangle). 184 Figure 5.5. Calibration output from EITS model, transfer parameters For further explanation, see text immediately before & after this figure. 185 Figure 5.6. Calibration output from EITS model, scatterplots & histograms Incidence of diarrhea in children was unrelated to the proportion of diarrhea in children that was waterborne (Figure 5.7). Although incidence of diarrhea appeared slightly lower if 1/3 to 2/3 of diarrhea was waterborne for all pathogens and for bacteria (Figure 5.7, a and b), the distributions were not significantly different after adjusting for multiple comparisons (Holm 186 method, n=12). Figure 5.7. Simulated diarrhea incidence in children (calibration step) 10 12 6 4 2 0 0 0 2 2 4 4 6 6 8 8 8 10 12 10 Incidence, diarrheal episodes per child-year 2 4 6 8 10 12 0 A B C 12 Incidence of diarrhea, by etiology and route a. All pathogens b. Bacteria c. Viruses d. Protozoa A B C A B C A B C A, > 2/3 of incidence from waterborne route; B, 1/3-2/3 of incidence from waterborne route; C, < 1/3 of incidence from waterborne route. In the calibration runs that were consistent with the calibration criteria (Table 5.4, page 182), six of the seven calibration parameters were associated with diarrheal incidence among children by multiple linear regression (Table 5.5; for descriptions of the calibration parameters, see Table 5.1, page 169). In contrast with all other calibration parameters, CPV (describing transfer of pathogens between households) was not significantly associated with diarrheal incidence, or the proportion of incidence that was waterborne (Table 5.5). Substantial portions of 187 the variation in incidence was explained by the calibration parameters, with adjusted R2 values from 0.3 to 0.8 depending on the dependent variable that was used. Table 5.5. Association of calibration parameters with incidence Linear regression parameter estimates* and p-values Dependent variable Child incidence, all runs Child incidence, consistent runs, A† Child incidence, consistent runs, B† n Intercept CPSf 67.8 1.67 - 2×10228,480 2×10 16 537 41.3 3.88 2×10- 2×1016 228 Child incidence, consistent runs, C† 963 Proportion of incidence that is waterborne, consistent runs 1728 16 16 CPH& DH CPatten CPV CPDl Bacteria Virus- Protoes zoa R2‡ 1.17 1.52 -5.51 -9.88 -0.010 - 0.0303 2×10- 2×10- 2×10- 2×10- 0.806 2×10 2×10-6 16 16 16 16 16 0.190 0.005 0.291 -9.00 -1.31 -1.32 -0.012 5×10- 2×100.551 0.7 2×10-9 4×10-9 10 16 2.65 32.5 2×103×10-4 16 0.121 0.0135 0.252 0.4 0.8 0.004 28.9 0.373 2×10- 2×10- 0.400 0.890 -3.81 -1.54 - 0.0458 2×10- 2×10- 5×10- -0.687 0.284 1×10 0.2 0.002 12 16 16 13 16 0.209 0.1 10 -6.47 2×1016 -1.06 -0.901 0.411 0.006 0.01 -0.107 -0.107 -0.110 -0.146 0.066 -0.043 -0.001 2×10- 2×102×100.762 0.7 4×10-7 0.0006 0.02 16 16 16 * The log10 of the calibration parameters were the independent variables in the linear models. † A, > 2/3 of incidence was waterborne; B, 1/3-2/3 of incidence was waterborne; C, < 1/3 of incidence was waterborne. ‡ The adjusted R2 was calculated by the lm() function in R 2.15.1. For descriptions of the calibration parameters, see Table 5.1, page 169. The influence of the calibration parameter values on the proportion of childhood diarrhea that was waterborne was particularly apparent for the transfer calibration parameters, which were proportions describing daily movement of pathogens between compartments. Lower values of 188 CPSf (related to dilution of pathogens in surface water) were associated with less waterborne transmission, and similarly, lower values of CPDl (related to dispersion of pathogens on land) were associated with less non-waterborne transmission (Figure 5.8, a and d). Lower values of CPH&Dh (describing transfer of pathogens out of the household environment)also favored more waterborne transmission (Figure 5.8b), though there was a significant tendency (p = 0.009, Wilcoxon rank sum test, Figure 5.8b) for CPH&Dh values to be higher if 1/3 to 2/3 of transmission was waterborne (category B), compared with < 1/3 of incidence being waterborne (category C). There was also some weaker statistical evidence that CPV (describing transfer of pathogens between households) values were lower if 1/3 to 2/3 of transmission was waterborne (category B, Figure 5.8c), compared with > 2/3 of incidence being waterborne (category A; p = 0.03) or < 1/3 of incidence being waterborne (category C; p = 0.08). 189 Figure 5.8. Distributions of transfer calibration parameters A B C 1E-7 1E-9 1E-11 1E-6 1E-5 1E-5 5E-5 1E-4 1E-3 5E-4 1E-2 5E-3 Transfer calibration parameter value 5E-6 5E-5 5E-4 1E-1 5E-3 Distributions of transfer calibration parameters, for runs meeting calibration criteria a. CPSf b. CPH&Dh c. CPV d. CPDl A B C A B C A B C A, > 2/3 of incidence was waterborne; B, 1/3-2/3 of incidence was waterborne; C, < 1/3 of incidence was waterborne. Grey horizontal lines represent the ranges over which the calibration parameters were sampled. The values of the attenuation calibration factors varied little in relation to the proportion of incidence that was waterborne, although there was a slight trend for their values to increase as the proportion of waterborne transmission decreased (Figure 5.9).In general, the values of the calibration parameters that were consistent with the calibration criteria spanned the ranges over which they were sampled during the calibration step (Table 5.2, page 169), although the attenuation parameter values did not quite meet the lower bound. The value of CPatten for 190 bacteria was approximately 20, compared with approximately 200 for viruses and protozoa. The model therefore suggested that protozoa and viruses are attenuated roughly 10 times faster than bacteria. Figure 5.9. Distributions of attenuation calibration parameters 500 1000 10 5 0.5 20 10 20 50 50 100 100 200 200 500 Transfer calibration parameter value 1 2 5 10 20 50 100 1000 Distributions of attenuation calibration parameters, for runs meeting calibration criteria a. CPatten, bacteria b. CPatten, viruses c. CPatten, protozoa A B C A B A C B C A, > 2/3 of incidence was waterborne; B, 1/3-2/3 of incidence was waterborne; C, < 1/3 of incidence was waterborne. Grey horizontal lines represent the ranges over which the calibration parameters were sampled. 5.4.2. Estimation A random sample of 100 parameter sets was taken from each of the three groups of calibration runs consistent with the calibration criteria. Using these parameter sets, a HWT 191 intervention with LRVs of 1, 2, 3, 4, or 5 against all pathogen types was applied with community-wide compliance of 50%, 80%, 95%, 99%, or 100%, and compliance types of α, β, or γ (see page 124 for further discussion of compliance types). The HWT intervention prevented little or no diarrhea where < 1/3 of childhood diarrhea was waterborne (Figure 5.10c), even if compliance was perfect. If 1/3-2/3 of childhood diarrhea was waterborne (Figure 5.10b), the median incidence decreased from 8 episodes per child-year to 3 episodes per child-year if compliance was perfect and LRVs were 3 or higher; however, if compliance was 50%, median incidence was approximately 7 episodes per child-year for compliance type γ and 6 episodes per child-year for compliance type α, regardless of LRV. If > 2/3 of childhood diarrhea was waterborne (Figure 5.10a), median incidence was further decreased, particularly for higher compliance levels; there were approximately 1.3 episodes per child-year if compliance was perfect. As found in chapter 4, LRVs higher than 3 generally did not prevent additional diarrhea. There was substantial variation around each of the median estimates displayed in Figure 5.10; interquartile ranges generally spanned about 4 episodes per child-year (data not shown). At the extremes, incidence levels ranging from 1 to 20 were obtained. 192 Incidence, episodes/child-year 0 1 2 3 4 5 6 7 8 9 Incidence, episodes/child-year 0 1 2 3 4 5 6 7 8 9 Incidence, episodes/child-year 0 1 2 3 4 5 6 7 8 9 Figure 5.10. Estimation step, incidence by LRV of HWT a. Median diarrhea incidence: >2/3 waterborne transmission 0 1 2 3 4 5 Log10 reduction values (LRVs) b. Median diarrhea incidence: 1/3 to 2/3 waterborne transmission 0 1 2 3 4 Log10 reduction values (LRVs) 5 c. Median diarrhea incidence: <1/3 waterborne transmission Symbols for all 3 charts: Compliance α 0% compliance 95% compliance Compliance β 50% compliance 99% compliance Compliance γ 80% compliance 100% compliance 0 1 2 3 4 5 Log10 reduction values (LRVs) 193 Although the EITS model is not directly comparable with the QMRA model described in chapter 4, it is possible to make a rough comparison (Figure 5.11). Both models indicate decreasing incidence ratios as compliance decreases. However, increasing compliance in the QMRA model leads to greater risk reductions than in the EITS model. Furthermore, in the EITS model, there is little or no improvement in IR from LRVs above 3 even if compliance is perfect. This differs from the QMRA model, where risk continues to decrease as LRVs increase if compliance is perfect. 194 Figure 5.11. Comparison of EITS and QMRA results 0.1 1E-2 1E-3 50% compliance 80% compliance 95% compliance 99% compliance 100% compliance 1E-4 Incidence ratio (IR) of diarrhea 1 a. Incidence ratios with HWT by LRV, compliance type α: EITS model 0 1 2 3 4 Log10 reduction values (LRVs) 5 0.1 1E-2 1E-3 80% compliance 95% compliance 99% compliance 100% compliance 1E-4 Incidence ratio (IR) of diarrhea 1 b. Incidence ratios with HWT by LRV, compliance type α: QMRA model (chapter 4) 0 1 2 3 4 Log10 reduction values (LRVs) 195 5 5.5. Discussion 5.5.1. Calibration step All calibration parameters were varied over wide ranges; the transfer calibration factors (describing the proportion of pathogens transferred daily between compartments) varied over 4 or more orders of magnitude (Figure 5.8), and the attenuation calibration factors (describing inactivation or removal of the three pathogen types) varied over about 2 orders of magnitude (Figure 5.9). Although 1728 sets of calibration parameters were found that were consistent with childhood diarrheal incidence values and etiologic fractions that appear common in developing countries (Table 5.4, page 182), it is not certain that these calibration parameter sets reflect realistic conditions in developing country communities, because of uncertainty in parameter values that necessitated calibration in the first place. All of the calibration parameters strongly influenced the outcome of the model, except for the calibration parameter describing pathogen exchanges during inter-household visits (CPV). The routes influenced by CPV and CPDl (which describes ingestion of pathogens from land) are parallel transmission routes; they both allow pathogens to move from one household to another. However, the inter-household route influenced by CPDl is more direct (feces from household A → land → new host in household B) than the inter-household route influenced by CPV (feces from household A → household environment A → household environment B → new host in household B). Furthermore, the CPDl route links all households with all other households every day, while the CPV route links only a subset of households daily. Although removing interhousehold visits might simplify the model without reducing its utility, household structure remains a useful component of the model (shown by the effect of CPH&Dh on diarrheal incidence 196 and the proportion of incidence that is waterborne, Figure 5.8b, page 190). The values of the daily attenuation rates (CPatten) of the three pathogen types were very high, with median estimates of ~20/day for bacteria, and ~200/day for viruses and protozoa. These correspond to half-lives of 50 minutes and 5 minutes, respectively, and are much faster than decay rates of ~0.6/day (half life of 1700 minutes, or 1.2 days) for E. coli (Flint, 1987), norovirus (Pancorbo et al., 1987), and Giardia (Wickramanayake et al., 1985; deRegnier et al., 1989) measured in unfiltered natural waters at ~20ºC (see page 227 for further discussion). However, attenuation includes many processes in addition to decay that prevent pathogens from contacting hosts, such as: percolation or burial in soil; sedimentation or settling in water; ingestion by animals; and removal from the community by flowing water. 5.5.2. Estimation step The EITS model corroborates the QMRA model in chapter 4 in several respects. Both models agree that increasing HWT LRVs beyond 3 is unlikely to prevent much additional diarrhea, although it might provide some benefit for intermediate levels of compliance when waterborne transmission predominates. Furthermore, in both models, compliance type α (where people either comply perfectly with HWT or don't comply at all) tends to prevent more diarrhea in both models than compliance type γ (where everyone complies partially with HWT); compliance type β is intermediate between α and γ. The EITS model indicates that HWT can prevent very little diarrhea if waterborne transmission is low in relation to other transmission routes, even if LRVs are high and compliance is perfect (Figure 5.10c, page 193). 5.5.3. Limitations of the EITS model Wherever possible, published measurements from peer-reviewed publications were used for parameter values (see chapter 7 for detailed discussion). However, reliable information from 197 developing countries was not available for all parameter values. Furthermore, it is possible that appropriate values for some parameters were not found during the literature review, despite being documented in the literature. Ideally, a rigorous meta-analysis would be conducted for each parameter in this model, but this was not practical given the resources available for the research and the large number of parameters. Nonetheless, multiple documented measurements were sought for each parameter; discussions of the decision process for various parameter values are in chapter 7. Most parameters used in the model had fixed values. Because few measurements have been reported for many parameters, it is possible that some of them are improperly specified, possibly affecting the model's results. However, the seven calibration parameters introduced variation along all possible routes that pathogens could take within the model. This potentially allowed the model to self-adjust for missing or misspecified parameter values in the calibration step, but would also confound sensitivity analysis of the fixed model parameters. The model could be refined in the future by incorporating updated information regarding its parameter values, which would mean that the aspects of each route that are covered by each calibration parameter would decrease, yielding a more completely specified model requiring calibration over fewer variables or narrower ranges of values. Although the model suggests that visits between households are a relatively unimportant route of transmission, this conclusion might depend upon the way visits are simulated. The model describes visits by an exchange of pathogens between two household environment compartments (Equation 5.3, page 176), which might represent a brief visit or an exchange of food or items. However, children are frequently cared for by friends or relatives in different households, and the entry of an infected child into an otherwise uninfected household could be a particularly important exposure route. The entry of an infected person into an uninfected 198 household for a long period, particularly a small child who defecates in the new household, could represent a much more intense exposure. In principle, the model could be modified to better represent such a situation, which might increase the importance of visits to infection transmission. This EITS model does not include zoonotic transmission of diarrheal pathogens. Domestic animals live in close proximity with humans (particularly in rural areas), and can be major reservoirs of pathogenic E. coli, Campylobacter (C. R. Young et al., 1999), and Cryptosporidium parvum (Xiao, 2010); however, giardiasis is probably not a zoonosis under most circumstances (Cacciò et al., 2005). The lack of animal sources of transmission in this model may inappropriately penalize transmission of bacterial and protozoan infections relative to viral infections. This may partially explain the relatively low attenuation rates (CPatten ≈ 20) of for bacteria, compared to viruses and protozoa (CPatten ≈ 200), because selecting lower values of CPatten for bacteria than for viruses or protozoa is the only way for the calibration step to favor bacterial transmission. If the model omits a route that bacteria use preferentially, the calibration process must select lower values of CPatten for bacteria than for viruses or protozoa in order to meet the calibration criteria (Table 5.4, page 182), essentially compensating for the absence of the bacterial route. The model also does not explicitly include exposure to pathogens through contaminated food, although doses of pathogens ingested from the household environment compartments and the land compartment could be considered to include such exposures. Many bacterial pathogens can grow in food, and this is not explicitly incorporated in the model either, although the lower attenuation rates (CPatten) of bacteria compared to viruses or protozoa (Figure 5.5e) could also be considered to reflect this. 199 Transmission of rotavirus in this model might also be inappropriately high because the model considers adults and children to have the same concentration of virus in their feces, regardless of whether they are infected or diseased. However, it has been reported that rotavirus particles are roughly 10 times as concentrated in child feces as in adult feces (Vollet et al., 1979). Furthermore, rotavirus infections in adults tend to be shorter and milder than in children (Hrdy, 1987); thus adults might also have a shorter period of communicability than children. 5.5.4. Insights gained during the model construction process The initial formulation of this model used a Gillespie algorithm framework (Keeling & Rohani, 2008) to model many discrete events occurring at particular rates over time using short timesteps of variable length separating each event; this resulted in a series of randomly selected events, separated by time periods of random (but very short) length. This framework was chosen in an attempt to build upon the previously published model described on page 81, which used similar methodology (J. N. S. Eisenberg et al., 2007). Although the Gillespie algorithm version of the model had similar compartments and flows (resembling Figures 5.1 and 5.2, page 162) to the EITS model described here, there were many thousands of events occurring daily, even within a very small community (~20 households). In order for the model to run in a reasonable amount of time, it was necessary to simplify it into a framework with discrete daily time steps. An original ambitious goal of this work was to construct a highly mechanistic model that was thoroughly grounded in published theory and measurements. However, in the course of reviewing the literature, it became clear that many important parameter values were highly uncertain or completely unknown. Furthermore, there is likely to be large variation in many aspects of real diarrheal infection transmission networks; for example, hydrogeologic characteristics affecting contamination of groundwater, or differing cultural practices in areas of hygiene, food preparation, food cultivation, etc. (for further discussion of uncertainty regarding 200 particular parameter values, see chapter 7). In order to incorporate this variation, the calibration parameters were added to key routes and varied during the calibration process to determine sets of calibration parameters leading to realistic childhood diarrheal incidence levels (page 181). It is not clear how to interpret these calibration parameter values, since they encompass many different processes. For example, the calibration parameter CPDl is a daily proportion of pathogens from the land that are ingested by each person. It is a crude way of summarizing many different processes that are poorly understood, for example: dispersal of pathogens over an area; the likelihood of contacting feces that are clustered in space in unknown ways; ingestion of pathogens along with soil; transfer of pathogens from soil to skin to mouth; etc. All models represent a compromise between realism and practicality, since highly detailed models are difficult to construct and analyze. In the case of this EITS model, some simplifications were necessary due to lack of information. The model was originally intended to be a closed system, in which people could only be infected by ingesting pathogens excreted by other people in the community. However, this led to frequent stochastic pathogen extinctions during the calibration step, particularly of bacteria. The probability of extinction was reduced by increasing the size of the community. However, extinctions remained frequent even if the community had 4000 households, and the model ran too slowly when simulating such a large community. Since extinction of pathogen types in an underdeveloped community is unlikely, random exposures of people to all pathogen types were added to the model, in such a way that a mean baseline incidence rate of 0.5 diarrheal episodes per child year was obtained (step 14, page 180). The baseline incidence represents importation of infection from outside the community, or infections arising from pathogen sources that are not explicitly modeled (e.g., animal feces or imported food). 201 5.5.5. Future applications of the model Readily achievable applications At this writing, it has not yet been possible to thoroughly explore many aspects of this complicated EITS model. This section briefly describes additional aspects of the transmission and prevention of diarrheal infections that could be explored without modifying the model code. The model is not limited to HWT interventions; it can also simulate the action of sanitation and hygiene interventions, and assess scenarios in which different interventions are applied together. Furthermore, it can simulate the effect of safe storage of drinking water, by making the assumption that each household owning a safe storage container completely blocks transmission of pathogens from the household environment to stored drinking water. Although many HWT interventions include a safe storage container, this is not always the case. The model could be used to assess different interventions simultaneously, or in series. Field trials report widely varying efficacies for particular interventions; one explanation might be that a community's response to an intervention might depend upon other interventions that are already in place. For example, if a community with predominantly non-waterborne diarrhea transmission (e.g., Figure 5.10c) acquires latrines, diarrheal incidence would decrease, but the importance of the waterborne route would be larger for the remaining disease transmission; this would be expected to increase the effectiveness of HWT interventions (Figure 5.10a). Furthermore, two interventions may interact positively or negatively when they are applied simultaneously (see page 46 for further discussion). Such scenarios can be simulated using this model. The time required for the full benefits of an intervention to be realized may be of interest. Because the model is dynamic, it can be run for a period of time without an intervention; once an intervention is applied, diarrheal incidence could be tracked over time to determine how rapidly 202 a lower equilibrium is reached. Since giardiasis in particular has an incubation period and disease duration of roughly two weeks (Rendtorff, 1954; A. M. Jokipii & L. Jokipii, 1977), it might take months for the full preventive effect of an intervention to be attained. A long time period between distribution of an intervention and the appearance of its benefits might lead to reduced compliance by the community because it initially appears ineffective, even if it would have been effective in the long term. Although the burden of diarrheal infections and disease in people aged five years or more is poorly understood, particularly in developing countries, this model explicitly includes them. Although it is impossible to say whether the model properly accounts for infections in such people, examination of infection and disease status in the simulated population could generate hypotheses that future studies might test. A sensitivity analysis would provide further information about the impact of the 46 fixed parameter values on the results from the model. Such an analysis would take the form of repeated estimation steps, modifying each fixed parameter in turn to determine its effect on the incidence of diarrhea in children, as well as the relative contribution of the waterborne route of transmission. Although an extremely thorough sensitivity analysis would need to consider alterations to the calibration parameter sets resulting from changing the fixed parameter values, it is not practical to recalibrate the model repeatedly in order to explore this effect. Applications requiring substantial modifications to the model Additional aspects of diarrheal infection transmission and prevention could be explored through further modifications to the program code of the model, and subsequent analysis of its behavior. Households who are not using a particular intervention might nonetheless benefit directly because other households in the community are using them; participating households thus 203 prevent diarrhea for themselves directly, and prevent diarrhea in the rest of the community indirectly since fewer people are infected and excreting pathogens. Incidence in noncompliers could be measured before and after a simulated intervention to determine the magnitude of indirect protective effects of interventions, in a manner similar to Halloran et al. (2002; see page 83). Bouts of diarrheal disease, or asymptomatic infections with gastrointestinal pathogens, are believed to degrade resistance to further bouts of diarrhea, probably through malnutrition and consequent impairment of immunity (see page 14 for further discussion). Although it is unclear precisely how this occurs, information is available regarding correlation of diarrheal episodes within individuals (Schmidt et al., 2009). Calibration parameters could be added that modify the dose response relationship(s), signifying reduced resistance to infection or disease following a previous infection, and select values of these parameters that yield distributions of cases within individuals similar to those observed in the field. Since repeated diarrhea episodes are an important factor in mortality, this could facilitate estimation of the mortality benefit attributable to particular interventions. The model might further be used to examine the emergence of persistent diarrhea. Since coinfections with diarrheal pathogens are common, a single diarrheal episode might consist of multiple overlapping infections. Processes by which diarrheal episodes are reported and measured in the field could be incorporated into the model, in a similar fashion to the incorporation of imperfect recall in the QMRA model in chapter 3. By calibrating the model such that the distributions of the durations of modeled diarrheal episodes match observed distributions of diarrheal episode durations, insight could be gained regarding the development of persistent diarrhea in developing country communities. 204 5.5.6. Conclusions The results from the EITS model corroborate the conclusions from the QMRA model regarding compliance (chapter 4): increasing LRVs beyond (approximately) 3 is unlikely to prevent additional diarrhea, and more diarrhea is prevented if more households comply perfectly (i.e., compliance type α is superior to compliance types β or γ). By incorporating multiple transmission routes, the EITS model also showed that improving compliance with HWT can greatly improve diarrhea prevention if waterborne transmission accounts for at least 1/3 of diarrheal incidence. This finding holds even for relatively low HWT LRVs of 1 or 2. More attention should be paid to improving compliance with HWT, rather than improving the antimicrobial efficacy of HWT. The relative importance of the various transmission routes for diarrheal infections are unknown; in the EITS model, it depends upon the values of the calibration parameters. The calibration parameters themselves represent a variety of processes which are poorly understood, such as: transfer of pathogens from soil or objects to hands or mouth; decay or attenuation of pathogens in many different media, temperatures, or humidities; and many others. However, transfers of pathogens via households visiting each other (affected by the calibration parameter CPV) did not greatly affect diarrheal incidence or the proportion of incidence attributable to the waterborne route, in contrast to the other 6 calibration parameters. The EITS model is a simplification of an extremely complicated infection transmission system incorporating multiple pathogens using multiple transmission routes. In addition, many aspects of diarrheal infection transmission through the environment are poorly understood. The EITS model attempts to account for this uncertainty through the calibration process; calibration produces sets of calibration parameter values that could be considered to represent distinct 205 communities with differing environmental characteristics. However, it is unknown which sets of parameter values best represent actual communities. Important aspects of transmission may also be improperly specified or missing (e.g., conceptualizing visits to represent shared child care might increase the importance of visits; incorporating domestic animals or foodborne transmission might favor bacterial infections). Nonetheless, the results regarding compliance are robust across widely varying sets of calibration parameter values, increasing confidence that they should also apply in real communities. Future descriptive studies and field trials results will allow refinement of EITS models for diarrhea by clarifying appropriate values for parameters and reducing the need for calibration. Future models will provide guidance for public health and development professionals regarding effective ways to prevent diarrhea in developing countries. 206 6. CONCLUSIONS 6.1. Summary of research Despite the large body of literature regarding the transmission of diarrheal infections and the various interventions used to prevent those infections, much remains to be learned. Constructing models simulating the transmission and control of these infections can yield conclusions useful for designing diarrhea prevention programs, and also highlights aspects of diarrheal infections that require further research. This dissertation describes three such models: a quantitative microbial risk assessment (QMRA) model simulating waterborne transmission during a published household water treament field trial (chapter 3); a generalization of that QMRA model to examine the effect on diarrheal incidence by varying HWT compliance and antimicrobial effectiveness (measured by log10 reduction values [LRVs] under different scenarios (chapter 4); and a more complex environmental infection transmission system (EITS) model that further examines compliance and LRVs, as well as multiple routes of transmission (chapter 5). By simulating a published field trial (Boisson et al., 2010) of a HWT device (chapter 3), the model estimated that similar trials with perfect compliance would have found a longitudinal prevalence ratio for childhood diarrhea of about 0.1, in contrast to an LPR of 0.9 with low compliance (an LPR of 0.8 was estimated by the actual trial, but was not statistically significant). Although a goal of the model was to adjust for bias caused by an imperfect placebo HWT device, uncertainty about the level of compliance greatly influenced the effect of the imperfect placebo. The model also predicted concentrations of diarrheagenic bacteria, viruses, and protozoa in the source water of the field study communities that were consistent with the limited published measurements of pathogen concentrations in other developing country source waters. 207 Since the importance of compliance was highlighted by the first QMRA model, it was modified to determine how childhood diarrhea incidence changed if HWT compliance and LRVs were varied under differing scenarios (chapter 4). These scenarios were defined by various combinations of: 1) baseline incidence of childhood diarrhea; 2) the pattern of compliance within the community (put simply, the proportion of people who complied perfectly was varied while holding compliance constant at the community level); 3) the size of randomly scheduled spikes of pathogens in untreated drinking water; and 4) etiologic fractions of childhood diarrhea cases attributable to bacteria, viruses, and protozoa. In general, LRVs of 5 prevented little or no additional diarrhea compared to LRVs of 3. However, LRVs of 5 sometimes prevented additional diarrhea if incidence was high, there were large contamination spikes, or many people complied perfectly with HWT. Although the two QMRA models were informative, they included only the waterborne route of transmission. Although that route is important, diarrheal pathogens can also be transmitted by other routes, e.g., contaminated hands, soil, or objects. The relative importance of these routes is unknown. Furthermore, it was unclear whether findings from the QMRA models would hold in a more complicated and realistic model. An EITS model was programmed (chapter 5), using many functions and parameters from the QMRA models, but incorporating additional functionality such as pathogen shedding by infected people, household structure, multiple routes of transmission, and attenuation (e.g., decay or sequestration) of pathogens in the environment. The EITS model suggested that viruses and protozoa are attenuated roughly 10 times faster than bacteria in the environment. This may be partially explained by the absence of certain aspects of pathogen transmission that favor bacteria, such as the ability of bacteria to multiply in food or within nonhuman animals. The calibration process may have selected lower attenuation 208 rates for bacteria in order to meet the calibration criteria, essentially compensating for these missing pathways. It also suggested that visits between households are a relatively minor route for transmission of pathogens, though this might differ if the intensity of exposure were greater during visits (e.g., an infected child being cared for in a different household while the parents were working). Further analysis of the EITS model will yield additional results, in particular the assessment of interaction between two differing interventions applied to the same community. 6.2. Implications for future diarrheal research and prevention efforts 6.2.1. Conduct and description of field trials Field trials of public health interventions in developing countries are expensive and difficult to perform, and subject to biases that are difficult to avoid. Models simulating existing field trials can be used to draw inferences regarding what the trial might have measured under different conditions. If biases present during a field trial are well understood, the biases themselves can be simulated within the model, allowing inference of what the trial might have measured in the absence of bias. Although it is impossible to be sure that a model perfectly represents what happened, they can nonetheless be useful for generalizing additional knowledge from field trials. It is often unclear how to interpret results from a single field trial of a public health intervention, and modeling field trials requires detailed information about the conduct of the trial and the study site. Human communities are extremely diverse, and the health impact from an intervention in one community might differ from the health impact of the identical intervention in a different community. However, detailed characteristics of study communities are seldom published, though there are exceptions (Mata, 1978). Table 6.1 lists information that should be published (likely in supplemental material) about any community that participates in a field trial. 209 Although it may not always be possible to gather detailed information about each of these aspects, qualitative or semi-quantitative assessments of many of them could be quickly obtained from focus group discussions with several community members. Establishment of multi-year research relationships between study communities and diverse teams of researchers (e.g., epidemiologists, anthropologists, microbiologists, and others) would facilitate collection, analysis, and publication of this information. Table 6.1. Important community characteristics for measurement in field trials Water Sanitation MicroSocial & Epidemi- Climate & source & Nutritional & hygiene biology‡ community ology geography treatment Open defecation Water source(s) Untreated water† Community cohesion Stunting Diarrheal incidence† Temperature† Type and usage of latrines* Distance to water source Treated water† Child care & schooling Wasting† Diarrheal longitudinal Humidity† prevalence† Handwashing practices* Drinking of boiled water* Human feces† Household size Pattern of breastfeeding Diarrheal Precipitation mortality† frequency† Domestic animals Other HWT methods* Animal feces† Cultural practices Staple foods All-cause Precipitation mortality† amount† Anal cleansing Water consumption† Food† Government or NGO programs* Weaning foods Diarrhea in Solar adults† irradiation† Attitudes toward child feces Water usage† Hands† Population density Dietary fiber HIV/AIDS Roads & rivers Socioeconomic status Protein source(s) Locally important diseases Urban/rural * Should include repeated measurements of compliance over at least 1 year. † Should be measured repeatedly over at least 1 year. ‡ Ideally, pathogens should be directly quantified, though this is difficult. HWT: household water treatment. NGO: non-governmental organization. HIV: human immunodeficiency virus. 210 6.2.2. Compliance with interventions that prevent diarrheal infections HWT methods are typically assessed based on their antimicrobial efficacy, assuming perfect compliance. However, perfect compliance is unattainable. HWT methods would be more holistically assessed, and public health would be better protected, if HWT guidelines were based on compliance as well as LRVs. Results from these models also indicate that the currently recommended LRVs for HWT (particularly 6 for bacteria, 4 for viruses, and 3 for protozoa (USEPA, 1987), which are sought after in order to advertise that a device 'meets US guidelines') are higher than necessary, since LRVs of 5 prevented little or no additional diarrhea in most scenarios studied, compared with LRVs of 3. Nonetheless, LRVs above 3 might provide additional benefit in certain situations, such as when transmission is primarily waterborne, compliance is high, and large spikes of drinking water contamination are known to occur. The pattern of compliance within communities is also important. The QMRA and EITS models agree that, for a given value of compliance at the community level, more diarrhea is prevented if more people within the community comply perfectly. For example, it is better for 80% of the population to comply perfectly and 20% to not comply at all, than if the entire population treated 80% of their daily water intake. Despite its importance, compliance is difficult to measure. At a minimum, the female headof-household can be asked if they comply; however, this will probably overestimate compliance because some noncompliers will claim they comply out of politeness or a desire to give the 'right' answer. Less biased methods include structured observation (which can still give biased results due to Hawthorne effects) or directly measuring drinking water to determine if it has been treated (this is relatively easy for HWT chlorination methods), Electronic devices have also been attached to HWT devices to record usage data. However, it is not sufficient to know whether households are using a HWT method; it is also important to know how much untreated water 211 people drink. Even small amounts of untreated water can greatly increase the risk of diarrhea, but carefully quantifying intake of untreated water would require following people throughout their community to observe when, where, and how much water they drank. However, it might be possible to roughly estimate a community's untreated water intake with focus group discussions about water consumption behavior, perhaps combined with a quantitative estimate of total water intake (e.g., Akpata 2004). Every field trial should measure compliance as carefully as possible, since compliance greatly affects the measured effectiveness of interventions. 212 APPENDICES 213 7. APPENDIX A: DISCUSSION OF PARAMETER VALUES USED IN THE MODELS The models described in this dissertation use a variety of parameter values, most of which were obtained from peer-reviewed scientific publications. Many of the same parameter values are used in all three models. Only children aged less than five years were considered in the two QMRA models described in chapters 3 and 4; both children and 'adults' (defined as people aged 5 years or older) were considered in the environmental infection transmission system (EITS) model described in chapter 5. Wherever possible, parameter values were chosen to reflect conditions in developing countries with warm climates. However, some parameter values only appear to have been measured in industrialized countries. Parameter values are discussed roughly in order of progression from exposure to infection to development of disease to shedding of pathogens. 7.1. Water ingestion rate The daily water ingestion rate for young children was obtained from a study (Akpata, 2004) which provided bottled water to mothers of 50 rural Nigerian children aged one to three years, and measured the amount remaining in the bottles at the end of the day. This study successfully cross-validated itself by obtaining similar estimates when asking mothers to estimate water intake based on common household measures. Relative humidity during the study was approximately 87% and mean maximum ambient temperature was approximately 31ºC. For adults, a study of Kenyan distance runners reported a mean daily water intake of 2.3 L/day (Fudge et al., 2008). This value also agrees with U.S. Army planning parameters of 2.3 L/day for the sum of urine and sweat losses during physical work in hot climates (USEPA, 2011), as well as the water intake of 2.0 L among children aged 7 to 9 years in the Nigerian study described above (Akpata, 2004). 214 The ingestion rates for drinking water described above are much higher than other standard values measured in industrialized countries, e.g., 317 mL/day for two year olds and 1043 mL/day for people over 20 years of age (USEPA, 2011). However, people in industrialized countries are more likely to drink bottled beverages, reducing their exposure to possibly contaminated drinking water; they are also more likely to reside in cooler (possibly air-conditioned) environments, which reduces their demand for fluids. 7.2. Hand-mouth contacts per day Young children mouth hands and objects more frequently than older children or adults, which indicates that they should have more exposure to pathogens in the environment. Children aged 0 to 5 years contact their mouths with their hands or an object about 15.8 times per hour, while children aged 6-10 years have about 8.1 such contacts per hour (USEPA, 2011). Assuming that this behavior in 6-10 year olds is similar to adults, and further assuming that young children are awake 12 hours per day while adults are awake 16 hours per day, young children have 330 daily contacts with their mouth while adults have 130. The ratio of these (2.54) was used to weight environmental exposure to pathogens in the EITS model more heavily for children. 7.3. Log10 reduction values (LRVs) attributable to interventions LRVs are particularly relevant to HWT interventions, but handwashing can also be considered to apply LRVs to pathogens adhering to hands. Such LRVs, as well as USEPA and WHO recommendations for antimicrobial effectiveness of HWT devices, are summarized in Tables 2.1 and 2.2, page 64. 7.3.1. LRVs attributable to sanitation Although sanitation clearly removes pathogens from the environment, it is not straightforward to assign a simple measure of antimicrobial efficacy to it. The effectiveness of 215 sanitation depends on the manner in which it is constructed and located; pathogens may be carried out of latrines by groundwater or runoff, subsequently contaminating water wells or surface water sources. Removal of feces from latrines for use as fertilizer is another possible way for people to be exposed. Nonetheless, to simulate sanitation in the EITS model, it was necessary to assign some level of antimicrobial efficacy to it. Since anal cleansing materials are often disposed of in a wastebasket or on the ground rather than in the toilet in developing countries, it could be considered that a well-constructed and well-situated latrine would remove all feces from the community, except for those involved in anal cleansing. The amount of feces involved in anal cleansing was determined as follows: a daily average of 4.4 mg of fecal nitrogen has been measured on toilet paper (Calloway et al., 1971); feces are approximately 1.9% nitrogen (RiveroMarcotegui et al., 1998); thus approximately 0.0044 / 0.019 = 0.23 g of feces would remain on anal cleansing materials from a single defecation event, which occurs approximately once per person per day (Weaver, 1988). This figure is roughly corroborated by a report of an average of 0.1 g of feces per set of underwear in university students (Gerba, 2001). Considering that an adult on a high-fiber diet excretes approximately 225 g of feces daily (Davies et al., 1986), approximately one thousandth (0.23/225) of daily fecal output would not enter the latrine; this corresponds to an LRV of 3. 7.3.2. LRVs attributable to intervention and placebo filters in the Lifestraw RCT The first QMRA model (chapter 3) used log10 reduction values (LRVs) for the LifeStraw Family Filter (LFF) from a laboratory study of the device (T. Clasen, Naranjo, et al., 2009). In contrast to the LRV of 6.9 for E. coli reported in the laboratory study, the mean LRV from functioning LFFs for thermotolerant coliforms (TTC) reported by the Lifestraw RCT was only 2.98; removal of other organisms was not determined during the RCT (Boisson et al., 2010). However, the mean LRV of 2.98 for TTC was determined by assuming a count of 1 CFU per 100 216 mL where TTC were not detected in the filtered water (64% of samples ); the median LRV was therefore technically > 3.1, and the maximum LRV measurable in any sample taken during the the RCT was 4.5. Therefore, the average LRV for TTC in the RCT was greater than the value of 2.98 that was reported; it is impossible to know how much greater. In addition, given the size exclusion treatment mechanism by the ultrafilter membrane in the LFF (20 nm pore size), it seems unlikely that the treatment performance in the field was much lower than what was observed in the laboratory. Although TTC might have passed through some devices due to non-visible leaks in the gaskets or defects in the membrane material, the presence of TTC in outlet samples was not correlated with specific devices, nor were there trends in time over the one-year field trial. In addition, intervention and placebo devices were monitored and repaired as necessary by the study team (Boisson et al., 2010). Thus, a more likely explanation for the lower LRVs measured in some intervention LFFs is that contamination of the outlet tube with TTC occurred. Based on this reasoning as well as the detection limit issues described above, it was determined that the most reasonable LRVs for the intervention LFF against the three pathogen types were the values determined in the laboratory study (T. Clasen, Naranjo, et al., 2009). For the placebo device, the Lifestraw RCT reported a mean LRV of 1.05 (i.e., 91% removed) for thermotolerant coliforms (TTC) (Boisson et al., 2010). Further analysis of the data revealed that the LRV varied greatly between -1 (indicating a tenfold higher concentration in filtered water) and 3 (99.9% removed). There was no evidence that removal by the placebo increased significantly over time, or was restricted to particular defective devices. It is likely that some of the observed variability in removal was due to the inherent variability in water quality and measurement associated with indicator organisms in water sources (K. Levy, A. E. Hubbard, K. L. Nelson, et al., 2009). Based on the observed performance and the design, mechanisms of 217 removal may have included: (1) straining by the prefilter of the LFF, especially if TTC were attached to larger particles; and (2) adhesion to the biofilm that likely developed on surfaces, in particular the rubber tubing that replaced the filter membrane cartridge. These mechanisms might not have occurred during laboratory testing because of the short testing period (3 weeks) and simpler challenge water quality (no removal of the three test organisms by the placebo device was measured in the laboratory). Since the mechanism of this reduction is unclear, it is difficult to predict removal of viruses and protozoan cysts by the placebo device in the field, but it seems likely that if some bacteria were removed, some viruses and protozoa were also removed. Thus, the simplest assumption was chosen: application of the same LRV of 1.05 to all three marker pathogens. 7.4. Dose response functions Dose response functions are critical for translating estimated dose into a probability of infection. Such functions can be obtained from studies in which volunteers are fed widely varying doses of pathogens, and monitored for development of the response (e.g., infection or disease). These studies have been conducted on healthy adults because it would be unethical to conduct such studies on children. It is likely that dose response relationships in developingcountry children differ in unknown ways; ill or malnourished children could have decreased resistance to infection, or they could have increased immunity due to more frequent exposure. The use of infection as the response, combined with morbidity ratios from developing countries to determine disease, constitutes a crude means of adjustment for this uncertainty. The dose response functions used in this model are graphed in Figure 3.3 (page 99, semilog plot) and Figure 4.6 (page 138, log-log plot), and the equations are given on page 98. The same dose response functions and parameters were used in all three models. 218 7.4.1. Dose response for E. coli infection Although there are many dose response models describing E. coli disease (Haas et al., 1999; P F M Teunis et al., 1996), few data are available regarding dose response of E. coli infection. In published E. coli feeding studies, data regarding E. coli dose and infection are concentrated in the top half of the dose response curve (i.e., > 50% of the participants become infected), meaning that the response at low dose levels is very uncertain (Anon, 2012; P F M Teunis et al., 1996). Only one feeding study (H L DuPont et al., 1971) appears to exist that uses infection as the response and also includes data in the lower half of the dose response curve; it had 3 dose levels and 19 participants, who were fed enteroinvasive E. coli (EIEC). The data were used to fit a beta-Poisson dose response function in R version 2.12, using maximum likelihood methods (Anon, 2012). Although dose response models of infection are available for enterohemorrhagic E. coli and Shigella species., they are not appropriate for these models because they are much more infectious than diarrheagenic E. coli (J P Nataro & J B Kaper, 1998; Anon, 2012). 7.4.2. Dose response for rotavirus and Giardia We used previously published analyses to parameterize the dose response functions (J B Rose et al., 1991; Haas et al., 1993). The feeding data for these fits have also been published (Rendtorff, 1954; Ward et al., 1986). 7.5. Incubation periods Incubation periods were only used in the EITS model (chapter 5), not the two QMRA models (chapters 3 and 4). The incubation period (time from exposure to development of disease) was assumed to be the same as the prepatent period (time from exposure to detection of the pathogen in the host, i.e., shedding of the pathogen), although this is not always the case; Giardia disease tends to precede excretion of cysts by roughly one week (A. M. Jokipii & L. 219 Jokipii, 1977). The median incubation period across nine ETEC outbreaks in the USA was 1.75 days (Dalton et al., 1999). The mean incubation period for rotavirus disease was approximately 3.2 days in 5 adults who were voluntarily exposed (Kapikian et al., 1983). For Giardia infection, the prepatent period was used; a study where Giardia cysts were fed to imprisoned adult men gave a mean of 13.5 days (Rendtorff, 1954), while a study of Finnish giardiasis patients who had recently visited an area of Russia with endemic giardiasis reported a median prepatent period of 14 days (A. M. Jokipii & L. Jokipii, 1977). 7.6. Morbidity ratios The morbidity ratio for a pathogen is the number of symptomatic infections in a community divided by the total number of infections by that pathogen. They can be obtained by systematically examining stools from a population of children, regardless of their diarrhea status. Asymptomatic infection with diarrheal pathogens is common, particularly in developing countries. Morbidity ratios for young children were used in all three models; morbidity ratios for adults were only used in the EITS model. 7.6.1. Morbidity ratios in young children The morbidity ratio for E. coli (0.21) was taken from a study (Vergara et al., 1996) of children under 5 years of age in a subtropical, rural region of Argentina, in which ETEC and EPEC were measured in feces quarterly over two years. The morbidity ratio for rotavirus (0.36) was provided by a cohort study (Fischer et al., 2002) from birth to age two years in periurban children in Guinea-Bissau, where stool specimens were collected weekly and 116 rotavirus infections were identified in 94 children. The morbidity ratio for Giardia (0.59) was obtained from 210 children aged 1 month to 9 years in Peruvian periurban districts with poor sanitation; each child contributed one stool 220 sample over the course of a year (Peréz Cordón et al., 2008). Another study (M S Prado et al., 2005) suggests a much lower morbidity ratio (~0.03) for Giardia in Brazilian children aged 6-45 months, but was not used for these reasons: 1) stool samples were collected repeatedly for each child over four months, so multiple samples may have been taken during the same infection episode; 2) information about individual episodes of diarrhea was not provided; 3) its low morbidity ratio is inconsistent with higher morbidities for Cryptosporidium (Kirkpatrick et al., 2008; Bushen et al., 2007; Peréz Cordón et al., 2008), which Giardia also represents by proxy in these models. 7.6.2. Morbidity ratios in adults Suitable morbidity ratios for diarrheagenic E. coli infection in adults appear to be unavailable. Although two studies of diarrheagenic E. coli infection in adult Kenyan food handlers have been published (Oundo et al., 2008; Onyango et al., 2009), they considered diarrhea only in terms of loose stools, without regard for stool frequency. Since three or more stools per day is commonly used as a criterion for diarrhea (USAID et al., 2005), these studies probably overestimated the morbidity ratio by counting 1-2 loose stools per day as diarrhea (they reported 0.34 and 0.62, respectively). Therefore, the morbidity ratio for E. coli in children described above (Vergara et al., 1996) was also used for adults. Morbidity ratios for rotavirus infections in adults are also difficult to find, but they should be substantially less than morbidity ratios in children due to acquired immunity. In developing countries, frequent rotavirus exposure throughout life probably contributes to maintenance of immunity in adults (Bishop, 1996). Morbidity ratios of 5/16 among adults and 6/28 among children were measured in a prospective study of rotavirus infection in middle-class families served by a particular pediatric practice in the USA (Rodriguez et al., 1987); although the morbidity ratio among children in Guinea-Bissau (Fischer et al., 2002).of 0.36 is similar to the 221 above ratio of 5/16 in adults, it is rather higher than the morbidity ratio among children in that pediatric practice. The morbidity ratios from Rodriguez et al. (1987) were not used because they did not appear representative of rotavirus disease in developing countries, where children are more likely to acquire serious rotavirus disease and adults may have stronger immunity due to more frequent exposure. In the absence of better information, a value of 4/18 was used in the EITS model for the rotavirus morbidity ratio in adults, based a study of 18 adult volunteers ingesting rotavirus from 0.2g of stool of an ill child, four of whom became ill (Kapikian et al., 1983). However, only twelve volunteers showed a serologic response, and only five volunteers shed the virus, four of whom had diarrhea (Kapikian et al., 1983). The volunteers were chosen partly based on their low antibody titers against rotavirus. The morbidity ratio for Giardia was obtained from a study of Pakistani men and their children aged two to twelve years, who lived in an area where untreated wastewater was used for agricultural irrigation. Although 67.2% of the study population was infected, only 2.8% of the infected people reported diarrhea during the week before a stool sample was collected (Ensink et al., 2006), similar to the Brazilian study of young children described above (M S Prado et al., 2005). Although the number of study participants who were children aged under five years was not reported, the population is likely to be dominated by persons aged five years or older. 7.7. Durations of illness and infection Determination of the duration of infection in underdeveloped settings is difficult, since immunity is incomplete for most diarrheal pathogens, and reinfection is common. What appears to be a long or intermittent period of infection (or illness) may in fact be multiple reinfections. Also, when childhood diarrheal disease is studied in developing countries, it is usually done in a hospital or clinic setting; measurements of the duration of infection or disease are biased upward in these situations since more severely ill children are more likely to seek care. Also, the duration 222 of infection as measured by detection of a pathogen in stool does not necessarily coincide with the duration of illness. Since information on asymptomatic infection is difficult to obtain, the duration of illness was assumed to be the same as the duration of infectiousness. The same durations of illness and infectiousness were used for adults and children. 7.7.1. Duration of diarrheagenic E. coli infection and illness The duration of E. coli diarrhea (mean of 3.0 days) was obtained from a study (EstradaGarcia et al., 2009) of Mexican children aged less than two years, who were monitored prospectively for the development of infection or disease from diarrheagenic E. coli. A gamma distribution was fit to these data for use in the three models. A feeding study of ETEC and EPEC in healthy United States adults found a similar mean duration of illness of 3.4 days (R E Black et al., 1982). However, a study of 10 ETEC outbreaks among adults in the USA reported median durations of illness from 4 to 6 days (Dalton et al., 1999). 7.7.2. Duration of rotavirus infection and illness A mean rotavirus diarrhea duration of approximately 2.5 days was reported by a study (Kapikian et al., 1983) of four experimentally infected adult volunteers, with durations of 1, 2, 3, and 4 days. Accordingly, a uniform distribution from 1 to 4 was used for viral infection duration in all three models. 7.7.3. Duration of Giardia infection and illness An outbreak investigation of giardiasis from a contaminated water supply system in Massachusetts yielded a mean duration of Giardia disease of 11.3 days (Kent et al., 1988); a gamma distribution was fit to these data, and used in all three models. Although a higher mean duration of 18.3 days was reported in a feeding study of healthy imprisoned men (Rendtorff, 1954), the value from the outbreak was used in the models because it was from a more diverse population. Cryptosporidium diarrhea in developing countries can be similarly long-lasting; 223 mean durations of diarrhea of 21 days for C. hominis and 13 days for C. parvum were reported in a cohort study of children in a Brazilian shantytown during their first 5 years of life (Bushen et al., 2007). 7.8. Duration of immunity Full treatment of immunity to diarrheal infections is extremely complicated, and it is beyond the scope of this dissertation. For E. coli, Campylobacter, and Giardia, immunity protects against disease, but does not necessarily prevent reinfection (R H Gilman et al., 1988; Cravioto et al., 1990; Valentiner-Branth et al., 2003; A. H. Havelaar et al., 2009). For rotavirus, immunity is long-lasting, but is strain-specific; many differing strains exist, and adults can still develop rotaviral infection and disease (Wenman et al., 1979; Kapikian et al., 1983; Ward et al., 1986). Further considering that each pathogen type also represents other similar pathogens (e.g., Campylobacter for bacteria, norovirus for viruses, and Cryptosporidium for protozoa), repeated infections therefore occur for all three pathogen types. Therefore, some information regarding immunity is already included in these models in the form of the morbidity ratios discussed above (page 220). Infection with diarrheagenic E. coli, rotavirus, or Giardia tends to persist for several days, as long as a week, after symptoms resolve (R E Black et al., 1982; Rendtorff, 1954; Kapikian et al., 1983). Therefore the QMRA models assumed that immunity from each pathogen lasted 7 days after infection resolved. However, this assumption was reconsidered in the course of constructing the EITS model, in which each pathogen type had a period of immunity lasting one day, principally to separate episodes of disease. 7.9. Incomplete recall of diarrheal disease The probability of remembering and reporting an episode of diarrheal illness is lower if the illness had resolved longer ago. Pakistani mothers were asked to recall diarrhea in their children 224 aged < 5 years (Zafar et al., 2010); the proportion who were reported ill was high for the previous two days, but dropped sharply for the third day. The amount of diarrhea reported for the third through the sixth day was similar. Incomplete recall was only considered in the first QMRA model (chapter 3); the other models measured diarrheal incidence as if it was reported perfectly. 7.10. Fecal excretion of pathogens 7.10.1. Concentrations of pathogens in feces The numbers of pathogens per gram of feces in infected people have seldom been measured. Measurements of EIEC and ETEC in six adult male volunteers with dysentery or diarrhea yielded estimates from 108 to 109 CFU/g (H L DuPont et al., 1971). The available studies of Giardia cyst concentrations in stool are somewhat problematic. Although the EITS model uses a mean measurement (5.7×105 cysts/g) from 15 Colombian children aged three to seven years (Danciger & M. Lopez, 1975), these children were divided into three groups of five children each: high excretors, low excretors, and mixed excretors. It was unclear which type predominated, or how the children were chosen for the study, although cyst concentrations were similar in formed and diarrheic stools (Danciger & M. Lopez, 1975). A study that quantitated Giardia cysts in the feces of seven ill soldiers (Porter, 1916) presented detailed time series of cyst concentrations; however, all of these soldiers were ill for several months, had been ill for at least 1.5 months before examination of their feces, had received differing chemotherapies, and were probably more severely affected than their peers. Cyst concentrations varied widely from day to day, ranging from 0 to 2×107 cysts/g. Although rotavirus particles can reach extremely high concentrations (~1011/g) in stool when measured by electron microscopy, only a tiny fraction appears to be viable; an average of 2×106 FFU/g has been reported from 5 fecal specimens from rotavirus diarrhea patients (Ward et 225 al., 1984). 7.10.2. Amount of feces excreted To determine the amount of pathogens excreted given a concentration of pathogens in feces, the volume of feces must also be known. The EITS model used fecal output measurements from individuals consuming high-fiber diets: Nigerian children aged 6-60 months defecated 109 g on average (SD 54) (Akinbami et al., 1995), while adult British vegans defecated 225g on average (SD 91) (Davies et al., 1986). Although people living in developing countries might be expected to consume more fiber than people in industrialized countries, this is not necessarily the case; a dietary survey of urban adult Nigerian hospital patients and medical students indicated similarly low daily dietary fiber intake (~8g) and daily stool mass (~140g) as people in industrialized countries (Ogunbiyi, 1978). Although the effect of fiber intake on pathogen concentrations in feces is unclear, gastrointestinal pathogens are associated with the intestinal wall, and therefore the amount of feces shed might be more related to the internal surface area of the gut that has been colonized, rather than the mass of feces passing through the gut. Diets higher in fiber yield larger and softer stools, and promote healthy microbial communities in the gut, leading to improved nutrient absorption; fiber may also reduce risk of diarrheal disease in some circumstances (Brownawell et al., 2012). Therefore communities where high-fiber diets are common might experience less diarrhea and also excrete fewer pathogens into their environment. The number of daily defecation events would be expected to affect the amount of pathogens present on hands and in the household environment, because each defecation event represents an episode of hand contamination with feces. In industrialized countries, healthy adults and young children generally have between 1 and 2 bowel movements per day (Weaver, 1988); the Nigerian medical students mentioned above had a mean of 0.89 bowel movements per 226 day (Ogunbiyi, 1978). Two studies of young children hospitalized for acute diarrhea in Peru and Bangladesh indicated similar mean daily fecal outputs of 365g and 310g (Lembcke et al., 1989; S. K. Roy et al., 1997), and a study of Tunisian adults with acute diarrhea reported mean daily fecal output of 499g (Hamza et al., 1999). Although the Peruvian study reported 9.5 (SD 5.0) stools during the 24 hours before admission, and the Tunisian study reported 6.3 (SD 2.5) stools the day of admission, people seeking care are likely to have more severe diarrhea. As a simplifying assumption, the EITS model used a value of 3 stools/day for all persons with diarrhea; when multiplied by the fecal output in persons without diarrhea, it roughly matches the fecal output described in the Peruvian, Bangladeshi, and Tunisian studies above. 7.11. Inactivation, removal, or attenuation of pathogens in the environment Inactivation of pathogens is highly variable, depending on the type of pathogen, temperature, humidity, and the microenvironment where the pathogen resides (e.g., water, skin, feces, soil, etc.). Furthermore, some pathogenic bacteria may multiply in contaminated food. An exponential decay rate of 0.6 per day was initially considered as a basis for all pathogens in all EITS model compartments; it was based on daily rates of: 0.64 for E. coli in unfiltered river water that was bottled and stored at 15°C (Flint, 1987); 0.58 for rotavirus in water from a creek that received discharges of domestic sewage, which was then bottled and stored at 20°C (Pancorbo et al., 1987); and 0.55 for Giardia fitted to data from inactivation studies at 15-20°C in distilled water, tap water, and in situ river water (Wickramanayake et al., 1985; deRegnier et al., 1989). However, pathogens may be removed from a system by many means beyond simple decay, such as being transported in water or soil particles, ingestion by animals, sedimentation in water, sequestration in underwater sediments or underground, etc. Furthermore, pathogen decay rates can vary greatly depending on temperature, solar radiation, humidity, and other factors. Since the actions of these various attenuation mechanisms are unclear and are likely to vary 227 greatly in different communities, the EITS model assigned varying values to three attenuation calibration parameters without reference to particular values from the literature. 7.12. Pathogen movement from land to surface water Modeling the movement and inactivation of pathogens within land and surface water is a large and complicated area of research, and is beyond the scope of this dissertation. The EITS model used a few simple parameter values to describe pathogen movement from land to water; 0.1% on dry days, 5% on rainy days, with rain occurring once every 14 days on average. In chapter 5, the calibration parameters CPDl, CPSf , and CPatten can be considered to include variability in scenarios due to hydrogeological considerations. 7.13. Demographic parameters Infection transmission could be affected by demography and community structure, since children are more susceptible to diarrheal infections than adults, and people share an environment with other residents of their household. A household size of roughly five people is common in many developing countries, although it can range from four to eight (Ayad et al., 1994). Household size across a wide variety of industrialized and developing countries can be adequately represented by a Poisson distribution truncated at zero (Jennings et al., 1999). In many sub-Saharan African countries, on average 18% of each household are children aged less than five years; however, it is lower in other developing regions, e.g., 12-17% among several Latin American countries (Ayad et al., 1994). 7.14. Summary table of all parameter values Table 7.1 summarizes parameter values used in all three models described in this dissertation. The two QMRA models (chapters 3 and 4) used very similar parameter sets. The EITS model (chapter 5) also used many of the parameters from the QMRA models, as well as 228 some additional parameters. Parameters that were varied during the estimation steps of the three models are not included here (but see chapters 3, 4, and 5, pages 105, 130, and 183). The variable names used in the model code are given in the table; the QMRA models are designated 'Q', while the EITS model is designated 'E'. Where necessary, the first and second QMRA models (chapters 3 and 4) are designated 'Q1' and 'Q2'. Some variables were stored in vectors; where necessary, the element of the vector is denoted in parentheses, e.g., 'VariableName(1)'. 229 Table 7.1. Summary table of all parameter values Description of parameter values, and pages where discussed Value Variable name(s) Reference 1.178 L/day Q: drinkKids E: Wdd(1) (Akpata, 2004) 2.3 L/day E: Wdd(2) (Fudge et al., 2008) Size of stored drinking water container in each household 25 L E: Ws Daily probability of a rainfall event (page 228) 1/14 E: rRain Water ingestion, young children (page 214) Water ingestion, adults (page 214) Kara L. Nelson, personal communication Proportion of pathogens moving from land to surface water daily (page 228) Without a rainfall event 0.001 E: xRunoff(1) .. With a rainfall event 0.05 E: xRunoff(2) .. (T. Clasen, Naranjo, et al., 2009) Log10 reduction values (LRVs), intervention group, LifeStraw RCT (pages 87 & 216) Escherichia coli 6.9 Q1: LRsInt(1) .. Rotavirus 4.7 Q1: LRsInt(3) .. Giardia 3.6 Q1: LRsInt(2) .. LRVs, placebo group, LifeStraw RCT, all 3 pathogen types (page 216) Q1: LRsPla Calibration step 1.05 .. Estimation step 0 .. 0.46 E: lHand (Luby et al., 2001) 3 E: lSan (Calloway et al., 1971; Rivero-Marcotegui et al., 1998) LRV for handwashing, all pathogen types (page 64) LRV for sanitation, all pathogen types (page 215) (Boisson et al., 2010) Additional LRVs, particularly for HWT: see Tables 2.1 and 2.2, page 64. Dose response function parameters (pages 98 & 218) (Anon, 2012) 230 Table 7.1 (cont'd) Description of parameter values, and pages where discussed E. coli (enteroinvasive); beta-Poisson parameters Value Variable name(s) Reference α = 0.155 N50 = 2.11×106 Q&E: alpha(1) Q&E: KorN50(1) (H L DuPont et al., 1971) α = 0.2531 N50 = 6.171 Q: alpha(3) E: alpha(2) Q: KorN50(3) E: KorN50(2) (Ward et al., 1986; Haas et al., 1993) 0.0198 Q: KorN50(2) (Rendtorff, 1954; J B Rose et al., 1991) Rotavirus; beta-Poisson parameters Giardia; exponential k parameter (Lanata & W. Mendoza, 2002; M. E. Wilson, 2005) Baseline incidence (infections/person-year) with the three pathogen types (page 180) Escherichia coli 1.17 E: BaseInf(1) .. Rotavirus 0.347 E: BaseInf(1) .. Giardia 0.212 E: BaseInf(1) .. Escherichia coli (ETEC) 2 days E: latent(1) (Dalton et al., 1999) Rotavirus 3 days E: latent(2) (Kapikian et al., 1983) Giardia 14 days E: latent(3) (Rendtorff, 1954; A. M. Jokipii & L. Jokipii, 1977) Incubation periods (page 219) Morbidity ratios for children (proportion of infected who are symptomatic; page 220) Escherichia coli 0.214 Q: MorbidityK(1) E: MRk(1) (Vergara et al., 1996) Rotavirus 0.397 Q: MorbidityK(3) E: MRk(2) (Fischer et al., 2002) 231 Table 7.1 (cont'd) Description of parameter values, and pages where discussed Value Reference 0.590 Giardia Variable name(s) Q: MorbidityK(2) E: MRk(3) (Peréz Cordón et al., 2008) Morbidity ratios for adults (proportion of infected who are symptomatic; page 221) Escherichia coli 0.214 E: MRa(1) (Vergara et al., 1996) Rotavirus 0.222 E: MRa(2) (Kapikian et al., 1983) Giardia 0.03 E: MRa(3) (Ensink et al., 2006) shape = 1.775 scale = 1.690 Q&E: durEc.m (Estrada-Garcia et al., 2009) Range 1-4 days Q&E: durRo.m (Kapikian et al., 1983) shape = 3.206 scale = 3.431 Q&E: durGi.m (Kent et al., 1988) Period of immunity for all pathogen types (QMRA models; page 224) 7 days Q: ImmuneTimes Period of immunity for all pathogen types (EITS model; page 224) 1 day E: immune Chance of remembering diarrhea >2 days in the past (page 224) 0.54 Q1: remembrance (Zafar et al., 2010) Escherichia coli 5×108 E: Mpgf(1) (H L DuPont et al., 1971) Rotavirus 2×106 E: Mpgf(2) (Ward et al., 1984) 5.7×105 E: Mpgf(3) (Danciger & M. Lopez, 1975) Fecal output, children aged < 5 years, high fiber diet (page 226) 109 g/day E: fpp(1) (Akinbami et al., 1995) Fecal output, people aged five years or more, high fiber diet (page 226) 225 g/day E: fpp(2) (Davies et al., 1986) Distributions of duration of infection and infectiousness (page 222) Escherichia coli (gamma distribution, mean 3 days) Rotavirus (uniform distribution; mean 2.5 days) Giardia (gamma distribution, mean 11 days) Number of pathogens per gram of feces (page 225) Giardia 232 Table 7.1 (cont'd) Description of parameter values, and pages where discussed Value Variable name(s) Reference 0.23 g E: fHands (Calloway et al., 1971; Rivero-Marcotegui et al., 1998) Defecation events per day, people without diarrhea (page 226) 1 E: rPoopInf Defecation events per day, people with diarrhea (page 226) 3 E: rPoopIll Feces entering household environment per defecation event (page 226) Number of mouth contacts per day, children (page 215) 330/day (USEPA, 2011) Number of mouth contacts per day, adults (page 215) 130/day .. Ratio of the above mouth contacts per day, children/adults 2.54 E: wHandMouth .. No. children <5 years, LifeStraw RCT intervention group 85 Q1: nKidsInt (Boisson et al., 2010) No. children <5 years, LifeStraw RCT placebo group 105 Q1: nKidsPla .. 200 E: nHH 0.18 E: pKids (Ayad et al., 1994) 5 E: mPHH (Ayad et al., 1994; Jennings et al., 1999) Mean household network degree (connections per household) 5.3 E: meanDeg (Zelner et al., 2012) Daily probability of a visit, per household connection 2/7 E: rVisit 1.85 Q: Shapes (Boisson et al., 2010) Demographic and community information (page 228) Number of households in the simulated community Proportion of the community that was children aged < 5 years Mean persons per household (Poisson distribution) Shape parameter for all gamma distributions of pathogen type concentrations (the two QMRA models only) a Longitudinal prevalence of reported diarrhea for each LifeStraw RCT group Intervention (LPIrad) 0.0749 Q1: LongPrevs(2) .. Placebo (LPPrad) 0.0896 Q1: LongPrevs(1) .. 233 Table 7.1 (cont'd) Description of parameter values, and pages where discussed Value Reference 0.872 Longitudinal prevalence ratio of reported diarrhea (LPRrad) measured by the LifeStraw RCT Variable name(s) Q1: longPrevKidsDesiredRatio .. Compliance with device use, first QMRA model: chance of using device on a given day Calibration step 0.65 Estimation step .. 0, 0.65, or 1.00 Compliance with device use: If using device on a given day, proportion of water treated Calibration step 2/3 or 1/3 a The scale parameters for the gamma distributions of pathogen concentrations in the QMRA models were calculated from the mean concentrations that were obtained during calibration of the model in chapter 3 (Table 3.2, page 96). 234 8. APPENDIX B: ADDITIONAL INTERVENTIONS FOR MITIGATING DIARRHEA Although this dissertation gives particular attention to household water treatment, sanitation, and handwashing, there are other important interventions that prevent or mitigate diarrhea. They are briefly reviewed here. 8.1. Nutritional interventions Diarrhea and malnutrition form a vicious cycle in which diarrhea leads to malnutrition, which reduces resistance to disease, which leads to more diarrhea (Motarjemi et al., 1993). Diarrheal illnesses contribute more strongly to malnutrition than other infections, such as respiratory infections (Motarjemi et al., 1993). Children with diarrhea often refuse food, contributing to malnutrition, although they may still accept breast milk eagerly (de Zoysa et al., 1991). Mothers sometimes believe that food should be withheld from children with diarrhea, which can further aggravate the diarrhea-malnutrition cycle (C. E. Taylor & Greenough, 1989). In addition, young children with diarrhea commonly refuse food, leading to (or exacerbating) malnutrition, but it has been shown in several countries that they do not usually lose their appetite for breastmilk (Huffman & Combest, 1990). 8.1.1. Breastfeeding Breast milk can provide all necessary nutrients for infants aged 6 months or less; even in very hot conditions, breast milk can supply all necessary fluid intake (Huffman & Combest, 1990). In addition, malnourished mothers can still produce sufficient good-quality milk to support exclusive breastfeeding (Huffman & Combest, 1990). Although the World Health Organization recommends exclusive breastfeeding for the first 6 months of life, and continuing to breastfeed until age two and a half years or longer, exclusive breastfeeding is uncommon (about 35% of infants aged 0 to 4 months) (Hill et al., 2004). Breast milk transfers maternal antibodies from mother to child, providing passive immunity. Although breastfeeding infants can 235 still acquire diarrheal infections, they seldom lead to malnutrition while the child is exclusively breastfeeding (Motarjemi et al., 1993). Both exclusive and partial breastfeeding are strongly protective against diarrhea, particularly in infants. In a review (Feachem & Koblinsky, 1984) of this topic, comparing any breastfeeding against no breastfeeding, the relative risk for diarrhea was 0.33 in 0-3 month olds, 0.42 for 3-6 month olds, and 0.71 for 6-11 month olds. However, protection from diarrhea does not appear to continue after breastfeeding has stopped (Feachem & Koblinsky, 1984). Exclusive breastfeeding appears to be effective enough to neutralize the effects of poor sanitation on diarrheal risk (VanDerslice et al., 1994; Feachem & Koblinsky, 1984). However, supplementing breastmilk with small amounts of contaminated water can double diarrheal risk, and after adding additional foods or weaning, the situation is even worse (VanDerslice et al., 1994). Dose-response relationships have been observed, with additional non-breastmilk feeds incrementally increasing diarrheal mortality risk, and breastmilk feeds similarly reducing mortality risk (Huffman & Combest, 1990). 8.1.2. Zinc supplementation Zinc deficiency has broad effects, particularly on immunity, but its symptoms are not obvious (Bhutta et al., 1999). Zinc is not stored in the body and frequent intake is therefore required (Bhutta et al., 1999). Treating diarrhea with 20 mg/day (10 mg/day for <6 month olds) oral zinc supplementation for 10-14 days can make disease less severe (USAID et al., 2005). Several trials of zinc supplementation have been reviewed (Bhutta et al., 1999), finding that it was effective in reducing both the incidence of diarrhea by 18% and prevalence by 25% in children under 3 years of age. It was even more effective (41% reduction) on pneumonia incidence. Cereal flours can be fortified with zinc, which would make sustained zinc supplementation 236 attainable without any particular action by the family, provided that common brands of flour are supplemented. In the absence of a fortified staple food, regular supplementation seems unlikely to be sustainable. Zinc fortification of wheat flour is mandatory in Indonesia, Jordan, Mexico, and South Africa; the latter two countries also require zinc fortification of maize flour (Kenneth H Brown et al., 2010). 8.1.3. Other nutrients Vitamin A also appears to be beneficial in reducing mortality and severe diarrhea, but may not impact diarrheal incidence overall (Long et al., 2007). Iron supplementation may actually exacerbate diarrhea in some cases (Long et al., 2007). There has been relatively little study of other micronutrients (Long et al., 2007). 8.2. Treatment of diarrheal illness There are 5 broad rules for home treatment of diarrhea (USAID et al., 2005): 1. Give more fluids than usual, particularly breastmilk and hygienically prepared oral rehydration solution (ORS; see below) 2. Supplement the child’s diet with zinc for 10-14 days 3. Continue to feed the child, particularly well-cooked, energy-rich, non-sugary foods that are easy to digest 4. Return to the clinic if the child is dehydrated or fails to improve after 3 days 5. Do not give antibiotics unless there is blood in the stool, signifying dysentery. 8.2.1. Oral rehydration salts/solution/therapy (ORS) Use of ORS drastically reduces dehydration, and consequently death, due to diarrhea. The original WHO-recommended ORS formulation of ORS from the 1960s reduced dehydration and death, but did not otherwise impact diarrhea severity or duration (Atia & Buchman, 2009). An improved lower-osmolarity formula began to be promoted in 2002 (Hill et al., 2004). Compared 237 with the original formulation, it has been shown to decrease stool volume in at least 5 trials and to decrease diarrheal duration in at least 2 trials (Atia & Buchman, 2009). It is unclear to what extent this might reduce secondary transmission of diarrhea. Zinc supplementation can also be used in conjunction with ORS. Although it does not greatly decrease disease duration, it decreases stool output in comparison with ORS alone (Bahl et al., 2002; Bhatnagar et al., 2004). 8.2.2. Access to care Improved access to medical care is probably most useful for preventing mortality from severe or complicated diarrhea. However, it could also impact diarrhea through decreasing the duration of disease via effective treatment. It might also indirectly prevent disease by promoting other behaviors, such as handwashing or good nutrition. 8.3. Vaccination Among the diarrheal pathogens, only rotavirus has a highly effective vaccine; it is 85-95% effective against severe rotavirus gastroenteritis in children (CDC, 2012). However, a cholera vaccine that is about 50% effective is available, and vaccines are available for other pathogens that are transmitted by the fecal-oral route, such as hepatitis A virus, poliovirus, and Salmonella enterica Typhi (which causes typhoid fever) (CDC, 2012; CDC, 2011). Rotavirus vaccine is not generally available in developing countries, although rotavirus vaccination programs are being investigated. The rotavirus vaccine is less effective in children in developing countries, compared with developed countries, but it remains effective enough to have a large public health impact (Cherian et al., 2012). However, vaccination against enterotoxigenic E. coli (ETEC) is considered feasible, and vaccine candidates are being researched (R. I. Walker et al., 2007). 238 9. APPENDIX C: SOURCE CODE FOR THE MODELS 9.1. Overview of the source code for the models All models were written in Octave, which is a freely available open source programming language for Linux, Mac, or Windows (http://www.gnu.org/software/octave/). Octave and MATLAB have nearly identical syntax, and the code runs on either platform without modifications (though it runs faster with MATLAB). There is no formal documentation for these models, aside from this dissertation; however, the code is extensively commented. The source code for all three models consists of several text files that can be viewed in any text editor: one main file and several functions/subroutines that it calls. They have the extension “.m” and are called 'm-files'. To do a test run of any of the models, copy all m-files to a single directory, start Octave or MATLAB, set the working directory to that directory (if you don't know how, type 'help cd'), and type the name of the appropriate main file (omitting the '.m'). If copies of the source code cannot be found online, the source code can be recreated simply by copying it from this dissertation and pasting it into a text editor. All three models should run without errors on Octave 3.2 or 3.3, as well as MATLAB 7.117.13. It is very likely that they will also run on later versions of Octave or MATLAB. All three models are licensed under the GNU General Public License (see also the copyright notice on page 4), with the exception of the file “erdrey.m” used in the EITS model, which is from the CONTEST toolbox (A. Taylor & Higham, 2008) and is reproduced with the permission of the authors (Des Higham, personal communication, 24 August 2012). 9.2. GNU General Public License The following license applies to all computer code in this dissertation, except otherwise noted; see 'erdrey.m' in section 9.5 The license has been copied directly, without modification, 239 from http://www.gnu.org/licenses/gpl.txt. GNU GENERAL PUBLIC LICENSE Version 3, 29 June 2007 Copyright (C) 2007 Free Software Foundation, Inc. Everyone is permitted to copy and distribute verbatim copies of this license document, but changing it is not allowed. Preamble The GNU General Public License is a free, copyleft license for software and other kinds of works. The licenses for most software and other practical works are designed to take away your freedom to share and change the works. By contrast, the GNU General Public License is intended to guarantee your freedom to share and change all versions of a program--to make sure it remains free software for all its users. We, the Free Software Foundation, use the GNU General Public License for most of our software; it applies also to any other work released this way by its authors. You can apply it to your programs, too. When we speak of free software, we are referring to freedom, not price. Our General Public Licenses are designed to make sure that you have the freedom to distribute copies of free software (and charge for them if you wish), that you receive source code or can get it if you want it, that you can change the software or use pieces of it in new free programs, and that you know you can do these things. To protect your rights, we need to prevent others from denying you these rights or asking you to surrender the rights. Therefore, you have certain responsibilities if you distribute copies of the software, or if you modify it: responsibilities to respect the freedom of others. For example, if you distribute copies of such a program, whether gratis or for a fee, you must pass on to the recipients the same freedoms that you received. You must make sure that they, too, receive or can get the source code. And you must show them these terms so they know their rights. Developers that use the GNU GPL protect your rights with two steps: (1) assert copyright on the software, and (2) offer you this License giving you legal permission to copy, distribute and/or modify it. For the developers' and authors' protection, the GPL clearly explains that there is no warranty for this free software. For both users' and 240 authors' sake, the GPL requires that modified versions be marked as changed, so that their problems will not be attributed erroneously to authors of previous versions. Some devices are designed to deny users access to install or run modified versions of the software inside them, although the manufacturer can do so. This is fundamentally incompatible with the aim of protecting users' freedom to change the software. The systematic pattern of such abuse occurs in the area of products for individuals to use, which is precisely where it is most unacceptable. Therefore, we have designed this version of the GPL to prohibit the practice for those products. If such problems arise substantially in other domains, we stand ready to extend this provision to those domains in future versions of the GPL, as needed to protect the freedom of users. Finally, every program is threatened constantly by software patents. States should not allow patents to restrict development and use of software on general-purpose computers, but in those that do, we wish to avoid the special danger that patents applied to a free program could make it effectively proprietary. To prevent this, the GPL assures that patents cannot be used to render the program non-free. The precise terms and conditions for copying, distribution and modification follow. TERMS AND CONDITIONS 0. Definitions. "This License" refers to version 3 of the GNU General Public License. "Copyright" also means copyright-like laws that apply to other kinds of works, such as semiconductor masks. "The Program" refers to any copyrightable work licensed under this License. Each licensee is addressed as "you". "Licensees" and "recipients" may be individuals or organizations. To "modify" a work means to copy from or adapt all or part of the work in a fashion requiring copyright permission, other than the making of an exact copy. The resulting work is called a "modified version" of the earlier work or a work "based on" the earlier work. A "covered work" means either the unmodified Program or a work based on the Program. To "propagate" a work means to do anything with it that, without 241 permission, would make you directly or secondarily liable for infringement under applicable copyright law, except executing it on a computer or modifying a private copy. Propagation includes copying, distribution (with or without modification), making available to the public, and in some countries other activities as well. To "convey" a work means any kind of propagation that enables other parties to make or receive copies. Mere interaction with a user through a computer network, with no transfer of a copy, is not conveying. An interactive user interface displays "Appropriate Legal Notices" to the extent that it includes a convenient and prominently visible feature that (1) displays an appropriate copyright notice, and (2) tells the user that there is no warranty for the work (except to the extent that warranties are provided), that licensees may convey the work under this License, and how to view a copy of this License. If the interface presents a list of user commands or options, such as a menu, a prominent item in the list meets this criterion. 1. Source Code. The "source code" for a work means the preferred form of the work for making modifications to it. "Object code" means any non-source form of a work. A "Standard Interface" means an interface that either is an official standard defined by a recognized standards body, or, in the case of interfaces specified for a particular programming language, one that is widely used among developers working in that language. The "System Libraries" of an executable work include anything, other than the work as a whole, that (a) is included in the normal form of packaging a Major Component, but which is not part of that Major Component, and (b) serves only to enable use of the work with that Major Component, or to implement a Standard Interface for which an implementation is available to the public in source code form. A "Major Component", in this context, means a major essential component (kernel, window system, and so on) of the specific operating system (if any) on which the executable work runs, or a compiler used to produce the work, or an object code interpreter used to run it. The "Corresponding Source" for a work in object code form means all the source code needed to generate, install, and (for an executable work) run the object code and to modify the work, including scripts to control those activities. However, it does not include the work's System Libraries, or general-purpose tools or generally available free programs which are used unmodified in performing those activities but 242 which are not part of the work. For example, Corresponding Source includes interface definition files associated with source files for the work, and the source code for shared libraries and dynamically linked subprograms that the work is specifically designed to require, such as by intimate data communication or control flow between those subprograms and other parts of the work. The Corresponding Source need not include anything that users can regenerate automatically from other parts of the Corresponding Source. The Corresponding Source for a work in source code form is that same work. 2. Basic Permissions. All rights granted under this License are granted for the term of copyright on the Program, and are irrevocable provided the stated conditions are met. This License explicitly affirms your unlimited permission to run the unmodified Program. The output from running a covered work is covered by this License only if the output, given its content, constitutes a covered work. This License acknowledges your rights of fair use or other equivalent, as provided by copyright law. You may make, run and propagate covered works that you do not convey, without conditions so long as your license otherwise remains in force. You may convey covered works to others for the sole purpose of having them make modifications exclusively for you, or provide you with facilities for running those works, provided that you comply with the terms of this License in conveying all material for which you do not control copyright. Those thus making or running the covered works for you must do so exclusively on your behalf, under your direction and control, on terms that prohibit them from making any copies of your copyrighted material outside their relationship with you. Conveying under any other circumstances is permitted solely under the conditions stated below. Sublicensing is not allowed; section 10 makes it unnecessary. 3. Protecting Users' Legal Rights From Anti-Circumvention Law. No covered work shall be deemed part of an effective technological measure under any applicable law fulfilling obligations under article 11 of the WIPO copyright treaty adopted on 20 December 1996, or similar laws prohibiting or restricting circumvention of such measures. 243 When you convey a covered work, you waive any legal power to forbid circumvention of technological measures to the extent such circumvention is effected by exercising rights under this License with respect to the covered work, and you disclaim any intention to limit operation or modification of the work as a means of enforcing, against the work's users, your or third parties' legal rights to forbid circumvention of technological measures. 4. Conveying Verbatim Copies. You may convey verbatim copies of the Program's source code as you receive it, in any medium, provided that you conspicuously and appropriately publish on each copy an appropriate copyright notice; keep intact all notices stating that this License and any non-permissive terms added in accord with section 7 apply to the code; keep intact all notices of the absence of any warranty; and give all recipients a copy of this License along with the Program. You may charge any price or no price for each copy that you convey, and you may offer support or warranty protection for a fee. 5. Conveying Modified Source Versions. You may convey a work based on the Program, or the modifications to produce it from the Program, in the form of source code under the terms of section 4, provided that you also meet all of these conditions: a) The work must carry prominent notices stating that you modified it, and giving a relevant date. b) The work must carry prominent notices stating that it is released under this License and any conditions added under section 7. This requirement modifies the requirement in section 4 to "keep intact all notices". c) You must license the entire work, as a whole, under this License to anyone who comes into possession of a copy. This License will therefore apply, along with any applicable section 7 additional terms, to the whole of the work, and all its parts, regardless of how they are packaged. This License gives no permission to license the work in any other way, but it does not invalidate such permission if you have separately received it. d) If the work has interactive user interfaces, each must display Appropriate Legal Notices; however, if the Program has interactive interfaces that do not display Appropriate Legal Notices, your work need not make them do so. 244 A compilation of a covered work with other separate and independent works, which are not by their nature extensions of the covered work, and which are not combined with it such as to form a larger program, in or on a volume of a storage or distribution medium, is called an "aggregate" if the compilation and its resulting copyright are not used to limit the access or legal rights of the compilation's users beyond what the individual works permit. Inclusion of a covered work in an aggregate does not cause this License to apply to the other parts of the aggregate. 6. Conveying Non-Source Forms. You may convey a covered work in object code form under the terms of sections 4 and 5, provided that you also convey the machine-readable Corresponding Source under the terms of this License, in one of these ways: a) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by the Corresponding Source fixed on a durable physical medium customarily used for software interchange. b) Convey the object code in, or embodied in, a physical product (including a physical distribution medium), accompanied by a written offer, valid for at least three years and valid for as long as you offer spare parts or customer support for that product model, to give anyone who possesses the object code either (1) a copy of the Corresponding Source for all the software in the product that is covered by this License, on a durable physical medium customarily used for software interchange, for a price no more than your reasonable cost of physically performing this conveying of source, or (2) access to copy the Corresponding Source from a network server at no charge. c) Convey individual copies of the object code with a copy of the written offer to provide the Corresponding Source. This alternative is allowed only occasionally and noncommercially, and only if you received the object code with such an offer, in accord with subsection 6b. d) Convey the object code by offering access from a designated place (gratis or for a charge), and offer equivalent access to the Corresponding Source in the same way through the same place at no further charge. You need not require recipients to copy the Corresponding Source along with the object code. If the place to copy the object code is a network server, the Corresponding Source 245 may be on a different server (operated by you or a third party) that supports equivalent copying facilities, provided you maintain clear directions next to the object code saying where to find the Corresponding Source. Regardless of what server hosts the Corresponding Source, you remain obligated to ensure that it is available for as long as needed to satisfy these requirements. e) Convey the object code using peer-to-peer transmission, provided you inform other peers where the object code and Corresponding Source of the work are being offered to the general public at no charge under subsection 6d. A separable portion of the object code, whose source code is excluded from the Corresponding Source as a System Library, need not be included in conveying the object code work. A "User Product" is either (1) a "consumer product", which means any tangible personal property which is normally used for personal, family, or household purposes, or (2) anything designed or sold for incorporation into a dwelling. In determining whether a product is a consumer product, doubtful cases shall be resolved in favor of coverage. For a particular product received by a particular user, "normally used" refers to a typical or common use of that class of product, regardless of the status of the particular user or of the way in which the particular user actually uses, or expects or is expected to use, the product. A product is a consumer product regardless of whether the product has substantial commercial, industrial or non-consumer uses, unless such uses represent the only significant mode of use of the product. "Installation Information" for a User Product means any methods, procedures, authorization keys, or other information required to install and execute modified versions of a covered work in that User Product from a modified version of its Corresponding Source. The information must suffice to ensure that the continued functioning of the modified object code is in no case prevented or interfered with solely because modification has been made. If you convey an object code work under this section in, or with, or specifically for use in, a User Product, and the conveying occurs as part of a transaction in which the right of possession and use of the User Product is transferred to the recipient in perpetuity or for a fixed term (regardless of how the transaction is characterized), the Corresponding Source conveyed under this section must be accompanied by the Installation Information. But this requirement does not apply if neither you nor any third party retains the ability to install modified object code on the User Product (for example, the work has been installed in ROM). 246 The requirement to provide Installation Information does not include a requirement to continue to provide support service, warranty, or updates for a work that has been modified or installed by the recipient, or for the User Product in which it has been modified or installed. Access to a network may be denied when the modification itself materially and adversely affects the operation of the network or violates the rules and protocols for communication across the network. Corresponding Source conveyed, and Installation Information provided, in accord with this section must be in a format that is publicly documented (and with an implementation available to the public in source code form), and must require no special password or key for unpacking, reading or copying. 7. Additional Terms. "Additional permissions" are terms that supplement the terms of this License by making exceptions from one or more of its conditions. Additional permissions that are applicable to the entire Program shall be treated as though they were included in this License, to the extent that they are valid under applicable law. If additional permissions apply only to part of the Program, that part may be used separately under those permissions, but the entire Program remains governed by this License without regard to the additional permissions. When you convey a copy of a covered work, you may at your option remove any additional permissions from that copy, or from any part of it. (Additional permissions may be written to require their own removal in certain cases when you modify the work.) You may place additional permissions on material, added by you to a covered work, for which you have or can give appropriate copyright permission. Notwithstanding any other provision of this License, for material you add to a covered work, you may (if authorized by the copyright holders of that material) supplement the terms of this License with terms: a) Disclaiming warranty or limiting liability differently from the terms of sections 15 and 16 of this License; or b) Requiring preservation of specified reasonable legal notices or author attributions in that material or in the Appropriate Legal Notices displayed by works containing it; or c) Prohibiting misrepresentation of the origin of that material, or requiring that modified versions of such material be marked in reasonable ways as different from the original version; or 247 d) Limiting the use for publicity purposes of names of licensors or authors of the material; or e) Declining to grant rights under trademark law for use of some trade names, trademarks, or service marks; or f) Requiring indemnification of licensors and authors of that material by anyone who conveys the material (or modified versions of it) with contractual assumptions of liability to the recipient, for any liability that these contractual assumptions directly impose on those licensors and authors. All other non-permissive additional terms are considered "further restrictions" within the meaning of section 10. If the Program as you received it, or any part of it, contains a notice stating that it is governed by this License along with a term that is a further restriction, you may remove that term. If a license document contains a further restriction but permits relicensing or conveying under this License, you may add to a covered work material governed by the terms of that license document, provided that the further restriction does not survive such relicensing or conveying. If you add terms to a covered work in accord with this section, you must place, in the relevant source files, a statement of the additional terms that apply to those files, or a notice indicating where to find the applicable terms. Additional terms, permissive or non-permissive, may be stated in the form of a separately written license, or stated as exceptions; the above requirements apply either way. 8. Termination. You may not propagate or modify a covered work except as expressly provided under this License. Any attempt otherwise to propagate or modify it is void, and will automatically terminate your rights under this License (including any patent licenses granted under the third paragraph of section 11). However, if you cease all violation of this License, then your license from a particular copyright holder is reinstated (a) provisionally, unless and until the copyright holder explicitly and finally terminates your license, and (b) permanently, if the copyright holder fails to notify you of the violation by some reasonable means prior to 60 days after the cessation. 248 Moreover, your license from a particular copyright holder is reinstated permanently if the copyright holder notifies you of the violation by some reasonable means, this is the first time you have received notice of violation of this License (for any work) from that copyright holder, and you cure the violation prior to 30 days after your receipt of the notice. Termination of your rights under this section does not terminate the licenses of parties who have received copies or rights from you under this License. If your rights have been terminated and not permanently reinstated, you do not qualify to receive new licenses for the same material under section 10. 9. Acceptance Not Required for Having Copies. You are not required to accept this License in order to receive or run a copy of the Program. Ancillary propagation of a covered work occurring solely as a consequence of using peer-to-peer transmission to receive a copy likewise does not require acceptance. However, nothing other than this License grants you permission to propagate or modify any covered work. These actions infringe copyright if you do not accept this License. Therefore, by modifying or propagating a covered work, you indicate your acceptance of this License to do so. 10. Automatic Licensing of Downstream Recipients. Each time you convey a covered work, the recipient automatically receives a license from the original licensors, to run, modify and propagate that work, subject to this License. You are not responsible for enforcing compliance by third parties with this License. An "entity transaction" is a transaction transferring control of an organization, or substantially all assets of one, or subdividing an organization, or merging organizations. If propagation of a covered work results from an entity transaction, each party to that transaction who receives a copy of the work also receives whatever licenses to the work the party's predecessor in interest had or could give under the previous paragraph, plus a right to possession of the Corresponding Source of the work from the predecessor in interest, if the predecessor has it or can get it with reasonable efforts. You may not impose any further restrictions on the exercise of the rights granted or affirmed under this License. For example, you may not impose a license fee, royalty, or other charge for exercise of rights granted under this License, and you may not initiate litigation (including a cross-claim or counterclaim in a lawsuit) alleging that any patent claim is infringed by making, using, selling, offering for 249 sale, or importing the Program or any portion of it. 11. Patents. A "contributor" is a copyright holder who authorizes use under this License of the Program or a work on which the Program is based. The work thus licensed is called the contributor's "contributor version". A contributor's "essential patent claims" are all patent claims owned or controlled by the contributor, whether already acquired or hereafter acquired, that would be infringed by some manner, permitted by this License, of making, using, or selling its contributor version, but do not include claims that would be infringed only as a consequence of further modification of the contributor version. For purposes of this definition, "control" includes the right to grant patent sublicenses in a manner consistent with the requirements of this License. Each contributor grants you a non-exclusive, worldwide, royalty-free patent license under the contributor's essential patent claims, to make, use, sell, offer for sale, import and otherwise run, modify and propagate the contents of its contributor version. In the following three paragraphs, a "patent license" is any express agreement or commitment, however denominated, not to enforce a patent (such as an express permission to practice a patent or covenant not to sue for patent infringement). To "grant" such a patent license to a party means to make such an agreement or commitment not to enforce a patent against the party. If you convey a covered work, knowingly relying on a patent license, and the Corresponding Source of the work is not available for anyone to copy, free of charge and under the terms of this License, through a publicly available network server or other readily accessible means, then you must either (1) cause the Corresponding Source to be so available, or (2) arrange to deprive yourself of the benefit of the patent license for this particular work, or (3) arrange, in a manner consistent with the requirements of this License, to extend the patent license to downstream recipients. "Knowingly relying" means you have actual knowledge that, but for the patent license, your conveying the covered work in a country, or your recipient's use of the covered work in a country, would infringe one or more identifiable patents in that country that you have reason to believe are valid. If, pursuant to or in connection with a single transaction or arrangement, you convey, or propagate by procuring conveyance of, a covered work, and grant a patent license to some of the parties 250 receiving the covered work authorizing them to use, propagate, modify or convey a specific copy of the covered work, then the patent license you grant is automatically extended to all recipients of the covered work and works based on it. A patent license is "discriminatory" if it does not include within the scope of its coverage, prohibits the exercise of, or is conditioned on the non-exercise of one or more of the rights that are specifically granted under this License. You may not convey a covered work if you are a party to an arrangement with a third party that is in the business of distributing software, under which you make payment to the third party based on the extent of your activity of conveying the work, and under which the third party grants, to any of the parties who would receive the covered work from you, a discriminatory patent license (a) in connection with copies of the covered work conveyed by you (or copies made from those copies), or (b) primarily for and in connection with specific products or compilations that contain the covered work, unless you entered into that arrangement, or that patent license was granted, prior to 28 March 2007. Nothing in this License shall be construed as excluding or limiting any implied license or other defenses to infringement that may otherwise be available to you under applicable patent law. 12. No Surrender of Others' Freedom. If conditions are imposed on you (whether by court order, agreement or otherwise) that contradict the conditions of this License, they do not excuse you from the conditions of this License. If you cannot convey a covered work so as to satisfy simultaneously your obligations under this License and any other pertinent obligations, then as a consequence you may not convey it at all. For example, if you agree to terms that obligate you to collect a royalty for further conveying from those to whom you convey the Program, the only way you could satisfy both those terms and this License would be to refrain entirely from conveying the Program. 13. Use with the GNU Affero General Public License. Notwithstanding any other provision of this License, you have permission to link or combine any covered work with a work licensed under version 3 of the GNU Affero General Public License into a single combined work, and to convey the resulting work. The terms of this License will continue to apply to the part which is the covered work, but the special requirements of the GNU Affero General Public License, section 13, concerning interaction through a network will apply to the combination as such. 251 14. Revised Versions of this License. The Free Software Foundation may publish revised and/or new versions of the GNU General Public License from time to time. Such new versions will be similar in spirit to the present version, but may differ in detail to address new problems or concerns. Each version is given a distinguishing version number. If the Program specifies that a certain numbered version of the GNU General Public License "or any later version" applies to it, you have the option of following the terms and conditions either of that numbered version or of any later version published by the Free Software Foundation. If the Program does not specify a version number of the GNU General Public License, you may choose any version ever published by the Free Software Foundation. If the Program specifies that a proxy can decide which future versions of the GNU General Public License can be used, that proxy's public statement of acceptance of a version permanently authorizes you to choose that version for the Program. Later license versions may give you additional or different permissions. However, no additional obligations are imposed on any author or copyright holder as a result of your choosing to follow a later version. 15. Disclaimer of Warranty. THERE IS NO WARRANTY FOR THE PROGRAM, TO THE EXTENT PERMITTED BY APPLICABLE LAW. EXCEPT WHEN OTHERWISE STATED IN WRITING THE COPYRIGHT HOLDERS AND/OR OTHER PARTIES PROVIDE THE PROGRAM "AS IS" WITHOUT WARRANTY OF ANY KIND, EITHER EXPRESSED OR IMPLIED, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE ENTIRE RISK AS TO THE QUALITY AND PERFORMANCE OF THE PROGRAM IS WITH YOU. SHOULD THE PROGRAM PROVE DEFECTIVE, YOU ASSUME THE COST OF ALL NECESSARY SERVICING, REPAIR OR CORRECTION. 16. Limitation of Liability. IN NO EVENT UNLESS REQUIRED BY APPLICABLE LAW OR AGREED TO IN 252 WRITING WILL ANY COPYRIGHT HOLDER, OR ANY OTHER PARTY WHO MODIFIES AND/OR CONVEYS THE PROGRAM AS PERMITTED ABOVE, BE LIABLE TO YOU FOR DAMAGES, INCLUDING ANY GENERAL, SPECIAL, INCIDENTAL OR CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OR INABILITY TO USE THE PROGRAM (INCLUDING BUT NOT LIMITED TO LOSS OF DATA OR DATA BEING RENDERED INACCURATE OR LOSSES SUSTAINED BY YOU OR THIRD PARTIES OR A FAILURE OF THE PROGRAM TO OPERATE WITH ANY OTHER PROGRAMS), EVEN IF SUCH HOLDER OR OTHER PARTY HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES. 17. Interpretation of Sections 15 and 16. If the disclaimer of warranty and limitation of liability provided above cannot be given local legal effect according to their terms, reviewing courts shall apply local law that most closely approximates an absolute waiver of all civil liability in connection with the Program, unless a warranty or assumption of liability accompanies a copy of the Program in return for a fee. END OF TERMS AND CONDITIONS 253 9.3. The QMRA model simulating the Lifestraw field trial (chapter 3) The model consists of several text files containing necessary functions and subroutines; the core program is ’QMRAv13_20110414.m’. Simulation options are set by the choice of several values at the top of the files 'QMRAv13_20110414.m' and 'GetTrialParams.m'. These options default to values that generate a single test run of the simulation. The source code is found below. The filename of each of the source code files is found in the copyright information near the top of each file. Each file starts on a new page. 254 %QMRA for Lifestraw Family RCT in DRC (Boisson 2010), coded by Kyle S. Enger (engerkyl@msu.edu) with suggestions from Joe Eisenberg & Bryan Mayer. %It was used to produce the manuscript, Linking quantitative microbial risk assessment and epidemiological data: Informing safe drinking water trials in developing countries, by Kyle S. Enger, Kara L. Nelson, Thomas Clasen, Joan B. Rose, and Joseph N. S. Eisenberg, published in Environmental Science and Technology in 2012. %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (QMRAv13_20110414.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Requires Octave 3.2 or later and the octave-image package. %Also works rather well on Matlab, running about 4x faster. Requires the statistics toolbox, and possibly others. %To run, start Octave, change the directory to the location of the program files (using the 'cd' command), and type 'QMRAv13_20110414'. 255 %This code (QMRAv13_20110414.m) is accompanied by several functions/subroutines, which need to be in the working directory: % AssignInf.m: Stochastically assigns infections to individuals with fixed durations % AssignInfRand.m: Stochastically assigns infections to individuals with random durations % AssignInfIllRand.m: Stochastically assigns infections & illnesses to individuals with random durations % CalcDiarrhWeeks.m: Determines whether a week with 1+ days of diarrhea is actually reported as a 'diarrhea week' % DRbP.m: Beta-Poisson dose response model % DRchoose.m: Executes the appropriate dose response model and determines illness % DRexp.m: Exponential dose response model % durEc.m: Randomly pick a duration for E. coli infection % durGi.m: Randomly pick a duration for Giardia infection % durRo.m: Randomly pick a duration for rotavirus infection % Examine1Run.m: Allows inspection of the complete simulated data from a single run of this code. % GetTrialParams.m: Generates a series of trial parameters instead of determining them stochastically. % OutQMRAmerge.m: Allows conglomeration of output from multiple model executions (e.g., if parallel processing). %Kids are defined as those < 5 yrs of age. All others are considered adults. %====Initial housekeeping==== clear all; Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). if Octave == 0; format compact; format shortg; end; StartTime = clock; %Recording start time of simulation. Is used to timestamp output file. %ignore_function_time_stamp('all'); %Hopefully saves time by preventing Octave from 256 checking if functions have changed during the run. However, makes debugging difficult (altered functions aren't updated). %====Setting simulation options====More simulation options are in GetTrialParams.m==== badPlacebo = 1; %Whether the simulation is run with an imperfect placebo (1) or a perfect placebo (0). loops = 1; %No. Monte Carlo iterations of QMRA, if parameters are chosen stochastically. If 1, & noStochParams != 1, stores all run info. noStochParams = 1; %If 1, use GetTrialParams.m to iterate over a series of parameter choices (not choose them stochastically). Resets 'loops'. testing = 1; %Equals 1 if testing (production of plots of infection prevalence) is desired. Only works if there are <= 25 runs. compliance100 = 0; %Equals 1 if device is used perfectly, always, by everyone. dailyVariation = 1; %Equals 1 if pathogen concentrations are allowed to vary by person by day, instead of taking a single fixed value. randomDurations = 1; %Equals 1 if illness durations are randomized instead of taking a single fixed value. Renders Duration vector mostly moot (still used to choose start time of surveys). storeStatus = 1; %If 1, store infection & disease status, to determine 'actual' (i.e., reported & unreported) burden of infection. %=====Housekeeping based on simulation options==== %cd ~/Dropbox/Octave/Lifestraw; %Selecting working directory. Assumes Linux. %cd QMRAv13; %TODO: Update whenever a new version of this code is produced. %if randomDurations == 1; IllDurations; end %Reading in necessary functions to randomly choose illness durations. if noStochParams == 1; GetTrialParams; end %GetTrialParams.m reads in pathogens/L & background measure to be tried. if testing == 1 && loops > 5 && noStochParams == 0; error('Do not use so many loops with testing code enabled, or you will flood the screen with graphs'); end %=====Ending housekeeping. Starting parameter values:===== %Water concentration and disease parameter values % These 4 vectors have 3 elements corresponding to our 3 pathogens of interest: 257 % [ETEC/EPEC, Giardia, rotavirus] PDiseaseK = [0.107, 0.064 , 0.098]; PDiseaseK = PDiseaseK / sum(PDiseaseK); %Proportion of kid diarrhea episodes due to various pathogens, adjusted to sum to 1 (assume unknown episoded are distributed just like known episodes) PathogensLMax = [2e5, 1.35, 0.18]; %Maximum pathogens/L; minimum is zero for all. 2fold empirically observed levels that led to LP exceeding the 95% CI for the placebo group, each pathogen taken individually. pathogensLcv = sqrt(3407044) / 2509.329; %Coeff. of var. calc. from variance & mean of 'cfbef' (Boisson 2010 water qual. data), high outliers (>= 30000 CFU) removed. MorbidityK = [0.214 , 0.59 , 0.397]; %Proportion of infected 'kids' (<5y) with diarrhea. Duration = [82.1/24, 18.3, 2.5]; Duration = round(Duration); %Duration of diarrhea (days) prevDiarrhBaseKidsMax = 0.0972; LongPrevs = [0.103, 0.0896]; %Vector of raw long. prev. values from Boisson dataset, kids w. placebo, then kids w. intervention). longPrevRangeMult = 0.4; %Deprecated; multiplier (between 0 & 1) to determine acceptable range for longitudinal prevalence & its ratio. longPrevKidsDesiredRatio = 0.872; %Ratio from study of long. prev. in intervention kids to placebo kids. prevDiarrhBaseKids = LongPrevs(2); %Baseline non-waterborne reported prevalence of disease. Can be no greater than that observed in the RCT. ImmuneTimes = [7 7 7]; %Length of immune period for all pathogens. %Parameters: population & exposure information. Model by household later. nKidsInt = 85; nKidsPla = 105; %Boisson 2010. drinkKids = 1.178; %drinkKidsSD = 0.186; %daily water intake, L/d, kids pUse = mean([0.685, 0.757, 0.483, 0.670]); %Mean of people using the device the previous day, int. & placebo, at 8 & 14 months. pTreat = 2/3; %Proportion of water treated, if device is being used. Try 1, 2/3, & 1/3. if compliance100 == 1; pUse = 1; pTreat = 1; end; %If perfect compliance is desired, override above 2 lines. %Parameters: device effectiveness information 258 LRsInt = [6.9, 4.7, 3.6]; %Log reductions [bacteria, viruses, protozoa] by the intervention device, with upper & lower ranges LRsPla = [1.05, 1.05, 1.05]; %As above, for 'placebo' device if badPlacebo == 0; LRsPla = [0 0 0]; end; %If a perfect placebo is being modeled, override above line. %Parameters: dose response, order as above [ETEC/EPEC, Giardia, rotavirus] KorN50 = [2111912, 0.01982, 6.171]; %Exponential k parameter or beta-Poisson N50 parameter alpha = [0.1549, NaN, 0.2531]; %Presence/absence of alpha value determines beta-Poisson or exponential dose resp. %Bias parameters remembrance = 0.54; %Proportion of diarrhea episodes remembered (and reported) if they ended >2d before being surveyed; assume perfect recall if episode is on day 0, 1, or 2 %Study parameters - relating to how the study was conducted recallPeriod = 7; %Number of days in the past over which people were asked to remember diarrheal episodes interval = 30; %Interval between beginnings of recall periods nRecallPeriods = 12; %Number of recall periods (i.e., number of simulated diarrhea surveys) daysBurnIn = ceil(max(ImmuneTimes) + max(Duration) + recallPeriod) * 4; %Days required for prevalence to reach equilibrium (simulation starts with nobody infected). Allows ample margin for reaching equilibrium. %=====Ending parameter values===== maxTime = daysBurnIn + (nRecallPeriods-1)*interval; %Time over which to run each simulation. %Creating output structure for storing results from main QMRA loop OutQMRA = struct('StartTime',StartTime,'Fit',NaN(1,loops),'KILP',NaN(1,loops),'KPLP',NaN(1,loops),' LPR',NaN(1,loops),... 'EcL',NaN(1,loops),'GiL',NaN(1,loops),'RoL',NaN(1,loops),'PrevBase',NaN(1,loops)); %'Fit' is no longer used. if storeStatus == 1; %Optionally creating cell array for more detailed output. Not 259 needed for calibration step. DailyStatus = cell(2, loops); %Each cell in the array needs to contain a matrix (rows are days, columns are variables). DailyStatus(:) = {NaN(maxTime,22)}; %1st row for intervention, 2nd row for placebo. %Columns are: 1:3 = new inf. Ec,Gi,&Ro; 4:6 = new cases; 7:14 = num. inf. (0,Ec,Gi,Ro,EcGi,EcRo,GiRo,EcGiRo); 15:22 = as inf., but ill. end; tic %Starts timer if noStochParams == 1; loops = size(TrialParams); loops = loops(1); end %Resets loops if a series of trial pathogens/L values is being used. for i = 1:loops; %=====Starting main QMRA loop.===== Loops once for each QMRA run. i indexes each loop. %=====Randomly generating parameters for this iteration===== switch(noStochParams); case 0; PathogensLmeans = rand(1,length(MorbidityK)) .* PathogensLMax; %Uniform sampling of the mean value for each pathogen. prevDiarrhBaseKids = rand(1,1)*prevDiarrhBaseKidsMax;%For Matlab. case 1; %Pulling parameters from previous runs consistent with RCT. PathogensLmeans = TrialParams(i,1:3); prevDiarrhBaseKids = TrialParams(i,4); otherwise error('noStochParams must be 0 or 1'); end OutQMRA.EcL(i)=PathogensLmeans(1); OutQMRA.GiL(i)=PathogensLmeans(2); OutQMRA.RoL(i)=PathogensLmeans(3); OutQMRA.PrevBase(i) = prevDiarrhBaseKids; %This line & previous store the varying parameter values. %=====End random parameter generation - start setup of values/vectors/matrices used throughout simulation===== %Computing daily doses of pathogens ingested in drinking water, using water drunk per day and log reduction values 260 %Computing parameter values for gamma distribution of pathogens in water Scales = (pathogensLcv * PathogensLmeans).^2 ./ PathogensLmeans; Shapes = PathogensLmeans ./ Scales; %Assigning infections randomly based on responses, assuming infections with different pathogens are independent. % A person can have only 1 infection per pathogen. OutPrevs = struct('KidsInt',NaN(1,nRecallPeriods),'KidsPla',NaN(1,nRecallPeriods)); %Creating struct to hold prevalences from surveys. %Creating person-pathogen matrices; everybody starts infected for a random amount of time, to reduce periodicity from constant disease duration. KidsInt = rand(nKidsInt,length(Duration)) * 2*(max(Duration)+max(ImmuneTimes)); KidsPla = rand(nKidsPla,length(Duration)) * 2*(max(Duration)+max(ImmuneTimes)); if storeStatus == 1; %Optionally, making corresponding matrices to store disease info. KidsIntD = ones(size(KidsInt)); %NaN means never infected, 0 means uninfected, 1 means infected, 2 means diseased. KidsPlaD = ones(size(KidsPla)); %Note that everyone starts off infected with everything, just as with KidsInt & KidsPla above. end OutputFields = fieldnames(OutPrevs); if Octave == 1; fflush(stdout); end; %Forces a write to screen. %=======Code for testing purposes only======= if testing == 1; %Only runs when testing code. These vectors needed for charting infection prevalence over the entire simulation. KidsIntInfPrev = NaN(nKidsInt,1); %Creating a vector to hold infection prevalence info, intervention group. KidsPlaInfPrev = NaN(nKidsPla,1); %As above, placebo group. end; %========End code for testing purposes======== %Now storing all person-pathogen matrices, but only if exactly 1 loop is requested. This repeats at the end of each day. if loops == 1; PPmatrices(1).KidsInt = KidsInt; PPmatrices(1).KidsPla = KidsPla; 261 PPmatrices(1).KidsIntD = KidsIntD; PPmatrices(1).KidsPlaD = KidsPlaD; end; %========Begin daily loop, t indexes the days========= for t = 1:maxTime; KidsInt = KidsInt - 1; %Note: a value of 0 signifies the first day of immunity. KidsPla = KidsPla - 1; if storeStatus == 1; %Optionally, tracking recovery from infection/disease. KidsIntD(find(KidsInt <= 0)) = 0; KidsPlaD(find(KidsPla <= 0)) = 0; end %Computing doses in untreated water, varying for each child, each day. Dose.KidsInt = NaN(nKidsInt,length(Duration)); %Matrix of doses per child (intervention). Columns are pathogens. Dose.KidsPla = NaN(nKidsPla,length(Duration)); %Matrix of doses per child (placebo). Columns are pathogens. RandComp = rand(max(nKidsInt,nKidsPla),2); %Random numbers for determining compliance (i.e., use of device). for j = 1:length(Duration); %Looping over pathogens to determine daily doses for ea. person. if isnan(Shapes(j)) == 1; %If mean pathogen conc. is set to 0, pathogen conc. is always 0. Dose.KidsInt(:,j) = 0; Dose.KidsPla(:,j) = 0; else if dailyVariation == 1; Dose.KidsInt(:,j) = gamrnd(Shapes(j),Scales(j),[nKidsInt,1]) * drinkKids; %Initial untreated dose Dose.KidsPla(:,j) = gamrnd(Shapes(j),Scales(j),[nKidsPla,1]) * drinkKids; else Dose.KidsInt(:,j) = PathogensLmeans(j) * drinkKids; %Dose becomes the mean dose if daily variation is turned off. 262 Dose.KidsPla(:,j) = PathogensLmeans(j) * drinkKids; end end %Next 2 lines: Determining who complies and has a nonzero dose (because log reduction would fail on zero dose). Comp.KidsInt = intersect(find(RandComp(1:nKidsInt,1) < pUse), find(Dose.KidsInt(:,j) > 0)); Comp.KidsPla = intersect(find(RandComp(2:nKidsPla,1) < pUse), find(Dose.KidsPla(:,j) > 0)); %Now applying LRs, if using device. Includes adjustment for partial treatment of water (pTreat). Dose.KidsInt(Comp.KidsInt,j) = 10.^(log10(Dose.KidsInt(Comp.KidsInt,j) * pTreat) - LRsInt(j)) + Dose.KidsInt(Comp.KidsInt,j) * (1 - pTreat); Dose.KidsPla(Comp.KidsPla,j) = 10.^(log10(Dose.KidsPla(Comp.KidsPla,j) * pTreat) - LRsPla(j)) + Dose.KidsPla(Comp.KidsPla,j) * (1 - pTreat); end %Computing responses (diarrheal illness) using custom functions DRexp() and DRbP(). Responses.KidsInt(:,1) = DRbP(KorN50(1),alpha(1),Dose.KidsInt(:,1)); %Note: response matrices correspond to the person-path. matrices. Responses.KidsInt(:,2) = DRexp(KorN50(2),Dose.KidsInt(:,2)); Responses.KidsInt(:,3) = DRbP(KorN50(3),alpha(3),Dose.KidsInt(:,3)); Responses.KidsPla(:,1) = DRbP(KorN50(1),alpha(1),Dose.KidsPla(:,1)); Responses.KidsPla(:,2) = DRexp(KorN50(2),Dose.KidsPla(:,2)); Responses.KidsPla(:,3) = DRbP(KorN50(3),alpha(3),Dose.KidsPla(:,3)); switch(randomDurations); case 1 switch(storeStatus); case 1 %DailyStatus columns: 1:3 = new inf. Ec,Gi,&Ro; 4:6 = new cases; 7:14 = num. inf. (0,Ec,Gi,Ro,EcGi,EcRo,GiRo,EcGiRo); 15:22 = as inf., but ill. KidsIntDold = KidsIntD; KidsPlaDold = KidsPlaD; [KidsInt,KidsIntD] = 263 AssignInfIllRand(KidsInt,KidsIntD,Responses.KidsInt,ImmuneTimes,MorbidityK); [KidsPla,KidsPlaD] = AssignInfIllRand(KidsPla,KidsPlaD,Responses.KidsPla,ImmuneTimes,MorbidityK); for s = 1:6; %Store counts of new infections & illnesses. if s <= 3; %Infections: DailyStatus{1,i}(t,s) = length(intersect(find(KidsIntDold(:,s) == 0), find(KidsIntD(:,s) > 0))); DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaDold(:,s) == 0), find(KidsPlaD(:,s) > 0))); else %Illnesses: DailyStatus{1,i}(t,s) = length(intersect(find(KidsIntDold(:,s-3) == 0), find(KidsIntD(:,s-3) == 2))); DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaDold(:,s-3) == 0), find(KidsPlaD(:,s-3) == 2))); end end otherwise KidsInt = AssignInfRand(KidsInt,Responses.KidsInt,ImmuneTimes); KidsPla = AssignInfRand(KidsPla,Responses.KidsPla,ImmuneTimes); end otherwise KidsInt = AssignInf(KidsInt,Responses.KidsInt,Duration,ImmuneTimes); KidsPla = AssignInf(KidsPla,Responses.KidsPla,Duration,ImmuneTimes); end if t >= daysBurnIn && mod(t-(daysBurnIn),interval) == 0; %Obtaining results from diarrhea assessment survey. %Determining reported diarrhea-weeks. CalcDiarrhWeeks() uses the Person-Pathogen Matrices, morbidity ratios, and recall of diarrhea episodes to determine if a week was reported as a week with diarrhea. KidsIntRD = CalcDiarrhWeeks(KidsInt,remembrance,recallPeriod,MorbidityK); 264 %Whether diarrhea was reported by each particular person. Note age (last column) is removed from the person-pathogen matrix when inputted to the function. OutPrevs.KidsInt((t-daysBurnIn)/interval+1) = sum(KidsIntRD)/length(KidsIntRD); %Getting prevalence for each diarrhea survey KidsPlaRD = CalcDiarrhWeeks(KidsPla,remembrance,recallPeriod,MorbidityK); %Like above 2 lines, but kid placebo OutPrevs.KidsPla((t-daysBurnIn)/interval+1) = sum(KidsPlaRD)/length(KidsPlaRD); fprintf(1,['Day ',num2str(t),'/',num2str(maxTime),' done--']); if Octave == 1; fflush(stdout); end; %Forces a write to screen. end if testing == 1; %=====Testing code===== KidsIntInfPrev(t) = sum(max(KidsInt') > 0) / nKidsInt;%If max value over all pathogens >0, then there is an infection. KidsPlaInfPrev(t) = sum(max(KidsPla') > 0) / nKidsPla;%Transposing so that max is for ea. row instead of ea. column. KidsIntInfEc(t) = sum(KidsInt(:,1) > 0) / nKidsInt; KidsIntInfGi(t) = sum(KidsInt(:,2) > 0) / nKidsInt; KidsIntInfRo(t) = sum(KidsInt(:,3) > 0) / nKidsInt; KidsPlaInfEc(t) = sum(KidsPla(:,1) > 0) / nKidsPla; KidsPlaInfGi(t) = sum(KidsPla(:,2) > 0) / nKidsPla; KidsPlaInfRo(t) = sum(KidsPla(:,3) > 0) / nKidsPla; end %======End testing code====== if storeStatus == 1; %Columns are: 1:3 = new inf. Ec,Gi,&Ro; 4:6 = new cases; 7:14 = num. inf. (0,Ec,Gi,Ro,EcGi,EcRo,GiRo,EcGiRo); 15:22 = as inf., but ill. for s = 7:22; %All based on 0=uninfected, 1=asymptomatic, 2=ill. if s == 7; %Tallying completely uninfected people DailyStatus{1,i}(t,s) = length(find(sum(KidsIntD') == 0)); DailyStatus{2,i}(t,s) = length(find(sum(KidsPlaD') == 0)); elseif s == 8; %Infected with only Ec DailyStatus{1,i}(t,s) = length(intersect(find(KidsIntD(:,1) > 265 0), find(sum([KidsIntD(:,2) KidsIntD(:,3)]') == 0)')); DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,1) > 0), find(sum([KidsPlaD(:,2) KidsPlaD(:,3)]') == 0)')); elseif s == 9; %Infected with only Gi DailyStatus{1,i}(t,s) = length(intersect(find(KidsIntD(:,2) > 0), find(sum([KidsIntD(:,1) KidsIntD(:,3)]') == 0)')); DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,2) > 0), find(sum([KidsPlaD(:,1) KidsPlaD(:,3)]') == 0)')); elseif s == 10; %Infected with only Ro DailyStatus{1,i}(t,s) = length(intersect(find(KidsIntD(:,3) > 0), find(sum([KidsIntD(:,1) KidsIntD(:,2)]') == 0)')); DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,3) > 0), find(sum([KidsPlaD(:,1) KidsPlaD(:,2)]') == 0)')); elseif s == 11; %Tallying infected with Ec & Gi only DailyStatus{1,i}(t,s) = length(intersect(find(prod(KidsIntD(:,1:2)') ~= 0), find(KidsIntD(:,3) == 0))); DailyStatus{2,i}(t,s) = length(intersect(find(prod(KidsPlaD(:,1:2)') ~= 0), find(KidsPlaD(:,3) == 0))); elseif s == 12; %Tallying infected with Ec & Ro only DailyStatus{1,i}(t,s) = length(intersect(find(prod([KidsIntD(:,1) KidsIntD(:,3)]') ~= 0), find(KidsIntD(:,2) == 0))); DailyStatus{2,i}(t,s) = length(intersect(find(prod([KidsPlaD(:,1) KidsPlaD(:,3)]') ~= 0), find(KidsPlaD(:,2) == 0))); elseif s == 13; %Tallying infected with Gi & Ro only DailyStatus{1,i}(t,s) = length(intersect(find(prod(KidsIntD(:,2:3)') ~= 0), find(KidsIntD(:,1) == 0))); DailyStatus{2,i}(t,s) = length(intersect(find(prod(KidsPlaD(:,2:3)') ~= 0), find(KidsPlaD(:,1) == 0))); elseif s == 14; %Tallying infected with Ec & Gi & Ro DailyStatus{1,i}(t,s) = length(find(prod(KidsIntD(:,1:3)') ~= 0)); 266 DailyStatus{2,i}(t,s) = length(find(prod(KidsPlaD(:,1:3)') ~= 0)); elseif s == 15; %Tallying completely non-ill people DailyStatus{1,i}(t,s) = length(find(sum(KidsIntD' .^2) <= 3)); DailyStatus{2,i}(t,s) = length(find(sum(KidsPlaD' .^2) <= 3)); elseif s == 16; %Ill with only Ec DailyStatus{1,i}(t,s) = length(intersect(find(KidsIntD(:,1) == 2), find(sum(KidsIntD(:,2:3)' .^2) <= 2)')); DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,1) == 2), find(sum(KidsPlaD(:,2:3)' .^2) <= 2)')); elseif s == 17; %Ill with only Gi DailyStatus{1,i}(t,s) = length(intersect(find(KidsIntD(:,2) == 2), find(sum([KidsIntD(:,1) KidsIntD(:,3)]' .^2) <= 2)')); DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,2) == 2), find(sum([KidsPlaD(:,1) KidsPlaD(:,3)]' .^2) <= 2)')); elseif s == 18; %Ill with only Ro DailyStatus{1,i}(t,s) = length(intersect(find(KidsIntD(:,3) == 2), find(sum(KidsIntD(:,1:2)' .^2) <= 2)')); DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,3) == 2), find(sum(KidsPlaD(:,1:2)' .^2) <= 2)')); elseif s == 19; %Tallying ill with Ec & Gi only DailyStatus{1,i}(t,s) = length(intersect(find(prod(KidsIntD(:,1:2)') == 4), find(KidsIntD(:,3) <= 1))); DailyStatus{2,i}(t,s) = length(intersect(find(prod(KidsPlaD(:,1:2)') == 4), find(KidsPlaD(:,3) <= 1))); elseif s == 20; %Tallying ill with Ec & Ro only DailyStatus{1,i}(t,s) = length(intersect(find(prod([KidsIntD(:,1) KidsIntD(:,3)]') == 4), find(KidsIntD(:,2) <= 1))); DailyStatus{2,i}(t,s) = length(intersect(find(prod([KidsPlaD(:,1) KidsPlaD(:,3)]') == 4), find(KidsPlaD(:,2) <= 1))); elseif s == 21; %Tallying ill with Gi & Ro only 267 DailyStatus{1,i}(t,s) = length(intersect(find(prod(KidsIntD(:,2:3)') == 4), find(KidsIntD(:,1) <= 1))); DailyStatus{2,i}(t,s) = length(intersect(find(prod(KidsPlaD(:,2:3)') == 4), find(KidsPlaD(:,1) <= 1))); elseif s == 22; %Tallying ill with Ec & Gi & Ro DailyStatus{1,i}(t,s) = length(find(prod(KidsIntD(:,1:3)') == 8)); DailyStatus{2,i}(t,s) = length(find(prod(KidsPlaD(:,1:3)') == 8)); end end end %Now storing the person-pathogen matrices at the end of this day, if exactly 1 loop was requested. if loops == 1; PPmatrices(t+1).KidsInt = KidsInt; PPmatrices(t+1).KidsPla = KidsPla; PPmatrices(t+1).KidsIntD = KidsIntD; PPmatrices(t+1).KidsPlaD = KidsPlaD; end; %sizeof(DailyStatus) %Debug measure - too big for memory? end %disp([' ']) %Adds line feed to separate the progress counters. %=========End daily loop, begin more testing code====== if testing == 1; maxTimePla = find(KidsPlaInfPrev == max(KidsPlaInfPrev)); %Getting time points of max. prevalence of infection (1st, if tie). disp(['1st time point where max. prevalence (placebo) is seen is ',num2str(maxTimePla(1))]) %Printing first point of max. prevalence. figure(i); %Plotting infection prevalence. subplot(2,1,1); plot([1:maxTime],KidsIntInfPrev,'-k'); title 'Daily infec. prev. (red=Ec,green=Gi,blue=Ro,black=any), dots = reported waterborne prev.'; ylabel 'Proportion affected (intervention)'; 268 xlim([0 625]); hold on; plot([1:maxTime],KidsIntInfEc,'-r'); plot([1:maxTime],KidsIntInfGi,'-g'); plot([1:maxTime],KidsIntInfRo,'-b'); h5 = plot([daysBurnIn:interval:maxTime],OutPrevs.KidsInt,'cx'); set(h5,'linewidth',2); legend('{\fontsize{10} Any infection (daily)}','{\fontsize{10} E. coli infection (daily)}','{\fontsize{10} Giardia infection (daily)}','{\fontsize{10} Rotavirus infection (daily)}','{\fontsize{10} LP_{Irwd} (prior week)}'); hold off; subplot(2,1,2); plot([1:maxTime],KidsPlaInfPrev,'-k'); xlim([0 625]); xlabel 'Time (simulated days)'; ylabel 'Proportion affected (placebo)'; hold on; plot([1:maxTime],KidsPlaInfEc,'-r'); plot([1:maxTime],KidsPlaInfGi,'-g'); plot([1:maxTime],KidsPlaInfRo,'-b'); h5 = plot([daysBurnIn:interval:maxTime],OutPrevs.KidsPla,'cx'); set(h5,'linewidth',2); legend('{\fontsize{10} Any infection (daily)}','{\fontsize{10} E. coli infection (daily)}','{\fontsize{10} Giardia infection (daily)}','{\fontsize{10} Rotavirus infection (daily)}','{\fontsize{10} LP_{Prwd} (prior week)}'); hold off; end %=========End testing code========= %Adding baseline prevalence. Corrected for doublecounting (some infected people would have been infected anyway by baseline transmission). OutPrevs.KidsInt = OutPrevs.KidsInt + prevDiarrhBaseKids * (1 - OutPrevs.KidsInt); OutPrevs.KidsPla = OutPrevs.KidsPla + prevDiarrhBaseKids * (1 - OutPrevs.KidsPla); i %Printing i as a debug measure OutQMRA.KILP(i) = mean(OutPrevs.KidsInt); OutQMRA.KPLP(i) = mean(OutPrevs.KidsPla); 269 %Output long. prevs. OutQMRA.LPR(i) = OutQMRA.KILP(i) / OutQMRA.KPLP(i); if storeStatus == 1; %Storing incidences for intervention group. OutQMRA.IncInfIntEc(i) = sum(DailyStatus{1,i}(maxTime-364:maxTime,1)); %Yearly infection incidence for E. coli. OutQMRA.IncIllIntEc(i) = sum(DailyStatus{1,i}(maxTime-364:maxTime,4)); %Yearly illness incidence for E. coli. OutQMRA.IncInfIntGi(i) = sum(DailyStatus{1,i}(maxTime-364:maxTime,2)); %Yearly infection incidence for Giardia. OutQMRA.IncIllIntGi(i) = sum(DailyStatus{1,i}(maxTime-364:maxTime,5)); %Yearly illness incidence for Giardia. OutQMRA.IncInfIntRo(i) = sum(DailyStatus{1,i}(maxTime-364:maxTime,3)); %Yearly infection incidence for rotavirus. OutQMRA.IncIllIntRo(i) = sum(DailyStatus{1,i}(maxTime-364:maxTime,6)); %Yearly illness incidence for rotavirus. %As above, for placebo group. OutQMRA.IncInfPlaEc(i) = sum(DailyStatus{2,i}(maxTime-364:maxTime,1)); %Yearly infection incidence for E. coli. OutQMRA.IncIllPlaEc(i) = sum(DailyStatus{2,i}(maxTime-364:maxTime,4)); %Yearly illness incidence for E. coli. OutQMRA.IncInfPlaGi(i) = sum(DailyStatus{2,i}(maxTime-364:maxTime,2)); %Yearly infection incidence for Giardia. OutQMRA.IncIllPlaGi(i) = sum(DailyStatus{2,i}(maxTime-364:maxTime,5)); %Yearly illness incidence for Giardia. OutQMRA.IncInfPlaRo(i) = sum(DailyStatus{2,i}(maxTime-364:maxTime,3)); %Yearly infection incidence for rotavirus. OutQMRA.IncIllPlaRo(i) = sum(DailyStatus{2,i}(maxTime-364:maxTime,6)); %Yearly illness incidence for rotavirus. %'Actual' longitudinal prevalences (person-days ill or infected, divided by total person-days observed). OutQMRA.LPInfIntEc(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,[ 8 11 12 14])))/(nKidsInt*365); %Yearly inf. LP, E. coli. OutQMRA.LPIllIntEc(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,[16 19 20 270 22])))/(nKidsInt*365); %Yearly ill LP, E. coli. OutQMRA.LPInfIntGi(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,[ 9 11 13 14])))/(nKidsInt*365); %Yearly inf. LP, Giardia. OutQMRA.LPIllIntGi(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,[17 19 21 22])))/(nKidsInt*365); %Yearly ill LP, Giardia. OutQMRA.LPInfIntRo(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,[10 12 13 14])))/(nKidsInt*365); %Yearly inf. LP, rota. OutQMRA.LPIllIntRo(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,[18 20 21 22])))/(nKidsInt*365); %Yearly ill LP, rota. OutQMRA.LPInfIntMix(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,11:14)))/ (nKidsInt*365); %Yearly inf. LP, mixed. OutQMRA.LPIllIntMix(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,19:22)))/ (nKidsInt*365); %Yearly ill LP, mixed. OutQMRA.LPInfIntAny(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,8:14)))/ (nKidsInt*365); %Yearly inf. LP, any. OutQMRA.LPIllIntAny(i) = sum(sum(DailyStatus{1,i}(maxTime-364:maxTime,16:22)))/ (nKidsInt*365); %Yearly ill LP, any. %As above, for placebo group. OutQMRA.LPInfPlaEc(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,[ 8 11 12 14])))/(nKidsPla*365); %Yearly inf. LP, E. coli. OutQMRA.LPIllPlaEc(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,[16 19 20 22])))/(nKidsPla*365); %Yearly ill LP, E. coli. OutQMRA.LPInfPlaGi(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,[ 9 11 13 14])))/(nKidsPla*365); %Yearly inf. LP, Giardia. OutQMRA.LPIllPlaGi(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,[17 19 21 22])))/(nKidsPla*365); %Yearly ill LP, Giardia. OutQMRA.LPInfPlaRo(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,[10 12 13 14])))/(nKidsPla*365); %Yearly inf. LP, rota. OutQMRA.LPIllPlaRo(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,[18 20 21 22])))/(nKidsPla*365); %Yearly ill LP, rota. OutQMRA.LPInfPlaMix(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,11:14)))/ (nKidsPla*365); %Yearly inf. LP, mixed. OutQMRA.LPIllPlaMix(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,19:22)))/ 271 (nKidsPla*365); %Yearly ill LP, mixed. OutQMRA.LPInfPlaAny(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,8:14)))/ (nKidsPla*365); %Yearly inf. LP, any. OutQMRA.LPIllPlaAny(i) = sum(sum(DailyStatus{2,i}(maxTime-364:maxTime,16:22)))/ (nKidsPla*365); %Yearly ill LP, any. %Longitudinal prevalence ratios. OutQMRA.LPRInfIntEc(i) = OutQMRA.LPInfIntEc(i) / OutQMRA.LPInfPlaEc(i); %E. coli, infection. OutQMRA.LPRIllIntEc(i) = OutQMRA.LPIllIntEc(i) / OutQMRA.LPIllPlaEc(i); %E. coli, illness. OutQMRA.LPRInfIntGi(i) = OutQMRA.LPInfIntGi(i) / OutQMRA.LPInfPlaGi(i); %Giardia, infection. OutQMRA.LPRIllIntGi(i) = OutQMRA.LPIllIntGi(i) / OutQMRA.LPIllPlaGi(i); %Giardia, illness. OutQMRA.LPRInfIntRo(i) = OutQMRA.LPInfIntRo(i) / OutQMRA.LPInfPlaRo(i); %Rota, infection. OutQMRA.LPRIllIntRo(i) = OutQMRA.LPIllIntRo(i) / OutQMRA.LPIllPlaRo(i); %Rota, illness. OutQMRA.LPRInfIntAny(i) = OutQMRA.LPInfIntAny(i) / OutQMRA.LPInfPlaAny(i); %Any, infection. OutQMRA.LPRIllIntAny(i) = OutQMRA.LPIllIntAny(i) / OutQMRA.LPIllPlaAny(i); %Any, illness. end if mod(i,floor(loops/10)) == 0; %Progress meter disp(['Loop ',num2str(i),' of ',num2str(loops),' complete.']) toc end end %======Ending main QMRA loop======= disp(['Program finished.']) toc %Outputs runtime. OutQMRA.EndTime = clock; eval(['save Results/OutQMRA',datestr(StartTime,30),'.mat, OutQMRA;']) %Saving output file. 272 if noStochParams == 1 && loops <= 25; %Displaying results if a series of input parameters were tested. out = [OutQMRA.KILP; OutQMRA.KPLP; OutQMRA.LPR; OutQMRA.EcL; OutQMRA.GiL; OutQMRA.RoL; OutQMRA.PrevBase]'; disp('KILP, KPLP, LPR, EcL, GiL, RoL, PrevBase, PrevBase for 0.84 LPR') out(:,8) = (0.84 * out(:,2) - out(:,1)) / 0.16 %Calc. necessary PrevBase to get an LPR of 0.84. min(out) mean(out) max(out) end if Octave == 1; %Audio alert when done. system('aplay -q /usr/lib/openoffice/basis-link/share/gallery/sounds/ok.wav'); else system('start c:\windows\media\tada.wav'); end 273 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (AssignInf.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Just like AssignIll.m, but loops over columns of the person-pathogen matrix instead of rows, & should be faster. %Assigns an illness duration to entries of a matrix, where rows are people and columns are pathogens %Positive entries mean the person is infected, negative entries (or 0) mean the person has recovered %Inputs: % PPmatrix: Matrix with 1 row per person and 1 column per pathogen (Person-Pathogen matrix) % Responses: Vector of illness responses (probability of illness given dose), 1 entry per pathogen % ResponsesNon: Vector of illness responses for people not using any device 274 % pUse: Probability that a person is not using any device % durations: Vector of durations of illnesses, 1 entry per pathogen % ImmuneTimes: Vector of durations of immunity, 1 entry per pathogen %The output (matrixOut) is matrixIn with new illness durations assigned to some of its entries. function PPmatrix = AssignInf(PPmatrix,Responses,Durations,ImmuneTimes); sizePP = size(PPmatrix); Randoms = rand(sizePP); %Random #s for determining infection. One number per person per pathogen. %NewlyInfected = cell(sizePP(2),1); %Initializing cell array to store index values of people who will be newly infected. Immunities = ones(sizePP(1),1) * ImmuneTimes + PPmatrix; %Durations = ones(sizePP) * Durations; %Immunities = PPmatrix + ImmuneTimes; %Immunities gives days left in (infection + immune period), or a neg. # if susceptible. for i = 1:sizePP(2); Immunities(:,i) = PPmatrix(:,i) + ImmuneTimes(i); NewlyInfected = intersect(find(Immunities(:,i) <= 0), find(Randoms(:,i) < Responses(:,i))); %Gets indices of newly infected. PPmatrix(NewlyInfected,i) = Durations(i); end end 275 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (AssignInfIllRand.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Similar to AssignIll.m, but assigns infection durations randomly. %Assigns an illness duration to entries of a matrix, where rows are people and columns are pathogens %Positive entries mean the person is ill, negative entries (or 0) mean the person has recovered %Inputs: % PPmatrix: Matrix with 1 row per person and 1 column per pathogen (Person-Pathogen matrix) % Responses: Matrix of illness responses (probability of illness given dose), 1 row per person and 1 column per pathogen % ImmuneTimes: Vector of durations of immunity, 1 entry per pathogen %The output (matrixOut) is matrixIn with new illness durations assigned to some of its 276 entries. %It requires the functions durEc(), durGi(), & durRo() in IllDurations.m. function [PPmatrix,PPmatrixD] = AssignInfIllRand(PPmatrix,PPmatrixD,Responses,ImmuneTimes,MorbidityK); %rand('state',28) sizePP = size(PPmatrix); Randoms = rand(sizePP); %Random #s for determining infection. One number per person per pathogen. Randoms2 = rand(sizePP); %Random #s for determining disease, as above. %NewlyInfected = cell(sizePP(2),1); %Initializing cell array to store index values of people who will be newly infected. Immunities = ones(sizePP(1),1) * ImmuneTimes + PPmatrix; %Adjusts PPmatrix to account for immunity. Durations = NaN(sizePP); %Making & populating a matrix of disease durations (same size as PPmatrix) to assign to newly infected. Durations(:,1) = durEc(sizePP(1)); Durations(:,2) = durGi(sizePP(1)); Durations(:,3) = durRo(sizePP(1)); for i = 1:sizePP(2); %Loop, once for each pathogen. NewlyInfected = intersect(find(Immunities(:,i) <= 0), find(Randoms(:,i) < Responses(:,i))); %Gets indices of newly infected. Note: 0 or less is susc. PPmatrix(NewlyInfected,i) = Durations(NewlyInfected,i); PPmatrixD(NewlyInfected,i) = 1; %Flags newly infected. NewlyIll = intersect(NewlyInfected, find(Randoms2(:,i) < MorbidityK(i))); PPmatrixD(NewlyIll,i) = 2; end end 277 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (AssignInfRand.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Similar to AssignIll.m, but assigns infection durations randomly. %Assigns an infection duration to entries of a matrix, where rows are people and columns are pathogens %Positive entries mean the person is infected, negative entries (or 0) mean the person has recovered %Inputs: % PPmatrix: Matrix with 1 row per person and 1 column per pathogen (Person-Pathogen matrix) % Responses: Matrix of illness responses (probability of infection given dose), 1 row per person and 1 column per pathogen % ImmuneTimes: Vector of durations of immunity, 1 entry per pathogen %The output (matrixOut) is matrixIn with new illness durations assigned to some of its 278 entries. %It requires the functions durEc(), durGi(), & durRo() in IllDurations.m. function [PPmatrix] = AssignInfRand(PPmatrix,Responses,ImmuneTimes); %rand('state',28) sizePP = size(PPmatrix); Randoms = rand(sizePP); %Random #s for determining infection. One number per person per pathogen. Immunities = ones(sizePP(1),1) * ImmuneTimes + PPmatrix; %Adjusts PPmatrix to account for immunity. Durations = NaN(sizePP); %Making & populating a matrix of disease durations (same size as PPmatrix) to assign to newly infected. Durations(:,1) = durEc(sizePP(1)); Durations(:,2) = durGi(sizePP(1)); Durations(:,3) = durRo(sizePP(1)); NewlyInfected = intersect(find(Immunities <= 0), find(Randoms < Responses)); %Gets indices of newly infected. Note: 0 or less is susc. PPmatrix(NewlyInfected) = Durations(NewlyInfected); end 279 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (CalcDiarrhWeeks.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Calculates whether a given week is reported as a week with diarrhea under the reporting scheme in the DRC Lifestraw RCT (Boisson 2010). %It considers reduced recall of past diarrheal episodes after 2d ('remembrance') and possible distinct diarrhea episodes in the previous 7d. %It operates on a matrix: % Rows represent people, columns represent pathogens/illnesses, and entries represent # of days remaining in the illness. %It outputs a vector with 1 entry per person, 1 if illness is reported during the week, 0 if not. function vec = CalcDiarrhWeeks(InMatrix, remembrance,timeWindow,Morbidity); sizeInMatrix = size(InMatrix); 280 Randoms = rand(sizeInMatrix(1),sizeInMatrix(2)); for j = 1:sizeInMatrix(2); InMatrix((find(Randoms(:,j) > Morbidity(j))),j) = -9999; end for i = 1:sizeInMatrix(1); %Loop over all people. Only the most recent episode (largest entry in a row) is used to assign illness. %Randoms = rand(1,columns(InMatrix)); %Random numbers for determining morbidity (these 2 lines moved upward for greater speed) %InMatrix(i,find(Randoms > Morbidity)) = -9999; %Apply morbidity ratio: if asymptomatic, infection is set to -9999, and therefore not reported. Note that this modification is not passed out of this function. if max(InMatrix(i,:)) >= -2; vec(i) = 1; %If ill during day 0, 1, or 2, assume illness is always reported, therefore assign illness. elseif (max(InMatrix(i,:) >= -timeWindow) & rand() < remembrance); vec(i) = 1; %Otherwise, if ill during days 3-7, randomly determine if episode is remembered. If so, assign illness. else vec(i) = (0); %Otherwise, no illness is remembered or reported. Assign no illness. end end end 281 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DRbP.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Beta-Poisson dose response model, using N50 (default) or beta as a parameter %function outvar = DRbP(N50orBeta,alpha,invar,reverse='no',WhichParam='N50') %Ordinarily, invar is dose & outvar is response. function outvar = DRbP(N50orBeta,alpha,invar,reverse,WhichParam) %Ordinarily, invar is dose & outvar is response. if nargin == 3; reverse = 'no'; WhichParam = 'N50'; end switch(reverse) case 'no' switch(WhichParam) case 'N50' 282 outvar = 1-(1+(invar/N50orBeta)*(2^(1/alpha)-1)).^-alpha; case 'Beta' outvar = 1-(1+(invar/N50orBeta)).^-alpha; otherwise error(['WhichParam must be "N50" or "Beta"']) end case 'yes' %If reverse='yes', invar is response & outvar is dose. switch(WhichParam) case 'N50' outvar = N50orBeta * ( ((1-invar).^(-1/alpha) -1) / (2^(1/alpha)-1) ); case 'Beta' outvar = N50orBeta * ((1-invar).^(-1/alpha) -1); otherwise error(['WhichParam must be "N50" or "Beta"']) end otherwise error(['reverse must be "no" or "yes"']) end end 283 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DRchoose.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Returns a vector of response values, determined by dose response models, for several pathogens. %Inputs are vectors with 1 entry per pathogen, all in the same order: % nonalphas: k parameter (exponential) or N50 parameter (beta-Poisson) % alphas: alpha parameter (beta-Poisson); NA if exponential model is desired % Doses: Doses of pathogens received per individual (under default behavior; see 'reverse' below) % morbidities: Morbidity ratios: proportion of infected who are ill % reverse: Defaults to 'no', determining proportion ill from dose. If 'yes', determines dose from proportion ill. %The output (outvec) is a vector containing the proportions of exposed who will fall ill. %If reverse=='yes', outvec is a vector of doses calculated from invec, the proportions 284 ill. %This code requires several custom functions/subroutines in the working directory: % DRexp.m: Exponential dose response model % DRbP.m: Beta-Poisson dose response model %function outvec = %DRchoose(nonalphas,alphas,invec,morbidities=1,reverse='no') %Octave function outvec = DRchoose(nonalphas,alphas,Doses,morbidities,reverse) if nargin == 3; morbidities = 1; reverse = 'no'; end if morbidities == 1; morbidities = ones(length(nonalphas)); end switch(reverse) case 'no' for i=1:length(alphas); if (isnan(alphas(i))) %if alpha is NA, run exponential dose response if size(Doses)(1) == 1; outvec(i) = DRexp(nonalphas(i),Doses(i)) * morbidities(i); else outvec(i) = DRexp(nonalphas(i),Doses(:,i)) * morbidities(i); end else %run beta-Poisson dose response outvec(i) = DRbP(nonalphas(i),alphas(i),Doses(i)) * morbidities(i); end end case 'yes' for i=1:length(alphas); if (isnan(alphas(i))) %if alpha is NA, run exponential dose 285 response outvec(i) = DRexp(nonalphas(i),Doses(i) ./ morbidities(i),'yes','N50') ; else %run beta-Poisson dose response outvec(i) = DRbP(nonalphas(i),alphas(i),Doses(i) ./ morbidities(i),'yes','N50'); end end otherwise error('reverse must be "no" or "yes"') end end 286 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DRexp.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Exponential dose response model %function outvar = DRexp(k, invar,reverse='no') %Ordinarily, invar is dose & outvar is response. function outvar = DRexp(k, invar, reverse) %This works in Matlab. if nargin < 3; reverse = 'no'; end switch(reverse); case 'no'; outvar = 1-exp(-k * invar); case 'yes'; %If reverse='yes', invar is response & outvar is dose. outvar = log(1-invar)/-k; 287 otherwise error(['reverse (last parameter) must be "no" or "yes"']) end end 288 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durEc.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Functions for calculating vectors of illness durations function output = durEc(n); output = round(gamrnd(1.775,1.690,[n,1])); %Shape, then scale output(find(output == 0)) = 0.1; %Sets zero durations to 0.1 day instead. Will still function as 1 day. end 289 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durGi.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Functions for calculating vectors of illness durations function output = durGi(n); %Based on a fit of gamma dist. to limited info from Kent GP 1988. output = round(gamrnd(3.206,3.431,[n,1])); %Shape, then scale output(find(output == 0)) = 0.1; %Sets zero durations to 0.1 day instead. Will still function as 1 day. end 290 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durRo.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Functions for calculating vectors of illness durations function output = durRo(n); %Based on 4 rotavirus-infected volunteers having durations of 1, 2, 3, and 4 days (Kapikian 1983). output = ceil(rand([n,1]) * 4); output(find(output == 0)) = 0.1; %Sets zero durations to 0.1 day instead. Will still function as 1 day. end 291 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (Examine1Run.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Examining data from a single run of the QMRA LifeStraw model. %Converting to susceptible (-9), immune (-1), or diseased (9) for each of the 3 pathogens. %====Setting options==== startTime = 96; %Time point at which to start looking at the data. Note that 1st matrix corresponds to time 0. recode = 1; %If 1, recode matrix entries to susc./inf./immune. %====Finished with options, starting processing.==== PPMs = PPmatrices; %Making a copy. switch(recode); case 1; disp(['Recoding raw numbers to susc./inf./immune.']) 292 for i = startTime:size(PPmatrices)(2); %Starting when the actual simulation starts (after equilibrium is reached). PPMs(i).KidsInt(PPMs(i).KidsInt <= -7) = -9; %7 day immune period; 0 counts as the 1st immune day, so at -7 they are susc. PPMs(i).KidsInt(PPMs(i).KidsInt > 0) = 9; %Susceptible if a positive integer. PPMs(i).KidsInt(abs(PPMs(i).KidsInt) != 9) = -1;%Immune if neither of the above applies. PPMs(i).KidsInt(PPMs(i).KidsInt == 9) = 2; %Infected person-day marked as 2, so as to more easily distinguish. end for i = startTime:size(PPmatrices)(2); %Same as above 'for' loop, but placebo. PPMs(i).KidsPla(PPMs(i).KidsPla <= -7) = -9; PPMs(i).KidsPla(PPMs(i).KidsPla > 0) = 9; PPMs(i).KidsPla(abs(PPMs(i).KidsPla) != 9) = -1; PPMs(i).KidsPla(PPMs(i).KidsPla == 9) = 2; end otherwise disp(['Not recoding to susc./inf./immune, output will display raw numbers.']) end KidsIntStatusEc = NA(size(PPMs(1).KidsInt)(1), size(PPmatrices)(2)-(startTime-1)); KidsIntStatusGi = KidsIntStatusEc; KidsIntStatusRo = KidsIntStatusEc; KidsPlaStatusEc = NA(size(PPMs(1).KidsPla)(1), size(PPmatrices)(2)-(startTime-1)); KidsPlaStatusGi = KidsPlaStatusEc; KidsPlaStatusRo = KidsPlaStatusEc; for i = startTime:size(PPmatrices)(2); %Starting when the actual simulation starts (after equilibrium is reached). for j = 1:size(PPMs(1).KidsInt)(1); KidsIntStatusEc(j,i-startTime+1) = PPMs(i).KidsInt(j,1); KidsIntStatusGi(j,i-startTime+1) = PPMs(i).KidsInt(j,2); KidsIntStatusRo(j,i-startTime+1) = PPMs(i).KidsInt(j,3); end 293 for j = 1:size(PPMs(1).KidsPla)(1); KidsPlaStatusEc(j,i-startTime+1) = PPMs(i).KidsPla(j,1); KidsPlaStatusGi(j,i-startTime+1) = PPMs(i).KidsPla(j,2); KidsPlaStatusRo(j,i-startTime+1) = PPMs(i).KidsPla(j,3); end end %Now can visually inspect the 6 status matrices that have been output. %Should be a way to collapse them also (run-length encoding?), but not yet implemented. function outmatrix = coll(inmatrix,nMaxRuns) %Inefficient but hopefully works. sizeM = size(inmatrix); maxk = 1; %Initializing counter to determine the maximum number of runs ever seen during the function call. for i = 1:sizeM(1); %Loop over all rows for j = 2:sizeM(2); %Loop over each entry per row if j == 2; %Special procedure for first iteration, since there could be a transition (or not) between the 1st 2 entries. k = 1; %Initiating run counter; if inmatrix(i,j) != inmatrix(i,j-1); if inmatrix(i,j-1) == -9; dur(k) = -1; elseif inmatrix(i,j-1) == -1; dur(k) = 0.001; elseif inmatrix(i,j-1) == 2; dur(k) = 1; else stop('Unexpected value in matrix (not -9, -1, or 2).') end k = k + 1; %Increment run counter if inmatrix(i,j) == -9; dur(k) = -1; elseif inmatrix(i,j) == -1; dur(k) = 0.001; elseif inmatrix(i,j) == 2; dur(k) = 1; else stop('Unexpected value in matrix (not -9, -1, or 2).') end else %If type of run does not change in 1st 2 entries if inmatrix(i,j) == -9; dur(k) = -2; elseif inmatrix(i,j) == -1; dur(k) = 0.002; 294 elseif inmatrix(i,j) == 2; dur(k) = 2; else stop('Unexpected value in matrix (not -9, -1, or 2).') end end elseif j == sizeM(2); %Special procedure for last iteration - need to drop it since it is probably incomplete. dur(k) = 0; else %If j (the column) is anything greater than 2, but not the last column: if inmatrix(i,j) != inmatrix(i,j-1); %If there is a transition, reset the run: k = k + 1; if inmatrix(i,j) == -9; dur(k) = -1; elseif inmatrix(i,j) == -1; dur(k) = 0.001; elseif inmatrix(i,j) == 2; dur(k) = 1; else stop('Unexpected value in matrix (not -9, -1, or 2).') end else %If there is no transition, extend the run if inmatrix(i,j) == -9; dur(k) = dur(k) - 1; elseif inmatrix(i,j) == -1; dur(k) = dur(k) + 0.001; elseif inmatrix(i,j) == 2; dur(k) = dur(k) + 1; else stop('Unexpected value in matrix (not -9, -1, or 2).') end end end end if k > maxk; maxk = k %Updates & displays maxk (largest no. runs seen so far). Use as guide for entering nMaxRuns. end outmatrix(i,:) = padarray(dur,[0, nMaxRuns - length(dur)],0,'post'); end outmatrix(:,1) = 0; %Sets all 1st runs to 0 (they are likely to be incomplete). 295 mean(outmatrix(find(outmatrix >= 1))) %Print mean duration of illness end function z = GraphColl(outmatrix) %Graphs output of above function. bins = unique(outmatrix(find(outmatrix < 0))); if length(bins) == 1; binsT=NaN(3,1); binsT(2)=bins; binsT(1)=binsT(2)-1; binsT(3)=binsT(2)+1; bins=binsT; end subplot(2,2,1); hist(outmatrix(find(outmatrix < 0)),bins) %Durations of susceptibility bins = unique(outmatrix(find(outmatrix > 0 & outmatrix < 1))); if length(bins) == 1; binsT=NaN(3,1); binsT(2)=bins; binsT(1)=binsT(2)-0.001; binsT(3)=binsT(2)+0.001; bins=binsT; end subplot(2,2,2); hist(outmatrix(find(outmatrix > 0 & outmatrix < 1)),bins) %Durations of immunity bins = unique(outmatrix(find(outmatrix >= 1))); if length(bins) == 1; binsT=NaN(3,1); binsT(2)=bins; binsT(1)=binsT(2)-1; binsT(3)=binsT(2)+1; bins=binsT; end subplot(2,2,3); hist(outmatrix(find(outmatrix >= 1)),bins) %Durations of illness end %histc(PlaRo(find(PlaRo < 1 & PlaRo > 0)),[0:0.001:max(PlaRo(find(PlaRo < 1 & PlaRo > 0)))]) %Awful (but functional) way to get counts of possible values for immunity length. 296 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (GetTrialParams.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %This script generates a series of parameters for input into the main code, rather than stochastically generating them. reps = 1; %# of times to run each parameter set. readParams = 0; %To read parameters generated by a previous model run (1) or not (0). paramSets = 10; %# of parameter sets to be run, if generating them systematically. %loops = paramSets * reps; %Deliberately overwrites 'loops' in the main code. switch(readParams); case 1; TrialParams = dlmread('Results/RunsThatFit.csv',',',1,1); %Access a file returned by ReadOutput.r. 297 paramSets = size(TrialParams, 1); %Overwrites 'paramSets' above. disp(['Reading parameter values from ',num2str(paramSets),' trials.']) case 0; MinDoses = [2e4, 0, 0]; %Minimum non-zero dose. A good choice is dose that infects 1% of population (ID1). ID1s = [7.5697E3, 5.0708E-1, 1.7280E-2]; %ID1 for ETEC, Giardia, & rota. MinDoses = ID1s * .1; %Uncomment if ID1s are desired. TrialParams = zeros(paramSets,size(MinDoses, 2)); for i = 1:length(MinDoses); %Populating all cells except the 1st row with the minimum nonzero dose. TrialParams(2:paramSets,i) = MinDoses(i); end TrialParams(2,:) = MinDoses; %1st run is 0 pathogens; 2nd run is the minimum nonzero dose. for i = 3:paramSets; %Uncomment the particular line desired. Comment all to check multiple replicates of the same dose. %TrialParams(i,:) = MinDoses * (i-1); %Linearly increases the dose on each model run. TrialParams(i,:) = 2 * TrialParams(i-1,:); %Doubles the dose on each model run. %TrialParams(i,2) = 10 * TrialParams(i-1,2); %Doubles the dose for only 1 pathogen, leaving others constant. end TrialParams(:,4) = 0; %Sets a single value for non-waterborne diarrhea prevalence. %disp('Will cycle through these parameters, 1 run per set.') %trialPrevDiarrhBaseKids = 0 %Sets a single value for non-waterborne diarrhea prevalence. %TrialParams %Print to screen, so we see that this file was executed & to view the trial params. otherwise; error('readParams must be 0 or 1'); end 298 %Replicating the parameter sets. TP = TrialParams; %Making a copy, for use in loop below. if reps > 1; for i = 1:reps-1; TrialParams = [TrialParams; TP]; %Appending copies of the parameter sets. end end TrialParams = sortrows(TrialParams); %Sorting so that identical parameter values are next to each other. loops = paramSets; %Overwriting 'loops' variable in main QMRA code. disp(['Using ',num2str(paramSets),' parameter sets, ',num2str(reps),' times each, totaling ',num2str(loops),' runs.']) if paramSets <= 25; disp(['Parameter sets are as follows:']) TP end 299 %{ COPYRIGHT INFORMATION Copyright 2011 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (OutQMRAmerge.m) is part of QMRAv13_20110414. QMRAv13_20110414 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRAv13_20110414 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRAv13_20110414. If not, see . %} %Pulls together QMRA output files from multiple Octave threads. Runs surprisingly fast! clear all; %First, enter the desired name of the .CSV: filename = {'Results/MergedOutQMRA.csv'}; %Now enter as many files as necessary, each one containing the 'workspaces' from a thread. Files = {'Results/OutQMRA20110315T130458.mat'}; rows = 0; %Initializing variable to count up total number of rows. for i = 1:length(Files); eval(disp(['load ',char(Files(i)),' OutQMRA;'])) %Loads the 'OutQMRA' struct stored in .mat file, overwriting that object if it exists. %disp(['File ',num2str(i),' took ',num2str(),' to run 300 ',num2str(length(OutQMRA.CaL)),... %' loops (',num2str(),' per loop.']) eval(disp(['OutQMRA',num2str(i),'=OutQMRA;'])) %Copies it and adds a numeric suffix to the name. rows = rows + length(OutQMRA.EcL); end clear OutQMRA; %Removes the initial copy of the last file loaded. %CSVmatrix = NA(rows,length(fieldnames(OutQMRA1))-2; %Creating the output matrix. Each row is a QMRA iteration. for i = 1:length(Files); %i %For debugging eval(disp(['OutQMRA = OutQMRA',num2str(i),';'])) %Taking 'OutQMRAx' and creating a copy called 'OutQMRA' to work from. OutQMRA = rmfield(OutQMRA, 'StartTime'); OutQMRA = rmfield(OutQMRA, 'EndTime'); CSVmatrix = OutQMRA.Fit'; %Initializing a matrix that will become a .CSV by transposing the first structure field into it. for [val,key] = OutQMRA; %This special syntax allows looping over all elements of the structure. %key %For debugging if strcmp(char(key),'Fit') == 0; %Don't do anything for the 'Fit' element because we took care of that 2 lines before. CSVmatrix = [CSVmatrix, val']; %Transpose fields into columns & bind into the matrix. end end eval(disp(['CSVmatrix',num2str(i),' = CSVmatrix;'])) clear CSVmatrix; end CSVmatrix = CSVmatrix1; %Initializing output matrix. for i = 2:length(Files); eval(disp(['CSVmatrix = [CSVmatrix; CSVmatrix',num2str(i),'];'])) 301 end fn=fieldnames(OutQMRA); nFields = numel(fn); %http://stackoverflow.com/questions/5292437/how-to-concat-cellarray-of-strings-in-matlab fn(1:nFields-1) = strcat(fn(1:nFields-1),{','}); file = fopen(filename,'w+'); fprintf(file,'%s',disp([fn{:}])); fclose(file); eval(["dlmwrite('",char([filename]),"',CSVmatrix,'-append');"]) disp(['Done; .mat files have been merged and output to ',char(filename),' in ',char(pwd),'/Results/']) 302 9.4. The QMRA model investigating compliance and LRVs (chapter 4) The model (referred to as “QMRA2v5” for short) consists of several text files containing necessary functions and subroutines; the core program is ’Main.m’. Simulation options are set by the choice of several values at the top of the files 'Main.m' and 'GetTrialParams.m'. These options default to values that generate a single test run of the simulation. Other options are set when calling the function 'Main.m' and are described within that file; for example, the following can be submitted at the Octave (or MATLAB) prompt to do a test calibration run: Main(0,1,0,[0 0 0],[0 0 0],[1e5 1 .1],1,'test.csv','NA.csv',1,0,1,0) The source code is found below. The filename of each of the source code files is found in the copyright information at the top of each file. Although QMRA2v5 uses some filenames that are identical to those in QMRAv13_20110414, the content of its files differs. 303 %Main function for running the model QMRA2v5, used for chapter 4 of Kyle S. Enger's Ph.D. dissertation, % as well as the manuscript "The joint effects of efficacy and compliance: a study of household water treatment effectiveness against childhood diarrhea". %Allows multiple runs (e.g., on computing cluster) using different parameters. %Implements a QMRA model originally based on Boisson 2010 (PLoS One) RCT of Lifestraw Family filtration device in the Dem. Rep. of the Congo. %It was produced by modifying QMRAv13_20110414. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger. This file (Main.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} %Requires Octave 3.2 or later. %Also works well on MATLAB; the results in chapter 4 of the accompanying dissertation were all produced with MATLAB. 304 %If running on MATLAB, requires the statistics toolbox, and possibly others. %This code (Main.m, the core component of QMRA2v5) is accompanied by several functions/subroutines, which need to be in the working directory: % AnalyzeInfectionFromSpikes.m: Optional code to determine the proportion of all infections that are caused by contamination spikes % AssignInf.m: Stochastically assigns infections to individuals with fixed durations % AssignInfRand.m: Stochastically assigns infections to individuals with random durations % AssignInfIllRand.m: Stochastically assigns infections & illnesses to individuals with random durations % CalcDiarrhWeeks.m: Determines whether a week with 1+ days of diarrhea is actually reported as a 'diarrhea week'; not actually used for ch. 4 analysis % CalibrationLoopFuncCompile.m: Code that calls Main() in order to facilitate parallel processing of many differently parameterized calibration runs % DRbP.m: Beta-Poisson dose response model % DRchoose.m: Executes the appropriate dose response model and determines illness % DRexp.m: Exponential dose response model % durEc.m: Randomly pick a duration for E. coli infection % durGi.m: Randomly pick a duration for Giardia infection % durRo.m: Randomly pick a duration for rotavirus infection % EstimationLoopFuncCompileV2.m: Code that calls Main() in order to facilitate parallel processing of many differently parameterized estimation runs % EstimationLoopFuncCompileV2PC.m: As above, but for estimation runs with perfect compliance % EstimationLoopFuncCompileV2Untreated.m: As above, but for estimation runs with complete noncompliance % Examine1Run.m: Allows inspection of the complete simulated data from a single run of this code % GetTrialParams.m: Generates a series of trial parameters instead of determining them stochastically % OutQMRAmerge.m: Allows conglomeration of output from multiple model 305 executions (e.g., if parallel processing) %Function arguments to Main(): %pPerfUse: Probability of using the device perfectly %pNoUse: Probability of not using the device at all %overallCompliance: Overall proportion of person-time complying. overallCompliance = pPerfUse + pTreat * (1-pPerfUse-pNoUse) %pTreat: Proportion of water treated if using the device imperfectly. Calculated from pPerfUse, pNoUse, & OverallCompliance. %LRs: Log10 reductions attributable to the device: vector (bac., protozoa, viruses) %PathogensLMin: During calibration, the minimum mean concentration of the 3 marker pathogens (bac., protozoa, viruses); usually [0 0 0]. %PathogensLMax: During calibration, the maximum mean concentration of the 3 marker pathogens (bac., protozoa, viruses); [0 0 0] for estimation. %calibRuns: Number of calibration runs; 0 for estimation runs %outFilename: Filename for storing output as a .CSV. %inFilename: File containing parameter values from calibration (pathogen concentrations). Only matters for estimation phase. %multConc: Multiplier for concentrations obtained from calibration. Used for estimation step to torture-test extreme concentrations. %nSpikesY: Number of 1-day pathogen spikes (all 3 pathogens spike at once) per year. Set to 0 to turn them off. %multSpikes: Defines size of spikes. Multiplier above mean baseline level. %useAllParamSets: If 1, and if estimation runs are being done, use all available parameter sets from calibration (instead of a subsample of parameter sets). %Note that more options are available to be set in the first few lines of this function; % using them as function arguments would have been cumbersome. function [OutQMRAmatrix OutQMRA] = Main(pPerfUse, pNoUse, overallCompliance, LRs, PathogensLMin, PathogensLMax, calibRuns, outFilename, inFilename, multConc, nSpikesY, multSpikes, useAllParamSets); %TODO: Designate necessary output (overall measures & distributions, plus vals of 1st 3 params). 306 Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). StartTime = clock; %====Setting simulation options====More simulation options are in GetTrialParams.m==== dailyVariation = 1; %Equals 1 if pathogen concentrations are allowed to vary by person by day, instead of taking a single fixed value. randomDurations = 1; %Equals 1 if illness durations are randomized instead of taking a single fixed value. Renders Duration vector (below) mostly moot (it is still used to choose start time of surveys). storeStatus = 1; %If 1, store infection & disease status, to determine 'actual' (i.e., reported & unreported) burden of infection. StoreAllDailyStatuses = 1; %If 1, store all daily statuses for all runs in a struct. %=====Housekeeping based on simulation options==== if calibRuns > 0; loops = calibRuns; noStochParams = 0; else noStochParams = 1; GetTrialParams; %GetTrialParams.m reads in pathogens/L & background measure to be tried. It contains its own options - check before running. end if loops <= 25; testing = 1; else testing = 0; end if overallCompliance == 0 | pNoUse == 1; pTreat = 0; pPerfUse = 0; overallCompliance = 0; pNoUse = 1; %Avoids x/0 error below elseif overallCompliance == 1 | pPerfUse == 1; pTreat = 1; pPerfUse = 1; overallCompliance = 1; pNoUse = 0; %Avoids x/0 error below else pTreat = (overallCompliance - pPerfUse) / (1 - pPerfUse - pNoUse); %Calculating pTreat so as to be able to hold overallCompliance constant over mult. runs. end if pTreat < 0 - eps | pTreat > 1 + eps; error('pTreat of ',num2str(pTreat),' is impossibly > 1 or < 0: check pNoUse, pPerfUse, & overallCompliance!'); end 307 %=====Ending housekeeping. Starting parameter values:===== %Water concentration and disease parameter values % Several vectors have 3 elements corresponding to our 3 pathogens of interest: % [ETEC/EPEC, Giardia, rotavirus], i.e., bacteria, protozoa, viruses %PathogensLMax = [2e5, 1.35, 0.18]; %Maximum pathogens/L; minimum is zero for all. 2fold empirically observed levels that led to LP exceeding the 95% CI for the placebo group, each pathogen taken individually. pathogensLcv = sqrt(3407044) / 2509.329; %Coeff. of var. calc. from variance & mean of 'cfbef' (Boisson 2010 water qual. data), high outliers (>= 30000 CFU) removed. For calc. of scaled gamma dists. MorbidityK = [0.214 , 0.59 , 0.397]; %Proportion of infected 'kids' (<5y) with diarrhea. Duration = [82.1/24, 18.3, 2.5]; Duration = round(Duration); %Duration of infection (days). Although length & max of this vector are still used, actual infection duration is determined by durEc.m, durGi.m, & durRo.m. prevDiarrhBaseKidsMax = 0.0972; %Upper limit of non-waterborne reported diarrhea prevalence. LongPrevs = [0.103, 0.0896]; %Vector of raw long. prev. values from Boisson dataset, kids w. placebo, then kids w. intervention). prevDiarrhBaseKids = LongPrevs(2); %Baseline non-waterborne reported diarrhea prevalence. No greater than observed in the RCT. Only used if no random variation (overwritten otherwise). ImmuneTimes = [7 7 7]; %Length of immune period for all pathogens. %Parameters: population & exposure information. Model by household later. %nKidsInt = 85; nKidsPla = 105; %Boisson 2010. nKids = 100; %No longer simulating an intervention trial - simply a set of counterfactuals for comparison. drinkKids = 1.178; %drinkKidsSD = 0.186; %daily water intake, L/d, kids %pTreat = 2/3; %Proportion of water treated, if device is being used. Try 1, 2/3, & 1/3. %if compliance100 == 1; pUse = 1; pTreat = 1; end; %If perfect compliance is desired, override above 2 lines. %Parameters: device effectiveness information 308 %LRs = [6.9, 3.6, 4.7]; %Log reductions [bacteria, protozoa, viruses] by the intervention device, with upper & lower ranges %LRsPla = [1.05, 1.05, 1.05]; %As above, for 'placebo' device %if badPlacebo == 0; LRsPla = [0 0 0]; end; %If a perfect placebo is being modeled, override above line. %Parameters: dose response, order as above [ETEC/EPEC, Giardia, rotavirus] KorN50 = [2111912, 0.01982, 6.171]; %Exponential k parameter or beta-Poisson N50 parameter alpha = [0.1549, NaN, 0.2531]; %Presence/absence of alpha value determines beta-Poisson or exponential dose resp. %Bias parameters remembrance = 0.54; %Proportion of diarrhea episodes remembered (and reported) if they ended >2d before being surveyed; assume perfect recall if episode is on day 0, 1, or 2 %Study parameters - relating to how the study was conducted recallPeriod = 7; %Number of days in the past over which people were asked to remember diarrheal episodes interval = 31; %Interval between beginnings of recall periods. Must be 31 to avoid undercounting a year as 360d. nYears = 1; %For easy adjustment of the length of the simulation. Used later to properly calculate incidence & LP. nRecallPeriods = 12 * nYears; %Number of recall periods (i.e., number of simulated diarrhea surveys) daysBurnIn = ceil(max(ImmuneTimes) + max(Duration) + recallPeriod) * 4; %Days required for prevalence to reach equilibrium (simulation starts with nobody infected). Allows ample margin for reaching equilibrium. %=====Ending parameter values===== maxTime = daysBurnIn + (nRecallPeriods-1)*interval; %Time over which to run each simulation. %Creating output structure for storing results from main QMRA loop OutQMRA = struct('StartTime',StartTime,'Fit',NaN(1,loops),'KILP',NaN(1,loops),'KPLP',NaN(1,loops),' LPR',NaN(1,loops),... 309 'EcL',NaN(1,loops),'GiL',NaN(1,loops),'RoL',NaN(1,loops),'PrevBase',NaN(1,loops)); %'Fit' is no longer used. if storeStatus == 1; %Optionally creating cell array for more detailed output. Not needed for calibration step. DailyStatus = cell(1, loops); %Each cell in the array needs to contain a matrix (rows are days, columns are variables). DailyStatus(:) = {NaN(maxTime,22)}; %1st row for intervention, 2nd row for placebo. %Columns are: 1:3 = new inf. Ec,Gi,&Ro; 4:6 = new cases; 7:14 = num. inf. (0,Ec,Gi,Ro,EcGi,EcRo,GiRo,EcGiRo); 15:22 = as inf., but ill. end; tic %Starts timer if noStochParams == 1; loops = size(TrialParams,1); end %Resets loops if a series of trial pathogens/L values is being used. MeanDoses = zeros(loops,size(MorbidityK,2)); %Prepopulating a matrix for storing mean doses (helps in checking whether spikes are working properly). for i = 1:loops; %=====Starting main QMRA loop.===== Loops once for each QMRA run. i indexes each loop. %=====Randomly generating parameters for this iteration===== switch(noStochParams); case 0; PathogensLmeans = rand(1,length(MorbidityK)) .* (PathogensLMax - PathogensLMin) + PathogensLMin; %Uniform sampling of the mean value for each pathogen. prevDiarrhBaseKids = rand(1,1)*prevDiarrhBaseKidsMax; case 1; %Pulling parameters from previous runs consistent with RCT. PathogensLmeans = TrialParams(i,1:3) * multConc; %Allows scaling up of dose values obtained through calibration. if size(TrialParams,2) <= 3; prevDiarrhBaseKids = 0; else prevDiarrhBaseKids = TrialParams(i,4); end otherwise 310 error('noStochParams must be 0 or 1'); end if nSpikesY > 0; %Adjusting the baseline mean downward if spikes are used, so that overall mean concentrations remain the same. nSpikes = nSpikesY * nYears; simTime = maxTime - daysBurnIn; PathogensLmeansBase = PathogensLmeans * simTime/(nSpikes*multSpikes+simTime); SpikeTimes = randperm(simTime); SpikeTimes = SpikeTimes(1:nSpikes); SpikeTimes = sort(SpikeTimes + daysBurnIn); %Preallocating the times for the spikes. SpikeTimes(end + 1) = 0; %Needed to avoid an error when spike counter advances past the last spike. nextSpike = 1; %Counter for working through the list of spikes. else PathogensLmeansBase = PathogensLmeans; end OutQMRA.EcL(i)=PathogensLmeans(1); OutQMRA.GiL(i)=PathogensLmeans(2); OutQMRA.RoL(i)=PathogensLmeans(3); OutQMRA.PrevBase(i) = prevDiarrhBaseKids; %This line & previous store the varying parameter values. %=====End random parameter generation - start setup of values/vectors/matrices used throughout simulation===== %Computing daily doses of pathogens ingested in drinking water, using water drunk per day and log reduction values %Computing parameter values for gamma distribution of pathogens in water %Scales = (pathogensLcv * PathogensLmeansBase).^2 ./ PathogensLmeans; %This seems the same as next line. Scales = pathogensLcv ^2 * PathogensLmeansBase; %Shapes = PathogensLmeansBase ./ Scales; %This seems the same as next line. Shapes = 1 / pathogensLcv ^ 2; Shapes = [Shapes Shapes Shapes]; %Shape parameter is identical for all 3 pathogen types. %Assigning infections randomly based on responses, assuming infections with different pathogens are independent. 311 % A person can have only 1 infection per pathogen. OutPrevs = struct('Kids',NaN(1,nRecallPeriods)); %Creating struct to hold prevalences from surveys. %Creating person-pathogen matrices; everybody starts infected for a random amount of time, to reduce periodicity from constant disease duration. KidsPPM = rand(nKids,length(Duration)) * 2*(max(Duration)+max(ImmuneTimes)); %KidsPla = rand(nKidsPla,length(Duration)) * 2*(max(Duration)+max(ImmuneTimes)); if storeStatus == 1; %Optionally, making corresponding matrices to store disease info. KidsPPMD = ones(size(KidsPPM)); %NaN means never infected, 0 means uninfected, 1 means infected, 2 means diseased. %KidsPlaD = ones(size(KidsPla)); %Note that everyone starts off infected with everything, just as with KidsInt & KidsPla above. end OutputFields = fieldnames(OutPrevs); if Octave == 1; fflush(stdout); end; %Forces a write to screen so that sim progress can be seen. %=======Code for testing purposes only======= if testing == 1 && loops <= 15; %Only runs when testing code. These vectors needed for charting infection prevalence over the entire simulation. KidsInfPrev = NaN(maxTime,1); %Creating a vector to hold infection prevalence info, intervention group. %KidsPlaInfPrev = NaN(nKidsPla,1); %As above, placebo group. end; %========End code for testing purposes======== %Now storing all person-pathogen matrices, but only if exactly 1 loop is requested. This repeats at the end of each day. if loops == 1; PPmatrices(1).KidsPPM = KidsPPM; PPmatrices(1).KidsPPMD = KidsPPMD; end; %========Begin daily loop, t indexes the days========= for t = 1:maxTime; 312 KidsPPM = KidsPPM - 1; %Note: a value of 0 signifies the first day of immunity. %KidsPla = KidsPla - 1; if storeStatus == 1; %Optionally, tracking recovery from infection/disease. KidsPPMD(KidsPPM <= 0) = 0; %KidsPlaD(find(KidsPla <= 0)) = 0; end %Computing doses in untreated water, varying for each child, each day. Dose.Kids = NaN(nKids,length(Duration)); %Matrix of doses per child (intervention). Columns are pathogens. %Dose.KidsPla = NaN(nKidsPla,length(Duration)); %Matrix of doses per child (placebo). Columns are pathogens. RandComp = rand(nKids,1); %Random numbers for determining compliance (i.e., use of device). for j = 1:length(Duration); %Looping over pathogens to determine daily doses for ea. person. if isnan(Shapes(j)) == 1; %If mean pathogen conc. is set to 0, pathogen conc. is always 0. Dose.Kids(:,j) = 0; %Dose.KidsPla(:,j) = 0; else if dailyVariation == 1; Dose.Kids(:,j) = gamrnd(Shapes(j),Scales(j),[nKids,1]) * drinkKids; %Initial untreated dose %Dose.KidsPla(:,j) = gamrnd(Shapes(j),Scales(j),[nKidsPla,1]) * drinkKids; else Dose.Kids(:,j) = PathogensLmeansBase(j) * drinkKids; %Dose becomes the mean dose if daily variation is turned off. %Dose.KidsPla(:,j) = PathogensLmeansBase(j) * drinkKids; end end 313 %Next 2 lines: Determining who complies and has a nonzero dose (because log reduction would fail on zero dose). Comp.Kids.Perf = find((RandComp(1:nKids,1) < pPerfUse) & (Dose.Kids(:,j) > 0)); Comp.Kids.Imperf = find((RandComp(1:nKids,1) > (pPerfUse+pNoUse)) & (Dose.Kids(:,j) > 0)); %Now applying LRs, if using device. Includes adjustment for partial treatment of water (pTreat) for imperfect compliers. Dose.Kids(Comp.Kids.Perf,j) = 10.^(log10(Dose.Kids(Comp.Kids.Perf,j)) LRs(j)); Dose.Kids(Comp.Kids.Imperf,j) = 10.^(log10(Dose.Kids(Comp.Kids.Imperf,j) * pTreat) - LRs(j)) + Dose.Kids(Comp.Kids.Imperf,j) * (1 - pTreat); end if nSpikesY > 0 && t == SpikeTimes(nextSpike); Dose.Kids = Dose.Kids * multSpikes; %disp(['Time ',num2str(t),', means = ',num2str(mean(Dose.Kids))]) %This line for testing only nextSpike = nextSpike + 1; end MeanDoses(t,:) = mean(Dose.Kids); %Computing responses (diarrheal illness) using custom functions DRexp() and DRbP(). Responses.Kids(:,1) = DRbP(KorN50(1),alpha(1),Dose.Kids(:,1)); %Note: response matrices correspond to the person-path. matrices. Responses.Kids(:,2) = DRexp(KorN50(2),Dose.Kids(:,2)); Responses.Kids(:,3) = DRbP(KorN50(3),alpha(3),Dose.Kids(:,3)); %Responses.KidsPla(:,1) = DRbP(KorN50(1),alpha(1),Dose.KidsPla(:,1)); %Responses.KidsPla(:,2) = DRexp(KorN50(2),Dose.KidsPla(:,2)); %Responses.KidsPla(:,3) = DRbP(KorN50(3),alpha(3),Dose.KidsPla(:,3)); switch(randomDurations); case 1 314 switch(storeStatus); case 1 %DailyStatus columns: 1:3 = new inf. Ec,Gi,&Ro; 4:6 = new cases; 7:14 = num. inf. (0,Ec,Gi,Ro,EcGi,EcRo,GiRo,EcGiRo); 15:22 = as inf., but ill. KidsPPMDold = KidsPPMD; %KidsPlaDold = KidsPlaD; [KidsPPM,KidsPPMD] = AssignInfIllRand(KidsPPM,KidsPPMD,Responses.Kids,ImmuneTimes,MorbidityK); %[KidsPla,KidsPlaD] = AssignInfIllRand(KidsPla,KidsPlaD,Responses.KidsPla,ImmuneTimes,MorbidityK); for s = 1:6; %Store counts of new infections & illnesses. if s <= 3; %Infections: %DailyStatus{1,i}(t,s) = length(intersect(find(KidsPPMDold(:,s) == 0), find(KidsPPMD(:,s) > 0))); DailyStatus{1,i}(t,s) = sum((KidsPPMDold(:,s) == 0) & (KidsPPMD(:,s) > 0)); %Should run much faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaDold(:,s) == 0), find(KidsPlaD(:,s) > 0))); else %Illnesses: %DailyStatus{1,i}(t,s) = length(intersect(find(KidsPPMDold(:,s-3) == 0), find(KidsPPMD(:,s-3) == 2))); DailyStatus{1,i}(t,s) = sum((KidsPPMDold(:,s-3) == 0) & (KidsPPMD(:,s-3) == 2)); %Should run much faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaDold(:,s-3) == 0), find(KidsPlaD(:,s-3) == 2))); end end otherwise KidsPPM = AssignInfRand(KidsPPM,Responses.Kids,ImmuneTimes); %KidsPla = AssignInfRand(KidsPla,Responses.KidsPla,ImmuneTimes); end 315 otherwise KidsPPM = AssignInf(KidsPPM,Responses.Kids,Duration,ImmuneTimes); %KidsPla = AssignInf(KidsPla,Responses.KidsPla,Duration,ImmuneTimes); end if t >= daysBurnIn && mod(t-(daysBurnIn),interval) == 0; %Obtaining results from diarrhea assessment survey. %Determining reported diarrhea-weeks. CalcDiarrhWeeks() uses the Person-Pathogen Matrices, morbidity ratios, and recall of diarrhea episodes to determine if a week was reported as a week with diarrhea. KidsRD = CalcDiarrhWeeks(KidsPPM,remembrance,recallPeriod,MorbidityK); %Whether diarrhea was reported by each particular person. Note age (last column) is removed from the person-pathogen matrix when inputted to the function. OutPrevs.Kids((t-daysBurnIn)/interval+1) = sum(KidsRD)/length(KidsRD); %Getting prevalence for each diarrhea survey %KidsPlaRD = CalcDiarrhWeeks(KidsPla,remembrance,recallPeriod,MorbidityK); %Like above 2 lines, but kid placebo %OutPrevs.KidsPla((t-daysBurnIn)/interval+1) = sum(KidsPlaRD)/length(KidsPlaRD); %fprintf(1,['d',num2str(t),'/',num2str(maxTime),'|']); %Progress counter. %if Octave == 1; fflush(stdout); end; %Forces a write to screen. end if testing == 1 && loops <= 15; %=====Testing code===== KidsInfPrev(t) = sum(max(KidsPPM') > 0) / nKids;%If max value over all pathogens >0, then there is an infection. %KidsPlaInfPrev(t) = sum(max(KidsPla') > 0) / nKidsPla;%Transposing so that max is for ea. row instead of ea. column. KidsInfEc(t) = sum(KidsPPM(:,1) > 0) / nKids; KidsInfGi(t) = sum(KidsPPM(:,2) > 0) / nKids; KidsInfRo(t) = sum(KidsPPM(:,3) > 0) / nKids; %KidsPlaInfEc(t) = sum(KidsPla(:,1) > 0) / nKidsPla; %KidsPlaInfGi(t) = sum(KidsPla(:,2) > 0) / nKidsPla; %KidsPlaInfRo(t) = sum(KidsPla(:,3) > 0) / nKidsPla; 316 end %======End testing code====== if storeStatus == 1; %Storing lots of daily information about the model run. %Columns are: 1:3 = new inf. Ec,Gi,&Ro; 4:6 = new cases; 7:14 = num. inf. (0,Ec,Gi,Ro,EcGi,EcRo,GiRo,EcGiRo); 15:22 = as inf., but ill. for s = 7:22; %All based on 0=uninfected, 1=asymptomatic, 2=ill. if s == 7; %Tallying completely uninfected people %DailyStatus{1,i}(t,s) = length(find(sum(KidsPPMD') == 0)); DailyStatus{1,i}(t,s) = sum(sum(KidsPPMD,2) == 0); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(find(sum(KidsPlaD') == 0)); elseif s == 8; %Infected with only Ec %DailyStatus{1,i}(t,s) = length(intersect(find(KidsPPMD(:,1) > 0), find(sum([KidsPPMD(:,2) KidsPPMD(:,3)]') == 0)')); DailyStatus{1,i}(t,s) = sum((KidsPPMD(:,1) > 0) & (sum([KidsPPMD(:,2) KidsPPMD(:,3)],2) == 0)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,1) > 0), find(sum([KidsPlaD(:,2) KidsPlaD(:,3)]') == 0)')); elseif s == 9; %Infected with only Gi %DailyStatus{1,i}(t,s) = length(intersect(find(KidsPPMD(:,2) > 0), find(sum([KidsPPMD(:,1) KidsPPMD(:,3)]') == 0)')); DailyStatus{1,i}(t,s) = sum((KidsPPMD(:,2) > 0) & (sum([KidsPPMD(:,1) KidsPPMD(:,3)],2) == 0)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,2) > 0), find(sum([KidsPlaD(:,1) KidsPlaD(:,3)]') == 0)')); elseif s == 10; %Infected with only Ro %DailyStatus{1,i}(t,s) = length(intersect(find(KidsPPMD(:,3) > 0), find(sum([KidsPPMD(:,1) KidsPPMD(:,2)]') == 0)')); DailyStatus{1,i}(t,s) = sum((KidsPPMD(:,3) > 0) & (sum([KidsPPMD(:,1) KidsPPMD(:,2)],2) == 0)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,3) > 0), find(sum([KidsPlaD(:,1) KidsPlaD(:,2)]') == 0)')); 317 elseif s == 11; %Tallying infected with Ec & Gi only %DailyStatus{1,i}(t,s) = length(intersect(find(prod(KidsPPMD(:,1:2)') ~= 0), find(KidsPPMD(:,3) == 0))); DailyStatus{1,i}(t,s) = sum((prod(KidsPPMD(:,1:2),2) ~= 0) & (KidsPPMD(:,3) == 0)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(prod(KidsPlaD(:,1:2)') ~= 0), find(KidsPlaD(:,3) == 0))); elseif s == 12; %Tallying infected with Ec & Ro only %DailyStatus{1,i}(t,s) = length(intersect(find(prod([KidsPPMD(:,1) KidsPPMD(:,3)]') ~= 0), find(KidsPPMD(:,2) == 0))); DailyStatus{1,i}(t,s) = sum((prod([KidsPPMD(:,1) KidsPPMD(:,3)],2) ~= 0) & (KidsPPMD(:,2) == 0)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(prod([KidsPlaD(:,1) KidsPlaD(:,3)]') ~= 0), find(KidsPlaD(:,2) == 0))); elseif s == 13; %Tallying infected with Gi & Ro only %DailyStatus{1,i}(t,s) = length(intersect(find(prod(KidsPPMD(:,2:3)') ~= 0), find(KidsPPMD(:,1) == 0))); DailyStatus{1,i}(t,s) = sum((prod(KidsPPMD(:,2:3),2) ~= 0) & (KidsPPMD(:,1) == 0)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(prod(KidsPlaD(:,2:3)') ~= 0), find(KidsPlaD(:,1) == 0))); elseif s == 14; %Tallying infected with Ec & Gi & Ro %DailyStatus{1,i}(t,s) = length(find(prod(KidsPPMD(:,1:3)') ~= 0)); DailyStatus{1,i}(t,s) = sum(prod(KidsPPMD(:,1:3),2) ~= 0); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(find(prod(KidsPlaD(:,1:3)') ~= 0)); elseif s == 15; %Tallying completely non-ill people %DailyStatus{1,i}(t,s) = length(find(sum(KidsPPMD' .^2) <= 3)); DailyStatus{1,i}(t,s) = sum(sum(KidsPPMD' .^2) <= 3); %Should be 318 faster than above. %DailyStatus{2,i}(t,s) = length(find(sum(KidsPlaD' .^2) <= 3)); elseif s == 16; %Ill with only Ec %DailyStatus{1,i}(t,s) = length(intersect(find(KidsPPMD(:,1) == 2), find(sum(KidsPPMD(:,2:3)' .^2) <= 2)')); DailyStatus{1,i}(t,s) = sum((KidsPPMD(:,1) == 2) & (sum(KidsPPMD(:,2:3)' .^2)' <= 2)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,1) == 2), find(sum(KidsPlaD(:,2:3)' .^2) <= 2)')); elseif s == 17; %Ill with only Gi %DailyStatus{1,i}(t,s) = length(intersect(find(KidsPPMD(:,2) == 2), find(sum([KidsPPMD(:,1) KidsPPMD(:,3)]' .^2) <= 2)')); DailyStatus{1,i}(t,s) = sum((KidsPPMD(:,2) == 2) & (sum([KidsPPMD(:,1) KidsPPMD(:,3)]' .^2)' <= 2)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,2) == 2), find(sum([KidsPlaD(:,1) KidsPlaD(:,3)]' .^2) <= 2)')); elseif s == 18; %Ill with only Ro %DailyStatus{1,i}(t,s) = length(intersect(find(KidsPPMD(:,3) == 2), find(sum(KidsPPMD(:,1:2)' .^2) <= 2)')); DailyStatus{1,i}(t,s) = sum((KidsPPMD(:,3) == 2) & (sum(KidsPPMD(:,1:2)' .^2)' <= 2)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(KidsPlaD(:,3) == 2), find(sum(KidsPlaD(:,1:2)' .^2) <= 2)')); elseif s == 19; %Tallying ill with Ec & Gi only %DailyStatus{1,i}(t,s) = length(intersect(find(prod(KidsPPMD(:,1:2)') == 4), find(KidsPPMD(:,3) <= 1))); DailyStatus{1,i}(t,s) = sum((prod(KidsPPMD(:,1:2),2) == 4) & (KidsPPMD(:,3) <= 1)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(prod(KidsPlaD(:,1:2)') == 4), find(KidsPlaD(:,3) <= 1))); elseif s == 20; %Tallying ill with Ec & Ro only %DailyStatus{1,i}(t,s) = length(intersect(find(prod([KidsPPMD(:,1) KidsPPMD(:,3)]') == 4), find(KidsPPMD(:,2) <= 319 1))); DailyStatus{1,i}(t,s) = sum((prod([KidsPPMD(:,1) KidsPPMD(:,3)],2) == 4) & (KidsPPMD(:,2) <= 1)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(prod([KidsPlaD(:,1) KidsPlaD(:,3)]') == 4), find(KidsPlaD(:,2) <= 1))); elseif s == 21; %Tallying ill with Gi & Ro only %DailyStatus{1,i}(t,s) = length(intersect(find(prod(KidsPPMD(:,2:3)') == 4), find(KidsPPMD(:,1) <= 1))); DailyStatus{1,i}(t,s) = sum((prod(KidsPPMD(:,2:3),2) == 4) & (KidsPPMD(:,1) <= 1)); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(intersect(find(prod(KidsPlaD(:,2:3)') == 4), find(KidsPlaD(:,1) <= 1))); elseif s == 22; %Tallying ill with Ec & Gi & Ro %DailyStatus{1,i}(t,s) = length(find(prod(KidsPPMD(:,1:3)') == 8)); DailyStatus{1,i}(t,s) = sum(prod(KidsPPMD(:,1:3),2) == 8); %Should be faster than above. %DailyStatus{2,i}(t,s) = length(find(prod(KidsPlaD(:,1:3)') == 8)); end end end %Now storing the person-pathogen matrices at the end of this day, if exactly 1 loop was requested. if loops == 1; PPmatrices(t+1).KidsPPM = KidsPPM; PPmatrices(t+1).KidsPPMD = KidsPPMD; end %sizeof(DailyStatus) %Debug measure - too big for memory? if nSpikesY > 0; DailyStatus{2,i} = SpikeTimes; %Storing the spike times DailyStatus{3,i} = daysBurnIn; %Storing the equilibration duration 320 (for debugging) end end %disp([' ']) %Adds line feed to separate the progress counters. %=========End daily loop, begin more testing code====== if testing == 1 && loops <= 15; maxPrevTime = find(KidsInfPrev == max(KidsInfPrev)); %Getting time points of max. prevalence of infection (1st, if tie). disp(['1st time point where max. prevalence is seen is ',num2str(maxPrevTime(1))]) %Printing first point of max. prevalence. figure(i); %Plotting infection prevalence. %set(f1, 'Position', [5 5 1024 768]); subplot(2,2,1); plot([1:maxTime]',KidsInfPrev,'-k'); title 'Daily infection prevalence; Xs = reported waterborne prevalence'; ylabel 'Proportion affected'; xlabel 'Time'; %xlim([0 625]); hold on; plot([1:maxTime],KidsInfEc,'-r'); plot([1:maxTime],KidsInfGi,'-g'); plot([1:maxTime],KidsInfRo,'-b'); h5 = plot([daysBurnIn:interval:maxTime],OutPrevs.Kids,'cx'); set(h5,'linewidth',2); legend('{\fontsize{10} Any infection}','{\fontsize{10} E. coli infection}','{\fontsize{10} Giardia infection}','{\fontsize{10} Rotavirus infection}','{\fontsize{10} LP_{Irwd} (prior week)}'); hold off; subplot(2,2,2); semilogy([1:maxTime],MeanDoses(:,1),'-r'); title 'Daily mean dose of pathogens in drinking water (untreated)'; ylabel 'Daily mean dose'; xlabel 'Time'; %xlim([0 625]); hold on; 321 semilogy([1:maxTime],MeanDoses(:,2),'-g'); semilogy([1:maxTime],MeanDoses(:,3),'-b'); legend('{\fontsize{10} E. coli}','{\fontsize{10} Giardia}','{\fontsize{10} Rotavirus}'); hold off; subplot(2,2,3); plot([1:maxTime],sum(DailyStatus{1,i}(:,1:3),2),'-k'); title 'Daily infection incidence (new infections each day)'; ylabel 'New infections'; xlabel 'Time'; %xlim([0 625]); %ylim([0 nKids]); hold on; plot([1:maxTime],DailyStatus{1,i}(:,1),'-r'); plot([1:maxTime],DailyStatus{1,i}(:,2),'-g'); plot([1:maxTime],DailyStatus{1,i}(:,3),'-b'); legend('{\fontsize{10} Total new infections}','{\fontsize{10} E. coli infections}','{\fontsize{10} Giardia infections}','{\fontsize{10} Rotavirus infections}'); hold off; subplot(2,2,4); plot([1:maxTime],sum(DailyStatus{1,i}(:,4:6),2),'-k'); title 'Daily illness incidence (new illnesses each day)'; ylabel 'New illnesses'; xlabel 'Time'; %xlim([0 625]); %ylim([0 nKids]); hold on; plot([1:maxTime],DailyStatus{1,i}(:,4),'-r'); plot([1:maxTime],DailyStatus{1,i}(:,5),'-g'); plot([1:maxTime],DailyStatus{1,i}(:,6),'-b'); legend('{\fontsize{10} Total new illnesses}','{\fontsize{10} E. coli illnesses}','{\fontsize{10} Giardia illnesses}','{\fontsize{10} Rotavirus illnesses}'); hold off; if nSpikesY > 0; 322 disp([num2str(sum(sum(DailyStatus{1,i}(SpikeTimes(1:end1),1:3)))/sum(sum(DailyStatus{1,i}(daysBurnIn:end,1:3)))),' of infections due to spikes (not including equilibration).']); end end %=========End testing code========= %Adding baseline prevalence. Corrected for doublecounting (some infected people would have been infected anyway by baseline transmission). OutPrevs.Kids = OutPrevs.Kids + prevDiarrhBaseKids * (1 - OutPrevs.Kids); %OutPrevs.KidsPla = OutPrevs.KidsPla + prevDiarrhBaseKids * (1 - OutPrevs.KidsPla); %i %Printing i as a debug measure OutQMRA.rLP(i) = mean(OutPrevs.Kids); %Output long. prev. of reported diarrhea. %OutQMRA.LPR(i) = OutQMRA.KILP(i) / OutQMRA.KPLP(i); if storeStatus == 1; %Storing incidences. OutQMRA.IncInfEc(i) = sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime,1)); %Count of infections with E. coli. OutQMRA.IncIllEc(i) = sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime,4)); %Count of illnesses with E. coli. OutQMRA.IncInfGi(i) = sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime,2)); %Count of infections with for Giardia. OutQMRA.IncIllGi(i) = sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime,5)); %Count of illnesses with Giardia. OutQMRA.IncInfRo(i) = sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime,3)); %Count of infections with for rotavirus. OutQMRA.IncIllRo(i) = sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime,6)); %Count of illnesses with rotavirus. %'Actual' longitudinal prevalences (person-days ill or infected, divided by total person-days observed). OutQMRA.LPInfEc(i) = sum(sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime, [ 8 11 12 14])))/(nKids*nYears*365); %Yearly inf. LP, E. coli. OutQMRA.LPIllEc(i) = sum(sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime, [16 19 20 22])))/(nKids*nYears*365); %Yearly ill LP, E. coli. OutQMRA.LPInfGi(i) = sum(sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime, [ 9 11 13 14])))/(nKids*nYears*365); %Yearly inf. LP, Giardia. 323 OutQMRA.LPIllGi(i) = sum(sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime, [17 19 21 22])))/(nKids*nYears*365); %Yearly ill LP, Giardia. OutQMRA.LPInfRo(i) = sum(sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime, [10 12 13 14])))/(nKids*nYears*365); %Yearly inf. LP, rota. OutQMRA.LPIllRo(i) = sum(sum(DailyStatus{1,i}(maxTime-(nYears*365)+1:maxTime, [18 20 21 22])))/(nKids*nYears*365); %Yearly ill LP, rota. OutQMRA.LPInfMix(i) = sum(sum(DailyStatus{1,i}(maxTime(nYears*365)+1:maxTime,11:14)))/(nKids*nYears*365); %Yearly inf. LP, mixed. OutQMRA.LPIllMix(i) = sum(sum(DailyStatus{1,i}(maxTime(nYears*365)+1:maxTime,19:22)))/(nKids*nYears*365); %Yearly ill LP, mixed. OutQMRA.LPInfAny(i) = sum(sum(DailyStatus{1,i}(maxTime(nYears*365)+1:maxTime,8:14)))/(nKids*nYears*365); %Yearly inf. LP, any. OutQMRA.LPIllAny(i) = sum(sum(DailyStatus{1,i}(maxTime(nYears*365)+1:maxTime,16:22)))/(nKids*nYears*365); %Yearly ill LP, any. %Mean daily pathogens per liter. OutQMRA.PathLMeanEc(i) = PathogensLmeans(1); OutQMRA.PathLMeanGi(i) = PathogensLmeans(2); OutQMRA.PathLMeanRo(i) = PathogensLmeans(3); end %fprintf(1,['d',num2str(t),'/',num2str(maxTime),'|']); %Progress counter. if i == 1; fprintf(1,['Finished loops: ',num2str(i)]); else; fprintf(1, ['-',num2str(i)]); end; if mod(i,floor(loops/10)) == 0; %Progress meter disp(['Loop ',num2str(i),'/',num2str(loops),'. Done in ',num2str((toc/i) * (loops-i) / 60 / 60),'h.']) toc end if i == 5; disp([' ']); disp(['=== Will finish ',num2str(loops),' loops in ~ ',num2str(toc*loops/5/60/60),' h ===']); end; end %======Ending main QMRA loop======= %Converting to a matrix so as to concatenate output from multiple calls, and subsequent 324 easy output to .CSV for analysis in R. OutQMRAmatrix = NaN(loops,14); OutQMRAmatrix(:,1) = 1:1:loops; %ID designation. OutQMRAmatrix(:,2) = pPerfUse; OutQMRAmatrix(:,3) = pNoUse; OutQMRAmatrix(:,4) = pTreat; OutQMRAmatrix(:,5) = overallCompliance; OutQMRAmatrix(:,6) = LRs(1); OutQMRAmatrix(:,7) = LRs(2); OutQMRAmatrix(:,8) = LRs(3); OutQMRAmatrix(:,9) = OutQMRA.PathLMeanEc'; OutQMRAmatrix(:,10) = OutQMRA.PathLMeanGi'; OutQMRAmatrix(:,11) = OutQMRA.PathLMeanRo'; OutQMRAmatrix(:,12) = OutQMRA.LPInfAny'; OutQMRAmatrix(:,13) = OutQMRA.LPIllAny'; OutQMRAmatrix(:,14) = OutQMRA.IncIllEc'; OutQMRAmatrix(:,15) = OutQMRA.IncIllGi'; OutQMRAmatrix(:,16) = OutQMRA.IncIllRo'; OutQMRAmatrix(:,17) = sum(OutQMRAmatrix(:,14:16)')'; %Creating the total incidence column, IncIllTot. OutQMRAmatrix(:,18) = OutQMRAmatrix(:,14) ./ OutQMRAmatrix(:,17); %pIncIllEc, proportion of illness from E. coli. OutQMRAmatrix(:,19) = OutQMRAmatrix(:,15) ./ OutQMRAmatrix(:,17); %pIncIllGi, proportion of illness from Giardia. OutQMRAmatrix(:,20) = OutQMRAmatrix(:,16) ./ OutQMRAmatrix(:,17); %pIncIllRo, proportion of illness from rotavirus. OutQMRAmatrix(:,21) = OutQMRAmatrix(:,17) / (nKids * nYears); %IncIllECY, incidence in episodes per child per year. %Write the results of a calibration to a file. dlmwrite(outFilename, OutQMRAmatrix, '-append'); disp([' Results written to ',outFilename,'.']); 325 disp(['Mean path. concs. (Ec, Gi, Ro):',num2str(mean(MeanDoses(daysBurnIn+1:maxTime,:)))]); concentrations. %Checking pathogen %eval(['save Results/OutQMRA',datestr(StartTime,30),'.mat, OutQMRA;']) %Saving output file. if noStochParams == 1 && loops <= 15; %Displaying results if a series of input parameters were tested. out = [OutQMRA.rLP; OutQMRA.IncIllEc; OutQMRA.IncIllGi; OutQMRA.IncIllRo]; disp('rLP, IncIllEc, IncIllGi, IncIllRo, IncIllAll, pIncEc, pIncGi, pIncRo') out(5,:) = sum(out(2:4,:)); out(6,:) = out(2,:) ./ out(5,:); out(7,:) = out(3,:) ./ out(5,:); out(8,:) = out(4,:) ./ out(5,:); %out = out' %min(out) %mean(out) %max(out) end if StoreAllDailyStatuses == 1; it can be used within Octave. OutQMRA = DailyStatus; end end %If storing everything, overwrite OutQMRA with it, so %End function Main(). 326 %Octave script to obtain proportion of infections that are from spikes. %Must set StoreAllDailyStatuses == 1 in Main.m. This analyzes the OutQMRA cell array that's output by Main.m. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (AnalyzeInfectionFromSpikes.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} 1; function Output = pS(mix,inc,sH); [OutQMRAmatrix OutQMRA] = Main(0,1,0,[0 0 0],[0 0 0],[0 0 0],0,'TestpSpikes.csv',char(['RTF_',mix,'_TrialCalibResults',inc,'5spikesx',num2str(sH),' .csv']),1,5,sH,0); startingDay = 128; %From line ~75 of Main.m (burn-in). nRuns = size(OutQMRA,2); nDays = size(OutQMRA{1,1},1); 327 output = zeros(nRuns,3); %1st column is all new infections, 2nd is new infections from spike days, 3rd is new infections from non-spike days. for i = 1:nRuns; output(i,1) = sum(sum(OutQMRA{1,i}(startingDay:end,1:3),2)); %Row-wise sum. output(i,2) = sum(sum(OutQMRA{1,i}(int32(OutQMRA{2,i}(1:5)),1:3),2)); output(i,3) = sum(sum(OutQMRA{1,i}(setxor(startingDay:end, int32(OutQMRA{2,i} (1:5))), 1:3),2)); %??? end output(:,4) = output(:,2) + output(:,3); output(find(output(:,1) != output(:,4)),:) %Testing whether each row sums properly. outputSum = sum(output) pSpikes = outputSum(2) / outputSum(1) output(:,5) = output(:,2) ./ output(:,1); Output = output(:,5); end Results = struct([]); c = 1; %Initializing loop counter inc={'Lo','Med','Hi'}; for i = 1:3; for j = [10,1000,100000]; for k = ['A','B','C']; Output = pS(k,inc{i},j); eval(char(["Results.",inc{i},num2str(j),k," = Output;"])); disp(['Col. ',num2str(c),', ',inc{i},' incidence, spike height ',num2str(j),', mix ',k]) c = c+1; end end end boxplot(struct2cell(Results)); %Need to convert struct to cell array for easy 328 boxplotting. title 'Fig. S3. Proportion of infections occurring on spike days'; ylabel 'Proportion of infections'; %axis('tic[y]'); axis([0 28 0 1.1]); %Adjusting axis limits. text(2.3,1.05,'Low incidence'); text(11.2,1.05,'Medium incidence'); text(20.5,1.05,'High incidence'); ah=get(gcf,'CurrentAxes'); %Getting handle of the axes that were just created by boxplot(). set(ah,'XTick',1:27); %Setting tick locations manually. set(ah,'XTickLabel','A|B|C|A|B|C|A|B|C|A|B|C|A|B|C|A|B|C|A|B|C|A|B|C|A|B|C') %Labelling ticks manually with pathogen mixture. hold on; plot([9.5, 9.5], [0, 1.1],'-k'); %Now placing lines to separate boxplots by incidence and spike height. plot([18.5, 18.5], [0, 1.1],'-k'); plot([3.5, 3.5], [0, 1],'-k'); plot([6.5, 6.5], [0, 1],'-k'); plot([12.5, 12.5], [0, 1],'-k'); plot([15.5, 15.5], [0, 1],'-k'); plot([21.5, 21.5], [0, 1],'-k'); plot([24.5, 24.5], [0, 1],'-k'); ah2 = axes('Position',[0 0 1 1],'Visible','off'); %Creating a 2nd set of invisible axes the size of the plot window. axis([0 1 0 1]); %Setting the limits of the above invisible axes. text(0.005,0.08,'Pathogen mix:'); %Manually labeling the x axis using the invisible axes. text(0.005,0.04,'Spike height: 10 10^3 10^5 10 10^3 10^5 10 10^3 10^5'); print -dps -mono pInfSpike.ps; %Does not work; gives bizarrely colored output with 329 many different graphics formats. Simply grabbed a screenshot of the figure window instead. %Resulting proportion of infections occurring on spike days. %Lo10A = 0.101 %Lo10B = 0.0997 %Lo10C = 0.108 %Lo1000A = 0.581 %Lo1000B = 0.580 %Lo1000C = 0.693 %Lo100000A = 0.833 %Lo100000B = 0.843 %Lo100000C = 0.894 %Med10A = 0.0722 %Med10B = 0.0686 %Med10C = 0.0848 %Med1000A = 0.339 %Med1000B = 0.309 %Med1000C = 0.432 %Med100000A = 0.455 %Med100000B = 0.411 %Med100000C = 0.561 %Hi10A = 0.0490 %Hi10B = 0.0460 %Hi10C = 0.0603 %Hi1000A = 0.188 %Hi1000B = 0.167 %Hi1000C = 0.232 %Hi100000A = 0.320 %Hi100000B = 0.275 330 %Hi100000C = 0.318 331 %Just like AssignIll.m, but loops over columns of the person-pathogen matrix instead of rows, & should be faster. %Assigns an illness duration to entries of a matrix, where rows are people and columns are pathogens %Positive entries mean the person is infected, negative entries (or 0) mean the person has recovered %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (AssignInf.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} %Inputs: % PPmatrix: Matrix with 1 row per person and 1 column per pathogen (Person-Pathogen matrix) % Responses: Vector of illness responses (probability of illness given dose), 1 entry per pathogen 332 % ResponsesNon: Vector of illness responses for people not using any device % pUse: Probability that a person is not using any device % durations: Vector of durations of illnesses, 1 entry per pathogen % ImmuneTimes: Vector of durations of immunity, 1 entry per pathogen %The output (matrixOut) is matrixIn with new illness durations assigned to some of its entries. function PPmatrix = AssignInf(PPmatrix,Responses,Durations,ImmuneTimes); sizePP = size(PPmatrix); Randoms = rand(sizePP); %Random #s for determining infection. One number per person per pathogen. %NewlyInfected = cell(sizePP(2),1); %Initializing cell array to store index values of people who will be newly infected. Immunities = ones(sizePP(1),1) * ImmuneTimes + PPmatrix; %Durations = ones(sizePP) * Durations; %Immunities = PPmatrix + ImmuneTimes; %Immunities gives days left in (infection + immune period), or a neg. # if susceptible. for i = 1:sizePP(2); Immunities(:,i) = PPmatrix(:,i) + ImmuneTimes(i); NewlyInfected = intersect(find(Immunities(:,i) <= 0), find(Randoms(:,i) < Responses(:,i))); %Gets indices of newly infected. PPmatrix(NewlyInfected,i) = Durations(i); end end 333 %Similar to AssignIll.m, but assigns infection durations randomly. %Assigns an illness duration to entries of a matrix, where rows are people and columns are pathogens %Positive entries mean the person is ill, negative entries (or 0) mean the person has recovered %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (AssignInfIllRand.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} %Inputs: % PPmatrix: Matrix with 1 row per person and 1 column per pathogen (Person-Pathogen matrix) % Responses: Matrix of illness responses (probability of illness given dose), 1 row per person and 1 column per pathogen % ImmuneTimes: Vector of durations of immunity, 1 entry per pathogen 334 %The output (matrixOut) is matrixIn with new illness durations assigned to some of its entries. %It requires the functions durEc(), durGi(), & durRo() in IllDurations.m. function [PPmatrix,PPmatrixD] = AssignInfIllRand(PPmatrix,PPmatrixD,Responses,ImmuneTimes,MorbidityK); %rand('state',28) sizePP = size(PPmatrix); Randoms = rand(sizePP); %Random #s for determining infection. One number per person per pathogen. Randoms2 = rand(sizePP); %Random #s for determining disease, as above. %NewlyInfected = cell(sizePP(2),1); %Initializing cell array to store index values of people who will be newly infected. Immunities = ones(sizePP(1),1) * ImmuneTimes + PPmatrix; %Adjusts PPmatrix to account for immunity. Durations = NaN(sizePP); %Making & populating a matrix of disease durations (same size as PPmatrix) to assign to newly infected. Durations(:,1) = durEc(sizePP(1)); Durations(:,2) = durGi(sizePP(1)); Durations(:,3) = durRo(sizePP(1)); for i = 1:sizePP(2); %Loop, once for each pathogen. %NewlyInfected = intersect(find(Immunities(:,i) <= 0), find(Randoms(:,i) < Responses(:,i))); %Gets indices of newly infected. Note: 0 or less is susc. NewlyInfected = find((Immunities(:,i) <= 0 & Randoms(:,i) < Responses(:,i)) == 1); %Should be much faster than above line. PPmatrix(NewlyInfected,i) = Durations(NewlyInfected,i); PPmatrixD(NewlyInfected,i) = 1; %Flags newly infected. %NewlyIll = intersect(NewlyInfected, find(Randoms2(:,i) < MorbidityK(i))); %PPmatrixD(NewlyIll,i) = 2; %----This|=====================================================|is %taken from line 25 above - should run faster than prev. 2 lines. PPmatrixD((Immunities(:,i) <= 0 & Randoms(:,i) < Responses(:,i)) & (Randoms2(:,i) < MorbidityK(i)),i) = 2; 335 end end 336 %Similar to AssignIll.m, but assigns infection durations randomly. %Assigns an infection duration to entries of a matrix, where rows are people and columns are pathogens %Positive entries mean the person is infected, negative entries (or 0) mean the person has recovered %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (AssignInfRand.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} %Inputs: % PPmatrix: Matrix with 1 row per person and 1 column per pathogen (Person-Pathogen matrix) % Responses: Matrix of illness responses (probability of infection given dose), 1 row per person and 1 column per pathogen % ImmuneTimes: Vector of durations of immunity, 1 entry per pathogen 337 %The output (matrixOut) is matrixIn with new illness durations assigned to some of its entries. %It requires the functions durEc(), durGi(), & durRo() in IllDurations.m. function [PPmatrix] = AssignInfRand(PPmatrix,Responses,ImmuneTimes); %rand('state',28) sizePP = size(PPmatrix); Randoms = rand(sizePP); %Random #s for determining infection. One number per person per pathogen. Immunities = ones(sizePP(1),1) * ImmuneTimes + PPmatrix; %Adjusts PPmatrix to account for immunity. Durations = NaN(sizePP); %Making & populating a matrix of disease durations (same size as PPmatrix) to assign to newly infected. Durations(:,1) = durEc(sizePP(1)); Durations(:,2) = durGi(sizePP(1)); Durations(:,3) = durRo(sizePP(1)); NewlyInfected = intersect(find(Immunities <= 0), find(Randoms < Responses)); %Gets indices of newly infected. Note: 0 or less is susc. PPmatrix(NewlyInfected) = Durations(NewlyInfected); end 338 %Calculates whether a given week is reported as a week with diarrhea under the reporting scheme in the DRC Lifestraw RCT (Boisson 2010). %It considers reduced recall of past diarrheal episodes after 2d ('remembrance') and possible distinct diarrhea episodes in the previous 7d. %It operates on a matrix: % Rows represent people, columns represent pathogens/illnesses, and entries represent # of days remaining in the illness. %It outputs a vector with 1 entry per person, 1 if illness is reported during the week, 0 if not. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (CalcDiarrhWeeks.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function vec = CalcDiarrhWeeks(InMatrix, remembrance,timeWindow,Morbidity); sizeInMatrix = size(InMatrix); 339 Randoms = rand(sizeInMatrix(1),sizeInMatrix(2)); for j = 1:sizeInMatrix(2); InMatrix((find(Randoms(:,j) > Morbidity(j))),j) = -9999; end for i = 1:sizeInMatrix(1); %Loop over all people. Only the most recent episode (largest entry in a row) is used to assign illness. %Randoms = rand(1,columns(InMatrix)); %Random numbers for determining morbidity (these 2 lines moved upward for greater speed) %InMatrix(i,find(Randoms > Morbidity)) = -9999; %Apply morbidity ratio: if asymptomatic, infection is set to -9999, and therefore not reported. Note that this modification is not passed out of this function. if max(InMatrix(i,:)) >= -2; vec(i) = 1; %If ill during day 0, 1, or 2, assume illness is always reported, therefore assign illness. elseif (max(InMatrix(i,:) >= -timeWindow) & rand() < remembrance); vec(i) = 1; %Otherwise, if ill during days 3-7, randomly determine if episode is remembered. If so, assign illness. else vec(i) = (0); %Otherwise, no illness is remembered or reported. Assign no illness. end end end 340 %Loop for obtaining estimation runs while modifying pUse, pTreat, and LRVs. Facilitates computing cluster use. %Be sure to check that GetTrialParams.m is configured properly before running this script. %Submit as several jobs to parallelize a calibration run (maybe not worth bothering with job array). %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (CalibrationLoopFuncCompile.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function[OutM OutS] = CalibrationLoopFuncCompile(indexText,inc,calibRunsText,nSpikesText,multSpikesText,maxEcTe xt,maxGiText,maxRoText); %This helps with debugging, since arguments to compiled code can only be text. index = str2num(indexText); 341 RandStream.setDefaultStream(RandStream.create('mt19937ar','seed',sum([clock index*10]))); %Sets random stream based on clock & job index. calibRuns = str2num(calibRunsText); nSpikes = str2num(nSpikesText); multSpikes = str2num(multSpikesText); PathogensLMax = [str2num(maxEcText) str2num(maxGiText) str2num(maxRoText)]; %===Required lines for HPCC setenv MKL_DYNAMIC FALSE %maxNumCompThreads(1); %Throws an error. Recommended, but does not seem to be necessary. %=== %===Parameter entry=== %switch inc; %This switch no longer needed since we're passing max pathogen concentrations into this function. % case 'Lo'; PathogensLMax = [6e3, 0.3, 0.025]; %From QMRA2v2/TrialCalibRuns.m. % case 'Med'; PathogensLMax = [2e4, 0.8, 0.05]; % case 'Hi'; PathogensLMax = [1.5e5, 1.6, 0.08]; % otherwise; error('inc needs to be Lo, Med, or Hi'); %end infile = 'nonapplicable.csv'; outfile = ['Results/TrialCalibResults',inc,nSpikesText,'spikesx',multSpikesText,'-',indexText,'.csv ']; %===End parameter entry=== %disp(['##### Running ',num2str(size(U,2)),'*',num2str(size(T,2)),'*',num2str(size(L,2)),'+1=',num2str(combos), ' parameter combinations on ',num2str(size(tempData,1)),' parameter sets from calibration, should have requested at least that many members in the job array. #####']) [OutM OutS] = Main(0, 1, 0, [0 0 0], [0 0 0], PathogensLMax, calibRuns, outfile, infile, 1, nSpikes, multSpikes); 342 disp(['##### DONE #####']) end %End function. 343 %Beta-Poisson dose response model, using N50 (default) or beta as a parameter %function outvar = DRbP(N50orBeta,alpha,invar,reverse='no',WhichParam='N50') %Ordinarily, invar is dose & outvar is response. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DRbP.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function outvar = DRbP(N50orBeta,alpha,invar,reverse,WhichParam) %Ordinarily, invar is dose & outvar is response. if nargin == 3; reverse = 'no'; WhichParam = 'N50'; end switch(reverse) case 'no' switch(WhichParam) 344 case 'N50' outvar = 1-(1+(invar/N50orBeta)*(2^(1/alpha)-1)).^-alpha; case 'Beta' outvar = 1-(1+(invar/N50orBeta)).^-alpha; otherwise error(['WhichParam must be "N50" or "Beta"']) end case 'yes' %If reverse='yes', invar is response & outvar is dose. switch(WhichParam) case 'N50' outvar = N50orBeta * ( ((1-invar).^(-1/alpha) -1) / (2^(1/alpha)-1) ); case 'Beta' outvar = N50orBeta * ((1-invar).^(-1/alpha) -1); otherwise error(['WhichParam must be "N50" or "Beta"']) end otherwise error(['reverse must be "no" or "yes"']) end end 345 %Returns a vector of response values, determined by dose response models, for several pathogens. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DRchoose.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} %Inputs are vectors with 1 entry per pathogen, all in the same order: % nonalphas: k parameter (exponential) or N50 parameter (beta-Poisson) % alphas: alpha parameter (beta-Poisson); NA if exponential model is desired % Doses: Doses of pathogens received per individual (under default behavior; see 'reverse' below) % morbidities: Morbidity ratios: proportion of infected who are ill % reverse: Defaults to 'no', determining proportion ill from dose. If 'yes', determines dose from proportion ill. %The output (outvec) is a vector containing the proportions of exposed who will fall ill. 346 %If reverse=='yes', outvec is a vector of doses calculated from invec, the proportions ill. %This code requires several custom functions/subroutines in the working directory: % DRexp.m: Exponential dose response model % DRbP.m: Beta-Poisson dose response model %function outvec = %DRchoose(nonalphas,alphas,invec,morbidities=1,reverse='no') %Octave function outvec = DRchoose(nonalphas,alphas,Doses,morbidities,reverse) if nargin == 3; morbidities = 1; reverse = 'no'; end if morbidities == 1; morbidities = ones(length(nonalphas)); end switch(reverse) case 'no' for i=1:length(alphas); if (isnan(alphas(i))) %if alpha is NA, run exponential dose response if size(Doses)(1) == 1; outvec(i) = DRexp(nonalphas(i),Doses(i)) * morbidities(i); else outvec(i) = DRexp(nonalphas(i),Doses(:,i)) * morbidities(i); end else %run beta-Poisson dose response outvec(i) = DRbP(nonalphas(i),alphas(i),Doses(i)) * morbidities(i); end end case 'yes' for i=1:length(alphas); 347 if (isnan(alphas(i))) %if alpha is NA, run exponential dose response outvec(i) = DRexp(nonalphas(i),Doses(i) ./ morbidities(i),'yes','N50') ; else %run beta-Poisson dose response outvec(i) = DRbP(nonalphas(i),alphas(i),Doses(i) ./ morbidities(i),'yes','N50'); end end otherwise error('reverse must be "no" or "yes"') end end 348 %Exponential dose response model %function outvar = DRexp(k, invar,reverse='no') %Ordinarily, invar is dose & outvar is response. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DRexp.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function outvar = DRexp(k, invar, reverse) %This works in Matlab. if nargin < 3; reverse = 'no'; end switch(reverse); case 'no'; outvar = 1-exp(-k * invar); case 'yes'; %If reverse='yes', invar is response & outvar is dose. 349 outvar = log(1-invar)/-k; otherwise error(['reverse (last parameter) must be "no" or "yes"']) end end 350 %Functions for calculating vectors of illness durations %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durEc.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function output = durEc(n); %Based on ...? output = round(gamrnd(1.775,1.690,[n,1])); %Shape, then scale output(output == 0) = 0.1; %Sets zero durations to 0.1 day instead. Will still function as 1 day. end 351 %Functions for calculating vectors of illness durations %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durGi.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function output = durGi(n); %Based on a fit of gamma dist. to limited info from Kent GP 1988. output = round(gamrnd(3.206,3.431,[n,1])); %Shape, then scale output(output == 0) = 0.1; %Sets zero durations to 0.1 day instead. Will still function as 1 day. end 352 %Functions for calculating vectors of illness durations %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durRo.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function output = durRo(n); %Based on 4 rotavirus-infected volunteers having durations of 1, 2, 3, and 4 days (Kapikian 1983). output = ceil(rand([n,1]) * 4); output(output == 0) = 0.1; %Sets zero durations to 0.1 day instead. Will still function as 1 day. end 353 %Loop for obtaining estimation runs while modifying pUse, pTreat, and LRVs. Facilitates computing cluster use. %Be sure to check that GetTrialParams.m is configured properly before running this script. %Differs from EstimationLoopFuncCompile in that only a small number of parameter combinations are chosen, rather than all possible combinations. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (EstimationLoopFuncCompileV2.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function[OutM OutS] = EstimationLoopFuncCompileV2(indexText,inc,mix,overallComplianceText,multConcText,nSpikesT ext,multSpikesText); Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). 354 index = str2num(indexText); if Octave == 0; RandStream.setDefaultStream(RandStream.create('mt19937ar','seed',sum([clock index*10]))); %Sets random stream based on clock & job index. Doesn't work with Octave (Octave bases the seed on the clock by default). end oC = str2num(overallComplianceText); multConc = str2num(multConcText); nSpikes = str2num(nSpikesText); multSpikes = str2num(multSpikesText); %baselines = str2num(baselinesText); %===Required lines for HPCC setenv MKL_DYNAMIC FALSE %maxNumCompThreads(1); %Throws an error. Does not seem to be necessary. %===Parameter entry=== Note that 0 should not be included in L. %P = [0 .1 .2]; %Vector of desired values for proportions of children never using the device. %N = [0 .1 .2]; %Vector of desired values for proportions of children perfectly using the device. L = [1 2 3 4 5]; %Vector of log reduction values desired (all marker pathogens get the same LRV). %Testing the code using the vectors below. %U=[.9 1] %T=[.9 1] %L=[1 2] %Constructing a matrix with all possible combos of P, N, & L %[p n l] = ndgrid(P,N,L); %Combos = [p(:) n(:) l(:)]; %TODO: Build functionality to check N, P, overallCompliance, and pTreat to ensure they make sense before running. %Building appropriate combinations of perfect compliers (P; 1st column) and noncompliers (N; 2nd column). 355 Combos = [oC 0]; %Combo with max possible perfect compliers & max possible noncompliers (same as [0 1-oC]; [oC 1-oC] is 0/0). Combos(2,:) = [oC/2 (1-oC)/2]; %Combo with intermediate perfect/nonperfect compliers. Combos(3,:) = [0 0]; %Combo with no perfect/nonperfect compliers (i.e., pTreat == oC). CombosT = (oC - Combos(:,1)) ./ (1 - Combos(:,1) - Combos(:,2)) %TODO: doublecheck to make sure this is right. if sum(CombosT > 1 + eps) | sum(CombosT < 0 - eps); error('pTreat out of range (>1 or <0); check parameter combos.'); end Combos(:,3) = L(1); %This & subsequent 'for' loop copy the above for each possible LRV. OutCombos = Combos; for i = 2:size(L,2); NextCombos = Combos; NextCombos(:,3) = L(i); OutCombos = [OutCombos; NextCombos]; end Baselines = zeros(1,3); %Creating baseline row. No log reduction & no perfect compliance. Baselines(:,2) = 1; %Modifies above, so that 100% never use device. Combos = [Baselines; OutCombos]; %Baseline as 1st row. Combos' %Output results, transposed. combos = size(Combos,1) infile = ['RTF_',mix,'_TrialCalibResults',inc,nSpikesText,'spikesx',multSpikesText,'.csv']; outfile = ['Results/EstResults',inc,mix,nSpikesText,'spikesx',multSpikesText,'x',multConcText,'oC', num2str(oC*100),'-',indexText,'.csv']; %===End parameter entry=== tempData = csvread(infile,1,1); disp(['##### Running 3 * ',num2str(size(L,2)),'+ 1 = ',num2str(combos),' parameter 356 combinations on ',num2str(size(tempData,1)),' parameter sets from calibration, should have requested at least that many members in the job array. #####']) if index == 1; %If running the baseline parameters: [OutM OutS] = Main(Combos(index,1), Combos(index,2), oC, [Combos(index,3) Combos(index,3) Combos(index,3)], [0 0 0], [0 0 0], 0, outfile, infile, multConc, nSpikes, multSpikes, 1); else %If running the parameters from calibration: [OutM OutS] = Main(Combos(index,1), Combos(index,2), oC, [Combos(index,3) Combos(index,3) Combos(index,3)], [0 0 0], [0 0 0], 0, outfile, infile, multConc, nSpikes, multSpikes, 0); end disp(['##### DONE #####']) end %End function. 357 %Loop for obtaining estimation runs while modifying pUse, pTreat, and LRVs. Facilitates computing cluster use. %Be sure to check that GetTrialParams.m is configured properly before running this script. %Differs from EstimationLoopFuncCompile in that only a small number of parameter combinations are chosen, rather than all possible combinations. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (EstimationLoopFuncCompileV2PC.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function[OutM OutS] = EstimationLoopFuncCompileV2Untreated(indexText,inc,mix,overallComplianceText,multConcText ,nSpikesText,multSpikesText); Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). 358 index = str2num(indexText); if Octave == 0; RandStream.setDefaultStream(RandStream.create('mt19937ar','seed',sum([clock index*10]))); %Sets random stream based on clock & job index. Doesn't work with Octave (Octave bases the seed on the clock by default). end oC = str2num(overallComplianceText); multConc = str2num(multConcText); nSpikes = str2num(nSpikesText); multSpikes = str2num(multSpikesText); %baselines = str2num(baselinesText); %===Required lines for HPCC setenv MKL_DYNAMIC FALSE %maxNumCompThreads(1); %Throws an error. Does not seem to be necessary. %===Parameter entry=== Note that 0 should not be included in L. L = [1 2 3 4 5]; %Vector of log reduction values desired (all marker pathogens get the same LRV). infile = ['RTF_',mix,'_TrialCalibResults',inc,nSpikesText,'spikesx',multSpikesText,'.csv']; outfile = ['Results/EstResults',inc,mix,nSpikesText,'spikesx',multSpikesText,'x',multConcText,'oC', num2str(oC*100),'-',indexText,'.csv']; %===End parameter entry=== tempData = csvread(infile,1,1); disp(['##### Running ',num2str(size(L,2)),' parameter combinations on a subsample of ',num2str(size(tempData,1)),' parameter sets from calibration, should have requested at least that many members in the job array. #####']) [OutM OutS] = Main(1, 0, oC, [L(index) L(index) L(index)], [0 0 0], [0 0 0], 0, outfile, infile, multConc, nSpikes, multSpikes, 0); 359 disp(['##### DONE #####']) end %End function. 360 %Loop for obtaining estimation runs while modifying pUse, pTreat, and LRVs. Facilitates computing cluster use. %Be sure to check that GetTrialParams.m is configured properly before running this script. %Differs from EstimationLoopFuncCompile in that only a small number of parameter combinations are chosen, rather than all possible combinations. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (EstimationLoopFuncCompileV2Untreated.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} function[OutM OutS] = EstimationLoopFuncCompileV2Untreated(indexText,inc,mix,overallComplianceText,multConcText ,nSpikesText,multSpikesText); Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). 361 index = str2num(indexText); if Octave == 0; RandStream.setDefaultStream(RandStream.create('mt19937ar','seed',sum([clock index*10]))); %Sets random stream based on clock & job index. Doesn't work with Octave (Octave bases the seed on the clock by default). end oC = str2num(overallComplianceText); multConc = str2num(multConcText); nSpikes = str2num(nSpikesText); multSpikes = str2num(multSpikesText); %baselines = str2num(baselinesText); %===Required lines for HPCC setenv MKL_DYNAMIC FALSE %maxNumCompThreads(1); %Throws an error. Does not seem to be necessary. %===Parameter entry=== Note that 0 should not be included in L. %L = [1 2 3 4 5]; %Vector of log reduction values desired (all marker pathogens get the same LRV). infile = ['RTF_',mix,'_TrialCalibResults',inc,nSpikesText,'spikesx',multSpikesText,'.csv']; outfile = ['Results/EstResults',inc,mix,nSpikesText,'spikesx',multSpikesText,'x',multConcText,'oC', num2str(oC*100),'-1-',indexText,'.csv']; %===End parameter entry=== tempData = csvread(infile,1,1); nChunks = 20; %# of equal-sized chunks to break the parameter sets into. Add 1 more for the remainder. nSets = size(tempData,1); chunk = floor(nSets/nChunks); remainder = mod(nSets,nChunks); 362 ChunkStarts = linspace(1,nChunks*chunk+1,nChunks+1); ChunkStarts(end+1) = nSets+1; DataChunk = tempData(ChunkStarts(index):ChunkStarts(index+1)-1,:); z = zeros(size(DataChunk,1),1); DataChunk = [z DataChunk]; DataChunk = [0,0,0,0;DataChunk]; %Needs surplus 1st line because Main() will strip the 1st line off (the original R output has column names). infile = ['TempBaselineChunk',indexText,'.csv']; csvwrite(infile,DataChunk); disp(['##### Running ',num2str(nSets),' parameter combinations in ',num2str(nChunks+1),' chunks, should have requested at least that many members in the job array. #####']) [OutM OutS] = Main(0, 1, 1, [0 0 0], [0 0 0], [0 0 0], 0, outfile, infile, multConc, nSpikes, multSpikes, 0); disp(['##### DONE #####']) end %End function. 363 %Examining data from a single run of the QMRA lifestraw model. %Converting to susceptible (-9), immune (-1), or diseased (9) for each of the 3 pathogens. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (Examine1Run.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} %====Setting options==== startTime = 96; %Time point at which to start looking at the data. Note that 1st matrix corresponds to time 0. recode = 1; %If 1, recode matrix entries to susc./inf./immune. %====Finished with options, starting processing.==== PPMs = PPmatrices; %Making a copy. switch(recode); case 1; 364 disp(['Recoding raw numbers to susc./inf./immune.']) for i = startTime:size(PPmatrices)(2); %Starting when the actual simulation starts (after equilibrium is reached). PPMs(i).KidsInt(PPMs(i).KidsInt <= -7) = -9; %7 day immune period; 0 counts as the 1st immune day, so at -7 they are susc. PPMs(i).KidsInt(PPMs(i).KidsInt > 0) = 9; %Susceptible if a positive integer. PPMs(i).KidsInt(abs(PPMs(i).KidsInt) != 9) = -1;%Immune if neither of the above applies. PPMs(i).KidsInt(PPMs(i).KidsInt == 9) = 2; %Infected person-day marked as 2, so as to more easily distinguish. end for i = startTime:size(PPmatrices)(2); %Same as above 'for' loop, but placebo. PPMs(i).KidsPla(PPMs(i).KidsPla <= -7) = -9; PPMs(i).KidsPla(PPMs(i).KidsPla > 0) = 9; PPMs(i).KidsPla(abs(PPMs(i).KidsPla) != 9) = -1; PPMs(i).KidsPla(PPMs(i).KidsPla == 9) = 2; end otherwise disp(['Not recoding to susc./inf./immune, output will display raw numbers.']) end KidsIntStatusEc = NA(size(PPMs(1).KidsInt)(1), size(PPmatrices)(2)-(startTime-1)); KidsIntStatusGi = KidsIntStatusEc; KidsIntStatusRo = KidsIntStatusEc; KidsPlaStatusEc = NA(size(PPMs(1).KidsPla)(1), size(PPmatrices)(2)-(startTime-1)); KidsPlaStatusGi = KidsPlaStatusEc; KidsPlaStatusRo = KidsPlaStatusEc; for i = startTime:size(PPmatrices)(2); %Starting when the actual simulation starts (after equilibrium is reached). for j = 1:size(PPMs(1).KidsInt)(1); KidsIntStatusEc(j,i-startTime+1) = PPMs(i).KidsInt(j,1); KidsIntStatusGi(j,i-startTime+1) = PPMs(i).KidsInt(j,2); KidsIntStatusRo(j,i-startTime+1) = PPMs(i).KidsInt(j,3); 365 end for j = 1:size(PPMs(1).KidsPla)(1); KidsPlaStatusEc(j,i-startTime+1) = PPMs(i).KidsPla(j,1); KidsPlaStatusGi(j,i-startTime+1) = PPMs(i).KidsPla(j,2); KidsPlaStatusRo(j,i-startTime+1) = PPMs(i).KidsPla(j,3); end end %Now can visually inspect the 6 status matrices that have been output. %Should be a way to collapse them also (run-length encoding?), but not yet implemented. function outmatrix = coll(inmatrix,nMaxRuns) %Inefficient but hopefully works. sizeM = size(inmatrix); maxk = 1; %Initializing counter to determine the maximum number of runs ever seen during the function call. for i = 1:sizeM(1); %Loop over all rows for j = 2:sizeM(2); %Loop over each entry per row if j == 2; %Special procedure for first iteration, since there could be a transition (or not) between the 1st 2 entries. k = 1; %Initiating run counter; if inmatrix(i,j) != inmatrix(i,j-1); if inmatrix(i,j-1) == -9; dur(k) = -1; elseif inmatrix(i,j-1) == -1; dur(k) = 0.001; elseif inmatrix(i,j-1) == 2; dur(k) = 1; else stop('Unexpected value in matrix (not -9, -1, or 2).') end k = k + 1; %Increment run counter if inmatrix(i,j) == -9; dur(k) = -1; elseif inmatrix(i,j) == -1; dur(k) = 0.001; elseif inmatrix(i,j) == 2; dur(k) = 1; else stop('Unexpected value in matrix (not -9, -1, or 2).') end else %If type of run does not change in 1st 2 entries if inmatrix(i,j) == -9; dur(k) = -2; 366 elseif inmatrix(i,j) == -1; dur(k) = 0.002; elseif inmatrix(i,j) == 2; dur(k) = 2; else stop('Unexpected value in matrix (not -9, -1, or 2).') end end elseif j == sizeM(2); %Special procedure for last iteration - need to drop it since it is probably incomplete. dur(k) = 0; else %If j (the column) is anything greater than 2, but not the last column: if inmatrix(i,j) != inmatrix(i,j-1); %If there is a transition, reset the run: k = k + 1; if inmatrix(i,j) == -9; dur(k) = -1; elseif inmatrix(i,j) == -1; dur(k) = 0.001; elseif inmatrix(i,j) == 2; dur(k) = 1; else stop('Unexpected value in matrix (not -9, -1, or 2).') end else %If there is no transition, extend the run if inmatrix(i,j) == -9; dur(k) = dur(k) - 1; elseif inmatrix(i,j) == -1; dur(k) = dur(k) + 0.001; elseif inmatrix(i,j) == 2; dur(k) = dur(k) + 1; else stop('Unexpected value in matrix (not -9, -1, or 2).') end end end end if k > maxk; maxk = k %Updates & displays maxk (largest no. runs seen so far). Use as guide for entering nMaxRuns. end outmatrix(i,:) = padarray(dur,[0, nMaxRuns - length(dur)],0,'post'); end 367 outmatrix(:,1) = 0; %Sets all 1st runs to 0 (they are likely to be incomplete). mean(outmatrix(find(outmatrix >= 1))) %Print mean duration of illness end function z = GraphColl(outmatrix) %Graphs output of above function. bins = unique(outmatrix(find(outmatrix < 0))); if length(bins) == 1; binsT=NaN(3,1); binsT(2)=bins; binsT(1)=binsT(2)-1; binsT(3)=binsT(2)+1; bins=binsT; end subplot(2,2,1); hist(outmatrix(find(outmatrix < 0)),bins) %Durations of susceptibility bins = unique(outmatrix(find(outmatrix > 0 & outmatrix < 1))); if length(bins) == 1; binsT=NaN(3,1); binsT(2)=bins; binsT(1)=binsT(2)-0.001; binsT(3)=binsT(2)+0.001; bins=binsT; end subplot(2,2,2); hist(outmatrix(find(outmatrix > 0 & outmatrix < 1)),bins) %Durations of immunity bins = unique(outmatrix(find(outmatrix >= 1))); if length(bins) == 1; binsT=NaN(3,1); binsT(2)=bins; binsT(1)=binsT(2)-1; binsT(3)=binsT(2)+1; bins=binsT; end subplot(2,2,3); hist(outmatrix(find(outmatrix >= 1)),bins) %Durations of illness end %histc(PlaRo(find(PlaRo < 1 & PlaRo > 0)),[0:0.001:max(PlaRo(find(PlaRo < 1 & PlaRo > 0)))]) %Awful (but functional) way to get counts of possible values for immunity length. 368 %This script generates a series of parameters for input into the main code, rather than stochastically generating them. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (GetTrialParams.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} reps = 1; %# of times to run each parameter set. readParams = 1; %To read parameters generated by a previous model run (1) or not (0). paramSets = 12; %# of parameter sets to be run, if generating them systematically. %loops = paramSets * reps; %Deliberately overwrites 'loops' in the main code. switch(readParams); case 1; TrialParamsAll = dlmread(inFilename,',',1,1); %Access a file returned by 369 ReadOutput.r. inFilename is an input to Main.m. paramSets = size(TrialParamsAll, 1); %Overwrites 'paramSets' above. disp(['Reading parameter values from ',num2str(paramSets),' trials in ',inFilename,', will winnow down to 150 if greater.']) if useAllParamSets == 1; TrialParams = TrialParamsAll; elseif paramSets > 150; selectedParamSets = randperm(paramSets); selectedParamSets = selectedParamSets(1:150); TrialParams = TrialParamsAll(selectedParamSets,:); paramSets = size(TrialParams,1); else TrialParams = TrialParamsAll; end case 0; MinDoses = [0, 0, 0.05]; %Minimum non-zero dose. A good choice is dose that infects 1% of population (ID1). ID1s = [7.5697E3, 5.0708E-1, 1.7280E-2]; %ID1 for ETEC, Giardia, & rota. %MinDoses = ID1s * .1; %Uncomment if ID1s are desired. TrialParams = zeros(paramSets,size(MinDoses, 2)); for i = 1:length(MinDoses); %Populating all cells except the 1st row with the minimum nonzero dose. TrialParams(2:paramSets,i) = MinDoses(i); end TrialParams(2,:) = MinDoses; %1st run is 0 pathogens; 2nd run is the minimum nonzero dose. for i = 3:paramSets; %Uncomment the particular line desired. Comment all to check multiple replicates of the same dose. %TrialParams(i,:) = MinDoses * (i-1); %Linearly increases the dose on each model run. TrialParams(i,:) = 2 * TrialParams(i-1,:); %Doubles the dose on each model run. %TrialParams(i,2) = 10 * TrialParams(i-1,2); %Doubles the dose for only 1 pathogen, leaving others constant. 370 end TrialParams(:,4) = 0; %Sets a single value for non-waterborne diarrhea prevalence. %disp('Will cycle through these parameters, 1 run per set.') %trialPrevDiarrhBaseKids = 0 %Sets a single value for non-waterborne diarrhea prevalence. %TrialParams %Print to screen, so we see that this file was executed & to view the trial params. otherwise; error('readParams must be 0 or 1'); end %Replicating the parameter sets. TP = TrialParams; %Making a copy, for use in loop below. if reps > 1; for i = 1:reps-1; TrialParams = [TrialParams; TP]; %Appending copies of the parameter sets. end end TrialParams = sortrows(TrialParams); %Sorting so that identical parameter values are next to each other. loops = size(TrialParams,1); %Overwriting 'loops' variable in Main(). disp(['Using ',num2str(paramSets),' parameter sets, ',num2str(reps),' times each, totaling ',num2str(loops),' runs.']) if paramSets <= 25; disp(['Parameter sets are as follows:']) TP end 371 %Pulls together QMRA output files from multiple Octave threads. Runs surprisingly fast! %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (OutQMRAmerge.m) is part of QMRA2v5. QMRA2v5 is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version. QMRA2v5 is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with QMRA2v5. If not, see . %} clear all; %First, enter the desired name of the .CSV: filename = {'Results/MergedOutQMRA.csv'}; %Now enter as many files as necessary, each one containing the 'workspaces' from a thread. Files = {'Results/OutQMRA20110315T130458.mat'}; rows = 0; %Initializing variable to count up total number of rows. for i = 1:length(Files); eval(disp(['load ',char(Files(i)),' OutQMRA;'])) %Loads the 'OutQMRA' struct stored in .mat file, overwriting that object if it exists. 372 %disp(['File ',num2str(i),' took ',num2str(),' to run ',num2str(length(OutQMRA.CaL)),... %' loops (',num2str(),' per loop.']) eval(disp(['OutQMRA',num2str(i),'=OutQMRA;'])) %Copies it and adds a numeric suffix to the name. rows = rows + length(OutQMRA.EcL); end clear OutQMRA; %Removes the initial copy of the last file loaded. %CSVmatrix = NA(rows,length(fieldnames(OutQMRA1))-2; %Creating the output matrix. Each row is a QMRA iteration. for i = 1:length(Files); %i %For debugging eval(disp(['OutQMRA = OutQMRA',num2str(i),';'])) %Taking 'OutQMRAx' and creating a copy called 'OutQMRA' to work from. OutQMRA = rmfield(OutQMRA, 'StartTime'); OutQMRA = rmfield(OutQMRA, 'EndTime'); CSVmatrix = OutQMRA.Fit'; %Initializing a matrix that will become a .CSV by transposing the first structure field into it. for [val,key] = OutQMRA; %This special syntax allows looping over all elements of the structure. %key %For debugging if strcmp(char(key),'Fit') == 0; %Don't do anything for the 'Fit' element because we took care of that 2 lines before. CSVmatrix = [CSVmatrix, val']; %Transpose fields into columns & bind into the matrix. end end eval(disp(['CSVmatrix',num2str(i),' = CSVmatrix;'])) clear CSVmatrix; end CSVmatrix = CSVmatrix1; %Initializing output matrix. for i = 2:length(Files); 373 eval(disp(['CSVmatrix = [CSVmatrix; CSVmatrix',num2str(i),'];'])) end fn=fieldnames(OutQMRA); nFields = numel(fn); %http://stackoverflow.com/questions/5292437/how-to-concat-cellarray-of-strings-in-matlab fn(1:nFields-1) = strcat(fn(1:nFields-1),{','}); file = fopen(filename,'w+'); fprintf(file,'%s',disp([fn{:}])); fclose(file); eval(["dlmwrite('",char([filename]),"',CSVmatrix,'-append');"]) disp(['Done; .mat files have been merged and output to ',char(filename),' in ',char(pwd),'/Results/']) 374 9.5. The EITS model (chapter 5) The transmission model (referred to as “EITSd” for short) consists of several text files containing necessary functions and subroutines; the core program is ’EITSd.m’. Simulation options are set by the choice of several values at the top of the file 'EITSd.m'. These options default to values that generate a single test run of the simulation. Other options are set when calling the function 'EITSd.m' and are described within that file; for example, the following can be submitted at the Octave (or MATLAB) prompt to do one test calibration run: EITSd('C',1,'test.csv'); The source code is below. The filename of each of the source code files is found in the copyright information at the top of each file. Although EITSd uses some filenames that are identical to those in QMRAv13_20110414 or QMRA2v5, the content of its files differs. EITSd also requires folders named 'Graphics' and 'Results' in the working directory in order to store output. All files in the source code that follows are part of EITSd and subject to the GNU General Public License Version 3, except for erdrey.m, which is part of the CONTEST toolbox at http://www.mathstat.strath.ac.uk/research/groups/numerical_analysis/contest (A. Taylor & Higham, 2009); erdrey.m is included at the end of the source code for completeness, by permission of the authors (Des Higham, personal communication, 24 Aug. 2012). 375 %Environmental transmission model of diarrheal infection transmission (main file), Kyle S. Enger, July 2012 %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (EITSd.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} %EITSd runs on Octave 3.2 or later. %Also works well (and runs faster) on MATLAB; the results in chapter 5 of the accompanying dissertation were all produced with MATLAB. %If running on MATLAB, requires the statistics toolbox, and possibly others. %This code (EITSd.m, the core component of EITSd) is accompanied by several functions/subroutines, which are part of EITSd and need to be in the working directory: % ApplyLRVs.m: Applies log10 reduction values to stocks of pathogens % AssignCompliance.m: Given compliance parameters for a community, assigns specific compliance characteristics to specific households 376 % DoseResponse.m: Applies dose response functions to individuals % CalLoopFuncCompile.m: Code that calls EITSd() in order to facilitate parallel processing of many differently parameterized calibration runs % DRbP.m: Beta-Poisson dose response model % DRexp.m: Exponential dose response model % durEc.m: Randomly pick a duration for E. coli infection % durGi.m: Randomly pick a duration for Giardia infection % durRo.m: Randomly pick a duration for rotavirus infection % EstLoopFuncCompileBase.m: Code that calls EITSd() in order to facilitate parallel processing of many differently parameterized baseline (no HWT) estimation runs % EstLoopFuncCompileHWT.m: As above, but for estimation runs using HWT with imperfect compliance % EstLoopFuncCompileHWTPC.m: As above, but for estimation runs using HWT with perfect compliance % Inact.m: Inactivates pathogens in all compartments % PlotMicrobes.m: Generates line charts of flows of microbes during a single model run % PlotPeople.m: Generates line charts of people and their infection statuses during a single model run % Pooping.m: Determines results of defecation events by infected people %Also requires in the working directory: erdrey.m, written by Alan Taylor and Des Higham, U. of Strathclyde, % freely available at http://www.mathstat.strath.ac.uk/research/groups/numerical_analysis/contest, % and reproduced and included with EITSd by permission of the authors, though it is not a part of EITSd). %Use just the 1st 3 arguments to run the model with default values for all parameters. %CalibOrEst: Whether the model is in calibration ('C') or estimation ('E') mode. %nRuns: Number of runs (calibration), or number of runs per parameter set (estimation). %outFilename: Filename for output file containing results. %inFilename: Used for estimation only. Filename root for .CSV containing parameter values 377 from calibration step. %CF variables: Calibration factors; 1st value is low end of range, 2nd value is high end of range. %CFdecayr: 2x3 matrix: top row is low end of range of decay constant modifiers; bottom row is high end of range. Columns are bac.-vir.-prot. %nHHr: Value to replace default for number of households (nHH). %E variables: LRVs and compliance figures for various intervention scenarios in the estimation step. Estimation necessitates input of garbage for the CF ranges above. %dbstop(161); %Setting breakpoint for debugging. Type 'dbquit' at the debug prompt to quit back to Octave prompt. function [People LogHHP Log nMw nMl OutMatrix] = EITSd(CalibOrEst, nRuns, outFilename, inFilename, CFSfr, CFHDhr, CFVr, CFDlr, CFdecayr, nHHr, ElSan, EcSan, ElHWT, EcHWT, ElHand, EcHand); if size(ver('Octave'),1)==0; format compact; format longe; end %Fixes annoying default display characteristics in MATLAB. if nargin == 3; disp(['Running with default parameter values only.']); end if CalibOrEst == 'C'; loops = nRuns; OutMatrix = NaN(loops,87); %Starting output matrix, summarizing results from many model runs, one row per run. elseif CalibOrEst == 'E'; %Pull in parameter values from calibration runs that fit the criteria. TrialParams = dlmread(inFilename,',',1,1); %Access a file returned by ReadOutput.r. inFilename is an input to EITSd.m. paramSets = size(TrialParams, 1); if paramSets > 100 & sum([ElSan ElHWT ElHand]) > 0; %If >100 parameter sets and an intervention is being applied, randomly sample 100 sets without replacement, to use in estimation step. TODO: Conflicts with sampling of 150 parameter sets in calibration R code. selectedParamSets = randperm(paramSets); selectedParamSets = selectedParamSets(1:100); 378 TrialParams = TrialParams(selectedParamSets,:); paramSets = size(TrialParams,1); end %Replicating the parameter sets. TP = TrialParams; %Making a copy, for use in loop below. if nRuns > 1; for i = 1:nRuns-1; TrialParams = [TrialParams; TP]; %Appending copies of the parameter sets. end end TrialParams = sortrows(TrialParams); %Sorting on basis of ID number, so that runs with identical CF values are next to each other. loops = size(TrialParams,1); OutMatrix = NaN(loops,88); %Starting output matrix, summarizing results from many model runs, one row per run. 88th column is the ID of the set of CFs input. disp(['Using ',num2str(paramSets),' parameter sets, ',num2str(nRuns),' times each, totaling ',num2str(loops),' runs.']) elseif CalibOrEst == 'D'; %Running with default parameters only. loops = nRuns; end OutMatrix(:,1) = 1:loops; %Serially numbering rows. Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). p.ProgramVersion=7; p.StartTime=datestr(now,31); %===Parameter values (struct of parameters created simultaneously). Start with defaults for parameters that vary in calibration step. %loops = nRuns; %Number of distinct runs with a given set of parameters. CFSf = 1E-8; p.CFSf=CFSf; %Calibration factor: microbe transfer from surface water to stored drinking water CFHDh = 0.01; p.CFHDh=CFHDh; %Calibration factor: microbe transfer out of household environment to stored water or adults (kids are higher; see wHandMouth). 379 CFV = 0.066; p.CFV=CFV; %Calibration factor: microbe transfer between households when a visit occurs (bidirectional). Rota, skin-skin, Ansari 1988. CFDl = 1E-9; p.CFDl=CFDl; %Calibration factor: proportion of land pathogens ingested per adult (kids are higher; see wHandMouth). CFdecay = [.6 .6 .6]; %Calibration factor: multiplier applied to inactivation rates in water to convert them to inactivation rates out of water. %Microbe-specific parameters [bacteria, viruses, protozoans]:=== Mpgf = [5E8 2E6 572000]; p.Mpgf=Mpgf; %Microbes per gram of feces (DuPont 1971, Ward 1984, Danciger 1974). %Mpgf = [0 0 0]; p.Mpgf=Mpgf; %For code verification (checking the development of baseline infections). KorN50 = [2111912, 6.171, 0.01982]; p.KorN50=KorN50; %Exponential k parameter or betaPoisson N50 parameter alpha = [0.1549, 0.2531, NaN]; p.alpha=alpha; %Presence/absence of alpha value determines beta-Poisson or exponential dose resp. MRk = [0.214, 0.36, 0.59]; p.MRk=MRk; %Morbidity ratios for kids. MRa = [0.214, 0.222, 0.03]; p.MRa=MRa; %Morbidity ratios for adults. Can't find a good adult MR for E. coli, so reusing the MR for kids. latent = [round(1.75), round(3.2), round(13.5)]; p.latent=latent; %Latent (incubation) period. infill = [round(3.4), round(2.5), round(18.3)]; p.infill=infill; %Duration of infectivity (assumed the same as duration of disease). NOT USED except for determining burn-in; see durEc(), durRo(), & durGi(). immune = [1, 1, 1]; p.immune=immune; %Immune period (assumed). %gM = [0.6 0.6 0.6]; p.gM=gM; %Daily inactivation rate of microbes in surface water (assume similar for soil, stored water, & hands) - see Inactivation.xls & EITSparams.ods. CFdecay used to modify these for application to land & household environment compartments. %xHtoW = [0.26 0.26 0.26]; p.xHtoW=xHtoW; %Transfer from hands to water, per contact. Half of reduction for E. coli from Pickering 2011. %p.xHtoP=xHtoP = [0.34 0.34 0.34]; %Transfer from hands to person's mouth, per contact. Rusin 2002 (assuming protozoa same as bac & viruses). Since so high & hand-mouth contacts so frequent, assume all are ingested each day. 380 %xferVisit = 0.066; p.xferVisit=xferVisit; %Proportion of hand pathogens that are transferred (2-way) during a visit. Rota, skin-skin, Ansari 1988. lSan = [3 3 3]; p.lSan=lSan; %LRVs attributable to sanitation. Based on amount of feces on toilet paper (which often isn't flushed). lHWT = [6 4 3]; p.lHWT=lHWT; %LRVs attributable to household water treatment. lHand = [0.46, 0.46, 0.46]; p.lHand=lHand; %LRVs attributable to handwashing; Luby 2001, TTC. Slightly higher than Pickering 2011. %Community characteristics (people & households): %nHH = 1000; p.nHH=nHH; %Approximate number of households in the simulated community nHH = 200; p.nHH=nHH; %Approximate number of households in the simulated community mPHH = 5; p.mPHH=mPHH; %Mean people per household (truncated Poisson) pKids = 0.18; p.pKids=pKids; %Proportion of people in the community who are <5y. meanDeg = 5.3; p.meanDeg=meanDeg; %Mean network degree for households in 18 villages, 'passing time' network, Joe's Ecuador sites (Zelner 2012). %Water intake and defecation output parameters: Ws = 25; p.Ws=Ws; %Size of household water storage container, in liters. Wflow = 0; p.Wflow=Wflow; %Unused. Simple in/outflow of reservoir, daily rate, starting with 0 (maybe 25 ft^3/sec, USGS creek measurements). landArea = 1; p.landArea=landArea; %Unused. Community area, km^2. Presuming people only poop within their community. Allows calc. of persons per square km. Wdd = [1.178, 2.3]; p.Wdd=Wdd; %L H2O drunk daily. Akpata 2004, Nigerian children; Fudge 2008, Kenyan runners (agrees w. USEPA 2011, p. 100). fpp = [109.3, 225]; p.fpp=fpp; %Grams of feces excreted daily; Nigerian children and adult British vegans (both with high-fiber diets). fHands = 0.23; p.fHands=fHands; %Grams of feces on fingers after defecation event. Based on daily feces on toilet paper. See EITSparams.ods. pPoopH2O = 0; p.pPoopH2O=pPoopH2O; %Unused. Probability that a defecation event goes straight into the surface water (i.e., not on land). %Compliance parameters. 1st element is overall compliance, 2nd is compliance type: 1=alpha, 2=beta, 3=gamma. 381 cSan = [0, 1]; p.cSan=cSan; %Compliance with sanitation. Defaults to no compliance; will be varied. cHWT = [0, 1]; p.cHWT=cHWT; %As above, but HWT. cHand = [0, 1]; p.cHand=cHand; %As above, but handwashing. %Now, basic daily rates for some key events. All rates are per day. %rH2ODrink = 6; p.rH2ODrink=rH2ODrink; %Daily drinks from stored water per person. rPoopInf = 1; p.rPoopInf=rPoopInf; %Defecation events, per non-ill person rPoopIll = 3; p.rPoopIll=rPoopIll; %Defecation events, per ill person rVisit = 2/7; p.rVisit=rVisit; %Visits/contact/day (each contact is a pair of households - see adjacency list generated below, based on time spent in past week [Zelner 2012]). rRain = 1/14; p.rRain=rRain; %Roughly fortnightly rainstorm xRunoff = [0.001, 0.05]; p.xRunoff=xRunoff; %Daily transfer of pathogens from land to surface water without rain, and with rain. wHandMouth = 330 / 130; p.wHandMouth=wHandMouth; %Hand-mouth contacts/day for kids divided by hand-mouth contacts/day for adults (USEPA 2011). Weights kids as being more unhygienic. BaseInf = [1.17 0.347 0.212]; %Baseline infections per person-year. Chosen to give 0.5 diarrheal episodes per child-year (total; half bac., 25% vir. & prot.) when distributed randomly among the population (CalcBaselineInfectionIncidence.ods). %===End parameter values; now simulation options=== tMax = 365; p.tMax=tMax; %Number of simulation days desired; daysBurnIn is then added to it. daysBurnIn= ceil(max(immune) + max(latent) + max(infill)) * 3; p.daysBurnIn=daysBurnIn; tMax = tMax + daysBurnIn; pSafeStorage = 0; p.pSafeStorage=pSafeStorage; %Currently can only be 1 or 0. Proportion of the community that has safe (household water) storage. Toggles presence of safe storage for all HWT users in the community. nMl = [0 0 0]; p.nMl=nMl; %Initial numbers of microbes on the land nMw = [0 0 0]; %Initial number of microbes in the reservoir if loops > 3; storeDetails = 0; else storeDetails = 1; end; %If too many loops, don't bother storing detailed output. SummaryResults = NaN(loops,8); %Store summary data from all runs. TODO: fully 382 implement? disp(sprintf('Running %s loops.',num2str(loops))) for l = 1:loops; %Allows multiple model runs. Loops once per run. disp(sprintf('STARTING RUN %s',num2str(l))) t=0; %Initialize time counter LogIncInf = NaN(tMax,9); %Stores daily infection incidence (count of new infections) for the 3 pathogens: 1:3, kids; 4:6, adults; 7:9, all. LogIncIll = NaN(tMax,9); %Stores daily illness incidence (count of new illnesses) for the 3 pathogens: 1:3, kids; 4:6, adults; 7:9, all. Log = NaN(tMax,33); %Logs daily status summary for all people. LogK = NaN(tMax,18); %Logs daily status summary for all kids. LogA = NaN(tMax,18); %Logs daily status summary for all adults. if storeDetails == 1; LogHHP = cell(tMax,2); %Stores HHs and People matrices daily. LogFlux = NaN(tMax,3,11); %Cube to store fluxes of microbes. z is 1:11 (1surface water to stored water at resupply; 2-overall visit transfer; 3-not used, but formerly land-to-hand-to stored water at drinking; 4-rainfall; 5&6-pooping into surface H2O & land; 7-inactivation; 8&9-kids' dose, water & hands; 10&11-adults' dose, water & hands). tRain = [(1:tMax)',NaN(tMax,1)]; %Logs times of rain events. end extinct=0; equilibrium=0; %Initialize extinction & equilibrium flags. %===Generating the community. Yields a matrix for tracking households and a matrix (or struct, 1 item per hh?) for tracking persons.=== %HHs has 16 columns: counts of persons, adults, and kids (1:3); counts of microbes on hands (4:6), stored water (7:9), food (10:12), compliance with sanitation, HWT, and handwashing (13:15), and amount of water currently stored in the household (16). containerOK = 0; while containerOK == 0; %The 'while' loop ensures that no HH has more people than its stored water container can supply. See the warning near the end. if nargin >= 9; nHH = nHHr; end %Overriding default number of households HHs = poissrnd(mPHH,ceil(nHH * (1 + poisspdf(0,nHH))),1); %Generates extra 383 households, since some will have 0 persons & will be thrown away. HHs = sort(HHs(find(HHs > 0))); nHH = size(HHs,1); %Actual number of households will probably be slightly lower than the inputted number. nPeople = sum(HHs); nKids = round(nPeople * pKids); nAdults = nPeople - nKids; HHs(:,2) = 1; %2nd column is the count of adults. Every household has at least 1 adult. TODO: This will break if nAdults < nHH, but this is extremely unlikely. aA = nAdults - nHH; %Adults that still need to be assigned to a household. while aA > 0; HHs(:,3) = HHs(:,1) - HHs(:,2); %3rd column tracks the number of empty person-slots (potential children) remaining in each household. candidates = find(HHs(:,3) > 0); picked = candidates(ceil(unifrnd(eps,size(candidates,1),1))); %This is like Matlab's randsample(), which Octave lacks. HHs(picked,2) = HHs(picked,2) + 1; aA = aA - 1; end HHs(:,3) = HHs(:,1) - HHs(:,2); %Now, 3rd column becomes the number of kids in each household. Wneeded = sum(HHs(:,2:3) .* repmat([Wdd(2),Wdd(1)],nHH,1),2); %Now errorchecking to make sure no HH exhausts their stored H2O. containerOK = 1; if sum(Wneeded > Ws) ~= 0; containerOK = 0; warning(sprintf('At least 1 household will drink > %sL daily. Retrying...',num2str(Ws))) end end HHs = horzcat(HHs,zeros(nHH,13)); %Adding 9 fields for the 3 marker pathogens on hands (4:6), water (7:9), and food (10:12) in each household. %Also adding 3 fields to track usage (an aspect of compliance) of sanitation, HWT, 384 and handwashing (13:15), and 1 field to track stored water (16). HHs(:,13) = AssignCompliance(HHs(:,13),cSan); %This household uses sanitation X of the time. If X > 0, it owns a latrine. HHs(:,14) = AssignCompliance(HHs(:,14),cHWT); %This household treats their water X of the time. If X > 0, it has a HWT method. HHs(:,15) = AssignCompliance(HHs(:,15),cHand); %This household washes hands X of the time. If X > 0, it has enough soap/water. HHs(:,17) = pSafeStorage * (HHs(:,14) ~= 0); %All households using HWT (partially or perfectly) have the same safe storage status. %HHs(:,17) = binornd(1,pSafeStorage,nHH,1); %Randomly assigning whether the household has safe storage. TODO: binary, SS or not, village-wide? %Now generating matrices for tracking each adult and each child. Adults = zeros(nAdults,8); %Adults, with disease status (1:3) & status counter (4:6) for the 3 marker pathogens. 7th column designates household. %Disease status: -1, susceptible; 0, immune; 1, exposed; 2, infected; 3, diseased iA = 1; %Counter for rows in adults. for i = 1:nHH; %Placing household indices in the 7th column of the matrix, so that people can be tied to certain households. pop = HHs(i,2); Adults(iA:iA+pop-1,7) = i; Adults(iA:iA+pop-1,8) = HHs(i,1); %Adds household size to each adult's record. iA = iA + pop; end Kids = zeros(nPeople(1) - nAdults, 8); %As above, for kids. iA = 1; kidHHindexes = find(HHs(:,3) > 0); %Gets households that have kids. for j = 1:size(kidHHindexes,1); %For all households with kids... i = kidHHindexes(j); pop = HHs(i,3); Kids(iA:iA+pop-1,7) = i; Kids(iA:iA+pop-1,8) = HHs(i,1); %Adds household size to each kid's record. iA = iA + pop; end 385 LogicalKids = logical([ones(nKids,1); zeros(nAdults,1)]); %Vector with 1 for each kid and 0 for each adult. Used later for distinguishing adults from kids. People = vertcat(Kids,Adults); %Combining into a single matrix, one row per person, kids on top and adults on the bottom. People(:,1:3) = 1; People(:,4:6) = ceil(rand(nPeople,3) * max(latent)); %Start with everyone exposed with everything, with a random latent period; everyone will therefore develop infection or disease. %Now choosing particular values for parameters if calibrating. switch(CalibOrEst); case 'C'; %TODO: Test. if nargin == 3; 1; %Do nothing, therefore use default parameter values. else checkCFHDh = max(HHs(:,2) + HHs(:,3) * wHandMouth); %The most a HH could lose from visits is 1-(1-CFV)^max(connx/HH). if 10 ^ CFHDhr(2) * checkCFHDh > 1; warning('Upper end of CFHDh range for calibration would result in more pathogens moving out of the household environment than exist. Adjusting.') CFHDhr(2) = log10(1/checkCFHDh); if CFHDhr(1) > CFHDhr(2); CFHDhr(1) = CFHDhr(2); end end if 10^CFVr(2) > 1 - 10^CFHDhr(2); %TODO: This doesn't seem to work. Currently solving by using ~0.9 for CFVr(2) instead of 1. warning('Upper end of CFV range for calibration would result in more pathogens moving out of the household environment than exist. Adjusting.') CFVr(2) = 1 - CFHDhr(2); if CFVr(1) > CFVr(2); CFVr(1) = CFVr(2); end end CFSf = 10^(rand(1) * (CFSfr(2) - CFSfr(1)) + CFSfr(1)); %Uniform sampling of the surface water to stored drinking water calibration factor (log10 scale). CFHDh = 10^(rand(1) * (CFHDhr(2) - CFHDhr(1)) + CFHDhr(1)); %Uniform sampling of the from-household-env. calibration factor (log10 scale). 386 CFV = 10^(rand(1) * (CFVr(2) - CFVr(1)) + CFVr(1)); %Uniform sampling of the visits calibration factor (log10 scale). CFDl = 10^(rand(1) * (CFDlr(2) - CFDlr(1)) + CFDlr(1)); %Uniform sampling of the surface water to stored drinking water calibration factor (log10 scale). CFdecay = 10 .^ (rand(1,3) .* (CFdecayr(2,:) - CFdecayr(1,:)) + CFdecayr(1,:)); %Uniform sampling of decay modifiers converting base decay rates to actual decay rates (log10 scale). disp(sprintf('Chosen calib. factors: CFSf=%s, CFHDh=%s, CFV=%s, CFDl= %s, CFdecayEc=%s, CFdecayRo=%s, CFdecayGi= %s',num2str(CFSf),num2str(CFHDh),num2str(CFV),num2str(CFDl),num2str(CFdecay(1)),num2str(C Fdecay(2)),num2str(CFdecay(3)))) end case 'E'; %Using parameters from previous runs consistent with RCT. TODO: Update code below (copied from Main.m). CFsetID = TrialParams(l,1); CFSf = TrialParams(l,2); CFHDh = TrialParams(l,3); CFV = TrialParams(l,4); CFDl = TrialParams(l,5); CFdecay = TrialParams(l,6:8); lSan=ElSan; cSan=EcSan; lHWT=ElHWT; cHWT=EcHWT; lHand=ElHand; cHand=EcHand; %Reading in chosen LRV and compliance parameters for a particular intervention scenario. otherwise error('CalibOrEst must be C (calibration) or E (estimation).'); end %end switch block %Now generating dosing weights for ingestion of pathogens on hands (depends on HH composition). Kids ingest more pathogens than adults. NO LONGER NEED wtDoses since pathogens in household environment can now persist from day to day (though they still exponentially decay). %wtDoses = HHs(:,2:3) .* repmat([rHandMouth(2),rHandMouth(1)],nHH,1); %Reversing wtHandMouth to match cols. 2&3 in HHs. 387 %wtDoses(:,3) = sum(wtDoses,2); %wtDoses(:,1:2) = wtDoses(:,1:2) ./ repmat(wtDoses(:,3),1,2) ./ HHs(:,2:3); %Gives proportion of microbes ingested from hands by that household's set of adults and set of children. %wtDoses(find(isnan(wtDoses))) = 0; %Sets NaN (from dividing by 0) to 0. chosenOnes = NaN(nPeople,5); %This matrix is used in the defecation step each day. EffM = zeros(2,9); %This vector stores the number of 'effective' microbes over the course of the simulation (those that contributed to a new infection, kids [row 1] & adults [row 2]). %Connections between households: See Zelner 2012 (in press, AJPH). nEdges = round(nHH*meanDeg/2); if nEdges > nHH * (nHH-1) / 2; disp('Community is too small to generate network of desired degree. Creating fully connected network.') nEdges = nHH * (nHH-1) / 2; end Am = erdrey(nHH,nEdges); %As above, but random graph (Erdos-Renyi). 2nd argument is the number of edges. Yields a sparse matrix. [Ax Ay Av] = find(Am); %Generating adjacency list (1 row per connection). Al = horzcat(Ax,Ay,Av); %Note that all connections in the adjacency list are 2-way, therefore listed twice. nA = size(Al,1); %Number of possible visits. Each connection has 2 possible ways to visit (A visits B, or B visits A). %Now generating all the random numbers needed during the simulation. nRandCols = tMax+1; %Random number table (and event log) is pregenerated based on this. One column per day. RPoopPlace = rand(nPeople,nRandCols); %Random table for defecation location, each person, each day. Unused if pPoopH20==0 (the default). %RCompHW = rand(nPeople,nRandCols); %Random table for handwashing compliance, by household. Compliance is the same for all household members, but is assessed individually. 388 %RCompHWT = rand(nHH,nRandCols); %Random table for household water treatment compliance. %RCompSan = rand(nPeople,nRandCols); %Random table for sanitation compliance, by household. Compliance is the same for all household members, but is assessed individually. RVisit = rand(nA,nRandCols); %Random table for determining which visits happen each day. RRain = rand(1,nRandCols); %Random vector for rainfall events. RDRtime = rand(1,nRandCols); %Random vector for timing of dose response events within the day. RDRpeople = rand(nPeople,3,nRandCols); %Random cube for determining daily outcome of dose response for each person & each pathogen. z coordinate is the day. RMR = rand(nPeople,3,nRandCols); %Random cube for determining daily outcome of morbidity ratios for each person & each pathogen. z coordinate is the day. Rbaseline = rand(nPeople,3,nRandCols); %Random cube for determining baseline exposures, which later turn into baseline infections. z coordinate is the day. %Output sanity check on community size and composition. disp(sprintf('Simulated community has %s people/km^2, %s households, %s people (%s>=5y, %s<5y). Daily water demand is %s L. Smallest HH is %s, biggest is %s, mean %s people/HH. Min connections per HH is %s, max is %s, mean is %s. Running for %s days, preceded by %s days of burn-in, total %s days.',num2str(nPeople/landArea),num2str(size(HHs,1)),num2str(size(People,1)),num2str(siz e(Adults,1)),num2str(size(Kids,1)),num2str(nHH * Ws),num2str(min(HHs(:,1))),num2str(max(HHs(:,1))),num2str(mean(HHs(:,1))),num2str(full(mi n(sum(Am)))),num2str(full(max(sum(Am)))),num2str(full(sum(sum(Am)))/nHH),num2str(tMaxdaysBurnIn),num2str(daysBurnIn),num2str(tMax))); %x = disp(['Simulated community has ',num2str(nPeople/landArea),' people/km^2, ',num2str(size(HHs,1)),' households, ',num2str(size(People,1)),' people (',num2str(size(Adults,1)),'>=5y, ',num2str(size(Kids,1)),'<5y). Smallest HH is ',num2str(min(HHs(:,1))),', biggest is ',num2str(max(HHs(:,1))),', mean ',num2str(mean(HHs(:,1))),' people/HH. Min connections per HH is ',num2str(full(min(sum(Am)))),', max is ',num2str(full(max(sum(Am)))),', mean is ',num2str(full(sum(sum(Am)))/nHH),'. Running for ',num2str(tMax-daysBurnIn),' days, 389 preceded by ',num2str(daysBurnIn),' days of burn-in, total ',num2str(tMax),' days.']); %disp(x) %==== DAILY LOOP STARTS==== tic while t < tMax && extinct == 0 && equilibrium == 0 %Keep iterating until max time is reached or a microbe goes extinct. The 'equilibrium' flag is not currently used. %while t < tMax %For code verification. t=t+1; %Advance to the next day phase = 1; %Marker for debugging initialHHs = HHs; initialPeople = People; %Storing copies of these matrices at the beginning of each day, for debugging. %if Octave == 1; % printf('-%s',num2str(t)); %Printing every day (for debugging) %else fprintf('%s',num2str(mod(t,10))); %Printing last digit of every day (for debugging) %end Sus = People(:,1:3)==-1; %Updating counts of people in various states, for each pathogen. nSus = sum(Sus); nSusK = sum(Sus(1:nKids,:)); nSusA = sum(Sus(nKids+1:nPeople,:)); Imm = People(:,1:3)==0; nImm = sum(Imm); nImmK = sum(Imm(1:nKids,:)); nImmA = sum(Imm(nKids+1:nPeople,:)); Exp = People(:,1:3)==1; nExp = sum(Exp); nExpK = sum(Exp(1:nKids,:)); nExpA = sum(Exp(nKids+1:nPeople,:)); Infec = People(:,1:3)==2; %Would rather use 'Inf' than 'Infec' here, but it means +infinity to Octave/Matlab. nInf = sum(Infec); 390 nInfK = sum(Infec(1:nKids,:)); nInfA = sum(Infec(nKids+1:nPeople,:)); Ill = People(:,1:3)==3; nIll = sum(Ill); nIllK = sum(Ill(1:nKids,:)); nIllA = sum(Ill(nKids+1:nPeople,:)); if sum(sum([Sus;Imm;Exp;Infec;Ill]) == [nPeople nPeople nPeople]) ~= 3; error('Sum of all possible states ~= total number of people, for at least 1 microbe.'); end %Above are 3-element vectors. Now need # of people infected with anything, but ill with nothing (nInfNotIll), and # of people ill with anything (nIllAny). test = sum(abs(People(:,1:3)) .^ 3, 2); %This variable distinguishes 'any infected & non-ill' from 'any ill'. InfNotIll = test < 27 & test >= 8; nInfNotIll = sum(InfNotIll); nInfNotIllK = sum(InfNotIll(1:nKids)); nInfNotIllA = sum(InfNotIll(nKids+1:nPeople)); IllAny = sum(People(:,1:3)==3, 2) > 0; nIllAny = sum(IllAny); nIllAnyK = sum(IllAny(1:nKids)); nIllAnyA = sum(IllAny(nKids+1:nPeople)); %Check for extinction of any microbes. phase = 2; %The next loop is not needed since extinctions have been avoided through assigning baseline infections. %{ if t > daysBurnIn + 1; %This avoids an error (negative incidence) if extinction happens during initial equilibration period. if sum(People(:,1) <= eps) == nPeople & sum([sum(HHs(:,[4 7 10])),nMw(1),nMl(1)]) <= 0.01; extinct = 1; disp(['Sim stopped at time ',num2str(t),' due to 391 extinction of bacteria']) %Terminates the event loop after this day is done. elseif sum(People(:,2) <= eps) == nPeople & sum([sum(HHs(:,[5 8 11])),nMw(2),nMl(2)]) <= 0.01; extinct = 2; disp(['Sim stopped at time ',num2str(t),' due to extinction of viruses']) %Terminates the event loop after this day is done. elseif sum(People(:,3) <= eps) == nPeople & sum([sum(HHs(:,[6 9 12])),nMw(3),nMl(3)]) <= 0.01; extinct = 3; disp(['Sim stopped at time ',num2str(t),' due to extinction of protozoa']) %Terminates the event loop after this day is done. end end %} %if sum(sum(People(:,1:3) <= eps) == nPeople) >= 1 & (sum([sum(HHs(:,[4 7 10]),nMw(1),nMl(1)])) <= 0.01 | sum([sum(HHs(:,[5 8 11]),nMw(2),nMl(2)])) <= 0.01 | sum([sum(HHs(:,[6 9 12]),nMw(3),nMl(3)])) <= 0.01) ); % extinct = 1; disp(['Sim stopped early due to extinction of at least one pathogen near time ',num2str(t)]) %Terminates the event loop after this day is done. %end %{ if mod(t,50) == 0 && t >= 100; %Test whether system has equilibrated, based on any infection (without illness). TODO: Never trips. Probably unnecessary anyway. typeOne = 0.1; %Test whether last 50 days & previous 50 days are drawn from the same distribution, with this value of alpha. eqPass = 0; %Number of times the test has failed to reject hypothesis of same distribution. if Octave==1; %Octave and Matlab have different function names for the Wilcoxon rank-sum test. [pval,ztest] = u_test(Log(t-99:t-50,17),Log(t-49:t,17)); else %if running Matlab: [pval,reject] = ranksum(Log(t-99:t-50,17),Log(t49:t,17),'alpha',typeOne); end 392 if pval < typeOne; eqPass = 0; else eqPass = eqPass + 1; end if eqPass==2; %If the test fails to reject twice in a row, set the equilibrium flag, terminating the event loop: equilibrium = 1; disp(['Sim stopped early; equilibrium apparently reached near time ',num2str(t)]) end end %} %Defecation (the only source of microbes). First, partitioning shedding people into 4 categories. TODO: Fully convert to daily fecal output? Maybe change fpp, rPoopIll, and rPoopInf. phase = 3; chosenOnes(:,1) = IllAny & LogicalKids; %Ill kids. chosenOnes(:,2) = InfNotIll & LogicalKids; %Asymptomatic kids. chosenOnes(:,3) = IllAny & ~LogicalKids; %Ill adults. chosenOnes(:,4) = InfNotIll & ~LogicalKids; %Asymptomatic adults. chosenOnes(:,5) = ~IllAny & ~InfNotIll; %Uninfected people. if unique(sum(chosenOnes,2)) ~= 1; %TODO: Check this. error('Multiple allocation of some people to different categories.') end nMwOld = nMw; nMlOld = nMl; [HHs, nMw, nMl] = Pooping(RPoopPlace(:,t), People, HHs, Mpgf,fpp(1)*rPoopIll,fHands*rPoopIll,nMw,nMl,pPoopH2O,lHand,lSan,chosenOnes(:,1)); %Ill kids. Note more poop on hands due to repeated defecation (rPoopIll). [HHs, nMw, nMl] = Pooping(RPoopPlace(:,t), People, HHs, Mpgf,fpp(1)*rPoopInf,fHands,nMw,nMl,pPoopH2O,lHand,lSan,chosenOnes(:,2)); %Asymptomatic kids. [HHs, nMw, nMl] = Pooping(RPoopPlace(:,t), People, HHs, 393 Mpgf,fpp(2)*rPoopIll,fHands*rPoopIll,nMw,nMl,pPoopH2O,lHand,lSan,chosenOnes(:,3)); adults. Note more poop on hands due to repeated defecation (rPoopIll). [HHs, nMw, nMl] = Pooping(RPoopPlace(:,t), People, HHs, Mpgf,fpp(2)*rPoopInf,fHands,nMw,nMl,pPoopH2O,lHand,lSan,chosenOnes(:,4)); %Asymptomatic adults. if storeDetails == 1; LogFlux(t,:,5) = nMw - nMwOld; LogFlux(t,:,6) = nMl - nMlOld; end postPoopHHs = HHs; %Making a copy for debugging. %Ill %Next transfer: land to water via rain etc. phase = 4; if RRain(t) < rRain; Runoff = nMl * xRunoff(2); tRain(t,2) = 1; %Flagging this day as a rainy one. else Runoff = nMl * xRunoff(1); end nMl = nMl - Runoff; %Removing pathogens from land. nMw = nMw + Runoff; %Adding pathogens to water. if storeDetails == 1; LogFlux(t,1:3,4) = Runoff; end %Now quantifying all possible transfers; then apply them simultaneously (so as to make them independent). %Inter-household visits: quantifying microbe exchange. phase = 5; chosenVisits = Al(RVisit(:,t) < rVisit,:); %Choosing the visits between households that actually happen today (conceptualized as 1 person meeting 1 other person). Throws odd "Undefined function 'visit' for input arguments of type 'char'" error in MATLAB compiled code, but not in interpreted code. %The following complexity ('for' loop) is necessary to handle successive visits 394 on the same day by the same household. HU = unique([chosenVisits(:,1); chosenVisits(:,2)]); %List of households involved in at least 1 visit nHHv = size(HU,1); %Number of households involved in at least 1 visit MH = [HHs(HU,[1 4 5 6]), HU]; MHorig=MH; %Microbes on hands for all households involved in at least 1 visit. Also includes # people in each household (column 1) & the index of the household in the HHs matrix (column 5). for i = 1:size(chosenVisits,1); %Looping once over each visit. %i MHrow1 = find(MH(:,5)==chosenVisits(i,1)); %Microbes on hands for the 1st HH involved in visit i. MHrow2 = find(MH(:,5)==chosenVisits(i,2)); %Microbes on hands for the 2nd HH involved in visit i. Mfrom1 = MH(MHrow1,2:4) ./ MH(MHrow1,1) .* CFV; Mfrom2 = MH(MHrow2,2:4) ./ MH(MHrow2,1) .* CFV; MH(MHrow1,2:4) = MH(MHrow1,2:4) - Mfrom1 + Mfrom2; MH(MHrow2,2:4) = MH(MHrow2,2:4) - Mfrom2 + Mfrom1; end for i = 2:4; if abs(sum(MHorig(:,i)) - sum(MH(:,i))) > 1E-3; %Sums will not be exactly equal due to machine imprecision (TODO: is something else causing this?). warning(sprintf('Imbalance in visit transfer, pathogen %s, difference = %s',num2str(i-1),num2str( sum(MHorig(:,i))-sum(MH(:,i)) ) )) end if abs(sum(MHorig(:,i)) - sum(MH(:,i))) > 1E-1; error('Visit transfer imbalance too severe! Halting.'); end end NetVisitSwaps1 = MHorig(:,2:4) - MH(:,2:4); of the Hands columns of the HHs matrix. 395 %Converting to a matrix the size NetVisitSwaps = sparse(repmat(HU',1,3), [ones(1,nHHv), ones(1,nHHv)*2, ones(1,nHHv)*3], NetVisitSwaps1(:), nHH, 3); %TODO: Doublecheck. if storeDetails == 1; LogFlux(t,1:3,2) = sum(abs(NetVisitSwaps)); end HHs(:,4:6) = HHs(:,4:6) - NetVisitSwaps; %Applying net microbe exchange from visits. %TODO: Check here that the above transfers are positive? Currently there is a similar check within Inact(). %===Resupplying stored water, handling rainfall, shedding, and swapping pathogens.=== phase = 6; nMl = nMl + sum(HHs(:,7:9)); %Dumping out remaining water; pathogens go to the land. This should be a very small flow. storedWater = repmat(nMw * CFSf, nHH, 1); nMw = nMw - sum(storedWater); %Microbes in stored water are removed from surface water. HHs(:,7:9) = storedWater; %Household water is resupplied, at source water conc. of microbes. if storeDetails == 1; LogFlux(t,1:3,1) = sum(storedWater); end; HHs(:,7:9) = ApplyLRVs(HHs(:,7:9), lHWT, HHs(:,14)); %Determining compliance and applying LRVs from HWT. %Hand-to-stored-water transfer: applying microbe movement. Contamination of stored water by hands (from drinking, or other decanting of water). Note: not simultaneous with actual water intake (though perhaps it should be; see DR below). phase = 7; MfromHtoW = HHs(:,4:6) .* CFHDh .* repmat(~HHs(:,17),1,3); %Calculating microbes transferred from hands to water within each HH. Do not need to consider # of people/HH at this step (more people automatically mean more hand contamination since they all defecate). Safe storage negates transfer (last term). HHs(:,4:6) = HHs(:,4:6) - MfromHtoW; %Removing microbes from hands within each household. 396 HHs(:,7:9) = HHs(:,7:9) + MfromHtoW; household. %Adding microbes to water within each if storeDetails == 1; LogFlux(t,1:3,1) = LogFlux(t,1:3,1) + sum(MfromHtoW); %Adding drinker hand contamination to water-gatherer hand contamination. %{ for i = 1:3; %Loop over pathogens if sum(People(:,i) ~= -1) == 0 & sum(HHs(:,i+3)) > sum(initialHHs(:,i+3)); %If all are susceptible and overall hand pathogens somehow increase: warning('Unexplained increase in hand compartments') warnHHs = HHs; warnPeople = People; equilibrium = 1; %Terminates simulation while allowing charts to happen. end end %} end %===Inactivating pathogens in all compartments, over the first part of a day. phase = 8; PreSink = sum([HHs(:,4:6); HHs(:,7:9); nMw; nMl]); [HHs, nMw, nMl] = Inact(RDRtime(t), HHs, nMw, nMl, CFdecay, Wflow, t); if storeDetails == 1; LogFlux(t,:,7) = PreSink - sum([HHs(:,4:6); HHs(:,7:9); nMw; nMl]); end %Daily bookkeeping here, just before status shifts & dose response - storing summary of simulation state and checking for problems. phase = 9; Log(t,1:24) = [t nSus nImm nExp nInf nIll nInfNotIll nIllAny nMw nMl]; %Storing aggregated status of all people. Log(t,25:27) = sum(HHs(:,4:6)); %Total microbes on hands in all households. Log(t,28:30) = sum(HHs(:,7:9)); %Total microbes in stored water in all households. LogK(t,1:18) = [t nSusK nImmK nExpK nInfK nIllK nInfNotIllK nIllAnyK]; 397 %Storing aggregated status, kids only. LogA(t,1:18) = [t nSusA nImmA nExpA nInfA nIllA nInfNotIllA nIllAnyA]; %Storing aggregated status, adults only. if storeDetails == 1; LogHHP{t,1} = HHs; LogHHP{t,2} = People; end %For each person, increment all counters and assign durations if a status shifts. TODO: Sum 'Shifters' and log these state transitions daily, to output incidence. phase = 10; People(:,4:6) = People(:,4:6) - 1; Shifters = People(:,4:6) == 0 & People(:,1:3) == 1; %People whose latent period has just expired... PeopleStates = People(:,1:3); PeopleStates(Shifters) = 2; People(:,1:3) = PeopleStates; %...become infected & infectious... People(:,4:6) = People(:,4:6) + Shifters .* [durEc(nPeople), durRo(nPeople), durGi(nPeople)]; %...and are assigned durations of infection... LogIncInf(t,1:3) = sum(Shifters(1:nKids,:)); %...and tallies of these are stored... LogIncInf(t,4:6) = sum(Shifters(nKids+1:nPeople,:)); LogIncInf(t,7:9) = sum(Shifters); NewlyDiseased = Shifters & RMR(:,:,t) < [repmat(MRk,nKids,1); repmat(MRa,nAdults,1)]; %...but some of the newly infected are also randomly diseased... PeopleStates = People(:,1:3); PeopleStates(NewlyDiseased) = 3; People(:,1:3) = PeopleStates; %...and receive 'diseased' status... LogIncIll(t,1:3) = sum(NewlyDiseased(1:nKids,:)); %...and tallies of these are stored. LogIncIll(t,4:6) = sum(NewlyDiseased(nKids+1:nPeople,:)); LogIncIll(t,7:9) = sum(NewlyDiseased); Shifters = People(:,4:6) == 0 & (People(:,1:3) == 2 | People(:,1:3) == 3); %People whose period of infection/disease has just expired... PeopleStates = People(:,1:3); PeopleStates(Shifters) = 0; People(:,1:3) = 398 PeopleStates; %...become immune... People(:,4:6) = People(:,4:6) + Shifters .* repmat(immune,nPeople,1); %...and are assigned a duration of immunity. Shifters = People(:,4:6) == 0 & People(:,1:3) == 0; %People whose immunity has just expired... PeopleStates = People(:,1:3); PeopleStates(Shifters) = -1; People(:,1:3) = PeopleStates; %...become susceptible... PeopleCounters = People(:,4:6); PeopleCounters(Shifters) = 0; People(:,4:6) = PeopleCounters; %...and their time counters are set to 0. %===Assessing dose response (susceptibles become exposed).=== phase = 11; Doses = zeros(nPeople,9); %Dose matrix, 1 row/person. Columns 1:3, water; columns 4:6, land; columns 7:9, household environment. Doses(1:nKids,1:3) = HHs(People(1:nKids,7),7:9) ./ Ws .* Wdd(1); %Dose from water, kids. Doses(nKids+1:nPeople,1:3) = HHs(People(nKids+1:nPeople,7),7:9) ./ Ws .* Wdd(2); %Dose from water, adults. dHHk = HHs(:,7:9) ./ Ws .* Wdd(1) .* repmat(HHs(:,3),1,3); %Determining kid doses from stored water for each HH. dHHa = HHs(:,7:9) ./ Ws .* Wdd(2) .* repmat(HHs(:,2),1,3); %Determining adult doses from stored water for each HH. HHs(:,7:9) = HHs(:,7:9) - dHHk - dHHa; %Actually removing the doses. %Doses(1:nKids,4:6) = repmat(rHandMouth(1) * pLandHand * nMl, nKids,1); Doses(1:nKids,4:6) = repmat(nMl .* CFDl .* wHandMouth, nKids,1); Doses(nKids+1:nPeople,4:6) = repmat(nMl .* CFDl, nAdults,1); nMl = nMl - sum(Doses(:,4:6)); %Removing land doses ingested by people from the land. %Doses(1:nKids,7:9) = HHs(People(1:nKids,7),4:6) .* repmat(wtDoses(People(1:nKids,7),1),1,3); %Dose from hands, kids. Doses(1:nKids,7:9) = HHs(People(1:nKids,7),4:6) .* CFHDh .* wHandMouth; %Dose from household environment, kids. 399 %Doses(nKids+1:nPeople,7:9) = HHs(People(nKids+1:nPeople,7),4:6) .* repmat(wtDoses(People(nKids+1:nPeople,7),2),1,3); %Dose from hands, adults. Doses(nKids+1:nPeople,7:9) = HHs(People(nKids+1:nPeople,7),4:6) .* CFHDh; %Dose from household env., adults. DosesHHenv = [accumarray(People(:,7),Doses(:,7)),accumarray(People(:,7),Doses(:,8)),accumarray(People( :,7),Doses(:,9))]; %Converting dose matrix from 1 row per person to 1 row per household. TODO: Doublecheck. HHs(:,4:6) = HHs(:,4:6) - DosesHHenv; %Removing pathogens ingested from household environment. CDoseH2O = sum(Doses(1:nKids,1:3)); if storeDetails == 1; LogFlux(t,:,8) = CDoseH2O; end CDoseL = sum(Doses(1:nKids,4:6)); if storeDetails == 1; LogFlux(t,:,9) = CDoseL; end CDoseH = sum(Doses(1:nKids,7:9)); if storeDetails == 1; LogFlux(t,:,9) = CDoseH; end pDoseH2OK = CDoseH2O ./ (CDoseH2O + CDoseL + CDoseH); %Proportion of total community-wide doses from water (kids). TODO: Store for later inspection/analysis. CDoseH2O = sum(Doses(nKids+1:nPeople,1:3)); if storeDetails == 1; LogFlux(t,:,10) = CDoseH2O; end CDoseL = sum(Doses(nKids+1:nPeople,4:6)); if storeDetails == 1; LogFlux(t,:,11) = CDoseL; end CDoseH = sum(Doses(nKids+1:nPeople,7:9)); if storeDetails == 1; LogFlux(t,:,11) = CDoseH; end pDoseH2OA = CDoseH2O ./ (CDoseH2O + CDoseL + CDoseH); %Proportion of total community-wide doses from water (adults). TODO: Store for later inspection/analysis. %DosesT = Doses(:,1:3) + Doses(:,4:6) + Doses(:,7:9); %Summing doses from water & hands for each person. PeopleOld = People; %Copying People in order to assess later how many pos DR events are about to occur. [People EffMnew] = DoseResponse(RDRpeople(:,:,t), People, nKids, Doses, KorN50, alpha, latent); %Determining dose response for each person. Includes latent period 400 assignment (TODO: randomize?). if t >= daysBurnIn; EffM = EffM + EffMnew; end; microbes to the tally. %Adding today's effective %Assigning baseline exposures, which will develop into infections after the latent period expires. Shifters = Rbaseline(:,:,t) < repmat(BaseInf/365,nPeople,1) & People(:,1:3) == -1; %People randomly chosen to get a baseline infection... PeopleStates = People(:,1:3); PeopleStates(Shifters) = 1; People(:,1:3) = PeopleStates; %...are exposed... PeopleCounters = People(:,4:6); PeopleCounters(Shifters) = 0; People(:,4:6) = PeopleCounters; %...their appropriate counter(s) (which are neg.) get set to 0... People(:,4:6) = People(:,4:6) + Shifters .* repmat(latent,nPeople,1); %...and are assigned a latent period. Log(t,31:33) = sum(Shifters); %===Done with dose response and baseline exposures - now inactivating over the remainder of the day. phase = 12; PreSink = sum([HHs(:,4:6); HHs(:,7:9); nMw; nMl]); [HHs, nMw, nMl] = Inact(1-RDRtime(t), HHs, nMw, nMl, CFdecay, Wflow, t); if storeDetails == 1; LogFlux(t,:,7) = LogFlux(t,:,7) + PreSink sum([HHs(:,4:6); HHs(:,7:9); nMw; nMl]); end %===Day's activities are finished. Now for some bookkeeping.=== phase = 13; if t == round(tMax/100); fprintf('\n1%% done, %s seconds elapsed, %s days so far, %s sec. per day\n',num2str(toc),num2str(t),num2str(toc/t)); elseif t == round(tMax/20); fprintf('\n5%% done, %s seconds elapsed, %s days so far, %s sec. per day\n',num2str(toc),num2str(t),num2str(toc/t)); elseif t == round(tMax/4); fprintf('\n25%% done, %s seconds elapsed, %s days so far, %s sec. per 401 day\n',num2str(toc),num2str(t),num2str(toc/t)); elseif t == round(tMax/2); fprintf('\n50%% done, %s seconds elapsed, %s days so far, %s sec. per day\n',num2str(toc),num2str(t),num2str(toc/t)); elseif t == round(3*tMax/4); fprintf('\n75%% done, %s seconds elapsed, %s days so far, %s sec. per day\n',num2str(toc),num2str(t),num2str(toc/t)); end if Octave == 1; fflush(stdout); end %Forces lines above to write to console. end %Ends daily loop %disp(x) %Redisplaying the 'sanity check' from the end of the community setup phase. disp(sprintf('\nRun complete, %s sec., %s sec./day, day %s, now outputting & charting',num2str(toc),num2str(toc/tMax),num2str(t))); tObsY = (t - daysBurnIn)/365; %Time observed, in years. IncInfK = sum(LogIncInf(daysBurnIn:t,1:3)) ./ (nKids * tObsY); IncInfA = sum(LogIncInf(daysBurnIn:t,4:6)) ./ (nAdults * tObsY); IncInfP = sum(LogIncInf(daysBurnIn:t,7:9)) ./ (nPeople * tObsY); IncIllK = sum(LogIncIll(daysBurnIn:t,1:3)) ./ (nKids * tObsY); IncIllA = sum(LogIncIll(daysBurnIn:t,4:6)) ./ (nAdults * tObsY); IncIllP = sum(LogIncIll(daysBurnIn:t,7:9)) ./ (nPeople * tObsY); disp(sprintf('Child illness incidence %s, adult %s, overall %s, episodes/personyear, bac-vir-prot.',num2str(IncIllK),num2str(IncIllA),num2str(IncIllP) )); %Storing output from each run in OutMatrix, one run per row. 1st column is the row number. Start with calibration variables. OutMatrix(l,2) = CFSf; OutMatrix(l,3) = CFHDh; OutMatrix(l,4) = CFV; OutMatrix(l,5) = CFDl; OutMatrix(l,6:8) = CFdecay; OutMatrix(l,9:11) = Mpgf; OutMatrix(l,12:17) = [cSan cHWT cHand]; %Compliance. 402 OutMatrix(l,18) = pSafeStorage; %Whether HWT compliers are using safe storage. OutMatrix(l,19:27) = [lSan lHWT lHand]; %Log reduction values. OutMatrix(l,28:39) = [IncInfK sum(IncInfK) IncInfA sum(IncInfA) IncInfP sum(IncInfP)]; %Incidence of infection, bac.-vir.-prot.-total, kids, adults and all people. OutMatrix(l,40:51) = [IncIllK sum(IncIllK) IncIllA sum(IncIllA) IncIllP sum(IncIllP)]; %Incidence of illness, bac.-vir.-prot.-total, kids, adults and all people. OutMatrix(l,52:60) = EffM(1,:); %Microbes contributing to actual new infections in kids, by route. OutMatrix(l,61:69) = EffM(2,:); %Microbes contributing to actual new infections in adults, by route. OutMatrix(l,70) = fHands; %Number of grams of feces on hands per defecation event. OutMatrix(l,71) = extinct; %Whether 1+ microbes went extinct. OutMatrix(l,72) = tObsY; %Amount of time observed. OutMatrix(l,73:75) = [nHH nKids nAdults]; %Number of households, kids, and adults. OutMatrix(l,76:78) = mean(Log(daysBurnIn:tMax,19:21)); %Mean numbers of microbes in surface water (excluding burn-in). OutMatrix(l,79:81) = mean(Log(daysBurnIn:tMax,22:24)); %Mean numbers of microbes on land (excluding burn-in). OutMatrix(l,82:84) = mean(Log(daysBurnIn:tMax,28:30)); %Mean numbers of microbes in stored drinking water (excluding burn-in). OutMatrix(l,85:87) = mean(Log(daysBurnIn:tMax,25:27)); %Mean numbers of microbes in household environment (excluding burn-in). if CalibOrEst == 'E'; OutMatrix(l,88) = CFsetID; %Storing the ID for the particular calibration factor set used. end %StoreParams(p,'OutputLog'); %p is a struct holding all the parameter values. if storeDetails == 1 & CalibOrEst == 'C'; if only a few runs are being done. 403 %Plotting a set of graphs for each run, %Plotting people (kids): F2=figure('Position',[100 100 1024 768]); %Putting graph at (100,100) on screen with 1024x768 resolution. x=.05; y=.72; x1=.93; y1=.23; %Values for positioning of 1st subplot (modify y for placing 2nd & subsequent subplots). subplot(3,1,1) %Plotting bacteria PlotPeople(LogK, tRain, x, y, x1, y1, 'bacteria', 'children <5y', daysBurnIn, nKids); subplot(3,1,2) %Plotting viruses PlotPeople(LogK, tRain, x, y, x1, y1, 'viruses', 'children <5y', daysBurnIn, nKids); subplot(3,1,3) %Plotting protozoa PlotPeople(LogK, tRain, x, y, x1, y1, 'protozoa', 'children <5y', daysBurnIn, nKids); eval(['print Graphics/Kids',strrep(strrep(char(p.StartTime),':','-'),' ','-'),'.png']) %Saving chart with datetime in filename w/o spaces or colons %Plotting people (adults): F3=figure('Position',[100 100 1024 768]); %Putting graph at (100,100) on screen with 1024x768 resolution. x=.05; y=.72; x1=.93; y1=.23; %Values for positioning of 1st subplot (modify y for placing 2nd & subsequent subplots). subplot(3,1,1) %Plotting bacteria PlotPeople(LogA, tRain, x, y, x1, y1, 'bacteria', 'people >=5y', daysBurnIn, nAdults); subplot(3,1,2) %Plotting viruses PlotPeople(LogA, tRain, x, y, x1, y1, 'viruses', 'people >=5y', daysBurnIn, nAdults); subplot(3,1,3) %Plotting protozoa PlotPeople(LogA, tRain, x, y, x1, y1, 'protozoa', 'people >=5y', daysBurnIn, nAdults); eval(['print Graphics/Adults',strrep(strrep(char(p.StartTime),':','-'),' ','-'),'.png']) %Saving chart with datetime in filename w/o spaces or colons %Plotting people (everybody): 404 F4=figure('Position',[100 100 1024 768]); %Putting graph at (100,100) on screen with 1024x768 resolution. x=.05; y=.72; x1=.93; y1=.23; %Values for positioning of 1st subplot (modify y for placing 2nd & subsequent subplots). subplot(3,1,1) %Plotting bacteria PlotPeople(Log, tRain, x, y, x1, y1, 'bacteria', 'people', daysBurnIn, nPeople); subplot(3,1,2) %Plotting viruses PlotPeople(Log, tRain, x, y, x1, y1, 'viruses', 'people', daysBurnIn, nPeople); subplot(3,1,3) %Plotting protozoa PlotPeople(Log, tRain, x, y, x1, y1, 'protozoa', 'people', daysBurnIn, nPeople); eval(['print Graphics/People',strrep(strrep(char(p.StartTime),':','-'),' ','-'),'.png']) %Saving chart with datetime in filename w/o spaces or colons %Plotting pathogens: F5=figure('Position',[100 100 1024 768]); %Putting graph at (100,100) on screen with 1024x768 resolution. x=.05; y=.55; x1=.93; y1=.42; %Values for positioning of 1st subplot (modify y for placing 2nd & subsequent subplots). subplot(2,1,1) %Plotting pathogens in community. %set(gca,'FontSize',18); %Increases font size on axis tick labels (gca is 'get current axes'). set(gca,'Position',[x y-0*.5 x1 y1]); if Octave==1; %Octave and MATLAB handle line properties a bit differently in graphs. semilogy(Log(:,1),Log(:,19)+.1,'-r','linewidth',3,Log(:,1),Log(:,20)+.1,'b','linewidth',3,Log(:,1),Log(:,21)+.1,'-g','linewidth',3); %All 3 of these are thick lines else semilogy(Log(:,1),Log(:,19)+.1,'-r',Log(:,1),Log(:,20)+.1,'b',Log(:,1),Log(:,21)+.1,'-g','linewidth',3); %All 3 of these are thick lines end hold on; 405 semilogy(Log(:,1),Log(:,22)+.1,'-r',Log(:,1),Log(:,23)+.1,'b',Log(:,1),Log(:,24)+.1,'-g'); plot(tRain(:,1),tRain(:,2)+80,' x'); title('Number of microbes in different community compartments') ylabel('# microbes','fontsize',20) %Axis & tick labels overlap on screen, but output better to .PNG. legend('Bac., surf. H2O','Viruses, surf. H2O','Prot., surf. H2O','Bac., soil','Viruses, soil','Prot., soil','Time of rain events','Location','SouthEast') hold off; subplot(2,1,2) %Plotting pathogens in households. set(gca,'Position',[x y-1*.5 x1 y1]); if Octave==1; semilogy(Log(:,1),Log(:,28)+.1,'-r','linewidth',3,Log(:,1),Log(:,29)+.1,'b','linewidth',3,Log(:,1),Log(:,30)+.1,'-g','linewidth',3); %All 3 of these are thick lines (stored water) else semilogy(Log(:,1),Log(:,28)+.1,'-r',Log(:,1),Log(:,29)+.1,'b',Log(:,1),Log(:,30)+.1,'-g','linewidth',3); %All 3 of these are thick lines (stored water) end hold on; semilogy(Log(:,1),Log(:,25)+.1,'-r',Log(:,1),Log(:,26)+.1,'b',Log(:,1),Log(:,27)+.1,'-g'); %Supposed to graph total pathogens on all households' hands. plot(tRain(:,1),tRain(:,2)+80,' x'); title('Number of microbes in different household compartments') xlabel('Time (days)'); ylabel('# microbes','fontsize',20) %Axis & tick labels overlap on screen, but output better to .PNG. legend('Bac., stored H2O','Viruses, stored H2O','Prot., stored H2O','Bac., hands','Viruses, hands','Prot., hands','Time of rain events','Location','SouthEast') hold off; eval(['print Graphics/Microbes',strrep(strrep(char(p.StartTime),':','-'),' 406 ','-'),'.png']) %Saving chart with datetime in filename w/o spaces or colons. %Plotting fluxes of microbes. F5=figure('Position',[100 100 1024 768]); %Putting graph at (100,100) on screen with 1024x768 resolution. x=.05; y=.72; x1=.93; y1=.23; %Values for positioning of 1st subplot (modify y for placing 2nd & subsequent subplots). subplot(3,1,1) %Plotting bacteria PlotMicrobes(LogFlux, tMax, tRain, x, y, x1, y1, 'bacteria', daysBurnIn); subplot(3,1,2) %Plotting viruses PlotMicrobes(LogFlux, tMax, tRain, x, y, x1, y1, 'viruses', daysBurnIn); subplot(3,1,3) %Plotting protozoa PlotMicrobes(LogFlux, tMax, tRain, x, y, x1, y1, 'protozoa', daysBurnIn); eval(['print Graphics/Fluxes',strrep(strrep(char(p.StartTime),':','-'),' ','-'),'.png']) %Saving chart with datetime in filename w/o spaces or colons. disp(['Charts done.']) end disp(sprintf('Whole run took %s seconds.',num2str(toc))) end %Ends main loop (once per run). %outFilename = ['Results/',outFilename]; dlmwrite(outFilename, OutMatrix, '-append'); disp([' Results written to ',outFilename,'.']); end %Ends function. 407 %This function applies log reduction values (LRVs) to stocks of pathogens. %It simply reduces the input number of microbes (nMin) according to the appropriate LRVs, and outputs the result. %This should usually be used with row vectors, 3 values each (bacteria, viruses, and protozoa, in that order). However, it accomodates multiple rows. %nMinput: Vector/matrix containing number of microbes that an LRV might be applied to. %LRVs: Log reduction values to be applied. 1 column per microbe. %Compliance: Proportion of all microbes to which the LRVs are being applied. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (ApplyLRVs.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function [nMoutput] = ApplyLRVs(nMinput, LRVs, Compliance); nRows = size(nMinput,1); %Gets number of households/people that it's acting on. Compliance = repmat(Compliance,1,size(LRVs,2)); %Replicating Compliance vector so 408 it matches the size of nMinput. LRVs = repmat(LRVs,nRows,1); nMoutput = nMinput .* (1-Compliance) + nMinput .* Compliance .* 10 .^ -LRVs; end 409 %Function to generate compliance behaviors for individual households given a particular compliance scheme. %HHcol: the column in the household matrix that is being populated. %cv: the compliance vector, i.e., cSan, cHWT, or cHand, 1st value being overall complance, 2nd being compliance type. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (AssignCompliance.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} %TODO: correlate the household-level compliances assigned to the 3 main interventions? function [output] = AssignCompliance(HHcol,cv); ct = cv(2); switch ct; case 1; %compliance type alpha: everyone perfectly complies or doesn't comply at all 410 output = rand(size(HHcol,1),1) < cv(1); %Randomly assign binary value to compliance for each household. case 2; %compliance type beta: some perfectly comply, some partially comply, some don't comply at all output = rand(size(HHcol,1),1); output(find(output < cv(1)/2)) = 1; %Assigning perfect compliers a value of 1 output(find(output > 1 - ((1-cv(1)) / 2) & output < 1)) = 0; %Assigning noncompliers a value of 0 output(find(output > 0 & output < 1)) = cv(1); %Assigning the remaining partial compliers the overall compliance value case 3; %compliance type gamma: everyone partially complies output = cv(1); otherwise error('Compliance type != 1, 2, or 3, therefore invalid'); end end 411 %Loop for obtaining parameter values for estimation runs while modifying reservoir size, pLandHand, etc. Facilitates computing cluster use. %Be sure to check that GetTrialParams.m is configured properly before running this script. %Submit as several jobs to parallelize a calibration run (maybe not worth bothering with job array). %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (CalLoopFuncCompile.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function[OutM OutS] = CalibrationLoopFuncCompile(indexText,inc,calibRunsText,jobname); %This helps with debugging, since arguments to compiled code can only be text. disp(class(indexText)) index = str2num(indexText); RandStream.setDefaultStream(RandStream.create('mt19937ar','seed',sum([clock index*10]))); 412 %Sets random stream based on clock & job index. calibRuns = str2num(calibRunsText); %===Required lines for HPCC setenv MKL_DYNAMIC FALSE %maxNumCompThreads(1); %Throws an error. Recommended, but does not seem to be necessary. %=== %===Parameter entry=== infile = 'nonapplicable.csv'; outfile = ['Results/',jobname,'.csv']; %===End parameter entry=== %disp(['##### Running ',num2str(size(U,2)),'*',num2str(size(T,2)),'*',num2str(size(L,2)),'+1=',num2str(combos), ' parameter combinations on ',num2str(size(tempData,1)),' parameter sets from calibration, should have requested at least that many members in the job array. #####']) if inc(1:2) == 'Hi'; EITSd('C', calibRuns, outfile, infile, [-11 -2.4], [-10 -1.5], [-8 -.1], [-15 -4], [-1 0 0;2 3 3], 200); %1st try; including the max. possible levels for all ranges of transfer params. disp(['##### DONE #####']) end %End function. 413 %Determine response (infection) given a particular dose, whatever the route. %These arguments (R, dose, KorN50, alpha) are vectors or matrices, 1 column per pathogen. %Uses 3 random numbers per person to determine response. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DoseResponse.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function [People EffM] = DoseResponse(RDRpeople, People, nKids, Doses, KorN50, alpha, latent); nP = size(People,1); %Number of people (rows). nAdults = nP - nKids; nM = size(KorN50,2); %Number of types of microbe (columns). %marker=repmat(['B','V','P'],nP,1); %Marker that is used to record the type(s) of microbe that resulted in infection. DosesT = Doses(:,1:3) + Doses(:,4:6) + Doses(:,7:9); %Summing doses from water & 414 hands for each person. Shifters = logical(zeros(nP,nM)); %Creating matrix of true/false for storing & processing outcomes. DosesT = DosesT .* (People(:,1:3) == -1); %If person is not susceptible, set dose to 0. for i = 1:nM %Loop over the 3 microbes if isnan(alpha(i))==1 response = DRexp(KorN50(i),DosesT(:,i)); else response = DRbP(KorN50(i),alpha(i),DosesT(:,i)); end Shifters(:,i) = RDRpeople(:,i) < response; %Determines which people will become infected. end %tags = char(horzcat(ones(nP,1)*88, (marker .* outcome))); %Tags for pos. dose response events w. appropriate microbe(s) (e.g., 'FoodEatXBV'). %event = horzcat(event,tags); %Apply the tags to the event name. PeopleStates = People(:,1:3); PeopleStates(Shifters) = 1; People(:,1:3) = PeopleStates; %...People who get a positive response become exposed... PeopleCounters = People(:,4:6); PeopleCounters(Shifters) = 0; People(:,4:6) = PeopleCounters; %...their appropriate counter(s) (which are negative) get set to 0... People(:,4:6) = People(:,4:6) + Shifters .* repmat(latent,nP,1); %...and are assigned a latent period. DosesK = Doses(1:nKids,:); DosesA = Doses((nKids+1):nP,:); ShiftersK = Shifters(1:nKids,:); ShiftersA = Shifters((nKids+1):nP,:); EffMWK = sum(DosesK(:,1:3) .* ShiftersK); a new infection. EffMLK = sum(DosesK(:,4:6) .* ShiftersK); EffMHK = sum(DosesK(:,7:9) .* ShiftersK); 415 %Tallying microbes that contributed to EffMWA = sum(DosesA(:,1:3) .* a new infection. EffMLA = sum(DosesA(:,4:6) .* EffMHA = sum(DosesA(:,7:9) .* EffM = [EffMWK EffMLK EffMHK; ShiftersA); %Tallying microbes that contributed to ShiftersA); ShiftersA); EffMWA EffMLA EffMHA]; %People(:,1:nM) = People(:,1:nM) + 2 * outcome; %Assigns exposure (status=1) to those people (who are susceptible, thus have status=-1). %People(:,nM+1:nM+nM) = People(:,nM+1:nM+nM) .* !outcome; %Sets counter to 0 for those people who have a new infection incubating. %People(:,nM+1:nM+nM) = People(:,nM+1:nM+nM) + repmat(latent,nP,1) .* outcome; %Assigns the length of the latent period. %TODO: Assign randomly chosen latent period instead. end 416 %Beta-Poisson dose response model, using N50 (default) or beta as a parameter %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DRbP.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function outvar = DRbP(N50orBeta,alpha,invar,reverse,WhichParam) %Ordinarily, invar is dose & outvar is response. if nargin == 3; reverse = 'no'; WhichParam = 'N50'; end switch(reverse) case 'no' switch(WhichParam) case 'N50' outvar = 1-(1+(invar/N50orBeta)*(2^(1/alpha)-1)).^-alpha; 417 case 'Beta' outvar = 1-(1+(invar/N50orBeta)).^-alpha; otherwise error(['WhichParam must be "N50" or "Beta"']) end case 'yes' %If reverse='yes', invar is response & outvar is dose. switch(WhichParam) case 'N50' outvar = N50orBeta * ( ((1-invar).^(-1/alpha) -1) / (2^(1/alpha)-1) ); case 'Beta' outvar = N50orBeta * ((1-invar).^(-1/alpha) -1); otherwise error(['WhichParam must be "N50" or "Beta"']) end otherwise error(['reverse must be "no" or "yes"']) end end 418 %Exponential dose response model %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (DRexp.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function outvar = DRexp(k, invar, reverse) %This works in Matlab. if nargin < 3; reverse = 'no'; end switch(reverse); case 'no'; outvar = 1-exp(-k * invar); case 'yes'; %If reverse='yes', invar is response & outvar is dose. outvar = log(1-invar)/-k; otherwise error(['reverse (last parameter) must be "no" or "yes"']) end 419 end 420 %Functions for calculating vectors of illness durations %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durEc.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function output = durEc(n); output = round(gamrnd(1.775,1.690,[n,1])); %Shape, then scale. From Estrada-Garcia 2009. output(output == 0) = 1; %Sets zero durations to 1 day instead. end 421 %Functions for calculating vectors of illness durations %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durGi.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function output = durGi(n); %Based on a fit of gamma dist. to limited info from Kent GP 1988. output = round(gamrnd(3.206,3.431,[n,1])); %Shape, then scale output(output == 0) = 1; %Sets zero durations to 1 day instead. end 422 %Functions for calculating vectors of illness durations %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (durRo.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function output = durRo(n); %Based on 4 rotavirus-infected volunteers having durations of 1, 2, 3, and 4 days (Kapikian 1983). output = ceil(rand([n,1]) * 4); %output(output == 0) = 1; %Sets zero durations to 1 day instead. end 423 %Loop for obtaining estimation runs while modifying pUse, pTreat, and LRVs. Facilitates computing cluster use. %Be sure to check that GetTrialParams.m is configured properly before running this script. %Differs from EstimationLoopFuncCompile in that only a small number of parameter combinations are chosen, rather than all possible combinations. %mix: Proportion of childhood disease that is waterborne (A, B, or C). %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (EstLoopFuncCompileBase.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function[OutM OutS] = EstLoopFuncCompileBase(indexText,jobname); Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). index = str2num(indexText); 424 if Octave == 0; RandStream.setDefaultStream(RandStream.create('mt19937ar','seed',sum([clock index*10]))); %Sets random stream based on clock & job index. Doesn't work with Octave (Octave bases the seed on the clock by default). end %===Required lines for HPCC setenv MKL_DYNAMIC FALSE %maxNumCompThreads(1); %Throws an error. Does not seem to be necessary. %===End parameter entry=== disp(['##### Running 6 parameter combinations, 2 each from runs that fit for mixes A, B, C, and Z, no interventions, should have requested at least 6 members in the job array. #####']) if index == 1 | index == 2; %If running the baseline parameters: EITSd('E',5,['Results/EstBaseA',indexText,'.csv'],'RTF_A_CalHi10b.csv',1,2,3,4,5,200,[0 0 0],[0 1],[0 0 0],[0 1],[0 0 0],[0 1]); elseif index == 3 | index == 4; EITSd('E',5,['Results/EstBaseB',indexText,'.csv'],'RTF_B_CalHi10b.csv',1,2,3,4,5,200,[0 0 0],[0 1],[0 0 0],[0 1],[0 0 0],[0 1]); elseif index == 5 | index == 6; EITSd('E',5,['Results/EstBaseC',indexText,'.csv'],'RTF_C_CalHi10b.csv',1,2,3,4,5,200,[0 0 0],[0 1],[0 0 0],[0 1],[0 0 0],[0 1]); elseif index == 7 | index == 8; EITSd('E',5,['Results/EstBaseZ',indexText,'.csv'],'RTF_Z_CalHi10b.csv',1,2,3,4,5,200,[0 0 0],[0 1],[0 0 0],[0 1],[0 0 0],[0 1]); else error('Too many jobs!') end disp(['##### DONE #####']) end %End function. 425 %Loop for obtaining estimation runs while modifying pUse, pTreat, and LRVs. Facilitates computing cluster use. %Be sure to check that GetTrialParams.m is configured properly before running this script. %Differs from EstimationLoopFuncCompile in that only a small number of parameter combinations are chosen, rather than all possible combinations. %mix: Proportion of childhood disease that is waterborne (A, B, or C). %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (EstLoopFuncCompileHWT.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function[OutM OutS] = EstLoopFuncCompileHWT(indexText,mix,overallComplianceText,jobname); Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). index = str2num(indexText); 426 if Octave == 0; RandStream.setDefaultStream(RandStream.create('mt19937ar','seed',sum([clock index*10]))); %Sets random stream based on clock & job index. Doesn't work with Octave (Octave bases the seed on the clock by default). end oC = str2num(overallComplianceText); %===Required lines for HPCC setenv MKL_DYNAMIC FALSE %maxNumCompThreads(1); %Throws an error. Does not seem to be necessary. %===Parameter entry=== Note that 0 should not be included in L. %P = [0 .1 .2]; %Vector of desired values for proportions of children never using the device. %N = [0 .1 .2]; %Vector of desired values for proportions of children perfectly using the device. L = [1 2 3 4 5]; %Vector of log reduction values desired (all marker pathogens get the same LRV). %Testing the code using the vectors below. %U=[.9 1] %T=[.9 1] %L=[1 2] %Constructing a matrix with all possible combos of P, N, & L %[p n l] = ndgrid(P,N,L); %Combos = [p(:) n(:) l(:)]; %TODO: Build functionality to check N, P, overallCompliance, and pTreat to ensure they make sense before running. %Building appropriate combinations of perfect compliers (P; 1st column) and noncompliers (N; 2nd column). Combos = zeros(3,3); %Combo with max possible perfect compliers & max possible noncompliers (same as [0 1-oC]; [oC 1-oC] is 0/0). Combos(:,1) = oC; Combos(:,2) = [1 2 3]; Combos(:,3) = L(1); %This & subsequent 'for' loop copy the above for each possible LRV. 427 OutCombos = Combos; for i = 2:size(L,2); NextCombos = Combos; NextCombos(:,3) = L(i); OutCombos = [OutCombos; NextCombos]; end %Baselines = zeros(1,3); %Creating baseline row. No log reduction & no perfect compliance. %Baselines(:,2) = 1; %Modifies above, so that 100% never use device. Combos = OutCombos; Combos' %Output results, transposed. combos = size(Combos,1) infile = ['RTF_',mix,'_CalHi10b.csv']; outfile = ['Results/',jobname,'.csv']; %===End parameter entry=== tempData = csvread(infile,1,1); disp(['##### Running 3 * ',num2str(size(L,2)),' = ',num2str(combos),' parameter combinations on ',num2str(size(tempData,1)),' parameter sets from calibration, should have requested at least that many members in the job array. #####']) %if index == 1; %If running the baseline parameters: % [OutM OutS] = Main(Combos(index,1), Combos(index,2), oC, [Combos(index,3) Combos(index,3) Combos(index,3)], [0 0 0], [0 0 0], 0, outfile, infile, multConc, nSpikes, multSpikes, 1); %else %If running the parameters from calibration: EITSd('E',10,outfile,infile,1,2,3,4,5,200,[0 0 0],[0 1],Combos(index,3)*[1 1 1], [Combos(index,1) Combos(index,2)],[0 0 0],[0 1]); %end disp(['##### DONE #####']) end %End function. 428 %Loop for obtaining estimation runs while modifying pUse, pTreat, and LRVs. Facilitates computing cluster use. %Be sure to check that GetTrialParams.m is configured properly before running this script. %Differs from EstimationLoopFuncCompile in that only a small number of parameter combinations are chosen, rather than all possible combinations. %mix: Proportion of childhood disease that is waterborne (A, B, or C). %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (EstLoopFuncCompileHWTPC.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function[OutM OutS] = EstLoopFuncCompileHWTPC(indexText,jobname); Octave = size(ver('Octave'),1); %Indicator of whether the code is running under Octave (1) or Matlab (0). index = str2num(indexText); 429 if Octave == 0; RandStream.setDefaultStream(RandStream.create('mt19937ar','seed',sum([clock index*10]))); %Sets random stream based on clock & job index. Doesn't work with Octave (Octave bases the seed on the clock by default). end %oC = str2num(overallComplianceText); %===Required lines for HPCC setenv MKL_DYNAMIC FALSE %maxNumCompThreads(1); %Throws an error. Does not seem to be necessary. %===Parameter entry=== Note that 0 should not be included in L. %P = [0 .1 .2]; %Vector of desired values for proportions of children never using the device. %N = [0 .1 .2]; %Vector of desired values for proportions of children perfectly using the device. L = [1 2 3 4 5]; %Vector of log reduction values desired (all marker pathogens get the same LRV). %Testing the code using the vectors below. %U=[.9 1] %T=[.9 1] %L=[1 2] %Constructing a matrix with all possible combos of P, N, & L %[p n l] = ndgrid(P,N,L); %Combos = [p(:) n(:) l(:)]; %TODO: Build functionality to check N, P, overallCompliance, and pTreat to ensure they make sense before running. %Building appropriate combinations of perfect compliers (P; 1st column) and noncompliers (N; 2nd column). Combos = zeros(4,3); %Combo with max possible perfect compliers & max possible noncompliers (same as [0 1-oC]; [oC 1-oC] is 0/0). Combos(:,1) = 1; Combos(:,2) = [1 2 3 4]; Combos(:,3) = L(1); %This & subsequent 'for' loop copy the above for each possible LRV. 430 OutCombos = Combos; for i = 2:size(L,2); NextCombos = Combos; NextCombos(:,3) = L(i); OutCombos = [OutCombos; NextCombos]; end %Baselines = zeros(1,3); %Creating baseline row. No log reduction & no perfect compliance. %Baselines(:,2) = 1; %Modifies above, so that 100% never use device. Combos = OutCombos; Combos' %Output results, transposed. combos = size(Combos,1) mixOptions='ABCZ'; %Translating the digits 1-3 into letters A-C. mix=mixOptions(Combos(index,2)); infile = ['RTF_',mix,'_CalHi10b.csv']; outfile = ['Results/',jobname,mix,'.csv']; %===End parameter entry=== tempData = csvread(infile,1,1); disp(['##### Running 4 * ',num2str(size(L,2)),' = ',num2str(combos),' parameter combinations on ',num2str(size(tempData,1)),' parameter sets from calibration, should have requested at least that many members in the job array. #####']) %if index == 1; %If running the baseline parameters: % [OutM OutS] = Main(Combos(index,1), Combos(index,2), oC, [Combos(index,3) Combos(index,3) Combos(index,3)], [0 0 0], [0 0 0], 0, outfile, infile, multConc, nSpikes, multSpikes, 1); %else %If running the parameters from calibration: EITSd('E',10,outfile,infile,1,2,3,4,5,200,[0 0 0],[0 1],Combos(index,3)*[1 1 1],[1 1],[0 0 0],[0 1]); %end disp(['##### DONE #####']) end %End function. 431 %Inactivation (attenuation) of microbes in all compartments. Also includes some error checking. %Note that CFdecay is now applied to decay in all compartments (in EITS06 and later versions). %Formerly (in EITS05), CFdecay was only applied to land & household environments, not to surface water or stored drinking water decay. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (Inact.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function [HHs, nMw, nMl] = Inact(RDRtime, HHs, nMw, nMl, nHH = size(HHs,1); %if iscomplex(HHs) == 1; %Works in Octave, but if sum(sum(imag(HHs))) ~= 0; %Works in both Octave error('Complex number in HHs!'); %This was a 432 CFdecay, Wflow, t); not MATLAB. and MATLAB. problem in earlier versions of the code. end if sum(sign([nMw nMl HHs(nHH*3+1:nHH*16)]) == -1) > 0; %Check for any problematic negative values. negs = find(HHs < 0); [rs,cs] = find(HHs < 0); warning('\n%s negative values in HHs at time %s, totaling %s, rows %s, cols %s, vals %s.',num2str(size(negs,1)),num2str(t),num2str(sum(HHs(negs))),num2str(rs),num2str(cs),num 2str(HHs(negs))) %When negative values occur, they tend to be in single-person households with repeated inter-household visits. if sum(HHs(negs)) > -1E-100; warning('Very tiny negative values (> -1E-100), setting them to 0.') HHs(negs) = 0; %Kludge: effectively adds some pathogens to the system to counteract negative values; however, seldom needed. else error('Larger negative values than usual. Stopping.') end if sum(sign([nMw nMl]) == -1) > 0; disp(nMw); disp(nMl); error('Surface water or land has gone negative! Setting to 0.') nMw(nMw < 0) = 0; nMl(nMl < 0) = 0; disp(nMw); disp(nMl); end end %nMw = nMw .* exp((-gM-Wflow) * RDRtime); %Surface water (this line formerly used in EITS05). nMw = nMw .* exp(-CFdecay .* RDRtime); %Surface water. nMl = nMl .* exp(-CFdecay .* RDRtime); %Land. Next is household inactivation. for i = 1:3; %Microbe. Corresponds to the appropriate entry in gM (inactivation rate). for j = [0 3]; %Compartment. 0 signifies hands (cols. 4:6 in HHs), 3 signifies 433 stored water (cols. 7:9 in HHs). if j == 0; HHs(:,i+3+j) = HHs(:,i+3+j) .* exp(-CFdecay(i) * RDRtime); elseif j == 3; %HHs(:,i+3+j) = HHs(:,i+3+j) .* exp(-gM(i) * RDRtime); %This line formerly used in EITS05. HHs(:,i+3+j) = HHs(:,i+3+j) .* exp(-CFdecay(i) * RDRtime); end end end end 434 %Function for generating line charts of microbe transfers. %Need to start the figure, set up subplots, and determine subplot positioning before calling this function. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (PlotMicrobes.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} %LogFlux = NaN(tMax,3,8); %Cube to store fluxes of microbes. z is 1:11 (1-surface water to stored water at resupply; 2-net visit transfer; 3-not used, but formerly landto-hand-to stored water at drinking; 4-rainfall; 5&6-pooping into surface H2O (not done) & land; 7-inactivation; 8&9-kids' dose, water & hands; 10&11-adults' dose, water & hands). function PlotMicrobes(Log, tMax, tRain, x, y, x1, y1, microbe, daysBurnIn); if microbe(1:3) == 'bac'; 435 add = 0; elseif microbe(1:3) == 'vir'; add = 1; elseif microbe(1:3) == 'pro'; add = 2; else error('microbe must equal ''bacteria'', ''viruses'', or ''protozoa''.'); end Log = Log + 0.1; %Avoids plotting zeros. Y = add + 1; %Y coordinate within LogFlux data cube (denoting microbe type). set(gca,'Position',[x y-add*.33 x1 y1]); %1:2; x&y of bottom L corner. 2:3; x&y of top R corner, minus 1:2. FluxIn = sum(Log(:,Y,5:6),3); FluxOut = sum(Log(:,Y,7:11),3); semilogy(1:tMax,Log(:,Y,1),'-b',1:tMax,Log(:,Y,2),'-c','linewidth',3); hold on; semilogy(1:tMax,Log(:,Y,4),'ob','MarkerSize',7) semilogy(1:tMax,Log(:,Y,6),'-m','LineWidth',6) semilogy(1:tMax,Log(:,Y,7),'-y','LineWidth',3) semilogy(1:tMax,Log(:,Y,8),'+r','MarkerSize',3) semilogy(1:tMax,Log(:,Y,9),'-r','LineWidth',3) semilogy(1:tMax,Log(:,Y,10),'+y','MarkerSize',3); semilogy(1:tMax,Log(:,Y,11),'-y',1:tMax,FluxIn,'+k',1:tMax,FluxOut,'-k'); plot(tRain(:,1),tRain(:,2) * 0.5,' x'); title(['Daily transfers of ',microbe],'fontsize',20); ylabel(['# ',microbe],'fontsize',20) %Axis & tick labels overlap on screen, but output better to .PNG. legend('Water resupply','Visits','Land-water (runoff)','Poop onto land','Inactivation','Kid dose, water','Kid dose, hands','Adult dose, water','Adult dose, hands','Net pos flux','Net neg flux','Time of rain events','location','southeast') plot([daysBurnIn daysBurnIn],[0.5 max(max(max(Log)))], '-k'); %Sticking a vertical line on the graph to denote when burn-in ends. axis([0 tMax*1.3 0.5 max(max(FluxOut))*1.5]); 436 hold off; end 437 %Function for generating line charts of infection status. %Need to start the figure, set up subplots, and determine subplot positioning before calling this function. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (PlotPeople.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} function PlotPeople(Log, tRain, x, y, x1, y1, microbe, host, daysBurnIn, nPeople); if microbe(1:3) == 'bac'; add = 0; elseif microbe(1:3) == 'vir'; add = 1; elseif microbe(1:3) == 'pro'; add = 2; else 438 error('microbe must equal ''bacteria'', ''viruses'', or ''protozoa''.'); end tMax = size(tRain,1); set(gca,'Position',[x y-add*.33 x1 y1]); %1:2; x&y of bottom L corner. 2:3; x&y of top R corner, minus 1:2. plot(Log(:,1),Log(:,14+add),'-r',Log(:,1),Log(:,18),'-k','LineWidth',2); hold on; plot(Log(:,1),Log(:,2+add),'g-',Log(:,1),Log(:,5+add),'-b',Log(:,1),Log(:,8+add),'m',Log(:,1),Log(:,11+add),'-r',Log(:,1),Log(:,17),'-k'); %plot(Log(:,1),Log(:,2+add),'g-',Log(:,1),Log(:,5+add),'-b',Log(:,1),Log(:,8+add),'m',Log(:,1),Log(:,11+add),'-r',Log(:,1),Log(:,14+add),'r','LineWidth',2,Log(:,1),Log(:,17),'-k',Log(:,1),Log(:,18),'-k','LineWidth',2); %Works in Octave. hold on; plot(tRain(:,1),tRain(:,2)-1,' x'); title(['Daily infection status, ',microbe],'fontsize',20); ylabel(['# ',host],'fontsize',20) %Axis & tick labels overlap on screen, but output better to .PNG. legend('Ill','Any illness','Susceptible','Immune','Exposed','Infected','Any infection (no illness)','Rain events','Location','East') %legend('Susceptible','Immune','Exposed','Infected','Ill','Any infection (no illness)','Any illness','Rain events','Location','East') %For commented-out plot() call above. plot([daysBurnIn daysBurnIn],[0 nPeople], '-k'); axis([0 tMax*1.3 0 nPeople+1]); hold off; end 439 %Describes results of defecation events (microbes onto hands, land, and surface water) by a single person. %{ COPYRIGHT INFORMATION Copyright 2012 Kyle S. Enger (username "engerkyl" at the domain "msu.edu"). This file (Pooping.m) is part of EITSd. EITSd is it under the Free (at your free software: you can redistribute it and/or modify the terms of the GNU General Public License as published by Software Foundation, either version 3 of the License, or option) any later version. EITSd is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License (gpl.txt) along with EITSd. If not, see . %} %RPoopPlace: 1 column from the random number matrix determining whether the person poops into the water or on land. %People: Rows from the People matrix, each row corresponding to a particular person. %HHs: The HHs matrix, corresponding to all households. %Mpgf: Microbes per gram of feces (3-element vector, 1 element per microbe type). %fpp: Grams of feces excreted per defecation event (assuming 1 event per day). Single value; run function once for each desired value. %fHands: Grams of feces on fingers after defecation event. %nMw: Number of microbes in the reservoir (3-element vector, 1 element per 440 microbe type). %nMl: As above, but number of microbes on land. %pPoopH2O: Probability that the person defecates directly into surface water (instead of on land). %RCompHW: 1 column from the random number matrix determining whether the person handwashes. %RCompSan: 1 column from the random number matrix determining whether the person uses sanitation. %lHand: LRVs from handwashing. %lSan: LRVs from sanitation. %chosenOnes: Logical vector of rows in People that are being operated on. function [HHs, nMw, nMl] = Pooping(RPoopPlace, People, HHs, Mpgf, fpp, fHands, nMw, nMl, pPoopH2O, lHand, lSan, chosenOnes); nPeople = size(People,1); if size(chosenOnes,1) ~= nPeople; error('People and chosenOnes do not match up!'); end MicrobeLoad = repmat(Mpgf,nPeople,1) .* repmat(chosenOnes,1,3) .* (People(:,1:3)==2 | People(:,1:3)==3) .* repmat(fpp,nPeople,3); %Only microbes infecting the person are shed. MicrobeLoadHands = MicrobeLoad .* (fHands/fpp); MicrobeLoadEnv = MicrobeLoad - MicrobeLoadHands; PeopleCompSanHW = HHs(People(:,7),[13 15]); %Expands household-level compliance to each person. MicrobeLoadHands = ApplyLRVs(MicrobeLoadHands, lHand, PeopleCompSanHW(:,2)); %Applying handwashing LRVs to handwashing compliers' hands. for i = 1:max(People(:,7)); %Looping over households, to apply new hand contamination to household 'hands' stock. Would be nice to vectorize, but not sure how. %PeopleInHHi = find(People(:,7)==i); MicrobesAdded = sum(MicrobeLoadHands(find(People(:,7)==i),:),1); %Forces sum of each column (even if there's only 1 row). %MicrobesAdded = MicrobeLoadHands(find(People(:,7)==i),:); HHs(i,4:6) = HHs(i,4:6) + MicrobesAdded; 441 end MicrobeLoadEnv = ApplyLRVs(MicrobeLoadEnv, lSan, PeopleCompSanHW(:,1)); %Applying sanitation LRVs to sanitation compliers' feces. PoopInH2O = RPoopPlace < pPoopH2O; nMl = nMl + sum(MicrobeLoadEnv(~PoopInH2O,:),1); %Apply remaining microbes after sanitation to the land. nMw = nMw + sum(MicrobeLoadEnv(PoopInH2O,:),1); %As above, for the water. end 442 function A = erdrey(n,m) %ERDREY Generate adjacency matrix for a G(n,m) type random graph. % % Input n: dimension of matrix (number of nodes in graph). % m: 2*m is the number of 1's in matrix (number of edges in graph). % Defaults to the smallest integer larger than n*log(n)/2. % % Output A: n by n symmetric matrix with the attribute sparse. % % % Description: An undirected graph is chosen uniformly at random from % the set of all symmetric graphs with n nodes and m % edges. % % Reference: P. Erdos, A. Renyi, % On Random Graphs, % Publ. Math. Debrecen, 6 1959, pp. 290-297. % % Example: A = erdrey(100,10); %Lines 22-26 added by Kyle S. Enger to ensure proper attribution. In all other respects, this copy of erdrey.m is identical to its original source. %This file is part of CONTEST, a publicly available MATLAB toolbox: http://www.mathstat.strath.ac.uk/research/groups/numerical_analysis/contest %See also Taylor A and Higham DJ (2009) CONTEST: A Controllable Test Matrix Toolbox for MATLAB. ACM Transactions on Mathematical Software. 35 (4). %This copy of erdrey.m is reproduced in Kyle S. Enger's Ph.D. dissertation by permission of the authors (Des Higham, personal communication, 24 Aug. 2012). %The file erdrey.m is not part of EITSd but is used by EITSd, and is included here for completeness. if nargin == 1 443 m = ceil(n*log(n)/2); end nonzeros = ceil(0.5*n*(n-1)*rand(m,1)); v = zeros(n,1); for count = 1:n v(count) = count*(count-1)/2; end I = zeros(m,1); J = zeros(m,1); S = ones(m,1); for count = 1:m i = min(find(v >= nonzeros(count))); j = nonzeros(count) - (i-1)*(i-2)/2; I(count) = i; J(count) = j; end A = sign(sparse([I;J],[J;I],[S;S],n,n)); while nnz(A) ~= 2*m difference = m-nnz(A)/2; Inew = zeros(difference,1); Jnew = zeros(difference,1); for count = 1:difference index = ceil(0.5*n*(n-1)*rand); Inew(count) = min(find(v>=index)); Jnew(count) = index - (Inew(count)-1)*(Inew(count)-2)/2; end I = cat(1,I,Inew); 444 J = cat(1,J,Jnew); S = ones(length(I),1); A = sign(sparse([I;J],[J;I],[S;S],n,n)); end 445 10. APPENDIX D: GLOSSARY Allocation concealment: Conducting a study in such a way that assignment to a particular group is truly random, and not influenced by either the subject or the investigator. In theory, adequate allocation concealment should always be feasible. Blinding: Conducting a study in such a way that the subject (single-blind) or the subject and the observer (double-blind) does not know which experimental group the subject belongs to. BSF: Biosand filter, a HWT method consisting of a sand filter in which the outlet pipe starts at the bottom of the filter and exits the filter above the top layer of sand, thus allowing the sand to remain wet at all times. CAMRA: Center for Advancing Microbial Risk Assessment. A multi-university center, it hosts the QMRAwiki (http://wiki.camra.msu.edu). CDC: Centers for Disease Control and Prevention (United States government agency). CFU: Colony-forming unit. A bacterial suspension can be quantitated by spreading a sample of it on a plate of growth medium. After incubation, the number of colonies are counted. This estimates the number of colony-forming units in the original suspension. CFUs are often assumed to represent single bacterial cells, although this assumption may not be accurate if the bacteria adhere to one another. DALY: Disability-adjusted life-year. Allows comparison of morbidity among differing diseases. Diseases are assigned 'DALY weights' based on the duration of disease, the level of disability caused by the disease, and the chance of death (the ultimate disability). The concept was developed and expanded by WHO. Dose response equation (or curve, or model): Relationship of the mean number of pathogens ingested to the probability of a response, such as infection or disease. Dysentery: An enteric disease characterized by intestinal inflammation and bloody stools. 446 EAWAG: Swiss Federal Institute for Environmental Science and Technology (German acronym). E. coli: Escherichia coli bacteria. Usually a commensal inhabitant of the mammalian gut, but some strains can cause disease. EITS model: Environmental infection transmission system model. Resembles SIR & other similar models, but directly models pathogens in the environment, in addition to states of infection by hosts. Endemic: Describes a disease that is constantly present in a community. Hyperendemic denotes constant presence at high levels. Endemic disease levels may fluctuate over time, but rapid increases in disease are described as epidemic. Epidemic: Temporarily increased levels of a disease in a community. FFU: Focus-forming unit. Analogous to CFU or PFU, but applies to viruses in cell culture. Distinct sites that are disrupted on a lawn of cells are considered to have arisen from a single FFU (and perhaps a single virion, if they do not adhere to one another). Fomite: A physical object that can become contaminated and thus transfer pathogens between hosts. Helminths: Worms. HWT: Household water treatment. Synonym of POU. Incidence, or incidence rate: Number of new cases of a disease in a population divided by time (e.g., cases of diarrhea per child per year). This is a measure of risk. Incubation period: Time from exposure to a pathogen until the first symptoms develop. Usually longer than the prepatent period. ID: Incidence difference. Analogous to incidence ratio (IR), but ID is obtained by subtracting the incidence in the intervention group from the incidence in the control (non-intervention) group. Also known as attributable risk. 447 IR: Incidence ratio, a comparison of two incidence rates. Generally the incidence in the intervention group is divided by the incidence in the control (non-intervention) group. It is a type of relative risk, representing the proportion of the incidence remaining in the population after an intervention has been applied. LFF: LifeStraw® Family Filter, a HWT gravity-fed filtration device capable of removing viruses. It is produced and distributed by the Vestergaard Frandsen corporation. LRV: Log10 reduction value; number of factors of 10 that have been inactivated, e.g., an LRV of 2 means that 99% of microorganisms have been inactivated, LRV of 3 means that 99.9% of microorganisms have been inactivated, etc. MATLAB: A software package (MATrix LABoratory) that is useful for programming computer simulations, among other things. Matlab: A region of Bangladesh in which many studies of diarrhea have been conducted. Morbidity ratio: Proportion of infected hosts that develop symptoms. Sometimes called the illness-to-infection ratio. NGO: ‘Non-governmental organization’, generally a private nonprofit organization conducting human development work. NTU: Nephelometric turbidity units, a measure of cloudiness of water. 0 is perfectly clear. The USEPA requires that municipally treated water have < 0.3 NTU in ≥ 95% of monthly samples. Water that is so cloudy that it is opaque might have an NTU of several hundred. Odds: The number of times an outcome occurs divided by the number of times the outcome does not occur. When the odds of an outcome in two different groups are expressed as a ratio, this is termed an ‘odds ratio’ or OR. ORs are analogous to relative risk and approximate relative risk when the disease is uncommon (in that circumstance, number of nonoccurrences ≈ entire population), though ORs are often used in contexts where a proper 448 risk measure is unavailable. ORS or ORT: Oral rehydration solution = oral rehydration salts = oral rehydration therapy. Giving a formulation of sugar and electrolytes in clean water by mouth to an ill person in order to replace fluids, electrolytes, and energy lost to diarrhea. Outbreak: A small or localized epidemic. Persistent diarrhea: Diarrhea lasting longer than 14 days. Person-time: Number of persons observed multiplied by the average amount of time during which each person was observed. Analogous to ‘work-hours’. PFU: Plaque-forming unit. Analogous to CFU, but applies to bacteriophages (viruses of bacteria) quantified on a lawn of bacteria. Round cleared areas in the lawn of bacteria are assumed to have arisen from a single bacteriophage. POU: Point-of-use water treatment. Synonym of HWT. Prepatent period: Time from exposure to a pathogen until the pathogen can be detected in the host. Usually shorter than the incubation period. Prevalence, longitudinal: Amount of person-time spent ill divided by the total amount of persontime observed. This is a measure of risk. Prevalence, point: Proportion of a population ill with a disease at a single point in time. Preventable fraction: Proportion of disease that is prevented by a public health intervention (such as household water treatment or handwashing) in a particular population. Rate ratio: See relative risk. Rates describe occurrence per unit time. Simple proportions are often incorrectly called rates. Relative risk: The ratio of two risk measures, used to illustrate the magnitude of an effect. By convention, relative risks under 1 denote a protective effect, while relative risks over 1 indicate an adverse effect. 449 Risk: The likelihood (loosely defined) of a particular adverse outcome occurring. The number of times a particular outcome occurs divided by the total number of outcomes. Safe storage: A common attribute of HWT methods, incorporating a storage vessel for water which has a narrow neck and (usually) a spigot, to prevent hands or other objects from (re)contaminating the water within. SIR model: A model of infection transmission in which hosts can occupy 3 states in this order: susceptible, infectious, and removed (meaning immune or dead). Commonly modeled by a simple system of differential equations describing the rates by which hosts transfer between states, though other methods may be used. Often elaborated to include other states of infection or other orderings of states, e.g., SIS (susceptible-infectious-susceptible), SEIS (susceptible-exposed-infectious-susceptible), SEIR (suscepible-exposed-infectiousremoved), etc. SODIS: Solar disinfection, a HWT method in which contaminated water is placed in clear plastic bottles, which are then placed in the sun. Microorganisms are inactivated by a combination of UV irradiation and heating. Sustainability: The ability of an intervention to be accepted by a community and perpetually used without any input from outside the community. Also pertains to management of resources in such a way that the resource is not depleted, e.g., using water from an aquifer at a rate no greater than its rate of recharge. TTC: Thermotolerant coliforms. E. coli is an organism in this group. Commonly used as an indicator of fecal contamination of water. UN: The United Nations. UNICEF: The United Nations Children's Fund (originally United Nations International Children's Emergency Fund). An international development charity and agency that 450 particularly focuses on children and mothers. USEPA: Environmental Protection Agency (United States government agency). UV: Ultraviolet light, which is sometimes used to inactivate pathogens in water. Weaning: Cessation of breastfeeding and introduction of additional foods, a process that can span months or years. Initiation of weaning (i.e., cessation of exclusive breastfeeding) is associated with sharply increased diarrhea risk. WHO: World Health Organization. Z-score: Number of standard deviations away from the mean. In nutrition, z-scores refer to standard distributions defining weight-for-height, height-for-age, and weight-for-age for a well-nourished reference population. Z-scores of 2 or 3 are commonly used as cutoffs for adverse nutrition outcomes. 451 REFERENCES 452 11. REFERENCES Abad FX, Pintó RM and Bosch A (1994) Survival of enteric viruses on environmental fomites. Applied and Environmental Microbiology. 60 (10), 3704–3710. Abba K, Sinfield R, Hart CA and Garner P (2009) Pathogens associated with persistent diarrhoea in children in low and middle income countries: systematic review. BMC Infectious Diseases. 9, 88. Adams M and Nicolaides L (1997) Review of the sensitivity of different foodborne pathogens to fermentation. Food Control. 8 (5-6), 227–239. Akinbami FO, Erinoso O and Akinwolere OA (1995) Defaecation pattern and intestinal transit in Nigerian children. African Journal of Medicine and Medical Sciences. 24 (4), 337–341. Akpata ES (2004) Fluoride ingestion from drinking water by Nigerian children aged below 10 years. Community Dental Health. 21 (1), 25–31. American Water Works Association (1999) Waterborne Pathogens: Manual of Water Supply Practices. Denver, CO: American Water Works Association Anderson RM and May RM (1991) Infectious Diseases of Humans: Dynamics and Control. Oxford: Oxford University Press Anon (2012) QMRAwiki. [Online] Available at: http://camrawiki.anr.msu.edu (accessed 26/07/12). Ansari SA, Sattar SA, Springthorpe VS, Wells GA and Tostowaryk W (1989) In vivo protocol for testing efficacy of hand-washing agents against viruses and bacteria: experiments with rotavirus and Escherichia coli. Applied and Environmental Microbiology. 55 (12), 3113– 3118. Arnold B, Arana B, Mäusezahl D, Hubbard Alan and Colford JM (2009) Evaluation of a preexisting, 3-year household water treatment and handwashing intervention in rural Guatemala. International Journal of Epidemiology. 38 (6), 1651–1661. Arnold BF and Colford JM (2007) Treating water with chlorine at point-of-use to improve water quality and reduce child diarrhea in developing countries: a systematic review and metaanalysis. The American Journal of Tropical Medicine and Hygiene. 76 (2), 354–364. Arnold BF, Khush RS, Ramaswamy P, London AG, Rajkumar P, Ramaprabha P, Durairaj N, Hubbard AE, Balakrishnan K and Colford JM Jr (2010) Causal inference methods to study nonrandomized, preexisting development interventions. Proceedings of the National Academy of Sciences of the United States of America. 107 (52), 22605–22610. Aronson JK (2007) Compliance, concordance, adherence. British Journal of Clinical 453 Pharmacology. 63 (4), 383–384. Atia AN and Buchman AL (2009) Oral rehydration solutions in non-cholera diarrhea: a review. The American Journal of Gastroenterology. Available at: http://www.ncbi.nlm.nih.gov/pubmed/19550407 (accessed 04/09/09). Atmar RL, Opekun AR, Gilger MA, Estes Mary K, Crawford SE, Neill FH and Graham DY (2008) Norwalk virus shedding after experimental human infection. Emerging Infectious Diseases. 14 (10), 1553–1557. Awumbila M and Momsen JH (1995) Gender and the environment. Women’s time use as a measure of environmental change. Global Environmental Change: Human and Policy Dimensions. 5 (4), 337–346. Ayad M, Piani AL, Barrère B, Ekouevi K and Otto J (1994) Demographic characteristics of households. Calverton, Maryland, USA: Macro International Inc. Available at: http://www.measuredhs.com/publications/publication-cs14-comparative-reports.cfm. Bahl R, Bhandari N, Saksena M, Strand T, Kumar GT, Bhan MK and Sommerfelt H (2002) Efficacy of zinc-fortified oral rehydration solution in 6- to 35-month-old children with acute diarrhea. The Journal of Pediatrics. 141 (5), 677–682. Banda K, Sarkar R, Gopal S, Govindarajan J, Harijan BB, Jeyakumar MB, Mitta P, Sadanala ME, Selwyn T, Suresh CR, Thomas VA, Devadason P, Kumar R, Selvapandian D, Kang Gagandeep and Balraj Vinohar (2007) Water handling, sanitation and defecation practices in rural southern India: a knowledge, attitudes and practices study. Transactions of the Royal Society of Tropical Medicine and Hygiene. 101 (11), 1124–1130. Barreto ML, Genser B, Strina Agostino, Teixeira MG, Assis AMO, Rego RF, Teles CA, Prado Matildes S, Matos SMA, Alcantara-Neves NM and Cairncross S (2010) Impact of a CityWide Sanitation Programme in Northeast Brazil on Intestinal Parasites Infection in Young Children. Environmental Health Perspectives. Available at: http://ehp03.niehs.nih.gov/article/fetchArticle.action? articleURI=info:doi/10.1289/ehp.1002058 (accessed 18/08/10). Barreto ML, Genser B, Strina A, Teixeira M, Assis A, Rego R, Teles C, Prado M, Matos S, Santos D, dos Santos L and Cairncross S (2007) Effect of city-wide sanitation programme on reduction in rate of childhood diarrhoea in northeast Brazil: assessment by two cohort studies. Lancet. 370 (9599), 1622–1628. Bates SJ, Trostle J, Cevallos WT, Hubbard Alan and Eisenberg JNS (2007) Relating diarrheal disease to social networks and the geographic configuration of communities in rural Ecuador. American Journal of Epidemiology. 166 (9), 1088–1095. Batterman S, Eisenberg JNS, Hardin R, Kruk ME, Lemos MC, Michalak AM, Mukherjee B, Renne E, Stein H, Watkins C and Wilson ML (2009) Sustainable control of water-related infectious diseases: a review and proposal for interdisciplinary health-based systems research. Environmental Health Perspectives. 117 (7), 1023–1032. 454 Bern C, Martines J, de Zoysa I and Glass R I (1992) The magnitude of the global problem of diarrhoeal disease: a ten-year update. Bulletin of the World Health Organization. 70 (6), 705–714. Bhatnagar S, Bahl R, Sharma PK, Kumar GT, Saxena SK and Bhan MK (2004) Zinc with oral rehydration therapy reduces stool output and duration of diarrhea in hospitalized children: a randomized controlled trial. Journal of Pediatric Gastroenterology and Nutrition. 38 (1), 34–40. Bhutta ZA, Black R E, Brown K H, Gardner JM, Gore S, Hidayat A, Khatun F, Martorell R, Ninh NX, Penny ME, Rosado J L, Roy SK, Ruel M, Sazawal S and Shankar A (1999) Prevention of diarrhea and pneumonia by zinc supplementation in children in developing countries: pooled analysis of randomized controlled trials. Zinc Investigators’ Collaborative Group. The Journal of Pediatrics. 135 (6), 689–697. Bhutta ZA, Nelson EA, Lee WS, Tarr PI, Zablah R, Phua KB, Lindley K, Bass D and Phillips A (2008) Recent advances and evidence gaps in persistent diarrhea. Journal of Pediatric Gastroenterology and Nutrition. 47 (2), 260–5. Bischoff WE, Reynolds TM, Sessler CN, Edmond MB and Wenzel RP (2000) Handwashing compliance by health care workers: The impact of introducing an accessible, alcoholbased hand antiseptic. Archives of Internal Medicine. 160 (7), 1017–1021. Bishop RF (1996) Natural history of human rotavirus infection. Archives of Virology. Supplementum. 12, 119–128. Black R E (1993) Persistent diarrhea in children of developing countries. The Pediatric Infectious Disease Journal. 12 (9), 751–61; discussion 762–4. Black R E, Levine MM, Clements ML, Cisneros L and Daya V (1982) Treatment of experimentally induced enterotoxigenic Escherichia coli diarrhea with trimethoprim, trimethoprim-sulfamethoxazole, or placebo. Reviews of Infectious Diseases. 4 (2), 540– 545. Black R E, Levine MM, Clements ML, Hughes TP and Blaser MJ (1988) Experimental Campylobacter jejuni infection in humans. The Journal of Infectious Diseases. 157 (3), 472–479. Black R E, Merson MH, Rowe B, Taylor PR, Abdul Alim AR, Gross RJ and Sack DA (1981) Enterotoxigenic Escherichia coli diarrhoea: acquired immunity and transmission in an endemic area. Bulletin of the World Health Organization. 59 (2), 263–268. Blaser MJ, Smith PD, Ravdin JI, Greenberg Harry B and Guerrant RL eds. (2002) Infections of the Gastrointestinal Tract (2nd edition). Philadelphia: Lippincott Williams & Wilkins Boehm AB (2007) Enterococci concentrations in diverse coastal environments exhibit extreme variability. Environmental Science & Technology. 41 (24), 8227–8232. 455 Boisson S, Kiyombo M, Sthreshley L, Tumba S, Makambo J and Clasen T (2010) Field assessment of a novel household-based water filtration device: a randomised, placebocontrolled trial in the Democratic Republic of Congo. PLoS ONE. 5 (9), e12613. Boschi-Pinto C, Velebit L and Shibuya K (2008) Estimating child mortality due to diarrhoea in developing countries. Bulletin of the World Health Organization. 86 (9), 710–717. Briscoe J (1984) Intervention studies and the definition of dominant transmission routes. American Journal of Epidemiology. 120 (3), 449–455. Brownawell AM, Caers W, Gibson GR, Kendall CWC, Lewis KD, Ringel Y and Slavin JL (2012) Prebiotics and the health benefits of fiber: current regulatory status, future research, and goals. The Journal of Nutrition. 142 (5), 962–974. Brown Joe and Clasen T (2012) High adherence is necessary to realize health gains from water quality interventions. PLoS ONE. 7 (5), e36735. Brown Joe, Sobsey MD and Proum S (2007) Use of Ceramic Water Filters in Cambodia. UNICEF, New York and WHO, Geneva Brown Kenneth H, Hambidge KM and Ranum P (2010) Zinc fortification of cereal flours: current recommendations and research needs. Food and Nutrition Bulletin. 31 (1 Suppl), S62–74. Bushen OY, Kohli A, Pinkerton RC, Dupnik K, Newman RD, Sears CL, Fayer R, Lima AAM and Guerrant RL (2007) Heavy cryptosporidial infections in children in northeast Brazil: comparison of Cryptosporidium hominis and Cryptosporidium parvum. Transactions of the Royal Society of Tropical Medicine and Hygiene. 101 (4), 378–384. Buswell CM, Herlihy YM, Lawrence LM, McGuiggan JT, Marsh PD, Keevil CW and Leach SA (1998) Extended survival and persistence of Campylobacter spp. in water and aquatic biofilms and their detection by immunofluorescent-antibody and -rRNA staining. Applied and Environmental Microbiology. 64 (2), 733–741. Cacciò SM, Thompson RCA, McLauchlin J and Smith HV (2005) Unravelling Cryptosporidium and Giardia epidemiology. Trends in Parasitology. 21 (9), 430–437. Calloway DH, Odell AC and Margen S (1971) Sweat and miscellaneous nitrogen losses in human balance studies. The Journal of Nutrition. 101 (6), 775–786. Carter MJ (2005) Enterically infecting viruses: pathogenicity, transmission and significance for food and waterborne infection. Journal of Applied Microbiology. 98 (6), 1354–1380. CDC (2011) CDC Health Information for International Travel 2012: The Yellow Book (1st edition). G. W. Brunette ed. Oxford University Press, USA CDC (2012) Epidemiology and Prevention of Vaccine-Preventable Diseases (12th edition). Washington, DC: Public Health Foundation Available at: 456 http://www.cdc.gov/vaccines/pubs/pinkbook/pink-chapters.htm. Centre for Affordable Water and Sanitation Technology (2008) Chlorine disinfection. Available at: http://www.cawst.org/index.php?id=132#Chlorine. Chacín-Bonilla L, Barrios F and Sanchez Y (2008) Environmental risk factors for Cryptosporidium infection in an island from Western Venezuela. Memórias do Instituto Oswaldo Cruz. 103 (1), 45–49. Chappell Cynthia L, Okhuysen Pablo C, Langer-Curry R, Widmer G, Akiyoshi DE, Tanriverdi S and Tzipori S (2006) Cryptosporidium hominis: experimental challenge of healthy adults. The American Journal of Tropical Medicine and Hygiene. 75 (5), 851–857. Checkley W, Gilman R H, Epstein LD, Suarez M, Diaz JF, Cabrera L, Black R E and Sterling CR (1997) Asymptomatic and symptomatic cryptosporidiosis: their acute effect on weight gain in Peruvian children. American Journal of Epidemiology. 145 (2), 156–163. Chen LC, Scrimshaw NS and United Nations University eds. (1983) Diarrhea and Malnutrition: Interactions, Mechanisms, and Interventions. New York: Plenum Press Cherian T, Wang S and Mantel C (2012) Rotavirus vaccines in developing countries: the potential impact, implementation challenges, and remaining questions. Vaccine. 30 Suppl 1, A3–6. Chiller TM, Mendoza CE, Lopez MB, Alvarez M, Hoekstra Robert M, Keswick BH and Luby SP (2006) Reducing diarrhoea in Guatemalan children: randomized controlled trial of flocculant-disinfectant for drinking-water. Bulletin of the World Health Organization. 84 (1), 28–35. Clasen T (2009) Scaling up household water treatment among low-income populations. Geneva, Switzerland: World Health Organization Available at: http://www.who.int/household_water/research/household_water_treatment/en/index.html. Clasen T, Bartram J, Colford JM, Luby SP, Quick R and Sobsey MD (2009) Comment on ‘Household water treatment in poor populations: is there enough evidence for scaling up now?’ Environmental Science & Technology. 43 (14), 5542–5544; author reply 5545– 5546. Clasen TF, Brown Joseph and Collin SM (2006) Preventing diarrhoea with household ceramic water filters: assessment of a pilot project in Bolivia. International Journal of Environmental Health Research. 16 (3), 231–9. Clasen T, Naranjo J, Frauchiger D and Gerba CP (2009) Laboratory assessment of a gravity-fed ultrafiltration water treatment device designed for household use in low-income settings. The American Journal of Tropical Medicine and Hygiene. 80 (5), 819–823. Clasen T, Roberts IG, Rabie T, Schmidt W-P and Cairncross S (2009) Interventions to improve water quality for preventing diarrhoea. Cochrane Database of Systematic Reviews. (1). 457 Clasen T, Schmidt W-P, Rabie Tamer, Roberts I and Cairncross S (2007) Interventions to improve water quality for preventing diarrhoea: systematic review and meta-analysis. British Medical Journal. 334 (7597), 782. Cook SM, Glass R I, LeBaron CW and Ho MS (1990) Global seasonality of rotavirus infections. Bulletin of the World Health Organization. 68 (2), 171–177. Cooper BS, Pitman RJ, Edmunds WJ and Gay NJ (2006) Delaying the international spread of pandemic influenza. PLoS Medicine. 3 (6), e212. Coulliette AD, Peterson LA, Mosberg JAW and Rose Joan B (2010) Evaluation of a new disinfection approach: efficacy of chlorine and bromine halogenated contact disinfection for reduction of viruses and microcystin toxin. The American Journal of Tropical Medicine and Hygiene. 82 (2), 279–288. Cravioto A, Reyes RE, Trujillo F, Uribe F, Navarro A, De La Roca JM, Hernández JM, Pérez G and Vázquez V (1990) Risk of diarrhea during the first year of life associated with initial and subsequent colonization by specific enteropathogens. American Journal of Epidemiology. 131 (5), 886–904. Crump J, Otieno P, Slutsker L, Keswick BH, Rosen D, Hoekstra R, Vulule J and Luby SP (2005) Household based treatment of drinking water with flocculant-disinfectant for preventing diarrhoea in areas with turbid source water in rural western Kenya: cluster randomised controlled trial. British Medical Journal. 331 (7515), 478–481. Curtis VA and Cairncross S (2003) Effect of washing hands with soap on diarrhoea risk in the community: a systematic review. The Lancet Infectious Diseases. 3 (5), 275–81. Curtis VA, Cairncross S and Yonli R (2000) Domestic hygiene and diarrhoea - pinpointing the problem. Tropical Medicine & International Health. 5 (1), 22–32. Curtis VA, Danquah LO and Aunger RV (2009) Planned, motivated and habitual hygiene behaviour: an eleven country review. Health Education Research. 24 (4), 655–673. Dalton CB, Mintz ED, Wells JG, Bopp CA and Tauxe RV (1999) Outbreaks of enterotoxigenic Escherichia coli infection in American adults: a clinical and epidemiologic profile. Epidemiology and Infection. 123 (1), 9–16. Danciger M and Lopez M (1975) Numbers of Giardia in the feces of infected children. The American Journal of Tropical Medicine and Hygiene. 24 (2), 237–242. Davies GJ, Crowder M, Reid B and Dickerson JW (1986) Bowel function measurements of individuals with different eating patterns. Gut. 27 (2), 164–169. Dechesne M, Soyeux E, Loret J, Westrell T, Senstrom T, Gornik V, Koch C, Exner M, Stanger M, Agutter P, Lake R, Roser D, Ashbolt N, Dullemont Y, Hijnen W and Medema GJ (2006) Pathogens in source water Available at: http://www.microrisk.com/publish/cat_index_11.shtml. 458 deRegnier DP, Cole L, Schupp DG and Erlandsen SL (1989) Viability of Giardia cysts suspended in lake, river, and tap water. Applied and Environmental Microbiology. 55 (5), 1223– 1229. Doocy S and Burnham G (2006) Point-of-use water treatment and diarrhoea reduction in the emergency context: an effectiveness trial in Liberia. Tropical Medicine & International Health. 11 (10), 1542–1552. Duflo E, Glennerster R and Kremer M (2007) Using randomization in development economics research: a toolkit. London, UK: Centre for Economic Policy Research Dunk D (2007) A new look at bromine: a potential sleeping giant? Water Conditioning & Purification Magazine. 49 (10). Available at: http://www.wcponline.com/NewsView.cfm? ID=3644. DuPont H L, Chappell C L, Sterling CR, Okhuysen P C, Rose J B and Jakubowski W (1995) The infectivity of Cryptosporidium parvum in healthy volunteers. The New England Journal of Medicine. 332 (13), 855–859. DuPont H L, Formal SB, Hornick RB, Snyder MJ, Libonati JP, Sheahan DG, LaBrec EH and Kalas JP (1971) Pathogenesis of Escherichia coli diarrhea. The New England Journal of Medicine. 285 (1), 1–9. EAWAG (2004) SODIS technical notes 1 through 17. Available at: http://www.sodis.ch/Text2002/T-Research.htm. Eisenberg JNS, Hubbard Alan, Wade TJ, Sylvester MD, LeChevallier MW, Levy DA and Colford JM (2006) Inferences drawn from a risk assessment compared directly with a randomized trial of a home drinking water intervention. Environmental Health Perspectives. 114 (8), 1199–1204. Eisenberg JNS, Lei X, Hubbard AH, Brookhart M and Colford JM (2005) The role of disease transmission and conferred immunity in outbreaks: Analysis of the 1993 Cryptosporidium outbreak in Milwaukee, Wisconsin. American Journal of Epidemiology. 161 (1), 62–72. Eisenberg JNS, Moore K, Soller JA, Eisenberg D and Colford JM (2008) Microbial risk assessment framework for exposure to amended sludge projects. Environmental Health Perspectives. 116 (6), 727–733. Eisenberg JNS, Scott JC and Porco T (2007) Integrating disease control strategies: balancing water sanitation and hygiene interventions to reduce diarrheal disease burden. American Journal of Public Health. 97 (5), 846–852. Ejemot R, Ehiri J, Meremikwu M and Critchley J (2008) Hand washing for preventing diarrhoea. Cochrane Database of Systematic Reviews. (1). Available at: http://onlinelibrary.wiley.com/doi/10.1002/14651858.CD004265.pub2/abstract (accessed 26/01/09). 459 Elliott M, Stauber C, Koksal F, DiGiano F and Sobsey MD (2008) Reductions of E. coli, echovirus type 12 and bacteriophages in an intermittently operated household-scale slow sand filter. Water Research. 42 (10-11), 2662–2670. Enger KS, Nelson KL, Clasen T, Rose Joan B and Eisenberg JNS (2012) Linking quantitative microbial risk assessment and epidemiological data: informing safe drinking water trials in developing countries. Environmental Science & Technology. 46 (9), 5160–5167. Ensink JHJ, van der Hoek W and Amerasinghe FP (2006) Giardia duodenalis infection and wastewater irrigation in Pakistan. Transactions of the Royal Society of Tropical Medicine and Hygiene. 100 (6), 538–542. Erickson MC and Ortega YR (2006) Inactivation of protozoan parasites in food, water, and environmental systems. Journal of Food Protection. 69 (11), 2786–2808. Esrey SA (1996) Water, waste, and well-being: a multicountry study. American Journal of Epidemiology. 143 (6), 608–623. Esrey SA, Habicht JP and Casella G (1992) The complementary effect of latrines and increased water usage on the growth of infants in rural Lesotho. American Journal of Epidemiology. 135 (6), 659–666. Esrey SA, Potash JB, Roberts L and Shiff C (1991) Effects of improved water supply and sanitation on ascariasis, diarrhoea, dracunculiasis, hookworm infection, schistosomiasis, and trachoma. Bulletin of the World Health Organization. 69 (5), 609–21. Estrada-Garcia T, Lopez-Saucedo C, Thompson-Bonilla R, Abonce M, Lopez-Hernandez D, Santos JI, Rosado Jorge L, DuPont Herbert L and Long KZ (2009) Association of diarrheagenic Escherichia coli pathotypes with infection and diarrhea among Mexican children and association of atypical enteropathogenic E. coli with acute diarrhea. Journal of Clinical Microbiology. 47 (1), 93–98. Feachem RG and Koblinsky MA (1984) Interventions for the control of diarrhoeal diseases among young children: promotion of breast-feeding. Bulletin of the World Health Organization. 62 (2), 271–291. Fewtrell L and Colford JM (2005) Water, sanitation and hygiene in developing countries: interventions and diarrhoea - a review. Water Science and Technology. 52 (8), 133–142. Fewtrell L, Kaufmann R, Kay D, Enanoria W, Haller L and Colford JM (2005) Water, sanitation, and hygiene interventions to reduce diarrhoea in less developed countries: a systematic review and meta-analysis. Lancet Infectious Diseases. 5 (1), 42–52. Fischer TK, Valentiner-Branth P, Steinsland H, Perch M, Santos G, Aaby P, Mølbak K and Sommerfelt H (2002) Protective immunity after natural rotavirus infection: a community cohort study of newborn children in Guinea-Bissau, west Africa. The Journal of Infectious Diseases. 186 (5), 593–597. 460 Fischer Walker CL, Sack D and Black Robert E (2010) Etiology of diarrhea in older children, adolescents and adults: a systematic review. PLoS Neglected Tropical Diseases. 4 (8), e768. Fisman DN (2007) Seasonality of infectious diseases. Annual Review of Public Health. 28, 127– 143. Flint KP (1987) The long-term survival of Escherichia coli in river water. The Journal of Applied Bacteriology. 63 (3), 261–270. Fudge BW, Easton C, Kingsmore D, Kiplamai FK, Onywera VO, Westerterp KR, Kayser B, Noakes TD and Pitsiladis YP (2008) Elite Kenyan endurance runners are hydrated dayto-day with ad libitum fluid intake. Medicine and Science in Sports and Exercise. 40 (6), 1171–1179. Genser B, Strina Agostino, Teles CA, Prado Matildes S and Barreto ML (2006) Risk factors for childhood diarrhea incidence: dynamic analysis of a longitudinal study. Epidemiology. 17 (6), 658–667. Gerba CP (2001) Application of quantitative risk assessment for formulating hygiene policy in the domestic setting. The Journal of Infection. 43 (1), 92–98. Ghana VAST Study Team (1993) Vitamin A supplementation in northern Ghana: effects on clinic attendances, hospital admissions, and child mortality. Ghana VAST Study Team. Lancet. 342 (8862), 7–12. Gilman R H, Marquis GS, Miranda E, Vestegui M and Martinez H (1988) Rapid reinfection by Giardia lamblia after treatment in a hyperendemic Third World community. Lancet. 1 (8581), 343–345. Gorter A, Sandiford P, Pauw J, Morales P, Perez R and Alberts H (1998) Hygiene behaviour in rural Nicaragua in relation to diarrhoea. International Journal of Epidemiology. 27 (6), 1090–1100. Grassly NC and Fraser C (2006) Seasonal infectious disease epidemiology. Proceedings of the Royal Society B: Biological Sciences. 273 (1600), 2541–2550. Greenberg Harry B and Estes Mary K (2009) Rotaviruses: from pathogenesis to vaccination. Gastroenterology. 136 (6), 1939–1951. Guerrant RL, Kosek M, Lima AAM, Lorntz B and Guyatt HL (2002) Updating the DALYs for diarrhoeal disease. Trends in Parasitology. 18 (5), 191–193. Guerrant RL, Oriá RB, Moore SR, Oriá MOB and Lima AAM (2008) Malnutrition as an enteric infectious disease with long-term effects on child development. Nutrition Reviews. 66 (9), 487–505. Guerrant RL, Schorling JB, McAuliffe JF and de Souza MA (1992) Diarrhea as a cause and an 461 effect of malnutrition: diarrhea prevents catch-up growth and malnutrition increases diarrhea frequency and duration. The American Journal of Tropical Medicine and Hygiene. 47 (1 Pt 2), 28–35. Gundry S, Wright J and Conroy R (2004) A systematic review of the health outcomes related to household water quality in developing countries. Journal of Water and Health. 2 (1), 1– 13. Haas CN, Rose J B, Gerba CP and Regli S (1993) Risk assessment of virus in drinking water. Risk Analysis. 13 (5), 545–552. Haas CN, Rose Joan B and Gerba CP (1999) Quantitative Microbial Risk Assessment. New York, NY: John Wiley & Sons, Inc. Haefner JW (2005) Modeling Biological Systems: Principles and Applications2nd ed. New York: Springer Hall A, Hewitt G, Tuffrey V and de Silva N (2008) A review and meta-analysis of the impact of intestinal worms on child growth and nutrition. Maternal & Child Nutrition. 4 Suppl 1, 118–236. Halloran ME, Haber M, Longini I M and Struchiner CJ (1991) Direct and indirect effects in vaccine efficacy and effectiveness. American Journal of Epidemiology. 133 (4), 323–331. Halloran ME, Longini Ira M, Cowart DM and Nizam A (2002) Community interventions and the epidemic prevention potential. Vaccine. 20 (27-28), 3254–3262. Hamza H, Ben Khalifa H, Baumer P, Berard H and Lecomte JM (1999) Racecadotril versus placebo in the treatment of acute diarrhoea in adults. Alimentary Pharmacology & Therapeutics. 13 Suppl 6, 15–19. Han A and Moe K (1990) Household fecal contamination and diarrhea risk. Journal of Tropical Medicine and Hygiene. 93 (5), 333–336. Harro C, Sack D, Bourgeois AL, Walker R, DeNearing B, Feller A, Chakraborty S, Buchwaldt C and Darsley MJ (2011) A combination vaccine consisting of three live attenuated enterotoxigenic Escherichia coli strains expressing a range of colonization factors and heat-labile toxin subunit B is well tolerated and immunogenic in a placebo-controlled double-blind phase I trial in healthy adults. Clinical and Vaccine Immunology. 18 (12), 2118–2127. Havelaar AH, van Pelt W, Ang CW, Wagenaar JA, van Putten JPM, Gross U and Newell DG (2009) Immunity to Campylobacter: its role in risk assessment and epidemiology. Critical Reviews in Microbiology. 35 (1), 1–22. Hennekens CH and Buring JE (1987) Epidemiology in Medicine1st ed. S. L. Mayrent ed. Boston: Little, Brown 462 Henry FJ and Rahim Z (1990) Transmission of diarrhoea in two crowded areas with different sanitary facilities in Dhaka, Bangladesh. The Journal of Tropical Medicine and Hygiene. 93 (2), 121–126. Heymann DL ed. (2004) Control of Communicable Diseases Manual (18th edition). Washington, DC: American Public Health Association Hill Z, Kirkwood B and Edmond K (2004) Family and Community Practices That Promote Child Survival, Growth and Development: A Review of the Evidence. Geneva, Switzerland Available at: http://www.who.int/child-adolescent-health. Hrdy DB (1987) Epidemiology of rotaviral infection in adults. Reviews of Infectious Diseases. 9 (3), 461–469. Huffman SL and Combest C (1990) Role of breast-feeding in the prevention and treatment of diarrhoea. Journal of Diarrhoeal Diseases Research. 8 (3), 68–81. Hunter PR (2009) Household water treatment in developing countries: comparing different intervention types using meta-regression. Environmental Science & Technology. 43 (23), 8991–8997. Hunter PR and Thompson RCA (2005) The zoonotic transmission of Giardia and Cryptosporidium. International Journal for Parasitology. 35 (11-12), 1181–1190. Hunter PR, Zmirou-Navier D and Hartemann P (2009) Estimating the impact on health of poor reliability of drinking water interventions in developing countries. The Science of the Total Environment. 407 (8), 2621–2624. Huq A, Xu B, Chowdhury MA, Islam MS, Montilla R and Colwell RR (1996) A simple filtration method to remove plankton-associated Vibrio cholerae in raw water supplies in developing countries. Applied and Environmental Microbiology. 62 (7), 2508–2512. Hutton G and Haller Laurence (2004) Evaluation of the Costs and Benefits of Water and Sanitation Improvements at the Global Level. Geneva: World Health Organization Available at: http://www.who.int/water_sanitation_health/wsh0404/en/index.html (accessed 08/04/09). Jamison DT, Breman JG, Measham AR, Alleyne G, Claeson M, Evans DB, Jha P, Mills A and Musgrove P eds. (2006) Disease Control Priorities in Developing Countries (2nd edition). New York: Oxford University Press Available at: http://www.dcp2.org/pubs/DCP. Jennings V, Lloyd-Smith B and Ironmonger D (1999) Household size and the Poisson distribution. Journal of Population Research. 16 (1), 65–84. Jokipii AM and Jokipii L (1977) Prepatency of giardiasis. Lancet. 1 (8021), 1095–1097. Kaper James B, Nataro James P and Mobley HL (2004) Pathogenic Escherichia coli. Nature 463 Reviews: Microbiology. 2 (2), 123–140. Kapikian AZ, Wyatt RG, Levine MM, Black R E, Greenberg H B, Flores J, Kalica AR, Hoshino Y and Chanock RM (1983) Studies in volunteers with human rotaviruses. Developments in Biological Standardization. 53, 209–218. Keeling MJ and Eames KTD (2005) Networks and epidemic models. Journal of the Royal Society Interface. 2 (4), 295–307. Keeling MJ and Rohani P (2008) Modeling infectious diseases in humans and animals. Princeton: Princeton University Press Available at: http://www.modelinginfectiousdiseases.org/. Kent GP, Greenspan JR, Herndon JL, Mofenson LM, Harris JA, Eng TR and Waskin HA (1988) Epidemic giardiasis caused by a contaminated public water supply. American Journal of Public Health. 78 (2), 139–143. Kim JJ, Kuntz KM, Stout NK, Mahmud S, Villa LL, Franco EL and Goldie SJ (2007) Multiparameter calibration of a natural history model of cervical cancer. American Journal of Epidemiology. 166 (2), 137–150. Kirchhoff LV, McClelland KE, Do Carmo Pinho M, Araujo JG, De Sousa MA and Guerrant RL (1985) Feasibility and efficacy of in-home water chlorination in rural North-eastern Brazil. The Journal of Hygiene. 94 (2), 173–180. Kirkpatrick BD, Haque R, Duggal P, Mondal D, Larsson C, Peterson K, Akter J, Lockhart L, Khan S and Petri WA (2008) Association between Cryptosporidium infection and human leukocyte antigen class I and class II alleles. The Journal of Infectious Diseases. 197 (3), 474–478. Koopman JS (1978) Diarrhea and school toilet hygiene in Cali, Colombia. American Journal of Epidemiology. 107 (5), 412–20. Koopman JS (2004) Modeling infection transmission. Annual Review of Public Health. 25, 303– 326. Kosek M, Bern Caryn and Guerrant RL (2003) The global burden of diarrhoeal disease, as estimated from studies published between 1992 and 2000. Bulletin of the World Health Organization. 81 (3), 197–204. Lanata CF and Mendoza W (2002) Improving diarrhoea estimates. [Online] Available at: http://www.who.int/child_adolescent_health/documents/diarrhoea_estimates/en/index.ht ml (accessed 06/07/10). Lantagne D and Gallo W (2008) Safe Water for the Community: A Guide for Establishing a Community-Based Safe Water System Program (1st edition). Centers for Disease Control and Prevention Available at: http://www.cdc.gov/safewater/resources.html. 464 Last JM ed. (1995) A Dictionary of Epidemiology (3rd edition). New York: Oxford University Press Lazzerini M and Ronfani L (2008) Oral zinc for treating diarrhoea in children. Cochrane Database of Systematic Reviews. (3), CD005436. Leclerc H, Schwartzbrod L and Dei-Cas E (2002) Microbial agents associated with waterborne diseases. Critical Reviews in Microbiology. 28 (4), 371–409. Lembcke J, Gastañaduy AS and Brown K H (1989) Prediction of total daily fecal excretion during acute childhood diarrhea. Journal of Pediatric Gastroenterology and Nutrition. 9 (4), 467–472. Levine MM, Caplan ES, Waterman D, Cash RA, Hornick RB and Snyder MJ (1977) Diarrhea caused by Escherichia coli that produce only heat-stable enterotoxin. Infection and Immunity. 17 (1), 78–82. Levine MM, Rennels MB, Cisneros L, Hughes TP, Nalin DR and Young CR (1980) Lack of person-to-person transmission of enterotoxigenic Escherichia coli despite close contact. American Journal of Epidemiology. 111 (3), 347–355. Levy K, Hubbard AE and Eisenberg JNS (2009) Seasonality of rotavirus disease in the tropics: a systematic review and meta-analysis. International Journal of Epidemiology. 38 (6), 1487–1496. Levy K, Hubbard AE, Nelson KL and Eisenberg JNS (2009) Drivers of water quality variability in northern coastal Ecuador. Environmental Science & Technology. 43 (6), 1788–1797. Lim ML and Wallace MR (2004) Infectious diarrhea in history. Infectious Disease Clinics of North America. 18 (2), 261–274. Lindesmith L, Moe C, Marionneau S, Ruvoen N, Jiang X, Lindblad L, Stewart P, LePendu J and Baric R (2003) Human susceptibility and resistance to Norwalk virus infection. Nature Medicine. 9 (5), 548–553. Li S, Eisenberg JNS, Spicknall IH and Koopman JS (2009) Dynamics and control of infections transmitted from person to person through the environment. American Journal of Epidemiology. 170 (2), 257–265. Long KZ, Rosado Jorge L and Fawzi W (2007) The comparative impact of iron, the B-complex vitamins, vitamins C and E, and selenium on diarrheal pathogen outcomes relative to the impact produced by vitamin A and zinc. Nutrition Reviews. 65 (5), 218–232. Lowbury EJ, Lilly HA and Bull JP (1964) Disinfection of hands: removal of transient organisms. British Medical Journal. 2 (5403), 230–233. Luby SP, Agboatwalla M, Raza A, Sobel J, Mint ED, Baier K, Hoekstra R M, Rahbar MH, Hassan R, Qureshi SM and Gangarosa EJ (2001) Microbiologic effectiveness of hand 465 washing with soap in an urban squatter settlement, Karachi, Pakistan. Epidemiology and Infection. 127 (2), 237–244. Luby SP, Agboatwalla Mubina, Bowen A, Kenah E, Sharker Y and Hoekstra Robert M (2009) Difficulties in maintaining improved handwashing behavior, Karachi, Pakistan. The American Journal of Tropical Medicine and Hygiene. 81 (1), 140–145. Luby SP, Agboatwalla Mubina, Feikin DR, Painter J, Billhimer W, Altaf A and Hoekstra Robert M (2005) Effect of handwashing on child health: a randomised controlled trial. Lancet. 366 (9481), 225–233. Luby SP, Agboatwalla Mubina, Painter J, Altaf A, Billhimer W, Keswick Bruce and Hoekstra Robert M (2006) Combining drinking water treatment and hand washing for diarrhoea prevention, a cluster randomised controlled trial. Tropical Medicine & International Health. 11 (4), 479–89. Lunn PG (2000) The impact of infection and nutrition on gut function and growth in childhood. The Proceedings of the Nutrition Society. 59 (1), 147–154. Makutsa P, Nzaku K, Ogutu P, Barasa P, Ombeki S, Mwaki A and Quick R E (2001) Challenges in implementing a point-of-use water quality intervention in rural Kenya. American Journal of Public Health. 91 (10), 1571–3. Manz DH (2007) BioSand water filter technology: household concrete design. [Online] Available at: http://manzwaterinfo.ca/index.htm. Manz DH (2009) BSF Guidance Manual 3: Basic Operation of the Concrete BioSand Water Filter. [Online] Available at: http://manzwaterinfo.ca/index.htm. Mata LJ (1978) The Children of Santa María Cauqué: A Prospective Field Study of Health and Growth. Cambridge, Mass: MIT Press Mattison K (2011) Norovirus as a foodborne disease hazard. Advances in Food and Nutrition Research. 62, 1–39. Mäusezahl D, Christen A, Pacheco GD, Tellez FA, Iriarte M, Zapata ME, Cevallos M, Hattendorf J, Cattaneo MD, Arnold B, Smith TA and Colford JM (2009) Solar drinking water disinfection (SODIS) to reduce childhood diarrhoea in rural Bolivia: a clusterrandomized, controlled trial. PLoS Medicine. 6 (8), e1000125. McCarney R, Warner J, Iliffe S, van Haselen R, Griffin M and Fisher P (2007) The Hawthorne effect: a randomised, controlled trial. BMC Medical Research Methodology. 7, 30. McDade T and Worthman C (1998) The weanling’s dilemma reconsidered: A biocultural analysis of breastfeeding ecology. Journal of Developmental and Behavioral Pediatrics. 19 (4), 286–299. McLennan SD, Peterson LA and Rose Joan B (2009) Comparison of point-of-use technologies 466 for emergency disinfection of sewage-contaminated drinking water. Applied and Environmental Microbiology. 75 (22), 7283–7286. McMahon S, Caruso BA, Obure A, Okumu F and Rheingans RD (2011) Anal cleansing practices and faecal contamination: a preliminary investigation of behaviours and conditions in schools in rural Nyanza Province, Kenya. Tropical Medicine & International Health. 16 (12), 1536–1540. Mehnert DU and Stewien KE (1993) Detection and distribution of rotavirus in raw sewage and creeks in São Paulo, Brazil. Applied and Environmental Microbiology. 59 (1), 140–143. Mensah P and Tomkins A (2003) Household-level technologies to improve the availability and preparation of adequate and safe complementary foods. Food and Nutrition Bulletin. 24 (1), 104–125. Messner MJ, Chappell C L and Okhuysen P C (2001) Risk assessment for Cryptosporidium: a hierarchical Bayesian analysis of human dose response data. Water Research. 35 (16), 3934–3940. Miliotis MD and Bier J eds. (2003) International Handbook of Foodborne Pathogens. New York: M. Dekker Miller SM, Fugate EJ, Craver VO, Smith JA and Zimmerman JB (2008) Toward understanding the efficacy and mechanism of Opuntia spp. as a natural coagulant for potential application in water treatment. Environmental Science & Technology. 42 (12), 4274– 4279. Montgomery MA, Desai MM and Elimelech M (2010) Assessment of latrine use and quality and association with risk of trachoma in rural Tanzania. Transactions of the Royal Society of Tropical Medicine and Hygiene. 104 (4), 283–289. Morris SS, Cousens SN, Kirkwood BR, Arthur P and Ross DA (1996) Is prevalence of diarrhea a better predictor of subsequent mortality and weight gain than diarrhea incidence? American Journal of Epidemiology. 144 (6), 582–588. Mossong J, Hens N, Jit M, Beutels P, Auranen K, Mikolajczyk R, Massari M, Salmaso S, Tomba GS, Wallinga J, Heijne J, Sadkowska-Todys M, Rosinska M and Edmunds WJ (2008) Social contacts and mixing patterns relevant to the spread of infectious diseases. PLoS Medicine. 5 (3), e74. Motarjemi Y, Käferstein F, Moy G and Quevedo F (1993) Contaminated weaning food: a major risk factor for diarrhoea and associated malnutrition. Bulletin of the World Health Organization. 71 (1), 79–92. Mukherjee AK, Chowdhury P, Bhattacharya MK, Ghosh M, Rajendran K and Ganguly S (2009) Hospital-based surveillance of enteric parasites in Kolkata. BMC Research Notes. 2, 110. Murray CJL and Lopez AD eds. (1996) The Global Burden of Disease: a Comprehensive 467 Assessment of Mortality and Disability from Diseases, Injuries, and Risk Factors in 1990 and Projected to 2020. Cambridge, Mass.: Published by the Harvard School of Public Health on behalf of the World Health Organization and the World Bank Nataro J P and Kaper J B (1998) Diarrheagenic Escherichia coli. Clinical Microbiology Reviews. 11 (1), 142–201. Neto RC, dos Santos LU, Sato MIZ and Franco RMB (2010) Cryptosporidium spp. and Giardia spp. in surface water supply of Campinas, southeast Brazil. Water Science and Technology. 62 (1), 217–222. Oberhelman RA, Gilman Robert H, Sheen P, Cordova J, Zimic M, Cabrera Lilia, Meza R and Perez J (2006) An intervention-control study of corralling of free-ranging chickens to control Campylobacter infections among children in a Peruvian periurban shantytown. American Journal of Tropical Medicine and Hygiene. 74 (6), 1054–1059. Ogunbiyi TA (1978) Whole-gut transit rates and wet stool weight in an urban Nigerian population. World Journal of Surgery. 2 (3), 387–393. Ogunjimi B, Hens N, Goeyvaerts N, Aerts M, Van Damme P and Beutels P (2009) Using empirical social contact data to model person to person infectious disease transmission: An illustration for varicella. Mathematical Biosciences. 218 (2), 80–87. Olson M, Goh J, Phillips M, Guselle N and McAllister T (1999) Giardia cyst and Cryptosporidium oocyst survival in water, soil, and cattle feces. Journal of Environmental Quality. 28 (6), 1991–1996. Onyango AO, Kenya EU, Mbithi JJN and Ng’ayo MO (2009) Pathogenic Escherichia coli and food handlers in luxury hotels in Nairobi, Kenya. Travel Medicine and Infectious Disease. 7 (6), 359–366. Oreskes N, Shrader-Frechette K and Belitz K (1994) Verification, validation, and confirmation of numerical models in the earth sciences. Science. 263 (5147), 641–646. Oundo JO, Kariuki SM, Boga HI, Muli FW and Iijima Y (2008) High incidence of enteroaggregative Escherichia coli among food handlers in three areas of Kenya: a possible transmission route of travelers’ diarrhea. Journal of Travel Medicine. 15 (1), 31– 38. Pancorbo OC, Evanshen BG, Campbell WF, Lambert S, Curtis SK and Woolley TW (1987) Infectivity and antigenicity reduction rates of human rotavirus strain Wa in fresh waters. Applied and Environmental Microbiology. 53 (8), 1803–1811. Parashar UD, Hummelman EG, Bresee JS, Miller MA and Glass Roger I (2003) Global illness and deaths caused by rotavirus disease in children. Emerging Infectious Diseases. 9 (5), 565–572. Parkin RT (2008) Foundations and frameworks for human microbial risk assessment. 468 Washington, DC: United States Environmental Protection Agency (USEPA) Available at: www.epa.gov/raf/files/epa_mra_fw_comparison_report_0609.pdf. Patel MM, Widdowson M-A, Glass Roger I, Akazawa K, Vinjé J and Parashar UD (2008) Systematic literature review of role of noroviruses in sporadic gastroenteritis. Emerging Infectious Diseases. 14 (8), 1224–1231. Paul BD (1955) Health, Culture, and Community; Case Studies of Public Reactions to Health Programs. New York: Russell Sage Foundation Peréz Cordón G, Cordova Paz Soldan O, Vargas Vásquez F, Velasco Soto JR, Sempere Bordes L, Sánchez Moreno M and Rosales MJ (2008) Prevalence of enteroparasites and genotyping of Giardia lamblia in Peruvian children. Parasitology Research. 103 (2), 459–465. Pichichero ME, Losonsky GA, Rennels MB, Disney FA, Green JL, Francis AB and Marsocci SM (1990) Effect of dose and a comparison of measures of vaccine take for oral rhesus rotavirus vaccine. The Maryland Clinical Studies Group. The Pediatric Infectious Disease Journal. 9 (5), 339–344. Pickering AJ, Davis J, Walters SP, Horak HM, Keymer DP, Mushi D, Strickfaden R, Chynoweth JS, Liu J, Blum A, Rogers K and Boehm AB (2010) Hands, water, and health: fecal contamination in Tanzanian communities with improved, non-networked water supplies. Environmental Science & Technology. 44 (9), 3267–3272. Pickering AJ, Julian TR, Mamuya S, Boehm AB and Davis J (2011) Bacterial hand contamination among Tanzanian mothers varies temporally and following household activities. Tropical Medicine & International Health. 16 (2), 233–239. Pond K, Rueedi J and Pedley S (2004) Pathogens in drinking water sources. Guildford, Surrey, United Kingdom: Robens Centre for Public and Environmental Health, University of Surrey Available at: http://www.microrisk.com/publish/cat_index_11.shtml. Porter A (1916) An enumerative study of the cysts of Giardia (Lamblia) intestinalis in human dysenteric faeces. Lancet. 1, 1166–1169. Prado M S, Cairncross S, Strina A, Barreto ML, Oliveira-Assis AM and Rego S (2005) Asymptomatic giardiasis and growth in young children; a longitudinal study in Salvador, Brazil. Parasitology. 131 (Pt 1), 51–56. Qadri F, Svennerholm A-M, Faruque ASG and Sack RB (2005) Enterotoxigenic Escherichia coli in developing countries: epidemiology, microbiology, clinical features, treatment, and prevention. Clinical Microbiology Reviews. 18 (3), 465–483. Ramani S and Kang Gagandeep (2009) Viruses causing childhood diarrhoea in the developing world. Current Opinion in Infectious Diseases. 22 (5), 477–482. Razzolini MTP, da Silva Santos TF and Bastos VK (2010) Detection of Giardia and Cryptosporidium cysts/oocysts in watersheds and drinking water sources in Brazil urban 469 areas. Journal of Water and Health. 8 (2), 399–404. Rendtorff RC (1954) The experimental transmission of human intestinal protozoan parasites. II. Giardia lamblia cysts given in capsules. American Journal of Hygiene. 59 (2), 209–220. Rivero-Marcotegui A, Olivera-Olmedo JE, Valverde-Visus FS, Palacios-Sarrasqueta M, GrijalbaUche A and García-Merlo S (1998) Water, fat, nitrogen, and sugar content in feces: reference intervals in children. Clinical Chemistry. 44 (7), 1540–1544. Roberts MG and Heesterbeek JAP (2003) A new method for estimating the effort required to control an infectious disease. Proceedings of the Royal Society B: Biological Sciences. 270 (1522), 1359–1364. Rodriguez WJ, Kim HW, Brandt CD, Schwartz RH, Gardner MK, Jeffries B, Parrott RH, Kaslow RA, Smith JI and Kapikian AZ (1987) Longitudinal study of rotavirus infection and gastroenteritis in families served by a pediatric medical practice: clinical and epidemiologic observations. The Pediatric Infectious Disease Journal. 6 (2), 170–176. Rogers EM (2003) Diffusion of Innovations (5th edition). New York: Free Press Root GP (2001) Sanitation, community environments, and childhood diarrhoea in rural Zimbabwe. Journal of Health, Population, and Nutrition. 19 (2), 73–82. Rose A, Roy S, Abraham V, Holmgren G, George K, Balraj V, Abraham S, Muliyil J, Joseph A and Kang G (2006) Solar disinfection of water for diarrhoeal prevention in southern India. Archives of Disease in Childhood. 91 (2), 139–141. Rose J B, Haas CN and Regli S (1991) Risk assessment and control of waterborne giardiasis. American Journal of Public Health. 81 (6), 709–713. Rothman KJ (1986) Modern Epidemiology (1st edition). Boston: Little, Brown Roy SK, Tomkins AM, Akramuzzaman SM, Behrens RH, Haider R, Mahalanabis D and Fuchs G (1997) Randomised controlled trial of zinc supplementation in malnourished Bangladeshi children with acute diarrhoea. Archives of Disease in Childhood. 77 (3), 196–200. Schmidt W-P and Cairncross S (2009) Household water treatment in poor populations: is there enough evidence for scaling up now? Environmental Science & Technology. 43 (4), 986– 992. Schmidt W-P, Genser B and Chalabi Z (2009) A simulation model for diarrhoea and other common recurrent infections: a tool for exploring epidemiological methods. Epidemiology and Infection. 137 (5), 644–653. Shediac-Rizkallah MC and Bone LR (1998) Planning for the sustainability of community-based health programs: conceptual frameworks and future directions for research, practice and policy. Health Education Research. 13 (1), 87–108. Singh G, Vajpayee P, Ram S and Shanker R (2010) Environmental reservoirs for enterotoxigenic 470 Escherichia coli in south Asian Gangetic riverine system. Environmental Science & Technology. 44 (16), 6475–6480. Sjögren E, Ruiz-Palacios G and Kaijser B (1989) Campylobacter jejuni isolations from Mexican and Swedish patients, with repeated symptomatic and/or asymptomatic diarrhoea episodes. Epidemiology and Infection. 102 (1), 47–57. Sobsey MD (2002) Managing water in the home: accelerated health gains from improved water supply. [Online] Available at: http://www.who.int/water_sanitation_health/dwq/wsh0207/en/print.html (accessed 04/09/09). Sobsey MD and Brown Joe (2011) Evaluating household water treatment options: health-based targets and microbiological performance specifications. Geneva, Switzerland: World Health Organization Available at: http://www.who.int/water_sanitation_health/publications/2011/household_water/en/index .html (accessed 11/07/11). Sobsey MD, Stauber C, Casanova L, Brown JM and Elliott M (2008) Point of use household drinking water filtration: A practical, effective solution for providing sustained access to safe drinking water in the developing world. Environmental Science & Technology. 42 (12), 4261–4267. Solo-Gabriele HM, LeRoy Ager A, Fitzgerald Lindo J, Dubón JM, Neumeister SM, Baum MK and Palmer CJ (1998) Occurrence of Cryptosporidium oocysts and Giardia cysts in water supplies of San Pedro Sula, Honduras. Pan American Journal of Public Health. 4 (6), 398–400. Spangenberg ER, Greenwald AG and Sprott DE (2008) Will you read this article’s abstract? Theories of the question–behavior effect. Journal of Consumer Psychology. 18 (2), 102– 106. Sphere Project (2011) Humanitarian Charter and Minimum Standards in Humanitarian Response (3rd edition). Rugby, United Kingdom: Practical Action Publishing Available at: www.sphereproject.org (accessed 11/06/12). Stockman LJ, Fischer TK, Deming M, Ngwira B, Bowie C, Cunliffe N, Bresee J and Quick Robert E (2007) Point-of-use water treatment and use among mothers in Malawi. Emerging Infectious Diseases. 13 (7), 1077–80. Sutra S, Srisontrisuk S, Panpurk W, Sutra P, Chirawatkul A, Snongchart N and Kusowon P (1990) The pattern of diarrhea in children in Khon Kaen, northeastern Thailand: I. The incidence and seasonal variation of diarrhea. The Southeast Asian Journal of Tropical Medicine and Public Health. 21 (4), 586–593. Taylor A and Higham DJ (2008) CONTEST: a controllable test matrix toolbox for MATLAB. University of Strathclyde Available at: http://www.mathstat.strath.ac.uk/research/groups/numerical_analysis/contest/toolbox. 471 Taylor A and Higham DJ (2009) CONTEST: A Controllable Test Matrix Toolbox for MATLAB. ACM Transactions on Mathematical Software. 35 (4). Taylor CE and Greenough WB 3rd (1989) Control of diarrheal diseases. Annual Review of Public Health. 10, 221–244. Teunis Peter F M, Moe CL, Liu P, Miller SE, Lindesmith L, Baric RS, Le Pendu J and Calderon RL (2008) Norwalk virus: how infectious is it? Journal of Medical Virology. 80 (8), 1468–1476. Teunis P F M, van der Heijden OG, van der Giessen JWB and Havelaar A (1996) The doseresponse relation in human volunteers for gastro-intestinal pathogens. Bilthoven, The Netherlands Available at: http://rivm.openrepository.com/rivm/handle/10029/9966 (accessed 23/07/10). Thompson KM, Duintjer Tebbens RJ and Pallansch MA (2006) Evaluation of response scenarios to potential polio outbreaks using mathematical models. Risk Analysis: An Official Publication of the Society for Risk Analysis. 26 (6), 1541–56. Tonglet R, Mahangaiko Lembo E, Zihindula PM, Wodon A, Dramaix M and Hennart P (1999) How useful are anthropometric, clinical and dietary measurements of nutritional status as predictors of morbidity of young children in central Africa? Tropical Medicine & International Health. 4 (2), 120–130. Toranzos GA, Gerba CP and Hanssen H (1988) Enteric viruses and coliphages in Latin America. Toxicity Assessment: An International Journal. 3 (5), 491–510. Tuite AR, Fisman DN, Kwong JC and Greer AL (2010) Optimal pandemic influenza vaccine allocation strategies for the Canadian population. PloS One. 5 (5), e10520. University of Maryland (2012) Global Enterics Multi-Center Study (GEMS). [Online] Available at: http://medschool.umaryland.edu/GEMS/ (accessed 05/04/12). USAID (2012) Demographic and Health Surveys. [Online] Available at: http://www.measuredhs.com/ (accessed 21/05/12). USAID, UNICEF and World Health Organization (2005) Diarrhoea treatment guidelines including new recommendations for the use of ORS and zinc supplementation for clinicbased healthcare workers. USEPA (2011) Exposure Factors Handbook: 2011 Edition. Washington, DC: Exposure Assessment Group, Office of Health and Environmental Assessment, U.S. Environmental Protection Agency Available at: http://cfpub.epa.gov/ncea/risk/recordisplay.cfm? deid=236252. USEPA (1987) Guide standard and protocol for testing microbiological water purifiers. Available at: http://www.biovir.com/Images/pdf061.pdf. 472 USEPA (2012) National Primary Drinking Water Regulations. [Online] Available at: http://water.epa.gov/drink/contaminants/index.cfm (accessed 18/05/12). Valentiner-Branth P, Steinsland H, Fischer TK, Perch M, Scheutz F, Dias F, Aaby P, Mølbak K and Sommerfelt H (2003) Cohort study of Guinean children: incidence, pathogenicity, conferred protection, and attributable risk for enteropathogens during the first 2 years of life. Journal of Clinical Microbiology. 41 (9), 4238–4245. VanDerslice J and Briscoe J (1995) Environmental interventions in developing countries: interactions and their implications. American Journal of Epidemiology. 141 (2), 135–44. VanDerslice J, Popkin B and Briscoe J (1994) Drinking-water quality, sanitation, and breastfeeding: their interactive effects on infant health. Bulletin of the World Health Organization. 72 (4), 589–601. Velázquez FR, Matson DO, Calva JJ, Guerrero L, Morrow AL, Carter-Campbell S, Glass R I, Estes M K, Pickering LK and Ruiz-Palacios GM (1996) Rotavirus infections in infants as protection against subsequent infections. The New England Journal of Medicine. 335 (14), 1022–1028. Vergara M, Quiroga M, Grenon S, Pegels E, Oviedo P, Deschutter J, Rivas M, Binsztein N and Claramount R (1996) Prospective study of enteropathogens in two communities of Misiones, Argentina. Revista do Instituto de Medicina Tropical de São Paulo. 38 (5), 337–347. Vesikari T, Ruuska T, Bogaerts H, Delem A and André F (1985) Dose-response study of RIT 4237 oral rotavirus vaccine in breast-fed and formula-fed infants. Pediatric Infectious Disease. 4 (6), 622–625. Vollet JJ, Ericsson CD, Gibson G, Pickering LK, DuPont H L, Kohl S and Conklin RH (1979) Human rotavirus in an adult population with travelers’ diarrhea and its relationship to the location of food consumption. Journal of Medical Virology. 4 (2), 81–87. Waddington H, Snilstveit B, White H and Fewtrell L (2009) Water, sanitation, and hygiene interventions to combat childhood diarrhoea in developing countries. [Online] Available at: http://www.3ieimpact.org/page.php?pg=synthetic (accessed 15/07/11). Walker RI, Steele D and Aguado T (2007) Analysis of strategies to successfully vaccinate infants in developing countries against enterotoxigenic E. coli (ETEC) disease. Vaccine. 25 (14), 2545–2566. Wallinga J, Teunis P and Kretzschmar M (2006) Using Data on Social Contacts to Estimate Agespecific Transmission Parameters for Respiratory-spread Infectious Agents. American Journal of Epidemiology. 164 (10), 936–944. Ward RL, Bernstein DI, Young EC, Sherwood JR, Knowlton DR and Schiff GM (1986) Human rotavirus studies in volunteers: determination of infectious dose and serological response to infection. The Journal of Infectious Diseases. 154 (5), 871–880. 473 Ward RL, Knowlton DR and Pierce MJ (1984) Efficiency of human rotavirus propagation in cell culture. Journal of Clinical Microbiology. 19 (6), 748–753. Weaver LT (1988) Bowel habit from birth to old age. Journal of Pediatric Gastroenterology and Nutrition. 7 (5), 637–640. Wenman WM, Hinde D, Feltham S and Gurwith M (1979) Rotavirus infection in adults. Results of a prospective family study. The New England Journal of Medicine. 301 (6), 303–306. Wennerås C and Erling V (2004) Prevalence of enterotoxigenic Escherichia coli-associated diarrhoea and carrier state in the developing world. Journal of Health, Population, and Nutrition. 22 (4), 370–382. Wickramanayake G, Rubin A and Sproul O (1985) Effects of ozone and storage temperature on giardia cysts. Journal American Water Works Association. 77 (8), 74–77. Wilson ME (2005) Diarrhea in nontravelers: risk and etiology. Clinical Infectious Diseases. 41 Suppl 8, S541–546. Wood L, Egger M, Gluud LL, Schulz KF, Jüni P, Altman DG, Gluud C, Martin RM, Wood AJG and Sterne JAC (2008) Empirical evidence of bias in treatment effect estimates in controlled trials with different interventions and outcomes: meta-epidemiological study. British Medical Journal. 336 (7644), 601–605. World Health Organization Disease and injury regional estimates. [Online] Available at: http://www.who.int/healthinfo/global_burden_disease/estimates_regional/en/index.html (accessed 30/03/12). World Health Organization (2012) Global Burden of Disease (GBD). [Online] Available at: http://www.who.int/healthinfo/global_burden_disease/en/ (accessed 15/02/12). World Health Organization (2008) The global burden of disease: 2004 update. Geneva, Switzerland: WHO Press Available at: http://www.who.int/topics/global_burden_of_disease/en/ (accessed 08/10/12). World Health Organization (2002) World Health Report 2002: Reducing risks, promoting healthy life. Geneva, Switzerland: World Health Organization World Health Organization and UNICEF (2010) Progress on sanitation and drinking-water: 2010 update. Geneva, Switzerland: World Health Organization Available at: http://www.who.int/water_sanitation_health/publications/9789241563956/en/index.html (accessed 15/06/10). Xiao L (2010) Molecular epidemiology of cryptosporidiosis: an update. Experimental Parasitology. 124 (1), 80–89. Yeager BAC, Huttly SR, Bartolini R, Rojas M and Lanata CF (1999) Defecation practices of young children in a Peruvian shanty town. Social Science & Medicine. 49 (4), 531–541. 474 Young CR, Ziprin RL, Hume ME and Stanker LH (1999) Dose response and organ invasion of day-of-hatch Leghorn chicks by different isolates of Campylobacter jejuni. Avian Diseases. 43 (4), 763–767. Zafar SN, Luby SP and Mendoza C (2010) Recall errors in a weekly survey of diarrhoea in Guatemala: determining the optimal length of recall. Epidemiology and Infection. 138 (2), 264–269. Zelner JL, Trostle J, Goldstick JE, Cevallos W, House JS and Eisenberg JNS (2012) Social connectedness and disease transmission: social organization, cohesion, village context, and infection risk in rural Ecuador. American Journal of Public Health. de Zoysa I and Feachem RG (1985) Interventions for the control of diarrhoeal diseases among young children: rotavirus and cholera immunization. Bulletin of the World Health Organization. 63 (3), 569–583. de Zoysa I, Rea M and Martines Jose (1991) Why promote breastfeeding in diarrhoeal control programmes? Health Policy and Planning. 6 (4), 371–379. Zwane AP, Zinman J, Van Dusen E, Pariente W, Null C, Miguel E, Kremer M, Karlan DS, Hornbeck R, Giné X, Duflo E, Devoto F, Crepon B and Banerjee A (2011) Being surveyed can change later behavior and related parameter estimates. Proceedings of the National Academy of Sciences of the United States of America. 108 (5), 1821–1826. 475