Y'HFSlS v y???“ \ , ‘\ ‘\.1\ iii 28 NNW?“ \c ‘5“2i‘iiixeii’ \l “\.\\\\\ “i A \' \\i\\‘\\\\\\\ \i‘i '\\\\\ \ 3 1069‘ / / \\\ il‘ _. i__ . M \ '2 LIBRARY 1 W ‘22, l Michigan Stat: E} University 4;, w This is to certify that the thesis entitled LAKE MODEL UNCERTAINTY ANALYSIS - a lake Ontario case study using Monte Carlo Simulation presented by V. David Lee has been accepted toWards fulfillment of the requirements for Master of Science degreein Resource Development Major professor Date January 5, 1981 0-7639 25.), l ‘4 \H L 2’ old I I {In of!!! 1 V“ I 1/ - 4 OVERDUE FINES: 25¢ per day per item RETURNING LIBRARY MATERIALS: Place in book return to remove charge from circulation records LAKE MODEL UNCERTAINTY ANALYSIS -a Lake Ontario case study using Monte Carlo simulation- By V. David Lee A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Department of Resource Development 1980 ABSTRACT LAKE MODEL UNCERTAINTY ANALYSIS - A LAKE ONTARIO CASE STUDY USING MONTE CARLO SIMULATION - By V. David Lee A growing concern over the reliability of modeling and model pre- dictions has highlighted the need for a measure of the value of the information contained in the model result. Uncertainty analysis meets this need by allowing quantification of model prediction uncertainty. Monte Carlo simulation accomplishes this by incorporating all known sources of error as stochastic inputs which reflect the natural vari- ability and estimation uncertainty for each parameter. A discussion of general modeling concepts, an introduction to Monte Carlo simulation, and an overview of distribution selection considerations serve as an introduction to the Lake Ontario case study. A detailed development of the model expression and the simula- tion structure is presented, and model sensitivities as well as several planning scenarios are experimentally tested. Following a discussion, conclusions and recommendations are made and analyzed in terms of the meaningfulness of the results and the utility of the model as a planning tool. ACKNOWLEDGEMENTS This research was supported through funding granted by the National Oceanic and Atmospheric Administration (grant #03-78-bOl-l09) and by the Michigan State University Agricultural Experiment Station. I would like to acknowledge the invaluable contributions made by Janine Niemer and Sue Watt for preparing this document in its final form, and by Paul Schneider for his superb graphics work. Special thanks are also extended to my friends and colleagues Michael Beaulac, Ralph Ancil, and Robert Montgomery for their constructive criticism, editorial advice, and the intellectually stimulating environment they helped to provide. I expressly thank my committee members Dr. Kenneth Reckhow, Dr. Daniel Chappelle, and Dr. Stanley Zarnoch for their comments and criticisms. These gentlemen have not lost sight of the most noble goals of higher education, and as a result, I have not only gained facts and skills but a valuable learning experience in the process. Dr. Reckhow is especially deserving of very special thanks. As my major professor and friend, he has provided me with an example of professionalism and high academic standards that will remain with me always. Through his guidance, personal concern, and seemingly endless patience, I have matured both as a professional and as a person. I owe him a great deal. ii Deeply felt appreciation is also due to my close friend, Cynthia Hauger. It has been her understanding, support, and love that have sustained me for these many months, and I thank her with all my heart. Finally, I wish to thank my mother and father. Their support and encouragement have provided the opportunity for me to realize and overcome my own shortcomings without being penalized for my own inability to recognize good advice when I first hear it. I love them dearly, and it is to them that I dedicate this thesis. TABLE OF CONTENTS Page INTRODUCTION .......................... 1 CHAPTER ONE: MODELS ...................... 4 Model Types ........................ 4 Mathematical Models .................... 5 Considerations Central to Model Use ............ 6 Model Costs ........................ 7 Complexity, Accuracy, and Uncertainty ........... 9 Summary Comments ..................... ll CHAPTER TWO: MONTE CARLO SIMULATION .............. T3 The Monte Carlo Procedure ................. l3 Distribution Selection .................. l4 Bases for Distribution Selection ............. 16 Qualitative Basis .................. l6 Quantitative Basis .................. l7 Quantitative Techniques .................. l8 Sensitivity Analysis ................. 19 Graphical Analysis .................. 19 Parameter Estimation ................. 20 Simple Parametric Distributions ......... 21 Complex Parametric Distributions ........ 22 Goodness-of-Fit Testing ............... 23 CHAPTER THREE: THE SIMULATION MODEL APPLICATION ........ 24 The Model Expression ................... 24 System Representation ................... 28 Distributions and Random Number Generation ...... 28 Phosphorus Loading .................. 29 Diffuse Source Flux ............... 30 Atmospheric Flux ................ 32 Point Source Flux ................ 34 Lake Erie Flux ................. 35 Settling Velocity .................. 35 Areal Water Loading ................. 37 Additional Model Terms ................ 38 Simulation Flow and Structure ............... 39 iv CHAPTER FOUR: RESULTS ..................... 40 Model Results ....................... 4O Flux ......................... 4O Waterload ....................... 4l Model Sensitivities .................... 42 Number of Runs .................... 42 Distribution Selection ................ 43 Random Seed ...................... 43 Planning Scenarios .................... '. 44 Point Source Flux ................... 44 Erie Concentration .................. 46 Land Use Fractions .................. 47 Variability ........................ 48 CHAPTER FIVE: ANALYSIS AND DISCUSSION ............. 5l FINAL COMMENTS ......................... 6O BIBLIOGRAPHY .......................... 62 APPENDICES ........................... 66 A Model Parameter Data Distributions ........... 66 B. Model Results - Graphical Summary ........... 79 C. Model Results - Tabular Summary ............ 99 D. 1000 Iteration Periodicity ............... lll E Flow Diagram and Computer Listing ........... ll4 C.4a C.5a LIST OF TABLES Page Land Use Projections for the Great Lakes Basin ..... 33 lOO Iteration Base Run - Model Terms .......... 100 Too Iteration Base Run — Flux Terms .......... 101 lOO Iteration Base Run - Water Load Terms ....... 102 Model Sensitivities - Number of Runs, Distribution Selection ...................... 104 Model Sensitivities - Random Seed ........... 105 Planning Scenarios - Point Source, Lake Erie Concentration .................... 106 Planning Scenarios - Land Use Fractions ........ 107 Single Year Prediction Distributions - Base Run, All Normal ........................ I09 Prediction Variability Over Time ............ 110 vi Figure J>J> (II->00 03 .TO .11 .l2 #OON .5a .5b LIST OF FIGURES Page Runoff Concentrations From Urban Watersheds - Data . . . 67 Runoff Concentrations From Agricultural Watersheds . . . 68 Runoff Concentrations From Forested Watersheds ..... 69 Atmospheric Flux to Urbanized Land Uses ........ 7O Atmospheric Flux to Agricultural Land Uses ....... 7l Atmospheric Flux to Forested Land Uses ......... 72 Atmospheric Water Loading to Lake Surface ....... 73 Water Load From Lake Ontario Basin Tributaries ..... 74 Water Load From Lake Erie at Niagara-Welland ...... 75 Phosphorus Concentration From Lake Erie at Niagara- Welland ....................... 76 Apparent Settling Velocity ............... 77 Hydraulic Detention Time ................ 78 100 Iteration Base Run - Flux Sources ......... BO lOO Iteration Base Run - Water Load Sources ...... Bl lOO Iteration Base Run - Concentration Predictions . . . 82 500 Iteration Run - Concentration Predictions ..... 83 500 Iteration Run, Increased Random Seed - Concentra- tion Predictions ................... 84 1000 Iteration Run - Concentration Predictions ..... 85 lOOO Iteration Run, Random Seed = 135432.00 - Concentration Predictions .............. 86 1000 Iteration Run, Increased Markov Constant - Concen- tration Predictions ................. 87 vii Figure B.l3a E.2 lOOO Iteration Run, Decreased Settling Velocity - Concentration Predictions ............. lOOO Iteration Run, Random Seed = 987225.00 - Concentration Predictions ............. Base Run, l/2 Point Source Flux Variability - Concen- tration Predictions ................ Base Run, Upgrade All Point Sources - Concentration Predictions .................... Comparison of Point Source Predictions - Flux Predic- tions ....................... Base Run, Eastern Erie Concentrations - Concentration Predictions .................... Comparison of Shifts to Single Land Use Types - Concentration Predictions ............. Four Single-Year Prediction Distributions, Base Run Four Single-Year Prediction Distributions, All Normal Distributions ................ Base Run, All Normal Distributions - Concentration Predictions .................... All Normal Distributions, Random Seed = 987225.00 - Concentration Predictions ............. Flow Diagram of Simulation . . . . . . ..... . . . . Diagramatic Representation of Prediction Altorithm . . viii 96 97 98 115 .116 INTRODUCTION Concern for the environment follows a pendulum-like path; some- times foremost in our thoughts, often superseded by seemingly more immediate concerns. Since the time of the industrial revolution, degradation of the environment has been a prominent side effect of the rapid increases in technology and leisure time. The associated demands placed on natural resource systems, combined with the rapid expansion of technological expertise, has resulted in an acute awareness of our ability to alter the environment. Not until most recently, however, has this awareness begun to alter the way in which the world and its resources are perceived. The past 20 years have highlighted the dangers of pollution, and the hazards of exploiting the earth's re- sources. Demands for effective action to protect the environment have been voiced in increasing numbers. This change in collective con- sciousness has been remarkably swift and far-reaching, and represents perhaps the most important of the recent social, political, and econo- mic changes which bears directly and critically on the future of our planet. Our planet is both fragile and finite. Decisions made today will dictate the differences between a polluted, unproductive world, and a world that can sustain future generations. However, making those de- cisions is never the clear cut, straight forward procedure it could be ideally. This multifaceted process must give a high priority to assessing potential impacts of different resource use strategies. The extent to which negative impacts are minimized in favor of positive impacts, however, relies heavily on the judgment of the decision- maker. Traditionally, this responsibility has been met with very little quantitative information. While qualitative information is essential to any resource utilization decision, decisions solely based on this type of information are risky at best. The trend toward in- creasing accountability and justification requirements imposed on decision-makers has led toward a growing reliance on explicit quanti- tative analysis of the essential cause and effect relationships. This type of analysis requires considerable amounts of specific information; information that is often not readily available. Major decisions re- garding resource management are made daily as part of numerous planning programs, and modeling has become a popular means for meeting the in- creasing expectations which planners encounter in their continual efforts to justify expenditures and fully achieve all expected results. The major concern that has prompted the following work is the potential use and, most importantly, misuse of models in the resource planning context. As one of the most powerful planning tools available, modeling needs to be critically evaluated to effect their greatest positive impact. In particular, the emphasis of this investigation is focused on the topic of uncertainty analysis as an aid to decision- making. From a cost-effectiveness standpoint, incorporating uncer- tainty analysis into the modeling process presents one of the few excellent opportunities to gain a great deal of meaningful information for very little additional cost. In order to establish a common ground for all readers, as well as a point of departure for the example application, some basic information on models and modeling is included. This is followed by a discussion of the Monte Carlo procedure with special attention given to distribu- tion selection, the heart of Monte Carlo simulation. The remainder of this analysis consists of the application of this stochastic process to the Lake Ontario system, and a discussion of simulation development and model outputs. CHAPTER I MODELS Models are simplifications of reality. For this reason, any particular system can be modeled in a multitude of ways because there are many ways to simplify the same reality. Different model types can be classified by their structural characteristics. Model Types Physical models are intended to closely resemble the subject in appearance. Characteristically, scaling techniques are used to yield products such as a globe or photograph. Analog models are character- ized by the use of graphical or schematic representation. Often transformed equivalents are employed for the development of typical products such as block diagrams and flowcharts. Finally, there are symbolic models which are characterized by the use of symbols, mathematical and logical, to represent the system components and interactions. Of these general types, only the symbolic has been extensively applied in the resource management contest. Of course, physical and analog models can be useful in specific instances. The physical model is useful in studying the physical character- istics of certain systems. This modeling process, however, is often expensive, highly system specific, and thus quite limited in usefulness. Analog models are often very useful for general planning. There is however, no facility for describing other than qualitative relation- ships. For these reasons, the mathematical model now bears the lion's share of the modeling performed for resource planning and management, and has revolutionized information systems. Mathematical Models This model type is actually a set of model types, as diverse in make up as the set of all models as a whole. Basically, these models can be classified as either empirical or theoretical. Empirical models are developed from statistical analysis of the available data, while the theoretical models attempt to describe the system components and their interactions through the use of equations derived to match the mechanistic operation of the system. In addition, mathematical models may be specialized to perform optimization or simulation duties. Optimization involves attempting to discover the best conditions, as defined by an objective function and constraints for a given situation. Simulation seeks only to create circumstances which emulate the system processes. Three additional classifications must be taken into account to characterize the entire spectrum of mathematical models. First, the time dependency question must be addressed. Static models do not allow for conditions in which the values of the variables change with time. These models tend to be simpler, however, and require less compu- tational effort than their dynamic counterparts, which are capable of accounting for the effects of transient phenomena. Second, cross- sectional and longitudinal dimensionality can vary according to need. Real world resource systems are three dimensional in nature. However, two dimensions, or often one, may be appropriate when the processes desired can be adequately described. Temporal dimensionality is dictated primarily by a model's time dependency; dynamic models being capable of continuous longitudinal forecasting. Third, and last, to be considered is the degree to which real world variability is built into the model. Deterministic models are based on physical laws and empirical formulas, and are frequently regarded as expected-mean-value models. Stochastic (or probabilistic) models take into account the randomness or variability inherent in system phenomena as well as the errors associated with quantitatively characterizing the system. Although more realistic in their representation of physical processes, excessive data requirements often limit their opportunity for use. Considerations Central to Model Use Mathematical models can be used in a variety of ways. Whether or not they should be employed in any given situation is a separate and important question. A direct answer which applies to all situations across the board does not exist. The decision to model must be made on a case by case basis according to the specific requirements of each case. The availability of an existing model which is suitable or the need for developing a new model to meet specifications must be considered. Time constraints, data availability, and data collection requirements additionally all exert a strong influence on decisions to model, and may differ substantially from case to case. A basic understanding of the advantages and disadvantages of model building provides a basis on which to make this decision. Chappelle (1972) elaborates the major strengths of mOdel building. He notes that: l) the most successful predicting systems to date employ model building, 2) since the model represents the designer's view of the real world, all of the inherent simplifying assumptions are explicitly recognized, 3) models force recognition of shortcomings at an early stage, 4) once represented symbolically, the system can be related Inore easily in the literature, and can be more easily manipulated, £5) mathematical models are often the least expensive means to accom- plish prediction, and 6) modern theoretical statistics can be used as ii powerful tool in the analysis and manipulation of the model. Many (dangers also exist however. Inappropriate simplification, unnecessary «:omplexity, and model building for model building's sake all reduce 'the probability that the model can be used to provide meaningful results. Foremost in the minds of model users is the ability of modeling to give quantitative answers to complex planning problems through the elucidation of fundamental cause-effect relationships. Depending on the solution technique used, these quantitative answers can be derived quickly and inexpensively. Computer assisted computation is one example. ,Model Costs As mentioned above, however, computation is only a portion of the time and cost necessary to complete a successful modeling exercise. lWodel selection and development (or modification), verification, degree (3f complexity and uncertainty are all important cost considerations. In addition, the availability of all necessary data, in the appropriate format to meet all these requirements must be taken into account. A wide variety of models are currently available, and as new models are developed, the probability of existence of a relevant model, which needs little modification, increases. Despite the fact that there are many models in existence and more being developed, this does not necessarily represent a duplication of efforts (Reckhow et. al., l980b). Models are needed for a wide range of problem types, addressing issues at different levels of complexity and precision. Additionally, in very few instances, at this point in time, can a model be expected to efficiently and effectively meet the need for specific information without some modification, and therefore some cost. A major consideration in the use of any model is the type, amount and accuracy of the data needed to carry out experiments with specified reliability. Data are needed for calibration and verifica- tion of the model, as well as for experimental applications. Calibra- tion and verification are distinctly different procedures. Calibration is performed using one or more data sets for model inputs and outputs to adjust and tune the model. Verification involves testing the calibrated model using independent data. Preferably, conditions for verification should differ from those used for calibration. Favorable comparisons between model predictions and corresponding field data serve as verification. Collection of field data, or assembly of historical data, is a costly and time consuming procedure. Once verification is accomplished, however, repeated experiments may be carried out quickly and with a small additional cost. Therefore, numerous alternative scenarios can be avaluated quickly and at low cost. The cost of experimental data, however, remains a major expense in terms of time as well as money. Complexity, Accuracy, and Uncertainty The cost of information is most acutely felt in the consideration of model complexity. The more complex and detailed a model is, the more data it requires to describe the system and the state of the system. The model developer must carefully evaluate the need for complexity in relation to the cost of the data necessary for meaning- ful applications. In addition to this data cost, computational costs are directly related to model complexity, and thus increase model operation costs. The introduction of additional complexity does not necessarily increase the accuracy of the simulation. In fact accuracy as well as precision may be adversely affected. Although an increase in accuracy often implies an increase in complexity, the reverse is not true, and a common fallacy is to mistake complexity for accuracy. The accuracy and precision (variability) of modeling results depend directly on these same attributes of the model, parameters, variables, and input data. One of the common pitfalls in using mathe- matical models is to attribute much greater weight to the model than is warranted by the accuracy and precision of the results. Such over-reliance on the simulation results may be counter productive and lead to misinterpretation of the data. Consideration of accuracy and precision, then, is a very real and vitally important issue. Model users are often placed in situations of great personal and professional risk by models and model results. It is no surprise, 10 therefore that concern over reliability has begun to be voiced. A measure of the value of the information contained in the model result is needed, and uncertainty analysis meets this need. Quantification of uncertainty is relatively new to lake modeling research. From early work by Cornell (1972) and Berthouex (1975), many other studies have followed, which address uncertainty analysis in lake phosphorus modeling (Reckhow, 1978b; Lehman, 1978; Lettenmaier and Richey, 1979; Chapra and Reckhow, 1979; Reckhow, 1979b; Reckhow and Chapra, 1979; and Simpson and Reckhow, 1979). More recently, compara- tive applications of different uncertainty analysis techniques, such as first-order, Extended Kalman Filter, and Monte Carlo simulation have been presented (Scavia, 1980; Scavia et al., 1980). Traditionally, the term uncertainty has been used to describe variability in situations where too little information is available to quantify that variability. Risk, on the other hand, is used to characterize variability in situations where enough information exists to quantify variability, often in the form of probabilities. In the present context, some liberty is taken with these definitions. Uncertainty will be estimated quantitatively to enable some qualitative comparative risk reduction. A model's inherent simplification of reality results from the in- ability to census all information completely through time and space. As a result, all model predictions reflect a certain degree of im- precision or uncertainty. Through the use of procedures such as Monte Carlo simulation, quantification of model prediction uncertainty is possible. This in turn provides yet one more piece of information l] which can be used to weigh the value of model outputs. A final consideration is the tradeoff between reliability and time. A decision based on modeling which yields results with relatively broad error bounds but that is ready in time to be used in appropriate planning activities may be preferable to a decision based on a more accurate yet complex model which is not available for planning applications until a later time. Summarngomments Mathematical models are powerful tools. By following a few simple guidelines, models can complement the qualitative information used in decision making and thereby provide valuable input to directing the planning process. The following guidelines represent a partial listing of general statements meant to serve as reminders but not as strict rules: 1. Critically define the problem and determine both what questions are to be answered and what infor- mation is needed. 2. Use the simplest method that can provide the answers to the questions. 3. Fit the model to the problem, not the problem to the model. 4. Weigh carefully increased time and cost involved in increasing accuracy, and use the simplest model that will yield the desired results. 5. Do not confuse complexity with accuracy. 12 6. Explicitly note the underlying assumptions of the model and avoid attributing more significance to the results than is actually there. A model may be thought of as really nothing more than an hypothesis or set of hypotheses which are logically combined into an integrated whole (Chappelle, 1972). The goodness of a model prediction depends on how well these guidelines are used in formulating the appropriate hypothesis and connecting elements. Most important of all, it must be recognized that mathematical models, when used properly, can expand the range of alternatives for consideration, and assist in providing information in an organized form. Models are not the panacea for resource information systems, however. When used within their limitations, models are at best tools to assist in the difficult task of evaluating alternatives. They are not a substitute for experience and good judgment, rather they are a means for permitting these qualities to be used more effectively. CHAPTER II MONTE CARLO SIMULATION Monte Carlo simulation has broad application potential as a re- search tool. Any system which can be characterized by parameters that exhibit some degree of variability is a candidate for Monte Carlo analysis. By allowing explicit treatment of parameter variability and error, this technique is an ideal tool for combining estimates of error from all sources in an examination of model prediction error. The Monte Carlo Procedure Monte Carlo simulation presumes construction of a mathematical model, which describes the stochastic behavior of the variables in the process under study. By characterizing each of these variables as a probability density function (an explicit representation of the variability and error) and not as a single value, all known components of the uncertainty associated with the model prediction are internalized. Repeatedly selecting a random value from the distribution representing each term, and using each of these randomly selected values in the model to calculate a predicted value of the dependent variable, results in a distribution of dependent variable predictions which reflects the combined uncertainties. This resulting dependent vari- able distribution allows evaluation of the potential impacts on the dependent variable as the result of perturbations to the system, just 13 14 as a single value would. However, the frequency distribution also quantifies the value of the prediction by indicating the degree of prediction precision (reliability of the information), represented by the distributional shape and spread. In addition, this information can be used to evaluate alternative models by comparing the pre- cision of the resulting distributions. Distribution Selection Before beginning this procedure, selection from among a myriad of possible probability density functions for each parameter is necessary. McGrathand Irving (19T3)provide an excellent review of this subject. It is vitally important to know how the particular process, which a variable represents, relates to the entire model in order to select an appropriate probability distribution for any given random variable. Each of the following points must be carefully considered: 1. the underlying theory of each process or event 2. data representing the variability of the process 3. sensitivity of the process to probable values of the variable 4. programming considerations When the variable under consideration is just one of many which affects the overall problem or system, the simulation is often not very sensitive to the choice of the distribution. For example, in summing a series of random variables, none of which dominate the sum, the total will tend to have a normal distribution, irrespective of the nature of the individual distributions. On the other hand, when only 15 a few variables dominate the process, or the process is greatly in- fluenced by rare occurances, the selection of probability distribu- tions becomes critical to effective simulation. A balance must be struck between theoretical justification and empirical evidence in the selection of the appropriate form of prob- ability distribution. Typically, some form of parametric distribution can be justified. Available data can then be used to estimate its parameters. If no empirical data are available, theory and intuition must suffice for selection. Carrying out sensitivity analyses using several different distributions is another means of selecting the appropriate distribution in the absence of empirical data. If abundant empirical data are available, the histogram or more elaborate parametric models can be used. The final choice of a particular distribution must also depend on the relative ease of implementation. Computer storage requirements, length of computation, and programming difficulty are also key considerations. Generating a random variable from a simple parametric distribution is a relatively simple procedure. Histograms, too, are fairly easily incorporated into Monte Carlo simulations. For more complex distribu- tions, however, simple procedures for generating random numbers are not available, and other more lengthy computational algorithms must be used. In these cases, a compromise must be made between the time and cost of complex computation and simulation rigor. Consideration of the simulation's sensitivity to individual probability distribution assumptions should be foremost in this compromise. 16 Bases for Distribution Selection Underlying the distribution selection process are two concepts: the extent to which the qualitative nature of the process is known, and the amount of quantitative data that are available. It is possible in some cases, for instance, to characterize a certain process as normal based solely on a firm understanding of particular characteris- tics and behavior. Of course, little or nothing may be known of the process. In these cases, data must be relied on for supporting the selection of one distribution over another. The amount of data, how- ever, may range from extensive (readily characterizing the variable), to none. Each case is unique in its particular combination of quali- tative and quantitative information, and therefore should be judged individually. Qualitative Basis Efforts to establish a qualitative basis for distribution selec- tion are generally based on the following: 1. similarity to a known process 2. underlying theory 3. certain other qualitative aspects A process may be similar enough to one whose behavior is well known to justify use of the known characterization for the process under study. McGrath and Irving (1973) add that even though the specific situations may not be particularly similar, an assumption of similar process may be reasonable. That is to say, that even though the particular events seem to bear no resemblance, they may share a l7 behavioral similarity which permits the assumption that the two events may be characterized by representatives from the same family of distributions. Many types of processes which are modeled stochastically have been characterized by examining the underlying theory of the process. The failure of electrical components have been widely assumed to follow exponential or Weibull distributions (Weibull, 1951). The deviation of shots from a bulls-eye is supported as having a Maxwell distribution in three dimensions and a Rayleigh distribution to two dimensions (Kendall and Stuart, 1958). The exponential distribution reflects reliability and queueing phenomena, as well as characterizing random arrival times (Goodman, 1979). There are cases, however, for which little is known of the theory of the process, and the process bears no discernable relation to any process whose behavior can be described. There are certain other qualitative aspects which may serve as clues for the identification of an applicable distribution in these cases. This is particularly true if something is known of the behavior of the process or if some data is available. Although probably not sufficient for positive identification, consideration of whether the variable is discrete or continuous, symmetric, bounded, or can be otherwise characterized, can be useful in making a reasonable selection of a distribution. Quantitative Basis The amount of data available is perhaps the most important con- sideration in the selection of probability distributions. Very 18 often, not having or not being able to collect the data necessary to describe a particular variable is the primary constraint. The col- lection process may be too lengthy or costly, or in some cases im- possible. When sufficient data are available, however, an empirical approach can be used. In essence, this involves using the data to derive a characteristic distribution which represents the variable's behavior. When insufficient data are available, and acquisition is difficult or impossible, justification of the additional resources expended in further data gathering may also be difficult. This is especially true if a reasonable distribution can be selected using the limited data. However, if a workable distribution cannot be identified, additional data may be essential for selecting a valid distribution. This is of particularly great concern if the results of the simulation are highly sensitive to that variable. Quantitative Techniques The basis for selecting a specific stochastic model, then, de- pends on the amount of qualitative and quantitative information avail- able. Of course, this information may vary from nothing to over- abundance. In this latter case, almost certain characterization of the process is possible based on sound theory and empirical observation. Development of the underlying theory traditionally involves chains of inductive and deductive reasoning. Although generation of these logical pathways often involves substantial quantitative information, the secondary use of this information is primarily qualitative. Developing empirical evidence, on the other hand, may require 19 a number of quantitative methods. Among these, sensitivity analysis, graphical analysis, parameter estimation, and goodness-of—fit testing are most common. Sensitivity Analysis Sensitivity analysis is performed to determine the extent to which one particular variable or assumption impacts the outcome of the analysis. This technique can be useful in determining if the behavior of the random component must be accurately described, especially when very little characterizing information is available. By varying the values or assumptions of the variables in question, significant differences may be revealed using standard statistical tests. This should never involve lengthy or tedious labor. Most of the variables' behavior must be characterized in the first place, if the simulation is to be relied on. If many of the variables are not accurately described, the simulation lacks validity. Graphical Analysis Frequency histograms are one means for identifying appropriate distribution models under the proper circumstances. Foremost, the modeler may benefit from the relative simplicity of histograms. They plot the frequency with which each value or class of values occurs in the sample data. In addition to being used directly in the simula- tion as the stochastic model of a particular process, these frequency plots provide a visual model of the distribution shape and thus can be useful in selecting an appropriate distribution. For many any“ ‘wa 20 applications, the use of a histogram is acceptable if abundant data are incorporated. Care must be exercised, however, in removing ob- vious errors while retaining all elements of the data set which reflect the existence of low probability events. This graphical tech- nique is most applicable in cases of limited theoretical information and at least some quantitative support. Constructed using all the data collected for a given variable, the histogram presents only the empirical information available. No assumptions of probable behavior are made. Also, by using all the data, errors in the observation set are perpetuated, and may contribute to the inaccuracy of the simula— tion results. Of special significance is the case of very limited data. Under this circumstance, accuracy is especially heavily in- fluenced by sampling irregularities and from a lack of low prob- ability events being represented. When this is the case, other qualitative considerations (ranging from underlying theory to intuition) may be used to develop better stochastic representations of the variable. Parameter Estimation The concept of parameter estimation for the development of a parametric distribution is another common approach to variable char- acterization. A parametric distribution is defined as either a functional or analytical representation of a probability distribution which depends on one or more parameters (Hastings and Peacock, 1975). 21 Several conveniences and advantages support the use of a parametric distribution function rather than an histogram, despite the additional effort required for parameter estimation (McGrath and Irving, 1973). Among other advantages, use of a parametric distribution: 1. allows reasonable models to be developed in cases of no data or very limited data, 2. allows incorporation of additional information inherent in the shape of the distribution (e.g., continuity, bounds, and symmetry) if there is theoretical justification, 3. allows meaningful extrapolation into the tail(s) of the distribution, and into other regions where no data are available, 4. provides a reproducible means of representing the data since manual fits to the same data will vary, and 5. provides important summary information about the variable in the form of estimated parameters of the fitted distribution. In addition, if the random variable may be characterized by a "simple" parametric distribution, random number generation is facilitated. Simple Parametric Distributions Simple parametric distributions are conceived as being those Which are commonly encountered, relatively easy to recognize, and (Have some theoretical basis for their functional form and application. ADAM avail flEXl 22 Also, somewhat arbitrarily, only those which are characterized by one or two parameters are considered simple, In general, these distributions may often be derived from theoretical assumptions or empirically-based graphical evidence, and are justified for selection by either knowledge of the underlying theory or by preliminary graphi- cal information. Perhaps the most used and useful of the simple probability func- tions is the normal distribution. Based on the central limit theorem, this function of course assumes that, if none of the independent elements of a sum dominates, the variable representing the sum will tend toward normality. Complex Parametric Distributions In contrast to the simple parametric distributions, the complex distribution families lack well-defined physical interpretations and are difficult to express in simple functional form. Rather, they are abstractions which provide greater flexibility than the simple distributions, and better allow projection of events which would appear in the distribution tails. The Weibull, Johnson, and Pearson distribution families commonly represent the complex distributions. Functionally complex, all these distributions require three to five parameters for specific characterization. They are especially applicable when a simple distribution cannot be justified, and when the results depend on rare events, for which insufficient data are available to accurately define the tail regions. These families are 'flexible enough to include a number of shapes. This flexibility 23 allows a reasonable fit to any set of observations. Most simple parametric distributions are in fact special cases of these complex parametric distributions families. Goodness-of—Fit Testing Following the preliminary distribution selection process, goodness-of-fit testing should be conducted to substantiate the choice. It is especially important to validate these choices if it is determined that the Monte Carlo result will be sensitive to the distribution selection. Sokal and Rohlf (1969) present an excellent, detailed review of goodness-of—fit testing. The Chi-Square test and the Kolmogorov-Smirnov test are most popular due to their wide range of application. These tests evaluate whether or not a group of data Supports the assumption that the hypothesized random variable has come from the assumed probability distribution. The underlying assumptions here are particularly important. Since the statistical inferences based on these tests rely on asymptotic properties, substantial data are required to obtain valid interpretations. The more data there are, the better the chances are of rejecting an inadequate distribution. Pel"haps more importantly, the converse is also true; the less data there are, the better the chances are of failing to reject an inade— quate distribution. In general, the Kolmogorov-Smirnov test is more sensitive than the Chi-Square test, and does not require arbitrary gm”Ding. In addition to these widely applicable tests, several distri- bUtTon specific tests are available such as the W—test for a normal distribution and the WE-test for an exponential distribution. CHAPTER III THE SIMULATION MODEL APPLICATION The Model Expression The model expression used in this simulation is an empirical, black box, lake phosphorus model. Such models are especially appli- cable in situations which allow a high degree of aggregation, both temporal and spatial (Reckhow, 1979a). The black box label results from the basic assumption of all these models that the system (the lake in this instance) may be treated as a black box. Simulation of internal lake mechanics is not attempted. Only processes which occur at the system boundaries are modeled; material input and output, and interface interaction. Being less complex mathematically than mechanistic simulations, empirical black box models are appealing to planners because they are often more compatible with the level of mathematical training and the availability of financial resources. Additionally, the relatively straightforward application of uncertainty analysis techniques to these models has encouraged analysis of prediction precision by model users who would otherwise work with deterministic methods only (Reckhow et al., l980b). Reckhow (1979a) reviews those basic model forms which empirically address lake phosphorus concentration. The first is a steady state form derived from the definition of Rp, the fraction of influent 24 25 phosphorus retained in the lake. This term is defined as follows: M - QPo Rp = T (3-1) where, on an annual basis, M = annual mass rate of phosphorus influx to the lake (103kg/yr) Q = annual volume rate of water outflow from lake (106m3/yr) p = average outflow phosphorus concentration (mg/I)- With M = LA and Q/A = 2/1, equation 3-1 becomes: R = L - (2/1) P p L LT =‘ET 7 Po L1. 2 where: L = annual areal phosphorus loading (g/mz-yr) z = lake mean depth (m.) T = hydraulic detention time (yr) A = lake surface (bottom) area (kmz) %1-= average influent phosphorus concentration (mg/l) P = annual in-lake phosphorus concentration (mg/l) 26 If it is assumed that in-lake and lake outflow concentrations are equal, LT P0 = P = E—-(l - Rp) (3-2) The remaining model forms are based on the following phosphorus mass balance (Vollenweider, 1969): dP _ _ _ VaE-- M - oPV OP (3 3) where: V = lake volume (106m3) o = sedimentation coefficient (yr-1). In essence, this states that the change in phosphorus mass in the lake (VdP) per unit time (dt) is equal to the mass input of phosphorus (M) minus the mass output of phosphorus via the outflow (QP) minus the net mass of phosphorus deposited to the sediments. This sediment "sink" term (oPV) assumes the rate of phosphorus deposition to the sediments to be proportional to the total mass of phosphorus in the lake (PV). Also, as in the first instance, the lake and outflow con- centrations are assumed to be equal. When g%-= O, the steady state solution is _ L P ‘ 2(1/1 + 0) (3‘4) The time-dependent solution is as follows: _ L -(l/ + 0) At -(l/t + 0) At Pt - 62—;rz7g-[l - e T ]-+ Pt_]e (3-5) where: At = time step (yrs.) 27 Alternatively, the phosphorus mass balance may be expressed as: dP _ Va? - M - vs PA - QP (3-6) where: vS = apparent settling velocity (m/yr) This differs from equation 3-3 in the sediment sink term by employing an areal sink, which expresses the rate of deposition to the sediments as a function of the bottom (surface) area. The resulting steady- state expression is as follows: p = (3-7) where: 2/1 = qS = areal water loading (m/yr) Therefore, P = ———————— (3-8) with the time dependent solution 1 v + 1-1 P; _____L__q_ [1 _ e”(i/T + vS/2)At] + P. e-(i/T + vS/a) At (3_9) S S The major difference between these last two model forms is the manner in which the settling velocity is expressed. The first (3-4), is based on an assumption of depth-dependent settling velocity (oz), while the other model (3-7, 3-8) assumes a constant settling velocity (Vs)' It is the time-dependent solution of this last model form (3-9) which is used as the experimental base for this application of Monte 28 Carlo simulation to lake model uncertainty analysis. System Representation As mentioned above, empirical black box lake models do not attempt to represent internal lake mechanics. Rather, they focus on material movement into and out from the system, and activity at the system interfaces. It is not surprising therefore, to find that this model addresses in-lake phosphorus concentration primarily through modeling the interrelationship among phosphorus flux via the lake interfaces (L), settling velocity (V5) and water loading (qs). Distributions and Random Number Generation Fifteen parameters are characterized using random variables. Six are represented as log normal variates: l. runoff concentration-urban 2. runoff concentration—forest 3. runoff concentration-agriculture 4. atmospheric flux-urban 5. atmospheric flux-forest 6. atmospheric flux-agriculture. The remaining nine are represented as normal variates: 7. point flux-primary treatment 8. point flux-secondary treatment 9. point flux-phosphorus removal 10. water load—atmospheric 11. water load-Ontario basin tributaries 29 12. water load-Niagra River/Welland Canal 13. concentration-Niagra River/Welland Canal 14. settling velocity 15. hydraulic detention time. All random number generation is accomplished through the use of Inter- national Mathematical and Statistical Library (IMSL) subroutines; sub- routine GGNML for normal random variates, and GGNLG for log-normal random variates (IMSL, 1979). Goodness-of-fit testing, as described earlier, is not performed to verify the selection of these parametric probability density functions. Severely limiting data restrictions, push the selection criteria balance toward the qualitative. Consequently, the generated data does not always fit the empirical histogram exactly. In this instance of limited empirical information, knowledge of the underlying theory, as well as intuition have been employed to supplement the existing empirical observations. That is to say, the parameter distributions generated by the simulation are meant to approximate, as reasonably as possible, the actual behavior of the parameter and not to exactly match the date. Phosphorus Loading This simulation (see Appendix E) places greatest emphasis on characterizing the loading term (L). This loading term is represented as the sum of the following individual terms: la) agricultural runoff concentration * tributary flow rate 1b) urban runoff concentration * tributary flow rate 30 1c) forest runoff concentration * tributary flow rate la + lb + 1c = diffuse source flux (DIFFLX) 2a) atmospheric flux to agricultural land use 2b) atmospheric flux to urban land use 2c) atmospheric flux to forested land use 2a + 2b + 2c = atmospheric flux (ATMFLX) 3a) point source flux attributable to primary wastewater treat- ment 3b) point source flux attributable to secondary wastewater treat- ment 3c) point source flux attributable to phosphorus removal waste- water treatment 3a + 3b + 3c = point source flux (PNTFLX) 4) concentration from Lake Erie * flow from Lake Erie = flux from Lake Erie (EREFLX) Therefore, L = (DIFFLX + ATMFLX + PNTFLX + EREFLX)/LSA where LSA = lake surface area. Diffuse Source Flux Diffuse source flux is obtained be selecting random values from the runoff concentration distribution for urban, forested and agricul- tural land uses. Each of these random concentration values are then multiplied by a randomly selected value from the distribution of Lake Ontario tributary flows. The runoff concentrations are represented by a log normal variate for each of the three land use types. The 31 distributions of these random variates (Figures A.l, A.2, A.3) are characterized by means and standard deviations derived from relevant data aggregated from the literature (Beaulac, 1980). For further details regarding the selection of these data, see Reckhow et a1. (1981f). Tributary flows, defined as all those tributaries in the Lake Ontario watershed other than the Niagara-Welland complex, are charac- terized by a normal random variate defining a distribution (Figure A.8) with mean and standard deviation calculated from flow data of several tributaries (Chapra, 1979). The cross-correlations among tributary flows (Reckhow et al., 19819) are sufficiently high to allow considerations of all tributary flow (other than Niagara- Welland Canal flow) by a single distribution. It is not unreasonable to believe that these tributary flows are correlated with the corres- ponding atmospheric water loading values (i.e., the more rain that' falls, the higher the tributary flows are expected). Future considera- tion should be given to incorporating this relationship into the model. One procedure in particular (Fiering, 1967) seems appropriate for use in this application. Preliminary analysis with limited data, however, does not indicate a strong correlation (r = .60). The size of each land use fraction over the 40 year experimental period (1980-2020) is expected to change relative to the others, and therefore alters the total flux contribution from each land use fraction during any given year. The simulation mechanics account for this change by increasing or decreasing the relative size of each land use fraction according to projected land use information 32 (see Table 3-1). The projected difference for each land use fraction (annual loss or gain) for the first 20 year period (1980-2000) is divided by 20 to yield an average annual change, which, when pro- gressively summed over the 20-year interval, accounts for the total change in the land use fractions. The same is done for the second 20-year period (2000-2020) using the projected loss or gain for each land use during this period. Atmospheric Flux Atmospheric flux is obtained by summing random values from three log normal atmospheric flux distributions. These distributions are defined by the mean and standard deviation of data characterizing this flux source for each of the three land use categories (Figures A.4, A.5, A.6). The data used (Reckhow et al., 1980a) reflect total bulk loads. This includes both solution phase and dry fallout com- ponents. It should be noted at this point that the dry fallout com- ponent may be a substantial and significant portion of total atmos- pheric flux (Reckhow et al., 1980a). All values in Figure A.4, A.5, and A.6 are measurements from terrestrial stations, however, and as such, probably incorporate much higher levels of dry fallout than is representative of mid-lake stations. Some appropriate reduction in these measurements, therefore, may be in order. Until further infor- mation is available, however, this lack of detailed information plus the broad inherent variability make any reduction at this point speculation. The relative weighting of atmospheric flux for each land use 33 Table 3.1: Land Use Projections for the Great Lakes Basin (103ha.) (IJC, 1977a) URBAN AGRICULTURE FOREST 1980 3416.4 16446.9 34459.4 % Total 6.29% 30.28% 63.43% A% +10.8% -1l.5% +1.3% 2000 3933.6 15818.4 34325.1 % Total 7.274% 29.252% 63.474% A% +15.l4% -3.82% -.OO4% 2020 4212.8 15497.8 33821.5 % Total 7.87% 28.95% 63.18% A% +7.l% -2.03% -l.47% Net A% 1980-2020 +23.31% -5.77% -l.85% 34 category is determined using an average land use fraction, obtained by determining the average relative size of each land use fraction over the entire 40-year experimental period. This average fraction size is used in each year's calculation of relative land use contribu- tions, unlike the weighting for atmospheric water loading values for each land use which changes on an annual basis. Point Source Flux Point source flux is a major contributor to the total phosphorus load to Lake Ontario. In order to facilitate examination of several different policy scenarios involving changes in waste water treatment strategies, and therefore point source loads, this flux source is subdivided into primary treatment, secondary treatment, and phosphorus removal. Each of these subcategories is represented by a normal distribution defined by a mean determined from survey data (De Pinto et al., 1980). The variability of these point flux contributions is estimated by standard deviations about the mean concentrations for each treatment type (Reckhow, 1978a). Two points which are deserving of future consideration are: l) the effect of population shifts on point source loads, and 2) the effect of the Finger Lakes and other watershed phosphorus traps (e.g., wetlands) on flux rates from point tributary sources. Population shifts will determine to a great extent the relative fractions serviced by primary, secondary, and phosphorus removal treatments. Additionally, point tributary loads upstream from the Finger Lakes will be affected to a certain degree by these lakes. Phosphorus flux, especially in particulate form, will be attenuated to 35 some extent by these water bodies, but exactly to what extent is not known. Lake Erie Flux Flux from Lake Erie is the final component of the phosphorus loading term (L). It is calculated by multiplying a selected random value from a normal concentration distribution by a lag-one auto- correlated flow value. The concentration distribution (Figure A.10) is defined by a mean and standard deviation from annual concentration data (Chapra, 1979). The flow data is represented by an autocorrela— tion flow model (Reckhow et al., 1981h). To incorporate this feature, a lag—one Markov model is employed to longitudinally correlate the flows. With flow initialized at the historical mean of 1.85 x 10" m3/yr (Chapra, 1979), this model combines 80% of the previous year's flow, 20% of the historical mean, plus a random component based on the standard deviation and correlation coefficient of the historical data (Figure A.9) to calculate the lag-one flow value. Some controversy exists over which concentration data most accurately reflect the flux values when multiplied by the lag-one flows. Since flows are measured in the Niagara-Welland Complex, Niagara-Welland concentrations can easily be justified. Chapra (1980), however, notes that severe shoreline erosion contributes an inordinate amount of particle—bound phosphorus to the inflow, and suggests use of eastern Erie concentration data as more representative of loadings that determine Lake Ontario concentrations. The particle- bound phosphorus is included in the total-P concentration measurements 36 taken in the Niagara-Welland area, and yet is thought not to contribute greatly to the in-lake concentration due to rapid settling near the inflow. Estimates of mid-lake availability vary, and indicate a need for examining this question further (see Reckhow et al., 1981i for a discussion of phosphorus fractions and availability). In an effort to resolve this controversy, runs using both Niagara-Welland values (R = .022, s = .002) and eastern Erie values (i = .017, s = .004) are analyzed. Settling Velocity The second major term in the model expression is vs, the apparent settling velocity. This term is obtained by selecting random values from a normal distribution (Figure A.ll), defined by a mean and standard deviation calculated from Lake Ontario data (Chapra, 1979). This term is additionally important in that it serves as the repository for model standard error. This error term is incorporated in the standard deviation of the settling velocity term, and is approximately equal to 2.7 g/m3. It is calculated by propagating model error for P through the steady state model, fit from Lake Ontario data (see Equation 3-8). When re-expressed as error in vs, this estimate for model error in the Monte Carlo simulation tests the degree to which a constant vS is appropriate for modeling Lake Ontario. In essence then, it may be thought of as measuring the lack of model fit for a constant settling velocity model. 37 Areal Water Loading The last major term in the model expression is the areal water loading term (qs), and qS = Q/A, where Q is the sum of the following terms: 1. Ontario tributary flows 2. Flow from Lake Erie 3. Atmospheric water loading Annual water load from all Ontario basin tributaries other than the Niagara River and the Welland Canal is calculated as described above (see Phosphorus Loading: Diffuse Source Flux). In summary, all tributary flows are characterized by a single normal distribution since cross-correlations are high (Reckhow et al., 19819). In addition, total water load is considered to be similar per unit area of land surface, regardless of the land use (Reckhow et al., l981e). The water load from Lake Erie is calculated as described above (see Phosphorus Loading: Lake Erie Flux) from Lake Ontario data (Chapra, 1979). Summarizing, the water load is represented as a lag-one auto-correlated parameter value. The Markov mddel combines 80% of the previous year's flow, 20% of the historical mean, plus a random component based on the standard deviation coefficient to derive the new flow (Reckhow et al., 1981h). Atmospheric water loading from precipitation is calculated from historical rainfall data (World Weather Records), obtained from the meteorologic records for the Ontario basin. Random values are generated from a normal distribution (Figure A.7) defined by the mean and standard deviation which characterize these data. 38 Additional Model Terms To fill out the model expression, lake mean depth (2), hydraulic detention time (t), lag-one phosphorus concentration (Pi-l)’ and time step (At) are needed. Hydraulic detention time is represented by a normal random vari- able selected from a distribution (Figure A.12) defined by a mean and standard deviation calculated from lake volume and outflow data provided by Chapra (1979). Twenty years of outflow data were each divided by the respective year's lake volume to arrive at a distribu- tion of hydraulic detention times. For the entire exercise, At remains one year. This of course implies annual data only, and points to a shortcoming of the model. By modeling annual loading and flow processes only, many planning objectives (especially those which hold seasonal occurrences as key events) cannot be addressed. However, taking into account the primary purpose of this study, to examine the application of uncertainty analysis on a lake model using Monte Carlo simulation, and the severe data restrictions which would be encountered in an attempt to simulate seasonal processes as well as the fact that much useful planning infor- mation can be obtained from a study of average annual conditions, this approach seems valid. The lag-one phosphorus concentration is selected from a distribu- tion of phosphorus concentration prediction values from the previous year (Pi-1)' This concentration value is randomly generated from a normal distribution defined by the mean and standard deviation of P1._1 predictions and then used to calculate the present year's 39 prediction (Pi)' It is initialized (PO) at .020 mg/l. Lake mean depth is set deterministically at 89 meters (Snodgrass, 1974), and used for all runs. Simulation Flow and Structure All the values are substituted into the model expression (Equation 3-9) for calculation of a Lake Ontario phosphorus concentra- tion prediction. The first prediction represents the concentration estimate for the year 1981. Following a number of 1981 runs to achieve a prediction distribution for that year's phosphorus concentra- tion, the land use fractions are incremented as described above and multiple iterations are again performed to achieve a prediction distribution for the year 1982. This is repeated 40 times to include a prediction distribution for the year 2020. The flow of the computer simulation is somewhat different from that described above in order to streamline certain calculations (see Appendix E). CHAPTER IV RESULTS Model Results The initial test of the model is a 40-year, lOO iteration - per- year run. The phosphorus concentration prediction mean for the 40-year experimental period is 18.2 micrograms per litre (pg/1), with a coefficient of variation for those concentration predictions of 2.75 percent. This measure of variability actually represents the year-to- year variation in the annual phosphorus concentration predictions. The 14.3 percent mean annual coefficient of variation and the 2.6 ug/l mean annual standard deviation serve to estimate within-year varia- bility, which reflects combined input errors and variabilities. Figure B.3 demonstrates the slight decreasing trend with the slope of the regression line equal to -0.0194. The annual predictions for the apparent settling velocity (R = 16.0367 m/yr), areal water load (R = 12.0348 m/yr), and the phosphorus loading (R = .5108 g/mz-yr) terms are summarized in Table C.l. Flux Figure 3.1 graphically presents the breakdown of phosphorus flux sources. Diffuse source flux comprises 32.5 percent of the total phosphorus flux to the lake, with a 40-year mean of 3.161 x 109 g/yr. The coefficient of variation for the concentration mean over this same 40 41 period is 8.8 percent, making it the most variable of all the deline- ated flux sources, and the major influence on overall variation for total flux. Point source flux is 17.5 percent of the total flux, with a 40-year mean of 1.703 x 109 g/yr. The variability of this source is 70 percent lower than that for the diffuse sources with a coefficient of variation of 2.5 percent. Phosphorus flux from the atmosphere is the smallest component of total flux at only 8.0 percent with a mean of 7.762 x 108 g/yr. Variability over the experimental period is slightly higher at 3.7 percent than that for point sources. The major source of phosphorus flux to Lake Ontario is from Lake Erie. 42 percent of all the phosphorus input to Lake Ontario comes by way of the Niagara River and Welland Canal (calcu- 1ated using Niagara-Welland concentrations). The model estimates a 40-year mean to be 4.064 x 109 g/yr with very little variability (coefficient of variation = 1.1 percent). See Table C.2 for a more complete statistical summary of the model's characterization of phosphorus flux to the lake. Water Load Water loading is broken down into atmospheric, Ontario basin tributary, and Lake Erie source components (Figure 8.2). Atmospheric water loading is the smallest component. The 1.659 x 1010 m3/yr com- prises only 7.0 percent of the total water load of 2.287 x 10H m3/yr. The coefficient of variation is 2.0 percent. The variability of the load from Ontario basin tributaries is approximately the same at 2.1 10 m3 percent, with a 40-year mean of 2.721 x 10 /yr, or about 12 perce Lake water water to be 0.3 p 51.1003 sensi‘ t0 thi Nunbex result Shad9c devlat run.) artlfa AQQFEg “ain't Either variat- 42 percent of the total. The remaining 81 percent of the water load to n m3/yr, this Lake Ontario comes from Lake Erie. At 1.849 x 10 water loading source is the most significant influence on the overall water load to the system. Additionally, one would expect this load to be consistently a significant influence as indicated by the small 0.3 percent coefficient of variation. Table C.3 presents a tabular summary of the water loading terms. Model Sensitivities Tables C.4 and C.4a summarize the results of the three different sensitivity experiments. The first involves the model's sensitivity to the number of annual iterations performed. Number of Runs Figures B.3, 4, and 5 are the graphical representations of the results for lOO-, 500-, and 1000-iteration runs respectively. The shaded portion reflects variability of plus or minus one standard deviation. (See Appendix D for a discussion of the 1000-iteration run.) There appears to be little sensitivity to this factor other than artifacts of the increasing number of iterations per year. On an aggregate basis, all experiments reflect predictions of comparable magnitude. In addition, no conclusive differences are evidenced by either measure of mean annual prediction variability (coefficient of variation, CV and standard deviation, s). 43 Distribution Selection Comparing results for the distribution selection experiments (Table C.4, Figure 3.13) reveals a higher prediction for the run using all normal distributions to characterize the model parameters. The average annual variability, however, is much lower than that for the 100 iteration base run ( V = .084 as opposed to .144). Random Seed Sensitivity to a change in the random seed appears to a limited extent in two of the three cases tested. Table C.4a presents a com- parison of these cases. In the case of 500 iterations per year, the general appearance of the time series plot is noticeably altered by changing the random seed from 123457.00 (double precision) to 987543.00 (Figures B.4 and B.4a respectively). The summary statistics of each of these runs, however, do not differ appreciably. The in- creased random seed results in a mean annual concentration prediction that is approximately 2 percent higher (19.1 ug/l vs. 18.7 ug/l), and a mean annual variability that is just slightly lower (14.3 percent vs. 14.4 percent). In both instances, the concentration appears to change very little over the 40-year experimental period. In the case of all normal distributions characterizing the para- meters, the increase in the magnitude of the random seed from 123457. 00 (Figure B.l3) to 987225.00 (Figure B.l3a) again results in a dis- tinct change in the overall appearance of the time series plot. The larger random seed results in an average annual concentration predic- tion just less than 9 percent greater than that for the small random 44 seed (21.1 ug/l vs. 19.4 ug/l). The average annual variability is comparable for the two runs with the larger random seed resulting in a somewhat smaller value. In addition, the tendency of the concentra- tion prediction to increase or decrease over time is altered by the change in random seed; in this instance changing from a negative slope when the smaller random seed is used, to a positive slope, when the larger random seed is used. Unlike the first two cases, changing the random seed for the 1000 iteration-per-year run appears to have little effect on either the general appearance of the time series plot or the magnitude of the predicted concentrations (Figures B.5, B.5d). Table C.4a does, however, point out an increase in the average annual variability of approxi- mately 3.5 percent when the larger random seed is used. (See Appendix D for a discussion of the periodicity present in the 1000 iteration— per-year runs.) PlanninQAScenarios Testing the effects of planning scenarios is one of the primary advantages of modeling. Tables C.5 and C.5a summarize the results of eight experimental scenarios. Point Source Flux The first two involve changing the nature of the point source flux term by: a) reducing point source flux variability, and b) upgrading all wastewater treatment facilities to phosphorus removal status. The first case attempts to experimentally test whether or not regulations 45 designed to tighten up operational efficiency, rather than to set upper bounds on discharge, will have any effect on in-lake phosphorus concentration. Figure B.6 indicates no great deviations from the 100 iteration-per-year base run (Figure B.3). In fact, the two runs are identical in many respects. Most importantly, they share a common average annual concentration prediction (18.2 ug/l) as well as a common average annual coefficient of variation (14.3 percent). It should be re-emphasized at this point that this analysis of point source flux variability may be questionable since the characterization of input variabilities is in such doubt. The second case is designed to test the effect of upgrading all wastewater treatment facilities by hypothesizing phosphorus removal treatment for those plants which do not currently employ this treatment (approximately 47% of the total point source contribution). It should be noted at this point that three treatment plants are not included in this treatment change over. Influent/effluent data are not available for these plants, which in fact contribute less than 4% of the total point source load. Figure 8.7 displays a remarkable resemblance to both the 100 iteration- per-year base run (Figure B.3) as well as the run using one half of the point source variability (Figure B.6). The major differences are a slightly lower mean annual concentration prediction (17.4 pg/l vs. 18.2 ug/l) and a mean annual coefficient of variation about 4 percent higher than the 14.3 percent variability of the base run. The effects of these hypothesized changes on the total point source flux is graphically represented in Figure 8.8. Total point source flux is defined here to be phosphorus flux from all wastewater 46 treatment plants with an average daily flow in excess of 1 million gallons per day (mgd). There are 91 such facilities in the Lake Ontario basin which account for well above 90 percent of the actual total point source phosphorus load (DePinto, 1980). Industrial sources are not included, as their contribution is relatively minor (Chapra and Sonzogni, 1979). The reduced variability has very little effect on the total point source flux. A reduction of less than two 9 g/yr to 1.67 x lOa percent in the average annual flux from 1.70 x 10 g/yr results. The upgrading to all phosphorus removal treatment, however, has a marked effect; reducing the mean annual point flux value by 27 percent. Erie Concentration The third experimental scenario measures the effect of alternate concentration values used to characterize the flux from Lake Erie. The concentration values used for all simulation runs are those taken from Niagara River-Welland Canal data. Chapra (1979) contends that excessive shoreline erosion in the area of the Niagara-Welland complex may contribute a large portion of particle-bound phosphorus to the lake. How much of this particulate fraction is actually available at mid-lake sites is questionable, however. Chapra therefore suggests that a more accurate estimate of this flux source may be derived using eastern Lake Erie concentration values. The time series plot of this run (Figure 3.9) is again very similar in general appearance to the 100 iteration-per-year base run. Since flux from Lake Erie is the largest single source, it is not surprising 47 to note that changes in this term result in distinct changes in the final concentration prediction distribution (see Table C.5). The mean summarizing the eastern Lake Erie concentration data is 23 percent lower than that used for the Niagara-Welland data (17 ug/l vs. 22 ug/l), and results in an average annual concentration prediction 1.7 ug/l less than the 18.2 ug/l mean value for the Niagara-Welland run. In addition, the eastern Lake Erie data is 15 percent more variable than the Niagara-Welland data. As a consequence, the mean annual coef- ficient of variation for the entire experimental period rises by 2 percent. Land Use Fractions The five remaining experimental planning scenarios test the effects of a) altering the balance of land use fractions, and b) altering the rate of that change (Table C.5a). Doubling, or even tripling the rate of current projected land use pattern shifts has virtually no effect whatsoever on the concentration predictions over the 40-year test period. Graphical representations of these runs are not included in Appendix B. They are identical to the 100 iteration-per-year base run in almost every respect. One slight difference is evident, however. As the rate of change increases, the average annual coefficient of variation decreases slightly from 14.3 percent to 14.1 percent (doubling), and again to 13.8 percent (tripling). The last three experiments hypothesize land use shifts to all urban, or all agriculture, or all forest by the end of the 40-year 48 period (Table C.5a). Beginning with the current land use fractions, one land use type is incrementally enlarged until the entire lake basin is covered with this land use at the end of the test period. The other two land use types are incrementally decreased until they occupy none of the lake basin at the end of the test period. The results of these experiments (Figure 8.10) dramatically point out the distinct dichotomy between disturbed and undisturbed ecosystems as far as their contribution to lake phosphorus concentration is con- cerned. The two disturbed systems, urban and agriculture, contribute to increasing in-lake concentrations as their respective fractions increase. Agricultural expansion impacts the concentration more quickly than urban expansion. That is to say, the rate of concentra- tion increase is higher than that for the urban experiment. The projected increase is 1.5 ug/l every 5 years for agriculture, as opposed to 0.4 ug/l every 5 years for urban. Within-year variability for the agriculture projections is more than 1-1/2 times greater than the 13.6 percent average annual coefficient of variation for the urban projections. Increasing the undisturbed system fraction, forest, results in a decrease of the concentration predictions of 0.7 ug/l every 5 years. The average annual variability of these predictions is a very low 10.4 percent. Variability Table C.6 summarizes eight representative single year concentra- tion distributions; four from the 100 iteration-per-year base run, and four from the run using all normal distributions to characterize the 49 model parameters. It is these single year distributions which distinguish stochastic methods such as Monte Carlo simulation from strictly deterministic techniques. While deterministic time series models are able to generate estimates of variability over years, only stochastic processes allow specification of variability for any one year. These distributions (Figure B.ll and B.12) provide a great deal of information concerning the nature of the model prediction. In addition to the customary measure of location, this information allows some feeling for spread as well as third and fourth moments. In the case of the 100 iteration-per-year base run, annual variability is quite large. Actually, for the entire 40-year run, the average annual coefficient of variation ranges from 9.6 percent to 28.4 percent. Skewness may be easily estimated by calculating the mode to mean ratio, R, as follows: R = (1 + cv2)"°5 The base run distributions consistently demonstrate R-values very close to 1.0. Some positive skewness can be seen in the sample distributions. The run using all normal distributions manifests much lower annual variability. For the entire 40-year run, the average annual coefficient of variation ranges from 6.8 percent to 11.0 per- cent. In this second case, the R-values approach l.O even more closely. The histograms bear this out, being remarkably normal in appearance. One of the major questions to be answered is whether or not this within-year variation displays any tendency to increase or decrease 50 over time. Table C.7 summarizes the simulation results of this experi- ment. Testing four model sensitivity runs results in very small dif- ferences in annual variability. Over the 40-year test period, none of these runs demonstrate much of a tendency to increase or decrease pre- diction variability over time. Regression statistics for prediction standard deviation and time reveal slight decreasing tendencies on the order of 10'6 mg/l every year; -13 x 10'6 being the single largest rate of change. Even the most radical planning scenario, the shift to all agricultural land use, produces only a mild 13 x 10'5 mg/l annual increase in the average annual standard deviation. CHAPTER V ANALYSIS AND DISCUSSION A primary concern of this investigation, as with all modeling exercises, is how well this tool characterizes the system it is in- tended to characterize. The degree to which the accuracy of such a tool may be measured is limited, especially in the predictive work it is most often employed. However, three criteria which may be used for such an evaluation are: 1) Does the simulation generate valid parameter values? 2) Using these parameter values, does the model predict reasonable values for the dependent variable? 3) Does the model react to perturbations with consistency? The first of these may be easily evaluated. The fifteen para- meters that are randomly generated from distributions characterized by literature data (see Chapter III) are of course accurately repre- sented by the simulation. The ten remaining major model parameters, which are calculated parameters, also agree closely with the findings of other Lake Ontario studies (IJC, 1978; Sonzogni et al., 1978; Chapra and Sonzogni, 1979; Chapra, 1980; Simons and Lam, 1980; and Fraser, 1980). The second point is somewhat more difficult to address. The de- pendent variable, in-lake phosphorus concentration, is certainly characterized within reason by the model expression. The model 51 52 prediction agrees with all the major Lake Ontario studies listed above. However, prediction behavior over the entire 40-year test period must be assessed. In the few years immediately preceding 1980, there seems to be a "significant decreasing trend in total phosphorus concentrations" (IJC, 1977b). The 100 iteration-per-year base run of this simulation agrees very closely with the magnitude of these concen- trations, and continues the decreasing trend in the 40-year prediction period. Major responsibility for this decrease is suggested to result from changes in the sedimentation rate of total phosphorus (Fraser, 1980). This of course demands a much clearer picture of the nature of the system's sensitivity to the rate of phosphorus sedimentation than that which the current data provide. If this is the case, how- ever, more sophisticated modeling treatment of the settling velocity term (vs) is necessary (if meaningful projections are to be made). The final point is the most difficult to assess in the present context. Experimental perturbations, detailed in Chapter IV, affect the model predictions in different ways, and to different degrees. That is to say, the simulation does not react to these perturbations with any recognizable consistency. The results, however, may indeed be consistent with the behavior of the natural system. This dilemma is due in part to the relatively recent origins of studies of this type that stress prediction uncertainty. There is no cohesive body of experimental information on which to base a set of expectations concerning responses to certain given perturbations. 53 Model Sensitivities The first set of sensitivity experiments indicates no substantial effects caused by changing the number of annual iterations performed. Some change in the overall appearance of the time series plots is evident, however, due primarily to the increased probability of generating extreme values when greater numbers of iterations are per- formed. Aggregate statistics, however, belie this difference. Revising the simulation to employ only normal distributions to characterize model parameters results in a slightly elevated prediction range and a significantly reduced average annual coefficient of varia- tion (.084 vs. .144). This lower variability and higher prediction average may be explained by the fundamental differences between the manner in which normal and log-normal distributions are treated. Random log-normal parameter values can only be positive by definition. Generation of random normal parameter values, however, involves generating a random standard normal variate which will be negative 50 percent of the time. Since negative concentrations and fluxes are not easily interpreted, all but the positive random numbers are screened out before this standard normal random variate is adjusted using the sample mean and standard deviation. As a result, random values generated from the same population, characterized by each of these two probability functions will result in two different sets of summary statistics due to the inherent differences in the density functions and the simulation's treatment of them. Whether or not this revision to all normal distributions results in a more realistic representation of system behavior is not clear, however. Adequate data for comparison are 54 not available. This revision, however, does seem to increase the simulation's sensitivity to changes of the random seed. Although there are certain minor changes in the 500 and 1000 iteration-per-year runs due to changes in the random seed, none appear as significant as those changes in the run using all normal distributions. There is an 8.8 percent increase in the average annual concentration and a complete reversal of the decreasing trend accompany the experimental change of the random seed. This suggests that perhaps the use of normal distributions is less characteristic of the six parameters, previously described by log-normal distribu- tions. The interrelationships among all the parameters, however, are so involved that such a suggestion cannot possibly be validated without considerable future investigation. Since distribution selection is at the heart of Monte Carlo simulation, model sensitivity to various distributions is critical. Future Monte Carlo investiga- tions of Lake Ontario should focus on better characterization of each parameter by a well defined parametric distribution. Planning Scenarios Experimental testing of regulations that would require greater consistency in the operation of wastewater treatment plants seems to alter the magnitude of the predictions very little. In fact, a sub- stantial reduction in treatment plant effluent concentration results in only a relatively small, although distinct, 1 ug/l reduction in average annual in-lake phosphorus concentration. In this hypothesized planning scenario, plants accounting for about half of the total point 55 source load are experimentally upgraded from 47 percent phosphorus concentration reduction (average secondary treatment) to 79 percent concentration reduction (average phosphorus removal treatment). It is difficult, once again, to know with any certainty if these experimen- tal predictions are truly indicative of the system's reactions to these point source flux perturbations. Point source flux is a minor contributor at 17.5 percent of total flux, and as such would not be expected to impact the overall prediction to a major degree. The sparse information currently available on the projected impacts of point source controls on future in-lake phosphorus concentrations for Lake Ontario (see Chapra, 1980), however, suggests more substantial impacts than are evidenced by this study. It is painfully obvious that much more effort needs to be spent in gathering data to better characterize the relationships between treatment plant operation, effluent concentration, and in-lake concentration. In addition, it is quite surprising that the multi-year data that have been gathered are neither centralized nor aggregated in any useful form. As an example, no estimates of treatment plant operational variability are available. In order to simulate the natural variability in the point source flux term on an annual basis, these estimates of how great the variations in concentration reduction are from year to year are vital. It seems a rather simple matter to calculate a standard deviation for the concentration reduction performance over a period of years. Extended studies of this kind which include measures of variability, however, are non-existent (Elridge, 1980; Eastman, 1980; Palancic, 1980; King, 1980; Berthouex, 1980, Heidtke, 1980; and 56 Hore, 1980). As introduced in Chapter IV, the use of eastern Lake Erie concentration values to characterize phosphorus flux to Lake Ontario via the Niagara/Welland Complex inlet results in a substantial 2 ug/l decrease in the average magnitude of the annual concentration predic- tions. This inflow is the single greatest source of phosphorus, representing 42 percent of the total flux to the lake. It is not surprising, then, to find the model prediction directly affected by a reduction in the concentration value used to generate estimates of flux from Lake Erie. Since the use of either concentration value constitutes a major assumption (and a difference of 1.0 to 1.7 billion g/yr), fundamental understanding of why these two terms are different through critical assessment of shoreline erosion at the inlet and bio-availability at mid-lake locations is essential to meaningful planning applications. The final planning scenarios involve altering the distribution of land use types and the rate at which that distribution changes. The current projections for land use over the next 40 years, however, manifest only minute annual changes of l/lOO to 5/100 on one percent. Therefore, even though diffuse source flux represents 32.5 percent of total flux, doubling, or even tripling, the rate of change has no impact on the predicted concentrations over the test period. Only the drastic and unrealistic shifts to single land use types stimulate noticeable changes. These last changes, however, serve to highlight the distinct dichotomy between the behavior of disturbed versus undisturbed systems, and the difficulty in distinguishing any further 57 specificity in land use classification. This is to say that, although more detailed distinctions between row crop and non-row crop, pasture, fallow, and feedlot, or between deciduous and coni- ferous forest, or between industrial and residential land uses may be desirable from the planning perspective, no data are currently available to support any distinction beyond disturbed versus undis- turbed. Variability Prediction variability within years and over time is the heart of this study. Results of variability experiments are presented in Chapter IV. Table C.7 clearly points out the difference between modeling with uncertainty analysis and modeling without uncertainty analysis. Traditionally, the prediction is characterized by a single value; usually the mean. Through the use of Monte Carlo simulation techniques, a fully descriptive prediction distribution is obtained. This distribution not only allows some estimation of location, but also allows estimation of prediction precision. Qualitatively, this may be accomplished through visual examination of the prediction histogram. Quantitatively, statistics such as standard deviation (5), coefficient of variation (CV), and the mode to mean ratio (R) may be used to estimate higher order moments. The tendency for variability to increase or decrease over time is unclear from the information generated by this study. Linear regres- sion statistics calculated for the annual prediction standard devia- tions indicate slightly decreasing slopes; the largest of the 58 sensitivity runs being 13/1000 of one microgram per year. Even the planning scenario which hypothesizes the shift to all agricultural land use, the most radical change, results in an annual change in standard deviation of only 13/100 of one microgram per year. Summar In summary, there are three concrete findings that can be made directly from the results. They are: 1) Upgrading all wastewater treatment plants to phosphorus removal status substantially reduces the magnitude of the overall concentration predictions. , 2) The flux from Lake Erie is the single greatest flux source. The decision to use either the Niagara- Welland or the eastern Lake Erie concentration data, therefore, is critical to the final predictions. 3) The only distinction which can be meaningfully made with regard to land use type as it affects phosphorus flux is that between disturbed (urban and agriculture) and undisturbed (forest) systems. From these findings, several recommendations may be made to aid in the development of priorities for future investigations. First, if uncertainty analysis is to play a major role in improving the predictive capabilities of models, then measures of variability must become a routine element in data collection and presentation. Short- comings in this area are particularly evident in the context of wastewater treatment plant performance. Second, the controversy over 59 which Lake Erie concentrations best reflect the actual impact of flux from this source on Lake Ontario must be resolved. Until this question is satisfactorily addressed, Lake Ontario water quality predictions will be subject to this additional uncertainty; a sub- stantial difference of 1.0 - 1.7 x 109 grams per year. Third, existing data relating land use type and phosphorus runoff concentra- tion must be augmented if meaningful land use planning is to be accomplished. If no such effort is undertaken, all modeling distinc- tion of land use types beyond disturbed and undisturbed will continue to be artificial. FINAL COMMENTS In many ways, this research has uncovered many more questions than it has answered; highlighted more difficulties than it has overcome. It is an interim work, meant to be used as a stepping stone, and to continue the construction of a firm foundation from which this relatively new field may continue to grow. The ultimate objective of these exercises is to provide the environmental planner with a valuable tool which will enhance the quality of the planning decisions by presenting one more piece of information to be factored into the decision-making process. Those who concern themselves with the model development area exclusively often lose sight of this objective, and begin to view model develop- ment as an end in itself rather than as a means to an end. It is essential to keep in mind the eventual real world application. Models are subject to data constraints, computation inadequacies, and training insufficiencies. Often, logical and efficient models are taken too soon from their pristine birthplaces and forced to fend for themselves in the somewhat imperfect real world. After the dust settles these well-intended tools are rejected as impractical or impossible to apply. Those that survive have been developed within the limitations of existing data restrictions. Of course, there is an important and valid function for those models that are developed beyond existing constraints. This simulation, for example, is not yet ready to be 60 61 used for sophisticated planning projections. It has, however, per- formed the necessary function of pointing out priority research areas, and thereby helped focus the scope of future investigations. Continued refinement is essential for this model to realize its ultimate objective. Above all else, this work is intended to showcase uncertainty analysis, in the form of Monte Carlo simulation, through the use of a practical example. It is abundantly clear that the Lake Ontario data base is not yet comprehensive enough to support a model of even this modest detail with uncertainty analysis. It is equally clear, how- ever, that Monte Carlo techniques in specific, and uncertainty analysis in general, possess enormous potential as modeling aids to decision- making. BIBLIOGRAPHY DeF BIBLIOGRAPHY Beaulac, M.N. 1980. Nutrient Export Coefficients: An Examination of Sampling Design and Natural Variability Within Differing Land Uses. Masters Thesis, Michigan State University. 243 pp. Berthouex, A.M. 1975. Modeling Concepts Considering Process Perfor- mance, Variability, and Uncertainty. In Mathematical Modeling for Water Pollution Control Processes, p. 405-440. Keinath, T.M. and M.P. Wanielista (eds.), Ann Arbor Science Publications, Inc., Ann Arbor, Michigan. Berthouex, P. 1980. University of Wisconsin-Madison, Department of Civil and Environmental Engineering. Personal Communication. Chappelle, D.W. 1972. Lecture Notes: Model Building, Hypotheses, Development of Theory. Department of Resource Development, Michigan State University, East Lansing, Michigan. Chapra, S.C. 1979. National Oceanic and Atmospheric Administration, Ann Arbor, Michigan. Personal Communication. Chapra, S.C. 1980. Simulation of Recent and Projected Total Phos- phorus Trends in Lake Ontario. J. Great Lakes Res. 6(2): 101- 112. Chapra, S.C. and K.H. Reckhow. 1979. Expressing the Phosphorus Loading Concept in Probabilistic Terms. J. Fish. Res. Board Can. 36: 225-229. Chapra, S.C. and W.C. Sonzogni. 1979. Great Lakes Total Phosphorus Bud et for the Mid 1970's. J. Water Pollution Control Fed. 51 (10 : 2524-2533. Cornell, C.A. 1972. First-Order Analysis of Model and Parameter Uncertainty. International Symposium on Uncertainties in Hydrologic and Water Resource Systems. Vol. 3: 1245-1274. Tucson, Arizona. DePinto, J.V., J.K. Edzwalk, M.S. Switzenbaum and T.C. Young. 1980. Phosphorus Removal in Lower Great Lakes Municipal Treatment Plants. United States Environmental Protection Agency, in press. 62 1. 7st .—I. :1 Lehn 63 Eastman, J. 1980. Michigan State University, Department of Civil and Sanitary Engineering. East Lansing, Michigan. Personal Communication. Eldridge, S. 1980. Michigan Department of Natural Resources, Water Quality Division, Lansing, Michigan. Personal Communication. Fiering, M.B. 1967. Streamflow Synthesis. Harvard University Press, Cambridge, Massachusetts. Fraser, A.S. 1980. Changes in Lake Ontario Total Phosphorus Concen- tration 1976-1978. J. Great Lakes Res. 6(1): 83-87. Goodman, E. 1979. Department of Electrical Engineering and Systems Science, Michigan State University, East Lansing, Michigan. Personal Communication. Hastings, N.A.J. and J.B. Peacock. 1975. Statistical Distributions. Halsted Press, New York. Heidtke, T. 1980. Great Lakes Basin Commission, Ann Arbor, Michigan. Personal Communication. Hore, R. 1980. Ontario Ministry of the Environment, Toronto, Ontario, Canada. Personal Communication. IJC. 1977a. International Joint Commission, Land Use and Land Use Practices in the Great Lakes Basin, Summary Report--Task B. Windsor, Ontario, Canada. IJC. 1977b. International Joint Commission, Great Lakes Water Quality, Appendix B. Windsor, Ontario, Canada. IJC. 1978. International Joint Commission Technical Group to Review Phosphorus Loadings, Fifth Year Review of Canada - United States Great Lakes Water Quality Agreement - Phosphorus Loadings. Report of Task Group 3, Great Lakes Regional Office, Windsor, Ontario, Canada. IMSL. 1979. International Mathematical and Statistical Library Reference Manual. Edition 7, Vol. 2, Chapter G. Kendall, M.B. and A.S. Stuart. 1958. The Advanced Theory of Statistics. Vol. 1, Distribution Theory. Charles Griffen & Co. King, 0. 1980. Michigan State University, Water Resources Research Center, East Lansing, Michigan. Personal Communication. Lehman, J.T. 1978. Aspects of Nutrient Dynamics in Freshwater Communities. Ph.D. dissertation, University of Washington. 64 Lettenmaier, D.P. and J.E. Richey. 1979. Use of First-Order Analysis in Estimating Mass Balance Errors and Planning Sampling Activities. In Theory and Systems Analysis in Ecology, p. 80-106. Halfon, F. (edT), Academic Press, New York. McGrath, E.J. and D.C. Irving. 1973. Techniques for Efficient Monte Carlo Simulation. Office of Naval Research, National Technical Information Service, AD 762: 721-723 (3 volumes). Palancic, S. 1980. United States Environmental Protection Agency, Facility Planning Branch, Chicago, Illinois. Personal Communica- tion. Reckhow, K.H. 1978a. Sampling Designs for Lake Phosphorus Budgets. Paper presented at the 1978 American Water Resources Association Symposium on the Establishment of Water Quality Monitoring Programs. Reckhow, K.H. 1978b. Empirical Lake Models for Phosphorus: Develop- ment, Applications, Limitations and Uncertainty. In Perspectives on Aquatic Ecosystem Modeling. Scavia, D. and A. Robertson (eds.), Ann Arbor Science Publishers, Inc., Ann Arbor, Michigan. Reckhow, K.H. 1979a. Quantitative Techniques for the Assessment of Lake Quality. United States Environmental Protection Agency, EPA-440/4-79-015. Reckhow, K.H. 1979b. Uncertainty Analysis Applied to Vollenweider's Phosphorus Loading Criterion. J. Water Pollution Control Federation. 51(9): 2123-2128. Reckhow, K.H. and S.C. Chapra. 1979. A Note on Error Analysis for a Phosphorus Retention Model. Water Resources Research. 15(6): 1643—1646. Reckhow, K.H., M.N. Beaulac, and J.T. Simpson. l980a. Modeling Phosphorus Loading and Lake Response Under Uncertainty: A Manual and Compilation of Export Coefficients. United States Environ- mental Protection Agency, EPA-440/5-80-Oll. Reckhow, K.H., V.D. Lee, and S.C. Chapra. 1980b. An Examination of Lake Model Prediction Uncertainty Using First Order Analysis and Monte Carlo Simulation. Paper presented at the winter meeting of the American Society of Limnology and Oceanography, January, 1980, Los Angeles, California. Reckhow, K.H., R.E. Ancil, M.N. Beaulac, V.D. Lee, and R.H. Montgomery. 1981a. Methods for the Analysis of Error and Variability in Lake Trophic State Determination. National Oceanic and Atmospheric Administration Completion Report, Contract Number O3-78-b01-109, Appendix F. 65 Ibid. 1981b. Appendix G. Ibid. l981c. Appendix H. Ibid. 1981d. Appendix I. Ibid. 1981e. Appendix J. Scavia, D. 1980. Uncertainty Analysis of a Lake Eutrophication Model. Ph.D. dissertation, University of Michigan. ScaVia, 0., W.F. Powers, R.P. Canale, J.L. Moody. 1980. Use of First-Order Error Analysis and Monte Carlo Simulation in Time- Dependent Lake Eutrophication Models. Great Lakes Environmental Research Lab. (NOAA) Contribution #234. Ann Arbor, Michigan. Simons, T.J. and D.C.L. Lam. 1980. Some Limitations of Water Quality Models for Large Lakes: A Case Study of Lake Ontario. Water Resources Research. 16(1): 105-116. Simpson, J.T. and K.H. Reckhow. 1979. Assessing Phosphorus Loading on Lakes. Paper presented at the North American Lake Management Conference, Michigan State University. East Lansing, Michigan. Snodgrass, W.J. 1974. A Predictive Phosphorus Model for Lakes - Development and Testing. Ph.D. diSsertation, University of North Carolina, Chapel Hill, North Carolina. Sokal, R.R. and F.H. Rohlf. 1969. Biometry. W.N. Freeman & Co., San Francisco. Sonzogni, W.C., J. Montieth, W.N. Bach, and V.G. Hughes. 1978. United States Great Lakes Tributary Loadings. PLUARG Technical Reports, IJC, Great Lakes Regional Office, Windsor, Ontario, Canada. Weibull, W. 1951. A Statistical Distribution Function of Wide Applicability. J. App. Mech. 18(3): 293-297. World Weather Records. 1941-1970. Volume 1, North America. APPENDIX A APPENDIX A GRAPHICAL DATA INPUTS Each of the twelve figures contained in this appendix displays the data used to formulate the parametric distributions which charac- terize the model parameters. The block histograms represent the actual data used in the distribution selection process. The continuous curves superimposed on these histograms are the graphical results of randomly generating values from parametric distributions defined by the mean and standard deviation of each of these data sets. 66 67 .mcmemcmumz cane: see» meowumcucmucou weocsm mscozamoge quo» Ap\mev :omumcucmucou .H a. m. a. e. m. s. m. N. . N.— ~.~ o Aouanbaug ._.< 88: 68 .mumsmcmumz Facaupzuwcm< Eoce meowuecucmucou meoczm mzsognmoza peach A_\me cowumcucmucou O.“ m. . 0 No W. m. l :0 Me No #0 o“ baud \ ADUBD A cm om .N.< 95m: .mumzmcmuez cmumeOd Eocw meowamcucmucou eeocsm maeosamosm page». .meemcsamd A_\mev :owuaeuemucou ewe. esc. 69 ril— l q a as L .3 11 cm a b n a U ,3 Ion .A L o: r em A 3 .mmm: use; nmecmnc: ob mcmcam05u< we“ soc» x=_d macozamoga ~muo» .egemcsm_d AL»- e\mv xapu N N. 7O Aouanbaug 71 .mmma vac; Pmcaupsuwca< ou mcmzamoEH< we» eocw x=_d macosamoza _muo» .m;\wc:mwd Acxlme\mv xapd as. me. we. no. mo. me. so. me. No. He. _ - Kauanbadj 72 .mmma new; umumwcom ow mcmzam05u< one soc» xzpd mzeocamoga _mbop c»- e m x: we. mo. so.A N \mw. P; we. go. Kauanbadj .3 9:5: 73 o.N .mumwczm meme 0» newnmob team: uwcmgam059< Pouch mo gm a time So: .3 A a? __\_ a: NH Aouanbadg .2 88.; 74 .mmwceu:o_cp cemem owccuzo axe; soc» woo; twee: Aczwme OPOPV 30“; o m o N O O H 4 .3 2%: Aouanbadj 75 mm mm on .v:e__m3-mcmmmmz um mwcu meme soc» 30pm —m:::< .m.< weaned e»\ e o_ ea_d as A mam oPNHV we me :2 N Kauanbadg 76 .ccm_~m3lmccmmw2 um m_cm mxmm Ease :o_umcucwucou mscoeumoge Page» .owlewcamwd Ap\mev :owumcucmocoo mmo. cmo. mNo. o_o. ago. . {J . \ l . 11- J a .D n .3 lNfiUJ rA i m 77 om Acx\ev xuwuo_m> .zpwuo—o> mumpuumm newsmaa< ;F.<,mc:owd A: l: J a b m. [NU 3 .A r m 78 Amcav we?» ON C. .me: 55:38 “81:863. .Npé 3:3... o.: C.3 :6 o.m w o.m , a i] / \ J m l3 J a b l N w u 3 A 1 m If: APPENDIX B APPENDIX B GRAPHICAL OUTPUTS Graphical interpretations of the simulation results comprise this appendix. Figures B.3, 4, 5a, 6, 7, 9, 13, and 13a have shaded areas which represent the estimated uncertainty (#1 standard deviation) about the predictions. Figures 8.11 and 12 display this same uncertainty through the use of histograms of representative single-year prediction distributions. 79 80 .Ameoeeaceee oo_v peach use .mwcm axe; .uwcmzam02u< .ucwoa .wmswwwo ”memm> so go; mwxzpd .—.m we:a_d Amczv meek omom OHON ooow awo— — _ — _ — q q — u—mm:amo2#< l m. 3% l o; \(lfé’fl l m-# l o.N mmnueHo l m.N l c.m l m.m maxm />\l!\/)\I\/>\/\/\/\\II)I\\/I\/\I) 0.: 4 ewe mcowumcmum oo~ um mcowuuwvmce cowumcucmucou mzcozqmosm eo mcmm> oe .m.m weaned 7&3 we: ONCN OHON OCON can; 1 — — u .“m” L n~ .0 ...... aa aa a 0.0 xxx .33 ........... .0: l W 1.23 mummmmu ..... 00.0 Hz” ....... Q 1 «NW mummizz : .. mmmxnmmmmmn. .0 mmmmm”. :3”. E U Q. ...... O. ..... b 3.3 ”333330 r. ..Avuxxcz: 1 ”3333““. .mm. $33.; WC 0 .Hmmwmmmmmmuxx mm a a ..... a a a a tame a. 3:33. :0. xxx“ x 3.2.”. O mam“ r. 230” U ...... H. QOmm )1 0 2233.3. I 0~ n .:..333 5 ..:”: / row. I\ l ON L mm 83 .emm> ewe m:o_pccmuH com um mco_uuwemca :owuccucmucou mzeosemoza mo memm> oe .e.m beamed Amczv weep omow oz; Doom omen: — q _ q 4 as hey “we?“ a m ”of” ”hummer”? u”mmmmwmwmemmmmm. w u 22333.2}”USER”.n ........ .HDHJHHUUH ........ I. H morn: 20:03. Nahum 000.0 ...... m: W mm 3me ..xumuflnz .33“. 333:...2333 ..... MN m as .ewmmmmm. a. .98an e m 900 .. mam”. m... 3 ”mafia. .H ...... o ”3 ....... .. U a .35. ..:... x l S O: “qunxummn x . \.l A mmmmmemc.... am anaemia. U .33.. .....D...0 f\ .333. :...9. 1 cm ”we? ”m3 3me new. flame. .n: 1 AN 84 omom c~om .me> me mcowumgwu~ com um m:o_uu_umgg cowuwgucmucou oom~ 7.3 8.: ocom _ o O O O o o o o o o o o o o d .m<.m mgzmwu h_ m~ m~ _m (l/fin) uo;1941uaauog 85 .me> gm; meowuwgmucm ooofl um mcowuuwumga cowumgucmucou wago;Qmo;a mo mgmm> ow omom ofiom Amng wevb ocom omm_ q u q .m.m mg=m_k n_ «J 0 U 3 a m“ U 1. J P m. 0 U m~ MW 6 / om mm 86 omcm ...... ...... cccccc oooooo ........ ......... .......... ......... ooooooo nnnnnn .me> gm; mcowuwgmum coo“ um mcowuumnmga cowumgucmucou Amgzv camp oficm ooow ommm — u a u ccccccc oooooooooooooo .......... oooooooooooooooooooooooo ................. . ooooooooooooooooooooooooooo oooooooooooooooo ..... ...... ooooooooooooo .......... ..... ..... ooooooooooooooo cccccccccccccc uuuuuuuuuuuuuu .............. oooooooooooooo ooooooooooooooo ........ 3333333 .............. .mm.m mgzmwm nfi mH m_ (1/5“) u0;1941uaau03 om Hm 87 ..:..m; E 20:29: 82 a. 58.80538 u we: L5... 20.528: 8522888 .33 82.: A53 we: omom Sow ooom om? _ _ _ _ L 2 3 O U 3 o o w o O .4 O 1 m: 00 O O O O O W O 1 o oo o m. o o u o o x) o 1 m o o S U o o /\ o o o o o o o o .. om o o o O O O 88 . m > ~\~ acmms me> me mcowumgwu_ ooo~ an mcowuu_vmga cowumgucmucou .um.m mg:m_m :m 3 O - U mm 3 a U 1 J 9 .- ..+ mm“ L. O U n - 5 2 U mm Amng mew» omom oflom oocm com” 1 _ . 4 L o o o o 1 o o o o o o o 01 o o o o 1 o o o 1 on 89 .me> Lma mco_pmgmu~ ooofi Lou ummm Eoncmz meLmA mcwmz m:o_uuwwmgg :o_pmgg:mu:ou .vm.m mg:m_u Amng we?» omow o~om ocow can: 7 q d _ 1 n_ o o_o.u . o o 1 mfi o o o 0 five 4 ¢_ 0 o 1 om o o o o (l/fin) u011241u33u03 90 .x: mugzo L00 xuw~_amwgw> N\H o:_m: mcowuuwumga cowumgucmucou ngozamwma mo mmmePmm .o.m mgszu Amgzv we?» omcm ofiom coom com” 1 _ u - my oooooooooooooooooooooooooooo ooooooooooooooooooooooooooooooooo ..... cccccccc “HIM“. “TV. @— uuuuu ----------------- nnnnnnnnnnnn nnnnnnnnnnnn ..... oooooooooooooooooooooooooo ooooooooo ........................ ....... ........ (l/5”) u011911uaou03 .................. ................. ..... ..... ooooo 91 .xapm “cram LOb Pw>OEmm mzLozamocQ ppm :uwz mcomuUwUmLa comeLHcmucOU maLonamozm $0 mew> De 3...: 2.: omcm Odom ooom amm— _ ...... ooooooooooo oooooooooo ....... ooooooooooooooooooooooo ...... nnnnnnnnnnnnnnnnnnn ........ ........ ........ ....... ----- ...... uuuuuu ..... ..... .~.m m.=m.. @— hfi (1/5”) UOLleJnuaouog ..IM: lo~ 92 .m:0mwgqucu ; mcowyuwumga xzpu mugzom unwoa we mgx ow .m.m mgzmmm Amng msmp omom Sow coon coma - _ _ _ A _ _ _ 4<>02mm 1 ~._ mazozmmoza 44< .1 «J H“ . n 1 m _ X 1 . m : _ 6 n0 ./ K 1 m; U >P~4~m u4<= \, 1 ¢._ \\. \/ > \l/II\\’ \ \ I \I’ l \u" \I, I/\ (.k\ (| r'|.~\ II/\\ \>/ \ < I /\ /\ I. hofl ( £ 1 m._ zsz mm ow .m.m mgsmwm Amng mswh owom ofiom ooom can; 1 q A _ ...... oooooooooooooooooo ........ ooooooooooooo O? 6 ) u011241uaou03 EKE p ooooooooooooooooooooooooooo ........... O o: e; D o: 6' Br 0 0% éagsj (u ...... .3333 ”m”... \L ............. ........ 94 .mQXH mco F—< o» mu$wzm mm: ccmg ”m:o_uumcwgm :omumgucmucou ngozamo:a we mg» oe .op.m mgzomm Amng mswp omom o_o~ ooom omma 11 . . 41 Hmmmou o o . 0...... .. I. 0 .00 O O o o O O... O ..0. on 000000 c o 0 five 0.0 m o o o 1 a a u.u 72mm: D 00 0 000000 a n .u o o n D D D o D m DOD D a o n a nu o .u .u n o .u m m o n. m D O 0 m0 1 .u o 00 O 00 o o o mazhgaummw< m— cm mm (1/5“) UOL19J1U33U03 95 .omom .oHON .coom “coma mgmm> Low mcowuowcmga cowamgucmucou gmm>1w~mcrm ooH A_\mav comumgucmucou mm ow mg 1 2‘ E A—\mzv cowpmgucmucou mm mm o. m~ _ FA+FAQ1O a: m~ om o_ mm om (0202) [ouanban (0002) Kauanbadj o A~\mav cowgmgucmucou mm r « m~ E m Ha ._2. Ap\onv cowumgucmucou mm om m~ A: m; pp.m mgzmwg (0102) Kauanban (0661) Kauanban .88 .38 .ooow .82 m.::; Low 23035339 F9502 :< 93m: mcowuuwcmga :omumgpcmucou c3 .35 €sz 96 Ap\m:v covumgucmucou A_\m:v cowumgucmucou cm mm ow m“ 1 fi fi rl1 1. m .1 m 4. u. m 1 o~ a n a a u 1 w 1 mm 3 m: K K TL 2 1 cm 0 1 cm 0 O ’11“ 1 mm Ap\m:v cowumgucmucou A_\m:v cowamgucwucou om m. mm om m~ .II+IgIIfi ><14 Flwlu. 110 >111. L m I. m 11. w a 1 c. .D 1 o0 .m n a a U M 3 1|. 1 m_ ,A 11. 4 m" ,A m... a r. 1 1 on 6 1IL [L cm mw _Il MW 97 .mcowuznwgumwo Pwsgoz Fp< acmms mcowuu_umga cowpmgucmucou ngosqmosa mo mgmm> oe .m~.m mgzmwu Amng ms_h omom ofiom ooom com, a _ q _ \L ...... nnnnn ............ ........................ ‘33 o 0 4*? L S (L/5”) uoylexquaauog ooooooooooooooooooooo 98 .mcowuzawgumwo .magoz p_< mcwma meowuumumga cowumgwcwucou .mmH.m mgzmwu Amng mam» ONON ofiom ooom omm_ T _ A - N 1 m. 0000000000000 ooooooooooooo .o l (L/fifl) u011941uaou03 oooooooooooo ...... eeeee APPENDIX C APPENDIX C TABULAR OUTPUTS Tabular presentations of summary statistics for simulation results are included in this appendix. Many of these statistiCs are not very meaningful as far as interpreting annual prediction varia- bility. Statistics such as the 40-year mean of annual means, and variability about that mean are included for the sole purpose of relative comparison. It is helpful to use these values to compare, on a relative basis, different sensitivity or scenario experiments. Tables C.6 and 7, however, directly address the focus of the study by presenting specific annual variability information. 99 100 Table C.l: Statistical Summary of 100 Iteration Base Run - Model Terms Phosphorus Concentration i = .0182 0_' = .1430 Prediction (mg/1) - s; = .0005 s = .0026 CV; = .0275 y-intercept = .0186 R; = .9989 slope = -.0194 r = -.4586 Settling Velocity i = 16.0367 y-intercept = 16.1854 Prediction (m/yr) s; = .4503 slope = -.0073 CV; = .0281 r = -l884 R; = .9988 Areal Water Load i = 12.0348 y-intercept = 12.0414 Prediction (m/yr) s; = .0460 slope = -.0003 CV; = .0038 r = -.0783 R; = 1.0000 Phosphorus Loading 2 = .5108 y-intercept = .5247 Prediction (g/mz-yr) s; = .0142 slope = -.0007 CV; = .0279 r = -.5594 R; = .9988 Table C.2: 101 Terms (g/yr) Diffuse Source Flux (32.5%) Point Source Flux (17.5%) Atmospheric Flux (8.0%) Flux From Lake Erie (42.0%) Total Flux (100%) 3.161 x 109 9 X ll s- .276 x 10 X y-intercept = 3.44 ' 1.703 x 109 9 X II .043 x 10 5)? y-intercept = 1.69 - 9 x 7.762 x 10 8 Si .287 x 10 y-intercept = 7.77 i 4.064 x 109 9 s; .044 x 10 y-intercept = 4.07 ‘ 9.702 x 109 9 X ll s- .268 x 10 x y-intercept = 9.97 CV; R- slope CV; R; slope = -.013 .0882 .9885 = -0.01 .0252 .9990 = .0006 ' .0370 .9980 = -.0003 ‘ .0108 .9998 = -.0001 .0278 .9989 Statistical Summary of 100 Iteration Base Run - Flux -.57 .17 -.012 -.022 -.564 102 Table C.3: Statistical Summary of 100 Iteration Base Run - Load Terms (m3/yr) Atmospheric Water Loading (7%) Water Load From Lake Erie (81%) Water Load From Lake Ontario Basin Tributaries (12%) Total Water Load (100%) 1.659 x 1010 .033 x 1010 .0199 .9994 1.849 x 1011 .005 x 10“ .0027 1.0000 2.721 x 1010 .056 x 1010 .0206 .9994 2.287 x 1011 .009 x 10H .0039 1.0000 y-intercept slope r y-intercept slope 1‘ y-intercept slope Y‘ y-intercept slope r Water .64 .0002 .085 9.83 .0011 .048 .73 .0005 .1140 .29 .00003 .040 103 Table C.4: Statistical Summary of Sensitivity Experiments - Phos- phorus Concentration Prediction (g/m3) Number of Iterations 100 x = .0182 CV = .144 s; = .0005 § = .0027 CV; = .0275 y-intercept = .0186 RR = .9989 slope = -.00002 r = -.4586 500 i = .01872 _V'= .144 s; = .00090 E = .0027 CV; = .0479 y-intercept = .01878 R; = .9966 slope = -.000003 r = -.O4l4 1000 x = .01899 "11‘ = .143 s; = .00095 E = .0027 CV; = .0499 y-intercept = .0186 R; = .9963 slope = .00002 r = .2282 104 Table C.4: Continued Distribution Selection All Normal Distribu- tions 2 = .0194 —V'= .084 s; = .0021 E = .0016 CV; = .062 y-intercept = .0207 R; = .9940 slope = -.00006 r = -.6367 105 Table C.4a: Statistical Summary of Sensitivity Experiments - Phosphorus Concentration Predictions (g/m3) Random Seed All Normal Distributions: RS = 123457.00 R5 = 987225.00 y-int = .0207 r = -.6367 y-int = .0204 r = .4575 slope = -.00006 EV'= .0837 slope = .00004 EV’= .0802 1000 Iterations Per Year: 500 Iterations Per Year: RS = 123457.00 R5 = 123457.00 y-int = .0186 r = .2282 y-int = .01878 r = -.0414 slope = .00002 W = .143 slope = -.0032 CV = .144 RS = 135432.00 RS = 987543.00 y-int = .0186 r = .18712 y-int = .01917 r = -.045054 slope = .00001 CV = .1441 slope =-.OOOOOlSCV = .14277 RS = 987225.00 .1589 y-int .0187 r S1ope .00001 0V. .1482 106 Table C.5: Statistical Summary of Scenario Experiments - Phosphorus Concentration Predictions (g/m3) Cut Point Source x = .0182 CV = .0143 Flux Variability _ by Half s; = .0005 s = .0026 CV; = .027 y-intercept = .0186 R; = .999 slope = -.0002 r = -.4339 Upgrade All Point x = .01736 EV'= .1494 Source Treatment _ to Phosphorus Removal 5; = .00051 s = .0026 CV; = .02911 y-intercept = .01779 R; = .9987 slope = -.00002 r = -.4808 Use Eastern Erie x = .0165 _V'= .1636 Concentration Value _ for Calculation of s; = .0005 s = .0027 Flux From Lake Erie CV; = .0298 y-intercept = .0169 R; = .9987 slope = -.00002 r = -.4383 Table C.5a: Concentration Predictions (g/m Land Use Shifts Double Rate of Fraction Shifts Triple Rate of Fraction Shifts Constant Rate Shift To All Urban Constant Rate Shift To All Agriculture 107 .01823 .00049 .0272 .9989 .01822 .00050 .0273 .9989 .02199 .00128 .0582 .9949 .02257 .00257 .1139 .9809 C') < ll UH II y-intercept slope Y‘ V UH II y-intercept slope l" C UN 11 y-intercept slope r V E y-intercept slope Y‘ .1406 .0026 .1380 .0025 .1363 .00301 .2244 .00519 Statistical Summary of Scenario Experiments - Phosphorus 3) .01865 -.00002 -.4895 .01868 -.00002 -.5197 .02025 .00008 .77249 .01833 .00021 .9399 Table C.5a: Continued Constant Rate Shift To All Forest 108 .01577 .00163 .1034 .9842 EV' .1039 § .00162 y-intercept slope 1" .01858 -.00014 -.98295 Table C.6: 109 Statistical Summary of Single Year Prediction Distribu- tions - Phosphorus Concentration (mg/l) 100 Iterations 1990 2000 2010 2020 XI XI XI X1 100 Iterations 1990 2000 2010 2020 2 X1 XI XI Per Year = .01884 = .01839 = .01858 = .01867 Per Year .02051 .01831 .01849 .01923 All Normal Distributions .00376 .00276 .00329 .00204 .00183 .00170 .00162 .00170 CV; CV; CV; CV; CV; CV2 CV; CV; .1998 .1500 .1771 .1093 .0894 .0927 .0877 .0883 .9430 .9672 .9547 .9823 .9881 .9873 .9886 .9884 Table C.7: Phosphorus Concentration (mg/l) lOO Iterations Per Year 500 Iterations Per Year 1000 Iterations Per Year 100 Iterations Per Year, All Normal Distribu- tions 100 Iterations Per Year, Shift To All Agricul- ture 5x 5 range CV; CV range 5)? 5 range CV; CV range g... X 5 range CV; CV range Si 5 range CV; CV range 52 5 range CV; CV range 110 .0026 .0017-.0054 .143 .096-.284 .0026 .0021-.0034 .144 .llO-.180 .0027 .0022-.0032 .143 .118-.167 .0016 .0013-.0023 .084 .068-.110 .0052 .0019-.0103 .224 .100-.389 y-intercept§ slope r y-intercept§ slope Y‘ y-interceptg slope Y‘ y-intercept§ slope Y‘ y-interceptg slope r Statistical Summary of Prediction Variability Over Time - .0026 -.000002 -.0295 .0028 -.000013 -.2692 .0028 -.000004 -.1784 .0017 -.000005 -.2593 .0025 .00013 .7505 APPENDIX D APPENDIX D 1000 ITERATIDN-PER-YEAR PERIODICITY The persistence of the remarkable periodicity which appears in the 1000 iteration-per-year runs is quite perplexing. On a random basis, the appearance of peaks is not surprising. However, the occurance of three distinct peaks 12-1/2 to 13 years apart, regardless of the changes to the simulation, is more than just a little sur- prising. After thoroughly checking the computer code for possible errors and examining the random number generators for possible cycling, the conclusion remained that this is not representing a random process. Changing the random seed does not affect this persistent irregularity (see Figures B.5a and B.5d). A similar peak appears in the 500 iteration-per-year run, but does not persist when the random seed is changed (Figures 8.4 and B.4a). The next solution involved altering the second Markov constant (MKC2) in the lag-one flow model used to generate random lag-one correlated flow values, representing the flow from Lake Erie. Since flux from Lake Erie is the single greatest influence on phosphorus concentration in Lake Ontario, changing the nature of flow value generation should change the nature of the predictions. By increasing MKC2 by 5 percent, the random portion of the flow value is increased. This change indirectly affects the 111 112 overall variability of the entire system, and might be expected to result in peaks of greater amplitude or greater frequency. Neither expectation is realized, however. The peaks persist with the same familiar amplitude and frequency (Figure B.5b). The last attempt at solving this mystery resorts to questioning the nature of the model expression itself. The model is explicitly lag-one in character. Implicitly, however, the model lags back, to a diminishing degree, many years. That is to say, if the simulation incorporates 75 percent of the previous year's predicted concentration into the current pre- diction, next year's prediction will not only contain 75 percent of this year's prediction, but implicitly will contain 75 percent of 75 percent of last year's prediction. In this manner, even the tenth year's prediction will have incorporated in it about 8 percent of the first year's prediction. The term that dictates how great a propor- tion of the previous year's prediction will be incorporated into the ’(l/T +vS/z). Since current year's prediction is the log term, e detention time, I, is not highly variable (CV = .14) and lake mean depth, 2, is represented as a constant, the settling velocity term, Vs’ was reduced in magnitude by one-half. If the periodicity is inherent in the model expression, the period of this cyclic phenomenon should change. Once again, however, no change from the established pattern occurred (Figure B.5c). This anomaly serves to point out how very sensitive complex simulations can be to the most unexpected elements. Unfortunately, the mystery is only manifested by the 1000 iteration run. Storage 113 requirements demand 240,000 binary bits of central memory for each of these runs, making it a very expensive mystery to solve. Future efforts should focus on the possible cyclic nature of the synergistic effect of two or more randomly generated variates. In specific, there is one point which begins each peak. Each of these initiating points leaves a wide gap between it and the previous point, seemingly disregarding the lag effect that should eliminate, or at least make highly improbable, the occurrence of such points. There is, of course, a finite probability that any given random number generator will generate a cluster of high values. This prob- ability, however, is certainly not great enough to cause a one-year increase of such magnitude. Several random number generators would have to generate clusters of high values coincidentally, thereby driving the overall annual prediction artificially high. In addition, a low log term value would contribute to this gap by decreasing the percent of the previous year's prediction that is incorporated into the current year's prediction, and in this way indirectly giving greater weight to these new artificially high values. This hypothesis can only be viewed as a desperate attempt to explain a perplexing problem. Even this, however, explains only how these peaks might occur. The question of why they occur with such regularity remains unanswered. APPENDIX E APPENDIX E SIMULATION DETAILS A list of parameters and two flow charts accompany the computer listing of the FORTRAN code for the simulation. Cost reduction measures should be incorporated for longer runs (more years or more iterations per year) or for extended application. The major cost- saving device which might be considered is the elimination of the 28 dimensioned variables. At only 100 iterations per year, this necessitates the storage of 2800 values each year. The actual values are of little use other than for use in de-bugging the program. To simplify, and eliminate the need for this storage, a simple summation statement would require only one storage location for each parameter while retaining the ability to describe each year with summary statistics. In addition, some savings may be realized by employing on-line optimization routines and selecting less costly queues and rate groups for the lengthier runs. 114 115 START 1. Generate Random Parameters (one set of 15) 2. Calculate Prediction for One Year E.2. Blocks #1 & 2 are Described in Figure 3. Repeat Blocks #1 & 2 J Times for J Estimates of Year I Phosphorus Concentration 4. Calculate Summary Statistics Which Describe the Year I Prediction Distribution 5. Increment Land Use Fractions 6. Repeat Blocks #1-5 40 Times , For a Complete Run (1981-2020) END Figure 5.1. Flow Diagram of Simulation. 116 a in . to. ¢o_.aa.t.a.e us.~ co..:oage 0..:etu.s 2: I9: ..:-I: IRE... wee ave-av. 3, to. co.u:a...u.a s..uo_o. ee._~3~u 05 It. 33.3. 3.1.3 0.8 “um—um .EgumgomP< cowuumumgm yo cowumpcomogamm ovumEmmewo .<.~\m> . _\_.-e 5.3.6.93 :...9- 32.0.3 6 o Au.a.A~\w> o »\—.101—. e s . «a ~11... d l a _ 73 .32 c3... 28.. ..8 no: 3: 35.01.48azsa q _ 3. 0:33. 33353.3 :0... cc. 3;- 3.. 13 can. an 8.29 Y.- 822. .:.: v 30.: I. a .N.m 8238?; fl 832323 :.::. so» vague—em $07.3: ..:.5 3:55 :a .:: 03: to: ..:: v3 .8670 :2 3 voo— .~30_xa. .:.. 3.53 2...! .5» 25.: .:::. .33 to. I3 89...... ~85 3!: Sn :35 .c 5323.3 .5. €23.80... :8: 6329...: 9.8.2: 23.. \1 611111111111111 L ~ A.S.uau. 0..» 2.3 l9: .5: .38 to. 38.. an co.. $352.5 203—3. : :.:: .:.: 25.38... -uo .5. go..- 3.13 2.: an :.::... v:- 30368 n :23 lam 2:: .8: toga—05500 95-no— . 3:38 .39.}. affinvefi ..:-6:33 .33.... “$.53 3: 2:8 .3: 8.: 03a» 39:: 0.8 :28 :o_.=a_t.u_u u=,u.o_ Lego: 3.5.383. ..:: Ir: 32.3, 162:... occ :28 95m in: «:3 .3: 33.53. 9.3 .595 :.::. .19; ”05:5 v9.2 59:; .::. .25 duo: on: YE; _- a u~__a...c_ .111111111a‘ .832:me 82.55250 0.: 05 lot 83...: 18:9. 28 guy—om 25:09: a»... we: ::.:. 03382.: 1:05 3 333 out: gang“ co gong ._a_._=x 4 ..aa._o. .:.. 3.53 33:9 5. «~32. n 30.: gm 3.2, 2.: Eve: .:: r. 2:: E: -abcouccu 63.8.8. acct x.a.3_=x 822:3»... .8: 5.3.5.: 3.3.5 9: ...9.; 33;: I85... 95 out—om \q \ 533.5»... .:.: 3.9.335- 9»: Y..— 53 lo... 2;: 59:3 v.5 guy—em 83:: .3: 3:.— vuamance 35.539; 95 3 5:95:39: 02...: 53 23:3: «Flag?- cOZD-r: om: .:..— .sacce Leo>-o~ vc~ .8 :— oEtccawctou E.2 /\ 3:8..35..3..e ..c -::; 3: cc.— ». 2: .c .25.. it. 0...: 56...... etc goo—om PARAMETER LIST SOURCE PROGRAM-MES Parameter Name XSEED Z T PLAST LUFCFU LUFCFA LUFCFF DSEED N LSA XM, M S URBCON AGRCON FORCON AFXURB AFXFOR m Dimensionless Meters Dimensionless Grams/m3 Dimensionless Dimensionless Dimensionless Dimensionless Dimensionless Square Meters Dimensionless Dimensionless Dimensionless Appropriate Appropriate Grams/m3 Grams/m3 Grams/m3 g/mZ-yr g/mz-yr 24.93%; Changes Random Seed Lake Mean Depth Counts Iterations Previous Year's Concentration Prediction Annual % Change of Urban Land Area Annual % Change of Agricultural Land Area Annual % Change of Forested Land Area Random Seed Counts IMSL Iterations Lake Surface Area Year Counter Iteration Counter Random Number Output Vector Mean of Parameter Standard Deviation of Parameter Runoff Concentration-Urban Runoff Concentration-Agricultural Runoff Concentration-Forest Atmospheric Flux-Urban Atmospheric Flux-Forest 118 SOURCE PROGRAM-MCS - Continued Parameter Name AFXAGR ATMFLX ONTFLO PFXPRI PFXSCD PFXPRE PNTFLX EREFLO LSTFLO MKCT MKC2 ERECON VS TAU ATMLOD DIFFLX EREFLX TOTFLX SIGLOD 05 L yflitg g/mz-yr Grams/yr m3/yr g/yr g/yr g/yr g/yr m3/yr m3/yr m3/yr m3/yr 9/m3 m/yr Years m3/yr g/yr glyr g/yr m3/yr m/yr g/mz-yr Grams/m3 Notes Atmospheric Flux-Agricultural Atmospheric Flux-Total Water Load-Ontario Tributaries Point Source Flux-Primary Treatment Point Source Flux-Secondary Treat- ment Point Source Flux-Phosphorus Removal Point Source Flux—Total Water Load-Lake Erie Previous Year's Flow Markov Constant Markov Constant Influent Concentration-Lake Erie Settling Velocity Hydraulic Detention Time Water Load-Atmospheric Diffuse Source Flux Flux From Lake Erie Total Phosphorus Flux Total Water Load Areal Water Load Areal Phosphorus Flux Areal Phosphorus Flux 119 SOURCE PROGRAMS-MCS - Continued Parameter Name LOGTRM TRMONE TRMTWO PDYN PLAST STDEV Dimensionless 9/m Q/m glm g/m g/m 3 3 3 3 3 Units m Lag Magnitude Term Present Year Contribution Previous Year Contribution Time Dependent Concentration Prediction Previous Year Concentration Prediction Standard Deviation of 40 Annual Prediction Averages SORTIT Subroutine N I J GROUPIT Subroutine J JCLS Subroutine PPSELCT Subroutine SUMSTAT I T XMEAN Y X VAR CV 120 Dimensionless Dimensionless Dimensionless Dimensionless Dimensionless Dimensionless Q/m3 Dimensionless Appropriate Appropriate Appropriate Appropriate Appropriate Appropriate Appropriate Array Length Array Counter Array Counter Do-Loop Counter Class Size Array Do-Loop Counter Prediction Total Do-Loop Counter Prediction Total Mean Squared Deviations Sum of Squared Deviations Variance Coefficient of Variation Mode to Mean Ratio O HO mm (A 0 km Z 20 H CO 3 Z O U A XU J 1 Am a U 42 w OH (3 0 62 m H H m m 0. a U> b u bk >0 m (H 24 X 44 IDQm m w»: 000 H DPdMH m Hdmg>0 > umoDIQZPZW 4 OH < <0 mzuaomm m MDU 40 O MU 106C m k UZKumm AUwawthU mommnzo O wo>u~u ooooou KOCZHHHAfim OQOVVvaQQ £4 HOXXOO o (JZWJJKHUMH m OZKLhVMJOD OJZUHLDDxmwm 57oDO+XSEED q 2 l 1 WC... NOODOHH d9 HHHXH NH D o > A o A o A o \ C \ n C W D m D N u N Z w I m I U I U z \ U \ U \ U \ m \ C m G M O M 0 0&0 v C”. - C: v Qv v .1 Va W vm vm H 2 GHZN DMZ cum: 0 a CW AAONFJAOfi damn JAG UN ZHUofiZHuo 2H30c2~m monwvmdcovmmvwvxmoovx KOODZOIO$ZOoDGZKIFOQK DHQ 04 d Cch 04 O m< IOJU H JU I‘JU H J: UHDJQM HJKUHDJMU HJXU 42n D \ m N N Z w \ 0&0 V v H D «0 .JAKCI!“ *.OTOB+AFXFOR(I)*o6355+AFXAGR(I)*.2962)*LSA v 44 uuv m EH JAhA oOzH-H < m o-o 01o CEO CZD\K a» >0 Q h w§o we. U+00Q m \DOUZJ UMP UMP Mthx 40.1: "zoom-b 0 (Int...) coming come; u, 4mm: zoomma COOK. comma OOOmoXQ Eat VDOD¥H DOV A DOV A Dov A__] mUHI DGVUU CO "H DO “A OD "Hg" Or- 0 ODDJIIOHODJ Vacs; yucca; v.— ZLumH _Jc>c> E‘AqmochHUocsfnomooZv-MZA .‘ZHI lLOCDZ’HLLQODZHCKWDDZHUQDDZHCZQH «(I3 F—me’OVF-XWU‘vamxa'HOUWXNOOVO. w «H ZNNDOZKQHOHXKQWODXKNWOMXUX U< w 0N6 JOQQN mmmhw UkQ®¢ mLhJ ha u Jaw 4nd me Ammdk 4 Nb wnth MHHJXVUHHJXVUHHJXUJA a; < 4 . m m H m H m N U D H H H EUUUUEU a U U U U ESENTFD RY THIS ALGORITHM WHICH GENERATES RANDOM LAG-ONE H VALUES. R 0 (l4 124 5 E*LSTFLO)+(MKC2*SNRN) CO I! ll UKKOO .J ZHZ‘I...’ JILHCOOZWRU. LJHJI—UULDCZngF- .1 ll - GOOD L'JHV \ hJDDO ()0 V3 Luv-‘00 AXJ 3: mmoo 021.1. v C~¢P~Hm o m chC‘DO‘ one: C ..:—«000‘ F- DJ .J 5' xDNN ..JA LL ::nnxo 0 00-0" Lu O ll 2" (1’ DJ < L!) U) N c-q PARAMETERS ARE ALSO REPRESENTED BY NORMAL RANDOM 1AINING I. T RE 2.017. SD: .004 MEAN E.ERIE .0229 SD: .002 EDQNQR) ECON (GIMS) ELLAND MEAN R ? 2 U a LT.O.) GO TO 130 U‘O 4m UH¢ «muduu UMHQSM C SAM 150 125 D m H O 0 ¢ r O a H b O A O A O A h m h m 0 c: o o O O 2 D Z A Z L." 0 (DA 0 . o- 0 cf C2D CD A La A>- m+ o w on U o\ wwh m cm mscno Mid V: on: 3+ 0E0 000: 0 Q Md VMFVDOV A VZJM #4 00 ”H J" .>' ..Ja: GOOD—J v EWAU Z ACDDXAC zqu—a LfiZIIH—JDOZHO OKVDWQO vZDOOvJ 0 ll mquOADr—ooooz A>Pmo ~<HQEVUFHQEMUva~v D¢U&& IZHO mawhu «av: UAOCZO.’ Fifi-Z.- mHumw AUDKZ 3" + 4' ADOPK JZHAA H OJ IIUII ll "OX9” H H’- k JJ\ ADAAAOLLAAAA H HJHHHHl—HHHH v§ vvvavaUA XWXXO ll PJZUJOu—a J¢JJOAHHKZBU mnkdeflflhOhZ LLOOJFDVHHKDZS> H .mOHmvvommo Ovmhmcdadhha 126 m m C INCREMENT THE RANDOM SEED SUPPLEMENT. E XSEED=XSEED+10 EEUUE CONTINUE DECEMBER 1980 MCS E DEVELOPMENT RAM DUO” 2>>~2 ~..JZOH mew“— HD-ct 00-«(9- LID—JV? .JED .1'-U>- (014:: Uh—cd mu: (DH 2’ H022) OM UMH Z: r—LL umuo b<fi4 DMOZ (3»ch o: 09-1 Gwen- 33 < mom; H8: m>cru hind—l ICILJ'Q t—O.>-U 127 (PDYNQKodCLSvPLASTvSTDEV) C‘A QC‘ OH 0.8 06 LJ HHH ZZVIIV “OZ“ZLJ I—H>- >3” 3030ch 2’ X/lOO ozaoa—nr—cz mu; ~+0—m: mzu xz >> MANN D>DD I-Lut-h- (DOC/)0) o.— o o thh m 90):!) (#44 .JCI)_J.J IL<0.0. OJ .- 0 (£wa .1 E.J.! UMUU 'DJ'J") 0L) o O ¥7¥¥ O O o O ZXZZ >- 0>->- OZDQ O_>-0.0. VOVV i—vF-F- UP-H< ..:—OLI- Luv-23m (00:05: (10013 0.:/20m ..J—l—I.’ .J..J..J.J (<44 UUUU 128 MEAN OF DISTRIBUTION OF PREVIOUS YEAR"S PREDICTION L") N \D A O a: r- O 2 O 0 L'J D L...) a m U o < 0‘) o OZ 0 0— V0!- “- (If-J < o .14! o .12 t—i—>Zafl- 0.0 (Dc/W.J.? (A .— <- DJ 4 I 3: H ..- V) ..J D .1 c d U 0 Lu U t— Z < H LLJ )— O. 2) Lg O C! m a a: (no 3 HN (I) O IN C) U 2 HI < I . :3—4 (I) CD I- CLCh 2 OH H O c: ..HO 0. a: 1:4 a: mu LJ h)- h— :- LJ Or: I O ‘1 mu. (2’ I < t—(n 0. Z (no .1 DH <1 2&- 2 ML) 0 H u—o (no #- Hm CL In: 0 F-Q. EE'JEEUUE TINUE 2C) 02 ULaJ O D E 129 O h 2 W H u 0 h x O < Z J m C m D- A < ¢ h u > m m U U 4 I J h J c < O o I n U m H O 0 D o a O— U A H c D > m z m D u 4 U D b O u < H H P > I Z w m U o a > w < P O m m I m m I < O u > A m U U m a o 2 m U H 0 n O u D Z (0 (I) N H D w 2 J h m I O U C U AC 0 m H 3 h H a m o h 0 D \U D M; U X D U am D UM H o o m x4 h I> D 2 a n H LaJ>- DJ >— A 00) Z ZOE m D A O #2 O 4 H O. O. H Lu 00 < mum v A v a NH I mum m D 2 w Hh 0am h h o > A O mu >- «22> a: H H D O 'J a: Vi—I m QOH O P v D aav 0 me. b m2 w m 3 H oHZI 20mm VH mu: 0 > o wv>U U>me ZHUUN u m D HZJZCH Ionaz omum m 2 Q 00>Q3 Xhlv w HOJDIH H m “HAD m o ODD: PM OO< *- Z 2 H50." F‘LDCULIJO U om F D H O Hfiu uuumwmvmu HUH km 0 h HC ZNAADDN N um oz>>z m D woos» H322 H »o z w~>ZZHOIHIZ (Kl-D O OthZO-I D H COLLZ'BDDOOOIOMOLLJDUJZ PK>J LAO w)- OZH EDT- OJ- 4 H D mum mm (DUO! QH Oc’hJ Dc: CED> 0*- t— 10H mm (AH (02 OH 20. thJD G O—DJLIQ’. Ln.‘ mount U ZLL b—OIJDLLO— HO U0 00¢! .— H DU‘ 1— DZ DLJH o-V: oo UZ>>Z ctr- dethZ 030- OJ—Q Id :q 2') CDT—U V3.1 0.0 OLdOEH D DO:>.J 0 U i O H m m < 0 U l— >- (I) < m 4 O a t m o 3 o m 0 m m 4 m h U U CDC) 0 D D D D O O D D F- (l) 5" 7A xq N ¢ 0 m D N ¢ 6 D m H U OD ¢N N N N N n W n n ¢ H x Z xH X U U OH 00 O O O O O O O O O U D zm n—r- »- 0— r— .— r— I- Ir— 0— v— a: c >4 a U U DU 00 O O O O o O O O O U H m A a? U0 0 o 0 U U 0 U 0 U H 4 k A H 0 .J 0— X A AA A A A A A A A A AH .— :3 (I) m ho mm A m 0 o H N n ¢ m. D O m 0 HO HHH HH HH HH NH NH NH NH NH NH O < m (LH COO (3+ DO 06 0+ 00 0+ 0+ 6+ DO I! ..J H 3» 00A .A an OA on on ca 0A 0A OH I U U H 02 o 0 OH 00.1 on .c CL“ 00 DA om om 0H U 0. mo «>0 OHUH UH UH UH UH UH UH UH UH mm 3 a U4H wCH H4Um cw Um om Um mm mm um CM 04 O D XIU' a o o- . ...J ..: ...J 94 .4 .4 04 04 04 0U 4 5—7 U H HAAU AU AU AU AU AU AU AU AU A7 2 O ox zchnHHchjoH§OHfioH70H3OH1OH5OH79HHOdoq Dimc HO4HHHHno~uoHHovH9vHDHHDHHDHHoHnovAo:ogquHn HH A ZZamzAmznmZAmznmzamzamzamZAmZQm.moam my DMO40>>H >-N >4") >3 FU‘ >-\D >45 >-cc >0“ >-H « #2 t- t—Z oZOHDQDHOOHOQHOOHODHOOHOQHODHODHODHOHOPHbdhdm o;UHmeOJnt-QJ/H—OJ/M-O.VH—QJ/H—CLWHQMO-mwhavn—CLWHZO—Zt-ZI227: Mi 4 HH4 H4 H4 H4 H4 H4 H4 H4 H4 H4 H HZHKHKPQ :Houomuuonuomuouuouuomuouuomuouuomuomomomomouz moeve—H5owso~5o~ficwdoudouficwfiowfio—fioamaumuaumu O C C: D D O D D D O O DO 5 m D D N a- \D m C N 0’ \D co C‘O H H H N N N N N '0 '0 7’? If) G" 0L0 LO LO DECEMBER 1980 A MCS V I PRO R M OUR E DE ELOPMENT IVE S TY ._ < {/52 II- LIUD WMUU zwm U 3.4le- W OO< ' CDC/3 #- ufi Pm Z>>Z H<¢JLIJZ O—D 2< D CFO OommH m>4»- 0&2 UOH 4 4P0 (ZU UUH H< mup HHU} k mum ZUU HOP HUU D I O¢< mZm mcq 3H0. (nt— (H mHm H>m IUH FOL vJCLSoPLASTuSTDEV) 0A mH D> 0+ HH QA H U HHZ 27H > 132 )-XMEAN)**2 OH OCH CH? Ho)- \HC hHQ 0 0A OH H+ “H A GZH <<fi >w. HEA HXN “1“” AND R VALUES FOR YEAR *iIS) UU. IIX «\cxhxn U‘C>¥ o OWm>mlco 0v HCZHOU HHUAHHU\ DUocHe HH> QDH DmON§Z €HZH®+Z OZQ-CA'HZD H>-o> Ho—qv-qra: 222:: ”H EU H H:n¢nh~o~~ndv H t—z HHmHth DH 0 OZOHO O O moaomz MOHOPUXD>UXCXU>MUEQUQmmU m 0 DH HH 11- H ICHI mm 13 GRN 1111 "111111 9310 U NIV. LIBRARIES VWWWWWWWW 6912888