. .V.. ...V . . .... . .. ’9’ :' "OI’ 2 C .V.. - \ ta .fisfifinflfidmmwflmaflnlz 4.3... . .V. V V “H.2vI‘EIVVDQ-ki‘vtvfl. .01! 03”“ . o it? 'EJQ’vVfl- §otvttkdlcfl11vrfazn§ 6 0:91.» .u.» 0... t ..V.... illbt‘:: ..vv 7 thl!v.. r kaiava.:tug9\h‘fit J i . .1332: . . n .. . $37.» um. . n... 1.3.... .Whnz. they... .V.”.v...:2vz».i I... .. .. . v1.3..¢.1¥$§$.32n8§: ht... o . .n ~ . r... - III'OI) (9'91!sz ' t‘atw. « c» ‘suvfifist w Ir} A . ’0 V o V’I'F”D.:P' 0""..17 fl 0 . A IL. I f; . 7 ........ - I . . 3'1...st \ fv§?z.fi..eflv hufivoosu...i.d« $ V to: .u at 3V0; V {69. '3. VI: Iv .V.. o. 7". \ .i I” L vh' I A- ‘\ , 3!“; .. .5. 2V . . m . w; :33. V t V ’O’E- prQ! IL sv . I. i. z g I\ gfiggz @2355. . .flgfiflfian8rlrmamrafl . i .59 E] A t ‘ - I r: 9- wt : to u c v 9.. r- .1. . r . r g 2‘ i a. i “on“ v0 . 2. ti. a.).u:b.1£fils.§§nqlfi uuu ’1 ... X.» vvvv r o ‘ I '9. 'V‘L-Q DquOOVCIV ‘9 § .. tlbn i u‘olt : ’04. ’ollv4tti-to...oot . ! 0 pvt-Rolexiu‘tt 0%. N! Ighvfi :v$§b? V~\Oz«§:§!L$f A ao‘~ . . t . at. It. «85) q I v “wt-o: :2‘”... It. :2 3.311.. . . V ‘ t. 3r!" {la-[C “Viofir'y‘tt‘nztstbn’t‘o. . - g I Iéz: I ‘\\. I 'Io§{i00~VH-¢.:v.yt ooo W ~~ .OA loglhufimozyfl.\flnn ’1 Jt at. 3 33. {I nhshiflldnhfifi... . tit: . 3,... .65 3.5%.... V p n .0 31V. Div‘il‘t D l. .1. C . n N. . chflwflo ‘ A. a :c‘ £0.z0lflw-!AORL?O«D V« ‘0; . . t )3 0vh‘u1 r v. . . .. $13.3...2LFL‘3yufi .r. you: .1 .nhrvit... .. bk”. b.0111 a vafltufl¢d..r.vvo..vrr1!..0o§+ ‘13... I‘lfln'fivlitwduwé inwovt‘cfiol V .Lv’glfiyvfloLH-Idaproflh t‘vll r v d?! .. 0. v! shill-nu goimlnniog7do‘. tzzutévns 3..ta-n.¢¢4-fio . I A. O 2‘ Cl thl‘h‘: D b.-HO'1‘ gt". 3" -\.§ \ l r I‘I hr 2| v ... . .44.. hast». Inn}. :1.) .88.. .n 2 :quS‘nxgm... 1;. ti..t.....u .a. 1.....uWnnu fimfihhrun 4 . . :..4 ix. . 1......» E. a 1%... x V $1 43... 52.: . 2...! has. . 13:13.... 3. V .19 2% 1:... 2. 2.2 2 . . .8. .fiiqfie.u.z. 3.3.93 firth“.vvtvt.AvtostlIOI'lv'ongzgvlnflcfiu2lro‘mfi I 0.90 ‘0\ N3)“. 1.: I .3in t 1 .. :3 "1.6.4‘ 56.91.0214 . Efttonl n . . Drnrlzv‘gt '0’ :0...) Oh up til. Iv . 5.; S......u.u.u«i:.%......b.zi. Edgfiggfififigugl: I... . t x x V. .V.. - L! 2 . . Lbs... :4. 5.33.... «a 7 zvov'v‘tvv nn 001v I ‘00.! t Ii ' ‘0' tt .11" 13:0! ‘ a 5.1.7 500. 3.)): )\ 1 09.“?! .7. : . {c I 311:»!!‘07'19l‘50r0ufl it’ll}; .5 I9. 9. \ .1122. x. 1190. it v landfill... . . ..§.u.¥3§8.£uvfl¢yv .9 K. .7: c V5339! 12.-.. 2 V. 3.31:. .. y ' .V . volt......t.tir.. obuafiv...‘ {5%. .. .. a. . 1 s V .. V V Leta“: ignnvlvalivh‘uvruvvvu Xi..." dc p. o :4.“th “a; ‘Jflvflu .£ t i. 111...... 3... lg!!!) V . .V ., H.341... Vnw..Vu.u2 ..!th.7£!$!r.€$vi.$¥€: . 3.11 a? . V. v . IV)»- it'll q‘;,l".|vl'u"' I . ‘ § - i. .i . ‘u‘r . v I! i!’¥7l’!lt:£L 58‘ 1?: it ch Ola l.)9’|§t‘li‘t§.99lt V0632? I V... 7"..f’\v‘!v‘ ;.Ih.p'l'..’.£'0 ‘ 1 . lite»... .yoinu'ttota.vhvb . E if... 211.41.)!!! . - -3: n: tittiiohflv.’ . . ! .vnigp! 5.23.1393? 8.5. $314.. . . V I' . V V 3 #1 IAN... . 1 Q ‘Vingfifiihygi53 \I II I‘E". V o 311! .l. .1 I 3’ cl ‘ v I 7 §3\L\\4‘ «(W1 \i \Q. It?! .0: ... \I .u I. .« flu.»i1)VJvallDu-PII iixggig 13-4-1982 iv\..}ku.$v\l.H.}.\4 3‘..I.V.\L”R\~(t ‘5} i . 0“ I. E \(l (v iglgglvu1\!18|11 V. I . i’glvvlotfi. 30900! I! g 14!!!! I p it! .. {a - . 1:11;...» 31... In..." rt§:§i{ 00% A {iii-1% 138 i lth'b’liil‘f’! 1.1091! c \ l V 14- :! I 'Y" .t" \‘Lsynf‘i‘i‘ \4 J. .. u 3. .V.... .3113. fig. o 8. itituloo} . ls.£s)liz..\.€sVunn\ “3.10.3: V V 0' Lira! nDDIDv 10“; . N“. 3.1. I117! \ .1JRJH1 s... 5.. V - t . fleci 3. ft O V. 0039!.) 2": 1'0!" 1:80.01... 0% 3|“ ‘1‘“. 31V..\\)$$\\1\$). igii . VNV) 8‘ v . .Clvnrv.l70§.vt0l097‘..!o.£-D.O I; '7 g .wggal‘. 003N111: 3‘8“!- a .5231... r:b.lyi.{nu’t-htlv:§rofouhruut{ {tux .. 2.193..\ ..§:.1§\V(Onu)711f.. ‘ syngizv \1511i0‘2‘213 .Vv {viliialflgciog 15‘; \tlnvu‘i}! lizaz$§l§13$iv1\t§1.$!fi \b\§§.;i V if...!...!lfii¥a:!!a$¥¥9¥. 31.3.3.3 g! V.. .. sifilw. )2. 2.... 131r1x3%3\§a‘%2$§§y4 .33).? IV, bfivgia “ é . .V V -1!»...:i!s 3 .32.... .3 91?...) {lg} 2159:. 112.2. . . V V V oli-:io_{,vvh3hvflli.¢o 3.3.1.030: fig; .V .1. gxizxzif¥f¥ ..»\\ 2(5) . « l .t . o . . 3.3.15 v.p.r:..v..oov?.ll.rvaiob V .5 stai\..2€w1>?)€hnnfluuva\m¢.x §)V.i§{§$\fisxv}8vlusa\§al { .l! 1: #5 V . I'V‘nvfi9EiKY4'il‘OOt'll Ei‘fé \ I 55"; ‘ §\z\§fl\tiflfipfi‘5xii-finVflk-s“aifi'§32\ 1 in "itAIIDM'ICAI; I!!F..ri.£.1yil..1l| pgif‘og 1 h. - 1/ V . 28w}? . 1 LE? .2. n 1; 1C|)§p illicitllr GO“; Vt} 31).. ’11.! 35).. 37; s.‘ v} . J} Li: ...u...¥:tu£»§.lvv.88.ta€§lv . 3.?35nunflflsddu‘ $222.33 2.7!... 1. 1.3.27... 1.....bsfimu?! Moos. . V an»! . . )Illtf. 241(5) 5'51) .I\. it!" . I} V . . t fad->0 anpztvt$floOrri$ib§i):9rth'v-O&I 5 3.. 2.53.! .V.)..Nh‘d ta“. .1. I)?” $th! )1. if .2 Q It 1.x t. ittdfiflh’m 1% I :0§:.£LO§O§UA Qua”) ... .. V. ¥§;V§.3 {,u)\ at“? . . .. I. it \ f )xb .V- it I . 33 V ’31:... it! I n x V. .V.... ....... ‘2‘: .12. §¥1§1§§x 51.x)»; r...» \ 23.! g 36k. 0"»? . boil l V .2 g 1 2 z. I ~ ..-¢Hfi:hfl.hflfl$n.£n¥2§fifi4ti .1? ngfix: a V512. 277.144 0» 2 . V V. .2. Stagnfi $91.. I‘:I:§rit“vv§#§£ 11“)? 12537333178} .‘vvl. !.\u.11.}\\ 8a).. .. n s . )\ \szalikwl.‘ v» b E. .64... .5.an ‘1'; \Q. V 1591.121}? 5 . . 1 t‘ V . \lsxflzksxhauflfisw Qua . .. .1 u . . . ‘ It i3.) '11:). 2; r1) )1 i9§§u 13.354...” ‘1! . v v ¥\$§§tli\<£€.¢rui§£\|3\1\ . v » I'D. .{r‘iyh y I.) ii‘rvg‘lvl‘l V V 3:§\‘§§ 1 V . a s . . i r .. V. V V V V . . V . . . . V . V V 1.. fli2§341¥3fxls rt gig...» r119!!!) Off“ :4. a!!! g??l\\“\"‘l~fihfllht.3\'fil u‘ ‘ ‘9’ . i.&§16}. v ‘h-‘l¢ltllllf‘f t" ‘0 éf‘ii’fi A V . l ‘ A I [I 3‘1"." {a E?! {if 59!- )Vb'viig giigzi" ;|(.>).v:..{$vl‘fit{£ilfkll.‘l V V12. . r n .3}... .41 V V V . Vuflg...g. . . frihs. highgrxi: (I .c c 132.13)...‘ )131‘1111153 1 3134-333.” iii-32 (13!. 1V V 33.1.55; wiw“.fi§ . .D \II! . ‘1’? \‘ i l‘% . iij‘i1'\\¥§f\r.uwx‘i 31311.3!) 1 v v . . V V 4 is: . .1)!"ng Li.~3~d(.\lil 9 |1§I-\€IJ\ )1" cl ; .V V. £7. frail}. .r. itiii . $1436.”: {‘wfigi‘llz‘gii! I‘liirl‘tll... \\\’7\§v§\....3!7 . 4.. V' :0; ul { IiflV111509>rohIp|¥D. 25.. 5121?: 911113.41 iii!!! ’32}! . A! SPEE’IK.‘ 11".”hgl‘h P. .LWo‘XZiAIQlll. l1... g‘fi%1\t\¥\fi\z;% 5.3)) 1“! a!" 5‘ V V vtt 27.1!!va {Oil-tyr’tbl: If .00. if xix )i 1))! 1 V tit. . fifia" Ltll‘l‘ltu.“ol. )' ’0 V .511 I ‘flzz‘iig \§\\szigg!§ Vatift. VI . V (VI. 1!.» Vol 111.‘ Vt; r.¥.VvDI(.>OI.:I|r‘ It’Vt‘; V 50:31:13.1; 113x459]! 3.11.. Vz\‘i\7}11‘5‘§ . I . . V 5.)!) )IE 1I| )P? E viii-135’ . fl V. 1.7. LVK V . ”EV .V. .Vl.l¢lrrl)‘VPllVlv!i!-I fir?!) llrtfif lvvglo. .t’gigfiggi V vglar'l(%lt|$7\f§lfill§ - r {‘0 {K . I If. I. ELI. full»?! fat; ) §\\3\.\\2“\1\Ci 13‘ 1 12.53. 3(\Y}l1\1\‘.§j. glilgezhl‘ \ I . V V . 11%;}; . .£.Vif(11>lrnEIfnlllllEV-D.IDP ) fill . .- 1 .. x 2 .4: V s 1 .9 1 x .1453!!!- 13iglmi t! 35‘5i(\\2§\§12).( . 23v §x.R\§\VY‘\$\l€4\i . \(‘\§: 13315244501\31$23\2\§5)\ . . . 8. on; \lgxa?\i\11\1n 1))?\\ I (v3- ‘1“ 33%;.312\§p§\1§1 . 35g 731‘: {xflaiwt izizgsz. 4| \.l...\ .. :. I.\ a. V V itil‘)‘glfl»‘11\114\sll\\ivfn {3151.1 V... .VS..\‘7.V.V\q.w§i V1.4. . 2...: 3‘ 3.15}. I. < 5‘5! $151.}. V V . $213.33.; )2 7113!}{5153liqgfiaszu ‘ . 1.11 I. \ .1)... is) 1V] I u. C V. . . . . . . . V i2i1\?§1111 \111Ifi\$!\.lli)$ \.I» ‘1... \t _ VVIV. . V . V Mb III} 11.1.... .v.VV. ....V V. V .V THCSIS LIBRARY p o . D ' iiCI’" ‘_‘.;r‘.r‘,‘.‘~3i c. 531's, riff) 1 a a l! E'_ .9: i415 Kn!" {f‘i‘b '; ’V.‘ 0 0 tr . . hq~. av-w ,'-.¢'sr-"~‘ofi E,u“~:‘1.-:’°€*u a.- £304 6 U" ' \r J V This is to certify that the thesis entitled COMPARISON OF FIRST-ORDER ERROR ANALYSIS WITH THE MONTE CARLO METHOD FOR A LAKE ONTARIO TIME- DEPENDENT PHOSPHORUS CONCENTRATION PREDICTION MODEL presented by Ralph Emerson Ancil has been accepted towards fulfillment of the requirements for M.S. Resource Development degree in flwém Major professor Date “1M"- 8/ 0-7639 ll!!!IllIllIll]llllllllllllllllflIll L 3 1293 00837 3973 RETURNING MATERIALS: IVIESI_I Pla ace in book drop to LIBRARIES re ethiS CEh COUk tr‘f 0"1 “ ym rrrrrrrr d. _F____uINEStw I'II be careioohgdfbkr reurt neaedftr thedate stamped beIow. :274 ' COMPARISON OF FIRST-ORDER ERROR ANALYSIS WITH THE MONTE CARLO METHOD FOR A LAKE ONTARIO TIME- DEPENDENT PHOSPHORUS CONCENTRATION PREDICTION MODEL By Ralph Emerson Ancil A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of MASTER OF SCIENCE Department of Resource Development 1981 67/”? ‘:.. ABSTRACT COMPARISON OF FIRST—ORDER ERROR ANALYSIS WITH THE MONTE CARLO METHOD FOR A LAKE ONTARIO TIME- DEPENDENT PHOSPHORUS CONCENTRATION PREDICTION MODEL By Ralph Emerson Ancil The problem of quantifying model prediction error, or uncer— tainty, is the subject of much ongoing research. This study compares two methods of prediction error quantification for a Lake Ontario eutrophication model, namely, the Monte Carlo method and first—order analysis. The Monte Carlo method accounts for uncertainty by char— acterizing each known error source by a frequency distribution. Values are obtained randomly from the distribution for each term which Simu- lates the inherent variability. First-order analysis defines uncer— tainty by its first non-zero moment, the variance. Uncertainties are propagated through the model by using the first-order terms of the Taylor series expansion. The variances are then combined to yield the total prediction error. The comparison includes using the same parameter values; testing sensitivities; and running the Monte Carlo simulation with all-normal stochastic inputs. The results Show that for similar input values, similar predictions but different errors are obtained. ACKNOWLEDGMENTS This research was made possible in part by funds from the National Oceanic and Atmospheric Administration, grant #03-78—b0l-l09. I would like to acknowledge the fine work done by Barb Visser, Shelley Saub, and especially Sue Watt, for typing the final draft of this thesis, and to Paul Schneider for his excellent graphics. Thanks are also extended to V. David Lee, my friend and col- league, for providing me with the information of his research which is used in this study. And thanks are also extended to Dr. Daniel Chappelle and Dr. Carl Ramm, two of my committee members, for their helpful comments and criticisms. I especially wish to acknowledge Dr. Kenneth H. Reckhow for his guidance and patience during the course of my studies. My association with him has been an invaluable contribution to my education. Finally, the indispensable assistance, patience, and encour- agement of my dearest friend, my wife Clarissa, is acknowledged with deepest gratitude. It is to her that this thesis is dedicated. TABLE OF CONTENTS LIST OF TABLES LIST OF FIGURES LIST OF SYMBOLS INTRODUCTION Chapter I. THE LAKE PHOSPHORUS MODEL II. THE MONTE CARLO METHOD AND ITS APPLICATION TO THE LAKE ONTARIO MODEL . The Monte Carlo Method Definition . Distribution Selection Application to the Model Loadings . . . . Settling Velocity . Areal Water Loading Hydraulic Retention Time Other Parameters III. RESULTS OF THE MONTE CARLO SIMULATION Major Observations Base Run . . . All-Normal Run . Minor Observations Base Run Fluxes Base Run Water Load . Higher Iterations Per Year . Flux Sensitivities Land Use Fraction Sensitivities Summary . Page Chapter IV. FIRST-ORDER ERROR ANALYSIS V. RESULTS FOR FIRST-ORDER ANALYSIS The Model Prediction The Error Equation Major Observations Minor Observations Summary . VI. COMPARISON OF FIRST-ORDER WITH MONTE CARLO . The Numerical Results The Procedures . . . . Theoretical Considerations . Practical Considerations Summary . . . FINAL COMMENTS APPENDICES . A. MONTE CARLO APPENDICES . B. FIRST-ORDER ERROR APPENDICES . BIBLIOGRAPHY Page 67 88 114 3> >>J>Z|> LIST OF TABLES Statistical Summary of Major Monte Carlo Runs Base Run Modified Base Run Comparison of Numerical Results . Comparison of Values Statistical Summary of lOO Iteration Base Run~-Model Terms . . . . . . . . . . . . . Statistical Summary of Base Run Fluxes Statistical Summary of Base Run Water Load Statistical Summary of Higher Iterations . Statistical Summary of Flux SensitivitieS--Phosphorus Concentration Predictions . Statistical Summary of Land Use Shifts Page 37 44 54 56 7O 71 72 73 76 80 Figure 3.1 3.2 LIST OF FIGURES 40 Years of Phosphorus Concentration Predictions at lOO Iterations Per Year . . . . . . . 40 Years of Phosphorus Concentration Predictions Using All Normal Distributions . . . . Theoretical Behavior of Predictions and Errors . Base Run Predictions Parameter Error Input Variable Error Base Run—-Total Error 40 Years of Phosphorus Concentration Predictions at 500 Iterations Per Year Concentration Predictions Using Larger Random Seed for lOOO Iterations Per Year . . . . . 40 Years of Phosphorus Concentration Predictions Using l/2 Variability for Point Source Flux . 40 Years of Phosphorus Concentration Predictions with all Phosphorus Removal for Point Flux . 40 Years of Phosphorus Concentration Predictions Using Eastern Erie Concentration Data . . 40 Years of Phosphorus Concentration Predictions: Land Use Shifts to All One Type . . . . vi Page 20 21 38 38 42 42 43 74 75 77 78 79 8l LIST OF SYMBOLS Inter-year prediction mean Inter-year standard deviation Inter-year coefficient of variation Inter-year mode to mean ratio Intra—year coefficient of variation Steady—state model Decay factor Initial phosphorus concentration Previous year's phosphorus concentration Mean loading Mean surface overflow Mean apparent setting velocity Mean depth Mean hydraulic retention Steady-state error Fractional steady-state error Model error i the diminishing term FSSE + ME ith error (as variance) Variance of prediction Correlation coefficient of Xi with Xj' Variance of Xi INTRODUCTION With the increasing burden on the earth's water resources, the resource manager whether in the private or government sector, is called upon to balance competing interests, evaluate different sources of information and to anticipate the consequences or impact of proposed uses and activities on the resource base. To do this, the manager requires some knowledge of the natural sciences to under- stand the biological, physical and chemical processes that shape the environment and to comprehend its fragile and interdependent char- acter. Yet the water resource manager must also be familiar with the social sciences since it is in this sphere that decisions concerning the application of knowledge gleaned from the natural sciences are made. Questions of water allocation, efficiency and distribution are economic and political. Thus the planner stands at the inter- face of the natural and social disciplines. Furthermore, these two realms of thought may suggest conflicting answers to water develop- ment or management problems. What may be "good" from a strictly scientific viewpoint may not be ”good” economically or politically. Also, uncertainty may arise regarding projects or activities, the nature and effect of which may be subtle and long—term. It is not surprising then that to make sound decisions, the water resource manager requires detailed, accurate information, as well as the ability to anticipate the possible consequences of various courses of action, that is, to forecast or predict impacts. There are several different ways prediction can be accom- plished. One common way is a qualitative study of the problem resulting in a projection of possible scenarios. Another is simply to extrapolate from past experience if the present problem is analo- gous to previous ones. The conceptual model offers another approach of envisioning complex interactions among a system's components. This latter method is often closely followed by a quantitative model which expresses the component mathematically and yields a quantita— tive prediction. The complexity of these models ranges from simple input-output models (Vollenweider, l969, l975) to sophisticated treatments of biological systems (Thomann et al., l977). However, since the modeling process involves a simplification of reality, the predictions necessarily reflect a level of uncer- tainty or error. Any reduction in this uncertainty will likely increase the cost Of the modeling process in both money and time. For example, increasing the complexity of the model may reduce its prediction uncertainty but will also increase the data requirements. The costs of data acquisition, model calibration and validation, and computation will increase. The modeler must, therefore, determine what level of uncertainty is acceptable under the circumstances and consistent with the available resources of time and money. Further- more, the modeler's personal and professional reputation may be risked on the reliability of the results. Clearly, then, to the water resources manager who uses quantitative models, the quality of the results is of great importance. This concern has recently led some researchers to investigate ways of quantifying model uncertainty and thereby provide this needed information (Reckhow, l979; Chapra and Reckhow, l979). The purpose Of this paper is to compare two such methods for prediction error quantification and to evaluate their relative advan- tages and disadvantages for a time-dependent phosphorus concentration prediction model of Lake Ontario. The thesis proceeds in the following order. The reader is acquainted first with the model and its derivation. This is followed by a discussion of the Monte Carlo procedure and the model predictions of one simulation are analyzed. Then the alternative method of first- order error analysis is considered along with a presentation of its model predictions. Finally, the results of these two different approaches are compared and their relative strengths and weaknesses are discussed. The appendices are referenced in the text to supple— ment the information of the main text or to provide mathematical proofs, only the results of which are used in the texts. CHAPTER I THE LAKE PHOSPHORUS MODEL The lake phosphorus model used in this paper is a “black box“ model which means it treats the physical phenomena empirically, rather than theoretically. In other words, no attempt is made to individually account for all the processes that may be involved in producing the final product, instead, all the processes are aggregated into a few parameters which may or may not have a physical basis. To the water resources planner, the empirical approach has the advantage of rela- tive mathematical simplicity over the more sophisticated theoretical models and is, therefore, more compatible with available financial resources, as well as the planner's mathematical ability. A survey of this kind of lake phosphorus model has been made by Uttormark and Hutchins (l980) and by Reckhow (l979). The underlying assumption of this model is that given a lake's size, mean depth, phosphorus loading, and flushing rate, phosphorus concentrations can be predicted. The model was derived by first cal- culating a phosphorus mass balance for lakes. In this calculation it is further assumed that phosphorus losses occur through the outlet and through internal losses such as sedimentation and that the net internal loss is directly proportional to the amount of phosphorus in the lake. A phosphorus mass balance for a simple lake system is given by: Another dP _- V at — M —<5PV - QP l.l where, P lake phosphorus concentration (mg/l) lake volume (106m3) annual mass rate of phosphorus inflow to lake (lO3kg/yr) O 2 < H annual volume rate of water outflow from lake (l06m3/yr) sedimentation coefficient (per yr). Q II expression for the phosphorus mass balance is given by: dP _ V a? — M - VSPA — QP l.2 where, A = lake surface (bottom) area (kmz) v5 = apparent setting velocity (m/yr)- Equation l.2 differs from Equation l.l in that the sediment Sink term (0 PV) is treated as an areal sink term (vSPA) and hence the rate of phosphorus deposition to the sediments is a function of the bottom (surface) area. Equation l.l has the steady—state solution (g; = O): hfi 1.3 T where, L = annual areal phosophorus loading (g/mZ-yr) z = lake mean depth (m) o = hydraulic retention time (yr) T = hydraulic retention time (yrs). The time-dependent solution is: l _ P. = L (l-e-LT + GMT) + P. e (I/I + O)At l.4 '1 Z l-I 2+— '1' where, P1 = ith phosphorus concentration prediction (mg/l) P1_] = previous phosphorus concentration prediction (mg/l) At = time interval (yrs). Let 2 = qS = surface overflow rate (m/yr) and Equation l.3 may be restated as: P: . I.5 P = = . l.6 Note that the difference between these two expressions (Equations l.5 and l.6) is in the term for the settling velocity. In Equation l.5 the settling velocity is assumed to be depth-dependent, whereas in Equation l.6 it is assumed to be constant. A third steady-state phosphorus model is based on the frac- tion of influence phosphorus retained in the lake and is given by: M - OP _ o Rp - _—_PF_—__ l.7 where, Rp = retained influent phosphorus M = L - A P0 = average outflow phosphorus concentration (mg/l) thus R:LA-3P_O_:]_(9_)EB 18 P LA LA A L and since Q/A = Z/T (Z/T) PO Rp —l--——E———— . l.9 If P0 = P (lake and outflow concentration are equal), then the model becomes: = = __. - l.lO PO P 2 (l Rp). And, finally, by setting Equation l.3 equal to Equation l.lO, it can be shown that _ T o _ l Rp _ l + T - l + I/TO ° 1.1] These then are the basic models for lake phosphorus loadings. The question becomes which one to use. In considering the choice of an appropriate model, one needs to consider the following (Reckhow, T979): l. Consider the data base used to fit the model. Does it contain lakes similar to yours (similar, in Size, depth, climate, etc.)? 2. Has the model been used successfully on Similar lakes? 3. Are the model's limitations clearly documented? In this paper the time dependent solution of Equation l.6 is used to model phosphorus behavior in Lake Ontario and is given by: P_ _ L (1 _ e-(l/r + vS/2)At) + Pi_]e_(1/T + vS/z)At. l.l2 Although specification in cross-sectional modeling is usually not an issue (Reckhow, l979), this model is chosen in order to conform to prevous work done by Chapra (l977). CHAPTER II THE MONTE CARLO METHOD AND ITS APPLICATION TO THE LAKE ONTARIO MODEL The Monte Carlo Method Definition The Monte Carlo method iS a procedure for internalizing the variability, and thus the uncertainty, of model parameters. By characterizing the variables as distributions rather than as single values (constants) the errors or variability associated with the variables is incorporated in the model. By repeatedly obtaining random values from these distributions, the model prediction also becomes a distribution of values reflecting the combined uncertainties of the independent variables. Exactly how the distributions are described varies from one case to another. In a normal distribution, for example, the arithmetic mean and the standard deviation are ade- quate descriptors. For the log-normal case, however, the geometric mean may be preferred in addition to the standard deviation or some measure of non-normality such as the coefficient of skewness. There are, however, sufficient statistics for this instance, too (Benjamin and Cornell, l970). 10 Distribution Selection Clearly the most critical aspect of the Monte Carlo method is the selection of the distributions which characterize the variables. There are both qualitative and quantitative ways to help the modeler make the appropriate distribution choice. Qualitative basis.—-The qualitative basis for the selection of distribution depends upon the extent to which the qualitative nature of the process is known. One such way is based on the behav- ioral similarity of the one process to some other known process (McGrath and Irving, l973). Such a Similarity allows for the assump- tion that the two processes may be characterized by the same family of distributions. Another qualitative approach is based on the understanding of the underlying theory. For example, the failure of electrical components has been widely believed to follow exponential or Weibull distributions (Weibull, l95l). The deviations of shots from a bull's- eye is described by a Maxwell distribution in three dimensions and a Rayleigh distribution in two dimensions (Kendall and Stuart, 1958). Other qualitative considerations which may help in distribution selection include a knowledge of whether or not the variable is dis- crete or continuous, symmetric, bounded, and so on. Quantitative baSiS.--The amount of data is most important in distribution selection for it determines the quality of the empirical determination. Frequently, a small data base will allow the modeler “qr ll to make an appropriate choice of distributions. But if not, addi- tional data must be obtained to properly select the distribution, especially if the model is highly sensitive to distribution choice. This, of course, must be balanced against the cost necessary to pro- cure the additional data. There are a number of techniques available to the modeler to analyze the data. The frequency histogram is one such technique which ‘ __.- .__._ involves plotting the frequency with which a class of values occurs in the sample data. This provides a visual aid to determining the possi— ble shape of the distribution. Of course, the fewer the data used, the less likely low probable events will be represented and the more misleading the histogram can be. In that case the histogram should be used cautiously and supplemented with some other technique. Probability paper can also be used to help select the appro- priate distribution. This paper comes with scaled ordinates such that the cumulative distribution function of the probability law plots as a straight line. Thus comparison of the proposed distribution with the data is reduced to a comparison between the cumulative fre- quency polygon of the data and a straight line. The closer the data are to the straight line, the better the fit. Probability paper exists for a number of distributions including the triangular, exponen- tial and the normal (Benjamin and Cornell, l970). Parameter estimation is another common method for use with parametric distributions. A parametric distribution is one which is represented as a function with one or more parameters. Simple l2 parametric distributions have only one or two parameters such as the normal distribution. Complex parametric distributions, on the other hand, lack well-defined physical interpretations and are less easily expressed as a function and usually involve three to five parameters. Examples of these include the Weibull, Johnson and Pearson families of distributions. After selecting a distribution there are a number of tech- niques which can be used to test the goodness-of—fit of the choice. This is particularly critical for those variables for which the Monte Carlo result is highly sensitive to distribution choice. Two Of the most popular tests are the Chi-square and the Kolmogorov-Smirnov tests. In general, the more data used with the tests, the greater the likelihood of rejecting an inadequate distribution choice. Like- wise, the less data used, the less likely of rejecting an inadequate choice. However, with smaller data sets the modeler can use the W—test for the normal distribution and the WE-test for the exponential distribution. Application to the Model Lee (l980) performed the Monte Carlo simulation on the time- dependent form of this model. For the sake of completeness and to facilitate comparisons, the characterization of the distributions and the manner in which the values are Obtained from those distributions are briefly described below for each term in the model. The data used in the simulation are gathered from the literature, that is, from secondary sources. (For greater detail and for the computer algorithm, the reader is referred to Lee, l980.) W l3 Loadings The loading term (L) is the single most complicated expression Of the model. It is a combination of several terms, namely, diffuse source flux (DIFFLX); atmospheric source flux (ATMFLX); point source flux (PNTFLX); and Lake Erie source flux (EREFLX). These terms are added together and divided by the lake surface area (LSA) to give the total load to Lake Ontario: L = (DIFFLX + ATMFLX + PNTFLX + EREFLX)/LSA. The flux terms are themselves composed of several terms which are considered below. Diffuse flux (DIFFLX).--The diffuse source flux is the sum of three land use runoff concentrations: agricultural, urban and forested. Each concentration is represented by a log-normal dis- tribution characterized by its mean and standard deviation derived from data taken from the literature (Beaulac, 1980). For more details concerning the selection of these data, see Reckhow et al. (l98l). The sum of these three concentrations is then multiplied by a randomly obtained value from the normal distribution for the Lake Ontario tributary flows (excluding the Niagara/Welland Canal flow). The distribution is characterized by a mean and standard deviation calculated from the flow of several tributaries (Chapra, l98l). The cross-correlations among tributary flows are sufficiently high to allow their treatment with a single distribution (see Appendix A.l). 14 The land use fractions are changed over the 40 year interval by increasing or decreasing the relative size of each land use frac- tion according to projected land use information (IJC, l977). The projected difference for each land use fraction for the first 20 year period (l98O - 2000) is divided by 20 to yield an average annual change, which, when progressively summed over the 20 year interval I accounts for the total change in the land use fractions. The same method holds for the second 20 year interval. Atmospheric flux (ATMFLX).--The atmospheric source flux is the sum of the phosphorus contributions in the atmosphere from these three land uses. The three values are randomly obtained from three log-normal distributions defined by their means and standard devia- tions. The data used reflect total bulk loads (Reckhow et al., l980). The relative weighting for each land use is determined by using an average land use fraction which is itself calculated according to the average relative size of each land use fraction over the entire 40 year period. Point flux (PNTFLX).--Point source flux is the sum of phos- phorus contributions attributable to primary, secondary and phos- phorus removal wastewater treatments. Each treatment is represented by a normal distribution defined by a mean determined from survey data (DePinto et al., l980). The variability of these point flux contributions is estimated by standard deviations about the mean concentrations for each treatment type (Reckhow, l978). l5 Lake Erie flux (EREFLX).--The Lake Erie flux is calculated by multiplying a randomly Obtained value from a normal concentration distribution by a lag-one autocorrelated flow value. This concentra- tion distribution is defined by a mean and standard deviation from annual data (Chapra, l98l). The flow data are represented by a lag- one autocorrelated Markov flow model. The model uses 80% of the I previous year's flow, 20% of the historical mean and a random compo- nent based on the standard deviation and correlation coefficient of the historical data to calculate the lag-one flow value (see Appendix A.2). Settling Velocity The settling velocity term is procured by randomly selecting values from a normal distribution defined by its mean and standard deviation calculated from Lake Ontario data (Chapra, l98l). This term also contains the model standard error which is included in the standard deviation; its value is about 2.7 g/m3. It is an estimate of the lack of model fit due to the assumption of a constant settling velocity. Areal Water Loading The areal water loading term is defined as as = Q/A where, Q is the sum of the Ontario tributary flows, the Lake Erie flow, and the atmospheric water loading. l6 The calculations for the Ontario tributary flows are discussed above under Diffuse Flux and the Lake Erie flow is considered under Lake Erie Flux. The atmospheric water load from precipitation is calcu- lated from historical rainfall data (World Weather Records), obtained from theameteorologicalrecords for the Lake Ontario basin. Pseudo- random values are generated from a normal distribution defined by the mean and the standard deviation derived from the data. Hydraulic Retention Time The hydraulic retention time (T) is represented by a normal distribution defined by its mean and standard deviation calculated from lake volume and outflow data provided by Chapra (l98l). The twenty years of outflow data were each divided by the respective year's lake volume to arrive at a distribution of hydraulic retention times. Other Parameters The time interval, At, is set to one year, for all iterations since only annual data are used. In terms of many planning considera- tions this is a shortcoming of the model,however, in terms of uncer- tainty analysis it is not a problem. The previous phosphorus prediction, P1_], is randomly selected from a normal concentration distribution and is used to calculate the present year's, Pi’ phosphorus prediction. It is initialized at .0200 mg/l. The mean depth is set at 89 meters (Snodgrass, l974), and is used for all runs. CHAPTER III RESULTS OF THE MONTE CARLO SIMULATION Major Observations The results of the Monte Carlo simulation indicate that the parameter values that are randomly generated from distributions characterized by literature data are, of course, accurately char- acterized in the simulation. The model parameters, which are calcu- lated also agree closely with the findings of other Lake Ontario studies. The dependent variable, in-lake phosphorus concentration, is also reasonably well represented in the simulation. The model pre- dictions agree with major Lake Ontario studies. Some highlights of these results are given below. (For a complete description, see Lee, 1980). Base Run The fifteen parameters of the Monte Carlo Simulation include six log-normal distributions and nine normal distributions. The base run for the Simulation is a 40—year, lOO iterations per year test (see Table 3.l and Figure 3.l). The phosphorus concentration predic— tion mean for the forty year period is .Ol82 milligrams/liter (mg/l), with a standard deviation of .0005 mg/l and a coefficient of variation of 2.75 percent. This coefficient measures the year-to-year variation l7 . "my 18 in the annual phosphorus concentration predictions. The l4.3 percent mean annual coefficient of variation and the .0026 mg/l mean standard deviation estimate within-year variability which reflect the combined input errors and variabilities. (For additional results, see Table A.l). All-Normal Run The lOO iterations per year run using all-normal distributions is given in Table 3.l and Figure 3.2. The average annual standard deviation dropped to .OOl6 mg/l and its corresponding coefficient of variation dropped to .0840. The inter-year standard deviation Spanning the 40—year test, increased substantially to .002l mg/l. Its corre- sponding coefficient of variation increased to .0620. The mean increased only Slightly from .Ol82 to .Ol94 mg/l. Lee suggests that this lower variability and slightly higher prediction may be due to the manner in which the normal variates are Chosen. Unlike the log-normal case, negative values can be generated by the normal distributions. However, since negative concentrations and fluxes are difficult to interpret, only positive values are used. This results in two different sets of summary statistics. Since ade— quate data for comparison are not available, it is unclear whether or not this revision to all-normal distributions is more realistic. It should be noted, though, that the simulation may be sensi- tive to changes in the random number seed. For example, increasing the magnitude of the random seed from l23457.DO (double precision) to 987225.DO results in an average annual concentration prediction about 19 TABLE 3.l.--Statistical summary of Major Monte Carlo Runs Base Run E. = .Ol82 CV = .l44O (lOO iterations S; = .0005 E = .0027 per year) __= y-intercept = .Ol86 CVx .0275 Slope = -.00002 R; = .9989 r = -.4586 All Normal Run R' = .Ol94 ‘7V = .0840 52' = .002l S" = .OOl6 CV— = 0620 y—intercept = .0207 x . Slope = -.00006 R; = .9940 r = -.6367 20 aivlhllII-Ill .AommFV mm; ”moczom .me> can mcovpmgmpH 00— pm mcowpuvumca cowbmcpcmucou ngocgmosm do mcmm> o¢-I._.m mczm?d Aucxv :E_H :..:m 0:: ..::x :..... 1 _ _ A ..... x: .mm m w .H x. mm x 32m: .3. :_ .3 no.“ 3 :3. 5.0.”:an .. :.:O . m m. .H ......... x. U ........... ........... c. n I. :m 21 I: .Aowmrv we; "mocsom .mcowpznwcpmwo FmEcoz PF< mcwm: mcowpovqum cowumcpcmucoo ngocamoca do mcmm> ov--.~.m wczmwd Amcxv ;=_p swam c.cm econ ::r_ _ . _ _ x.— T :. ms U D 2» U ..4 a P I o_ H O . U ”W. 9 No.” D o . . ,;. so a .099 a . U ”9&me I S ( :.. .9 _~ 0 a 22 9% higher than for the smaller seed (.0211 mg/l vs. .0194 mg/l). The average annual standard deviations,however, are about the same (.0017 vs. .0016); the average annual coefficient of variation for the larger seed is somewhat smaller (.0802 vs. .0840). The tendency of the pre- diction to increase or decrease over time may also be altered. In this case, it changes from a negative to a positive slope. These changes suggest, according to Lee, that normal distribu- tions are less characteristic of the parameters previously described lllnflflllflfllnw by log-normal distributions. But the interrelationships among the parameters are so involved that such a suggestion requires consider- able additional research to validate including more experiments to determine whether or not these results are consistently obtained. Hence, the problem of apparent sensitivity is not yet firmly estab- lished. For further details, see Lee, 1980. Minor Observations Base Run Fluxes The phosphorus fluxes generally exhibited low variability for the base run. The Lake Erie flux, which comprises 42 percent of the total phosphorus input to Lake Ontario, had the lowest variability with a Coefficient of variation Of 1.1 percent. The second largest source of flux was diffuse flux, which constitutes 32.5 percent of the total and has a coefficient of variation of 8.8 percent which, though low, is the most variable of all the fluxes. Point source flux is third in influence with a coefficient of variation of only 2.5 per- cent. Atmospheric flux, constituting 8 percent of the total, has a 23 variability of 3.7 percent. (For further details, see Table A.2.) Base Run Water Load Water loading is comprised of the atmospheric, Ontario basin, tributary and Lake Erie sources. Atmospheric water loading, which is the smallest component, has a coefficient of variation of 2.0 percent. The variability of the load from the Ontario basin tributaries is approximately the same at 2.1 percent. The remaining 81 percent of HIIIIHIHMNIIMI the water load comes from Lake Erie which has the greatest influence on the overall system. It has a coefficient of variation of .3 per- cent. (For further details, see Table A.3.) Higher Iterations Per Year The 500 iterations per year simulation may also be sensitive to changes in the random number seed (123457.DO to 987543.00). How- ever, the summary statistics of each of these runs do not significantly differ (.0191 mg/l vs. .0187 mg/l), and the mean annual variability is only slightly lower (14.3% vs. 14.4%). In both instances, the concentration appears to change very little over the 40—year period (see Figure A.l and Table A.4). The 1,000 iterations per year run, though giving similar results, displayed the problem of periodicity (see Figure A.2). The graph gives an almost Sinusoidal impression. Although several aspects of the simulation were examined, no answer to this problem was found. 24 Flux Sensitivities Reducing the point source flux variability gave no great devia- tions from the base run. Both runs gave the same average annual concentration prediction (.Ol82 mg/l) as well as the same average annual coefficient Of variation (14.3%). See Table A.5 and Figure A.3. Reducing the variabilities of the treatment facility effluent has very little effect on the total point source flux. The upgrading to all phosphorus removal treatment, though, has a noticeable impact: it reduces the mean annual point flux value by 27 percent. See Table A.5 and Figure A.4. The concentration values used to characterize the flux from Lake Erie are those taken from Niagara River-Welland Canal data. Since flux from Lake Erie iS the largest single source, it is quite predictable that changing this term gives distinct changes in the final concentration distribution. The mean for the eastern Lake Erie concentration data is 23 percent lower than that used for the Niagara-Welland data (.0170 mg/l vs. .0220 mg/l), and results in an average annual concentration prediction 1.7 mg/l less than the 18.2 mg/l mean value for the Niagara-Welland run. Also, the eastern Lake Erie data is 15 percent more variable than the Niagara—Welland data. The mean annual coefficient of variation for the entire period increases by 2 percent (see Table A.5 and Figure A.5). 25 Land Use Fraction Sensitivities Altering the balance of the land use fractions has virtually no effect on the concentration predictions over the 40-year test period. The only slight change appears in the annual coefficient of variation which decreases slightly from 14.3 percent to 14.1 percent when the fractions are doubled, and down to 13.8 percent when they are tripled. Shifting land use to all urban, or all agriculture or all forest by the end of the 40—year period did have an impact on the prediction. The results clearly Show the difference between disturbed and undisturbed ecosystems as far as their contributions to lake phos- phorus concentration are concerned. The two disturbed systems, urban and agriculture, contribute to increasing in-lake concentrations as their respective fractions increase. Increasing the undisturbed system fraction, forest, results in a decrease of the concentration predictions of .7 mg/l every 5 years. The average annual variability of these predictions is a very low 10.4 percent (see Table A.6 and Figure A 6). Summary The Monte Carlo results can be briefly summarized as follows. The base run exhibited low variability and therefore gave a small prediction error. The method was sensitive to the choice of distribu- tions. For example, the switch to all normal distributions increased the prediction slightly, lowered the intra-year standard deviation (or variability), but increased the intra-year standard deviation (or I 1' 3mm_ III—mu 26 variability). The method may be sensitive to changes in the random number seed, especially with the all-normal run. It was not highly sensitive to increased per year iterations. The fluxes, too, were relatively insensitive to changes in variability. However, the use of a different data set lowered the prediction mean and increased the error. With respect to the land use fractions, the model was sensitive only to differences between disturbed and undisturbed land uses. CHAPTER IV FIRST-ORDER ERROR ANALYSIS An alternative method for estimating the error associated with model predictions is known as first-order error analysis. This method is characterized by two basic features: Single moment treatment of the uncertain, or random, component and the first-order analysis of functional relationships among the variables. The first characteristic implies that the random variable X is described by its first and second moments, ml and m2, which are x given by: +w J _ xng(x)dx = E[xn] = m: 4.1 where, g(x) is the probability density function and E[xn] iS the expectation of xn (Benjamin and Cornell, 1970). Thus the first moment of X, ml, with respect to the origin iS the mean. The second moment of X, mi, is the variance. When the moments are found with respect to the mean, they are called central moments and are defined as: 27 28 +00 I (x-x)ng(x)dx = E[(x-x)n] = 11:. 4.2 In this case the first central moment, Mx’ is zero and the variance is the first non-zero moment (Benjamin and Cornell, 1970). The second characteristic implies that only the first-order or linear terms in a Taylor series expansion will be retained when dealing with system or functional relationships among random variables. In the case of a function, y = f(x), of a random variable x, the question arises as to how to determine the moments of the dependent variable. In this regard, it should be observed that if the coefficient of variation is small, x is very likely to lie close to mX and thus a Taylor series expansion of f(x) about mX is suggested as an approxi- mation to the moments of y (Benjamin and Cornell, 1970). For a single variable relationship the Taylor series expansion may be written as: If the error (x-xO) is small and if the higher-order derivations are not large, i.e., if the relationship is not substantially non-linear in the area of interest, the above equation reduces to: 29 f(x) ; f(xO) + d§(x) (x-xO). 4.4 x x O Taking expectations of both sides yields the first-order approxima- tion to the first moment of Y, the mean: ll? E[f(x)] f(E[x]) = f(x 4.5 O)° The second term drops out since the E[x-xO] = 0. Similarly, finding the variance of both sides yields the first- order approximation for the variance of f(x): (x—xO) 4.6 and after carrying out the multiplication and rearranging terms gives: x .4.7 XO Since the first two terms are constants (hence, zero variance), they dx Var [f(x)] 2 Var [:f(x0) + df(x) + df(x) x dx dx Var [f(x)] 3 Var [:f(x0) — x0 df(x) do not affect the variance of f(x) and thus, Var [f(x)] 5 Var r'df(x) -x 5 df(x) de x0] [dx A similar result holds for the multi-variate case starting with 2 Var [x]. 4 8 x0 ' a Taylor series expansion of partial derivatives and finding the first- order approximation of the expected value of 30 Y = f(x], x , . . ., xn) which gives 2 E[Y] E[f(x], x2, . . ., x )] n f(E[x], x2, . . ., an) = f(X109 X203 - - 0: xn0)° 4'9 where, XiO is the mean of the ith random variable. Because the second term drops out again, E(xi-x10] = 0, there is no way of accounting for the covariance of the variables unless the second-order approximation is determined: + NIH II M 3 M E[Y] = f(x10, x20, . . ., an) where, afZ/axiaxj x0 is the second partial derivative of f(x], x2, . . ., Xn) with respect to x1 and Xj evaluated at their means X10, X20, . . ., XnO. If the coefficients of variation of the random variables, x1, and the non—linearity in the function are not large, this second term is negligible (Benjamin and Cornell, 1970). The first-order approximation of the variance is: n n Var [Y] E z 2 3f __ 9L i 1 j=l 3x1 XiO ij Cov[x1, xj]. 4.11 on llflfllflIm-flq'mw 31 However, if the Xils are uncorrelated, then: n Var [Y] 3 2 [:3f q . X . T=1 o 2 . Var [x.]. 4.12 T XiOI I This equation shows that each of the n random variables contributes to the dispersion of Y in a manner proportional to its own variance and proportional to the factor, (Bf/8x1 x10)2, which relates changes in x to changes in Y, a sensitivity factor (Benjamin and Cornell, 1970). The total prediction error includes the model error, the con- tributions to the uncertainty from the parameter error and the inde- pendent variable error. These are combined as follows: = 52 + $2 + $2 4.T3 2 St A p v where, St = total error of the prediction SD = standard error of the estimate for the model SD = parameter error contribution Sv = input variable error contribution. The standard error of the estimate was approximated by a non-linear regression routine and includes errors in the variables. It is referred to hereafter as the model error and is treated as a constant in the error equation. However, since this routine was initialized from an estimate based on the steady-state form of the model, but is applied to the time-dependent form, a non-quantified source of error is 32 introduced. (See Appendix B-l for a discussion of the non-linear regression routine.) The parameter error and the input variable error are treated by first-order analysis using the equation with the covariance term (Equation 4.11) since some of the xi's are corre— lated. This equation may be rewritten as: n n n Var[v]sz(%§—)Zs§+z 2 2.331 i=1 i i=1 j=l+l 3x. 3x. I .1 . Si Sj Rij' 4,14 The parameter, vs, is not correlated theoretically with any other variable and thus its error contribution reduces to 2 _ §P__2 2 Sp - (av ) Si’ 4.15 s The mean and variance for vS were also estimated from the non-linear regression routine mentioned above (see Appendix B-l) since no data exist for Vs' Recalling the model (Equation 1.12), this gives: ap _(e‘Vs/Z"/T-T)L BVS— ( )2 + (P. _ L )(-1/ze'vs/Z - 1/T) 1-1 v + 5 Q5 .4.16 Vs + qs This quantity is squared and then scaled by the variance of Vs' The variables L, qs, and r are correlated. Their correlation coefficients were derived from linear regression using the fourteen years of data from 1960 to 1973 for these variables. These data are 33 taken from the work of Steven Chapra (Chapra, 1981). The means and variances were also calculated from these data. The input variable P]._1 is not theoretically correlated with any other variable in the dependent form (Equation 1.12), though linear regression with the other variables yields a relatively high correlation coefficient. This is believed to be due to the small amount of data and the low variability of the data. Therefore, the correlation coefficient was assumed to be zero. The total input variable error is 2_ ggz . 2 3P 2. 2 SP 2 3V — (3L) 5 (L) + (3‘15 s (45) + (3,) s (T) 8P 2 2 DP OP + (api-1) 5 (P1_]) + 2 56;' DE- 5(95) 3(L) C(anL) 8L 5? + 2 §P__ §P_ S(qs) - 5(1) r(qS,T) 4 17 qu at (The term 2 is the mean depth and is set equal to 89 meters.) The partial derivatives for the above equation are: .23 = l-e-vS/Z - Ii: 4 18 BL v + q ' S S 23 =l;. (5V5/2_;/T'l) 4.19 aqs (V + q l 34 _§E = _ L . -2 . -v /Z - 1/T 3T (P1_1 v + q T e s 4.20 s s §§}——- = e'Vs/Z ' I/T. 4.21 1-1 The full equation for the total error of the prediction is: .2 2 = (emvs/Z - 1/T--1)L ____J;____ (-1/ze-Vs/z-1/T) St ( + )2 + (Pi-1 v + q ) qs s s 52(VS) 2 2 + fi-v S/Z- I/T #1) 2 1- —v /z-1/T 2 ° S (95) + V 4 q S (L) s s 2 + [:: ) T-2 -v /z-l/T 32(T) Lq S + (e'Vs/Z'l/D2 . s2(P._,) 4.22 + 2L(e'Vs/Z‘I/T-T) (l-e'Vs/Z’1/T) (vS qS)2 (Vs + qs) 5(95) 5(L) r(<15, L) + ZLI-e-vS/Z-I/T) .(p_ _ L )I-Ze-VS/Z-1/T VS + qS 1-1 VS + qS ° S(q ) - S(T) ° r(qs,'t) + ME. The values used for the base run first—order analysis are summarized below: 2 Error type Mean S S Parameter vS 19.1910 1.1963 1.4311 Input Variables qs 10.660 1.4608 2.1339 L .6352 .0811 .0066 T 7.9402 1.0421 1.0860 P1_] .0206 .0027 .OOOOO729 Model ME .0032 .00001024 The correlation coefficients are: qS with L: .6822 qs with t: -.9902 L with T: -.7078. CHAPTER V RESULTS FOR FIRST-ORDER ANALYSIS The Model Prediction The results of forty iterations representing forty years of phosphorus concentration predictions are presented in Table 5.1 and Figure 5.2. The model is initialized at .0206 mg/l and the first iteration predicts .0208 mg/l. From there the predictions grow to .0213 mg/l after which there is no change. The decay factor is the key to understanding these results. The equation is: -v /z-1/T 1._1e s , 5.1 The term e'Vs/Z - I/T is the decay factor. In the second term, the decay factor reduces the previous prediction by a fraction of a fraction. The base run values for VS’ 2, and I yield a decay factor of about .71. The prediction, then, for each iteration is 71 percent of the previous value, while the first term remains constant at approximately 29 percent Ofiflmasteady-state value. Thus the first term constitutes an increasingly larger proportion of the prediction while the second term continuously diminishes. The effect is to move the prediction ever closer to the steady-state value at which point the model ceases to be dynamic (see Appendix B-2). 36 37 TABLE 5.1.--Base Run Prediction Error Coefficient of Variation ”(.0206)* (.0027)* (.1311). l. .0208 .0038 .1827 2. .0209 .0042 .2010 3. .0210 .0044 .2095 4, .021] .0045 .2133 5. .0212 -0046 .2170 6. .0212 .0046 .2170 7, .0212 .0046 .2170 8. .0212 .0046 .2170 9. .0212 .0046 .2170 10. .0213 .0046 .2160 40. .0213 .0046 .2160 'T .0213 .0046 .2147 3 .0001 .0001 .0058 CV .0053 .0313 .0271 *Initial values. WWI» 38 Behavior of Predictions and Errors B ll=inHiaHzed aboveequHinunivalue B : initialized below equilibrium value E : equilibrium value Figure 5.l.--Theoretical Behavior of Predictions and Errors. Base Run Predictions 21.4- E; '{g 20.64 § 1 l 1 o 4 a 12 ' 40 iterations Figure 5.2.-—Base Run Predictions. 39 The behavior of the model predictions is easier to see by using a different form of the model, namely, Pi = SS + (P0 - SS)k' 5.2 where, SS =-V;—;La;— , the steady state P0 = initial phosphorus concentration value k = 1/evs/Z T I/T ith iteration. .1. ll (See Appendix B-2 for this derivation.) There are three points to note here. First, the predictions asymptotically approach the steady-state, SS and Limit Pi = SS. (See Appendix B-3.) Second, the predictions grow or decay depending upon the difference between the initial and steady—state values. A negative difference, that is, an initial value below the steady-state level, results in growth and a positive difference, that is, an initial value above the steady-state level, results in decay (see Figure 5.1). For example, consider the above equation (Equation 5.2) with the following values, 40 P. = .0200 + (P - .0200)(.71)I l 0 where SS = .0200 k = .71 If PO is initialized at a value below .0200, say at .0180, the dif— ference between PO and SS will be negative and .0200 + (.0180 - .0200)(.7T)I .0 II .0200 - .0020 (.71)‘. The amount subtracted from .0200 decreases with each iteration i, thus causing the prediction to grow. If, however, PO is initialized at a value above .0200, say at .0220, the difference between PO and SS will be positive and .0 II .0200 + (.0220 - .0200)(.7T)l .0200 + .0020 (.71)l. The amount added to .0200 decreases with each iteration i, thus causing the prediction to decay. Third, this equation can be solved for i and thereby the number of iterations required to attain the steady-state value, within any desired level of precision, can be determined. This amounts to calculating the number of iterations needed to attain a desired phosphorus concentration. Thus, 41 P1 - SS 1 = 1n P0 - SS . 5.3 - (vs/z + I/T) (See Appendix B—4). The Error Equation Major Observations The forty error estimates corresponding to the forty pre- dictions are given in the second column of Tables 5.1. Figures 5.3 and 5.4 give the graphs of the parameter and the input variable errors. The error is initialized at .00272 and jumps to .00382 at the first iteration. It then grows steadily to .0046 after which it remains constant (see Figure 5.5). These results are more easily accounted for by restating the error equation in a more manageable form in terms of its basic components. The first component of the error equation to note is the summation of all those terms which are essentially the same as the steady-state error terms. Indeed, these terms differ only by the amount (1 — k2). These terms are referred to hereafter as the fractional steady-state error and notated as FSSE. The second component to note is the summation of all those terms which are scaled by the expression, 42 Parameter Error €H39 - _.- ° ug/l 10- C) O T 01 (D ()1 j Concentration - cn on an I v 1 1 r \/\_4 577O 4 8 12 4O Iterations Figure 5.3.--Parameter Error. Input Variable Error 12(1— 5 .00. g 80- 4;; 60~ 40 I J 1 \/\__1 O 4 8 12 ‘10 lterations Figure 5.4.--Input Variable Error. ° 119/1 Concentration 50 1 4.0 3.0 2.0 _ 43 Base Run W/f/I Model Error W/f/ma/ Mode/ Error 1 L I \/\.__J 4 8 12 16 4O lteralions L- Figure 5.5.--Base Run--Total Error. 44 TABLE 5.2.—-Modified Base Run (without model error) Prediction Error Coefficient of Variation (.0206)* (.0027)* (.1311)* 1. .0208 .0020 .0962 2. .0209 .0016 .0766 3. .0210 .0013 .0619 4. .0211 .0011 .0521 5. .0212 .0011 .0472 6. .0212 .0010 .0472 7. .0212 .0010 .0472 8. .0212 .0010 .0472 9. .0212 .0010 .0472 10. .0213 .0010 .0469 40. .0213 .0010 .0469 1' .0213 .0011 .0495 S .0001 .0002 .0092 CV .0053 .1774 .1860 *Initial values. 45 This is the difference between the previous prediction and the steady-state value. This difference diminishes with each iteration and hence is referred to as the "diminishing" factor and notated as Di where "i" is the ith iteration. The third element of the error equation may be thought of as a ”reduction" term consisting of the decay factor k2 and the previous error, T 2 2 _ 2 2 ( 5pj~—-) S (P. 1) - k S (P. 1). 5.3 This term retains a fraction of the previous error, namely, k2, and loses an amount equal to 1-k2. As with the model, the decay factor plays the central role. A fourth component to note is the model error, ME, which is a constant in the error equation. Using the above defined abbreviations and letting 52(P1_1) = E1._1 and SE = E1, the error equation becomes, 2 5.4 E1 = FSSE + ME + Di + k E1_]. The base run error estimates apparently approach an equilibrium value at which point the error equation is run longer dynamic. This means that the amount lost by the reduction term each iteration must be exactly compensated for by the other terms. Then at infinity, E = szw + FSSE + ME + Om 5.5 00 46 If R = FSEE + ME + Ow, then 2 Em = k Em + R. 5.6 Since the sum of the remaining terms must equal the amount lost to have equilibrium, then R = (1-k2) E and 2 2 Em=-5—3— + R = 2 + 1 R =-—————. 5.7 l-k l-k (For a rigorous mathematical treatment, see Appendix B-5). Since at infinity the diminishing term, Di’ has vanished altogether, R = FSSE + ME and En: FSSE + ME 5.8 1—k2 Thus the error converges to the sum of the fractional steady-state error and the model error divided by the constant 1-k2 (see Appen- dix B-6). The component form of the error equation, E1, requires a knowledge of the previous error, Ei-l' This equation has a more 47 general form which allows for calculation independent of the previous error by using the generator form for an infinite series (see Appen- dix B-7), E =__F_(1-k21)+E0k21 TDifliz-LL 59 l 1-k2 T-k2 where, E1 = ith error or variance F = FSSE + ME E0 = the initial error k = decay factor Di = ith diminishing term and E = (E - F ) k2] + F + D (T-RZI ' ° 2 2 2 l-k l-k 1-k = F2 + (50 '._E.§)k21 + D. (1-k21) . 5.10 1— I‘k I'k ]_k2 In this form it is easier to see just how the error grow or decays. It depends upon the difference between the initial and final errors. If the difference is positive, the error will decay toward the final value F/(l-k2) and if it is negative, the error will grow toward the final value. The error is based upon the final value, plus or minus an ever Shrinking fraction of this difference. 48 Minor Observations Diminishing factor.—-It was found that the Di term contributes a negligible amount to the base run total error. Its omission allows for a simpler equation which gives excellent approximations —ii—) k + ———§ . 5.11 a N This also allows for a Simpler program and hence is easier to model (see Appendix B—6). Calculation of i.--Using the above equation, one can calcu- late how many iterations are required to attain the equilibrium value within any desired level of precision. This amounts to calculating the number of iterations needed to attain a desired error level, F Ei - /(1-k Eo ' F/(1-k2) -2 (vs/z + I/T) 2) 5.12 i = 1n (see Appendix B-7). Initial value.--In reviewing the base run errors, one notes the jump from the initial value of .0027 to .0038. This is accounted for by observing that the first iteration will produce an error at least equal to the FSSE. The greater the difference between FSSE and E0, the greater the increment at the first iteration. 49 Model error.--The base run was made with an independent model error term, but to facilitate comparison with the Monte Carlo simula- tion, the program was run without it (see Figure 5.5 and Table 5.2). Calculation of the final value without the model error is straightforward. One need only remove the constant ME, yielding, Ew= FSSE . 5.13 l-kz For example, in the base run, the final value is 215.918 . 10.7 (in variance form). The final value without the model error is, 7(]_k2) 2 T-k2 E = 215.918 - 10' - .0032 (I) -7 4.48 . 10 T-k2 9.050 . 10‘7 The standard deviation is .0010. Symmetry.--Figure 5.5 shows that the base run and modified base run standard deviations are approximately symmetrical. This is because the initial standard deviation is about halfway between the two different equilibrium values. Moving this initial value closer to one equilibrium or the other will cause the error to con- verge to that equilibrium value more quickly. 50 Coefficient of variation.--Because both the model prediction and the error converge, the coefficient of variation, CV, also approaches a final value, namely, CVco = Eco P; i 2 = F/(l-k ) ——§§———— . 5.14 where, E» = the variance at infinity Poo = the prediction at infinity, that is, the steady-state, SS. F = the sum of the fractional steady-state error and the model error (see Appendix B—8). However, the coefficient need not approach this final value in the same well-behaved manner that the error and the prediction do, that is, it can have values within the iterations which exceed the final value. This is due to the differential rates of convergence of the prediction and error equations and due to the manner in which they are initialized. Relation to steady-state error.--Some relationships of the time-dependent error to the steady-state error are given below: a. FSSE > FSSE T-R2 51 b. FSSE < SSE c. FSSE < SSE T-k2 d NE;_ > ME T-k2 e. FSSE + ME 2 > SSE + ME 1-k 1-k when fl) 2(I/k - I)’ SSE The derivations are presented in Appendix B-9. Summary The phosphorus prediction model is relatively simple and its analysis is straightforward. The equation limits to the steady-state and hence the predictions asymptotically approach it. The predic- tions grow or decay depending upon whether the initial value is less or greater than the final value. By using the generator form of the equation, these characteristics can be determined without actually running the model and the programming needs are Simplified. Also, the number of iterations required to attain a particular phosphorus concentration can be determined. The error equation, though more complicated than the model, has several similarities to it. The errors approach a final equilibrium 52 value which is a fraction of the steady-state error and they grow or decay depending upon whether the initial error is smaller or greater than the final error. As with the model, these characteristics can be determined by using the generator form of the error equation. Also, the number of iterations needed to attain a particular error can be calculated. Additionally, other features of possible interest can be calculated such as the steady-state error, the effect of removing the model error, and the coefficients of variation. CHAPTER VI COMPARISON OF FIRST-ORDER WITH MONTE CARLO The Numerical Results The 100 iterations per year base run phosphorus concentration prediction for the Monte Carlo simulation averaged .Ol82 mg/l over the 40 years. The error of the prediction is .0026 mg/l mean standard deviation which measures within—year variability. The coefficient of variation is about 14.3 percent. The inter-year standard deviation of the mean is .0005 mg/l and the correspondent coefficient Of variation is 2.75 percent (see Table 6.1). The base run phosphorus prediction concentration for the first- order analysis, using only the mean, averaged .0213 mg/l over the 40 iterations. It ranged as low as .0208 mg/l to .0213 mg/l. The mean standard deviation associated with this prediction which is an esti- mate of the combined uncertainties of all the model parameters is .0046. The error ranged from .0038 to .0046. The coefficient of variation ranged from .1827 to .2170 and its 40 year average is .2147. The inter-year standard deviation is .0001 and its correspondent coefficient of variation is .0053. The two means for the two different base runs are substantially different: .0182 to .0213. Both the first-order error and the first- order coefficient of variation are larger than the respective Monte 53 54 TABLE 6.1---Comparison of Numerical Results Monte Carlo First-Order Base Run x' .0182 S" .0026 CV .1430 R .9989 S; .0005 CV; .0275 All Normal i .0194 S .0016 CV .0840 R .9940 S; .0021 CV- .0620 X .01872 .0027 .1440 .1966 .OOO9O .0479 1000 Iterations .0213 .0046 .2147 .99996 Base Run .0001132 .0053 Monte Carlo Values Without Model Error .0182 .0002 .0117 .9998 .0000 .0000 Without Model Error .0213 .0011 .0495 .9996 .0001132 .0053 Monte Carlo Values with Model Error .01899 .0027 .1430 .9963 .OOO95 .0499 .0182 .0047 .2582 .0000 .0000 .0000 55 Carlo values (see Table 6.1). For Monte Carlo the interval, mean plus or minus the standard deviation, is .0156 - .0218, whereas for first-order the interval is .0167 - .0259. These differences in variability can be accounted for by considering the values used in calculating the predictions. The first—order analysis uses generally higher variances for Vs’ qs, L, and Pi-l' A comparison of these different means and standard devia- tions is given in Table 6.2. In the Monte Carlo Simulation, the model error was not included as an independent term. Rather, it was incorporated in the parameter error, that is in the standard deviation of vs. Hence, to facilitate a comparison, two modifications of the first-order program were made. First, the program was run without the independent model error term, but with the base run values. These results are compared in Table 6.1. Clearly, the removal of the model error significantly drops the first-order error from .0046 to .0011. (The prediction, of course, remains unchanged.) This error is substantially smaller than the Monte Carlo base run error of .0026. The second refinement for comparison was to run the first- order program with all the values used for the Monte Carlo runs (Table 6.2), as well as without the independent model error term. (This includes use of the parameter error, that is, the standard deviation of vS which incorporates the model error.) Again, a sub- stantial drop occurs for both the prediction and the error. The pre- diction, .Ol82, is exactly the same (to within four decimal places) as 56 TABLE 6.2.--Comparis0n of Values Monte Carlo Values First-Order Values vs: x = 16.0367 vs: x = 19.1910 S = .4503 S = 1.1963 CV = .0281 CV = .0623 R = .9988 R = .9942 qS: i = 12.0348 qS: x = 10.6650 S = .0460 s = 1.4608 CV = .0038 CV = .1370 R = .99998 R = .9725 L: i = .5108 L: i = .6352 s = .0142 s = .0812 CV = .0279 CV = .1279 R = .9988 R = .97595 T: i = 7.5950 T: x = 7.9402 s = 1.065 s = 1.0421 CV = .1402 CV = .1312 R = .9712 R = .9747 Pi-l: R = .0182 P1_]: i = .0206 x = .0005 S = .0027 CV = .0275 CV = .1311 R = .9989 R = .9748 57 the Monte Carlo base run and quite close to the Monte Carlo all- normal prediction of .0194. The error decayed to about one eighth the Monte Carlo value and Similarly the coefficient of variation was lowered to approximately 14 percent of the Monte Carlo value. Montgomery et al. (1980), in a comparison of these same methods for the steady-state form Of this model, found that both techniques gave similar prediction errors for similar input values. Finally, increasing the iterations to 500 to 1,000 per year gave no significant differences in predictions or errors from the Monte Carlo base run. For example, the 500 iteration mean annual prediction is .01872 and its standard deviation .0027. Compare with base run values in Table 6.1. The Procedures Theoretical Considerations The first-order analysis generates a series of predictions and errors which converge to their final values. The Monte Carlo method, on the other hand, generates a distribution of values from which predictions and standard deviations are derived for the phosphorus concentration. One important question in comparing the two methods is whether or nor the errors track the same central tendency and dispersion sta- tistics. Scavia et al. (1981) in a similar comparison found that the two variances were indeed qualitatively different. In modeling vari- ous aquatic plants and animal species of Saginaw Bay, the first-order predictions generated means which tracked the typical species, whereas 58 the Monte Carlo method tracked the composite average of the total population. Consequently, the first-order variance represented the variability around the typical species while the Monte Carlo variance reflected the variability of the total population. Also, the first— order means tracked the Monte Carlo medians as the distributions became skewed (see also Burges and Lettenmaier, 1975). The Monte Carlo values, then, were qualitatively more realistic even though quantitatively they were generally Similar to the first-order values. In this study, however, there is only one subject modeled and, hence, both methods track the same variability, i.e., the vari- ability of the phosphorus concentrations. Furthermore, the output distributions are not highly skewed allowing for no conclusion regard- ing whether or not the first-order mean is the Monte Carlo median. Both the means and the errors are qualitatively the same. Another important issue in comparing the methods is whether their assumptions are violated. This is especially important for first-order analysis since its assumptions are stricter. It is assumed that the linearization adequately approximates the true mean within the region of interest. The non-linearity within this region must be sufficiently small to allow for such an estimation. Another way of stating this is to say that the method assumes the distribu- tions are not highly non-normal and, hence, that the first and second moments adequately describe the distribution. In a comparison of sensitivity analysis with Monte Carlo simulation, Gardner et al. (1981) found that for complex, non-linear WW» 59 models, sensitivity analysis was a poor approximation because the region of interest displayed high variability. Sensitivity analysis is similar to first-order analysis in that both depend upon a lineariza- tion by use of a Taylor series expansion to approximate the true model. Unlike sensitivity analysis which examines small perturbations in the parameters by means of the partial derivatives, first-order analysis incorporates variances and covariances and therefore includes the effects of simultaneous parameter variability. Gardner et al. found this to be the major failure of sensitivity analysis to their stream ecosystem model. The variability of their parameter values ranged up to 3,000 percent, while in this study variability is no more than 30 percent and in most cases much less than that. Since the predictions and errors in the linearization method converge to a final value, they can never generate bell-shaped curves and hence they are not normal distributions. Indeed, the predictions and the errors resemble a cumulative distribution function and in a sense, they are the accumulated summations of the previous calcula— tions (see Appendix B-6). Because they asymptotically approach a final value, they suggest an asymptotic distribution such as the Type I extreme value or Gumbel (Gumbel, 1958) distribution. In this case, one wishes to know the limiting distribution of the largest of n values of predictions as n gets large. This implies that the largest of n observations is also the mean of the n observations. Such a distribution has been used before in modeling hydrologic phenomena such as peak annual flows (Benjamin and Cornell, 1970). This, however, requires further investigation. 60 Another point to be considered is whether or not the devia- tion from normality is sufficiently small such that a normal distribu- tion approximates the true distribution. The coefficients of variation in the first-order results are generally lower than the Monte Carlo values. One measure of non-normality is the mode to mean ratio given by, R =(1+ CV)'I°5 1 mm. _-—_ _ where, CV = coefficient of variation. Most of the R values for the first-order analysis were 99 percent or more of normal by this measure. Another point of comparison lies in the different sensitivi- ties of the two methods. The Monte Carlo procedure is sensitive to the change in distributions, and possibly to the change in the size of the random number seed. The method is not as sensitive to changes in point source flux variability as present information indicates (Lee, 1980), but it does respond to the upgrading of all treatment levels and especially the switch in concentration values for the Lake Erie flux which is consistent with'Huasteady-state results (Mont- gomery et al., 1980) obtained for a similar test. Its sensitivity to land use changes is restricted primarily to differences between disturbed and undisturbed uses which is all that is reflected in the data. The method is not sensitive to increased iterations per year. 61 First-order analysis, on the other hand, is highly sensitive to the removal of the model error as well as the changes in the means and variances as the switch from base run to Monte Carlo values indi- cates. Further study of this model with the Monte Carlo procedure should include three main points. First, the reason for the increase in average annual prediction concentration, and in the inter-year variability when all normal distributions ameused is not clear. Are these increases due to the inadequacy of the normal distribution to characterize the parameterscw‘are they primarily due to the manner in which the variates are chosen? Closely related to this and in need of clarification is the reason for the apparent possible sensi- tivity of the all normal run to changes in the random number seed. Future research should focus on whether or not these results are consistently obtained, and if so, why. Second, the relative insensi- tivity of the model to changes in point source variability requires explanation. Third, the model error should be incorporated as an independent error term. An additional consideration for further research is the prob- lem of periodicity of the 500 and especially 1000 iterations per year runs. The reason for this is unknown. For the first-order error analysis, future research should focus on two main areas. First, the proper output distribution needs to be determined and its impact on the assumptions of first-order clarified. Second, higher order terms of the Taylor series expansion 62 should be examined to determine whether there is a Significant change in error estimates. Secondarily, it would be interesting to examine more closely the behavior of the coefficient of variation. Practical Considerations Generally, the initial investigator time for first-order analysis involves the derivation and verification of the partial derivatives, a process which can be laborious and lengthy, depending upon model complexity. In some cases there may be no analytical solution to the partial derivatives and one can only approximate the solution through finite differences (Garen, 1979; Coleman and DeCoursey, 1976; and Mein and Brown, 1978). With the Monte Carlo method, the amount of time required for the initial investigation depends upon the difficulty of properly characterizing the probability density functions. The more easily these can be described, the less investigator time is required. The computation costs, however, especially on large frame computers, can be quite high, depending upon the complexity of the model. Frequently, as in this particular study, the probability density functions are easily described for the Monte Carlo method. The major cost was computational. In the first-order analysis, the derivation and verification of the partial derivatives was also not difficult, but the programming was much Simpler and hence the compu- tation costs were much lower. It should be noted, though, that the Monte Carlo method was significantly more complex in its treatment of 63 the loading term L which contributed to its larger cost of computa- tion. Nonetheless, the predictions and error estimates of the first- order method can be run on a pocket calculator which is more available and less expensive than is a computer. Summary The comparison of these two procedures on this model indicates that the predictions are generally similar but that the errors can differ substantially. Although the Monte Carlo all-normal mean annual error (intra-year variability) is smaller than its base run counter- part, it is significantly larger than the corresponding first-order value even when the same data are used. The exact reason for this is unclear. The first order predictions and errors have non-normal dis- tributions whose exact characterizations require additional investi- gation. However, it was found that both errors are qualitatively the same. The comparison also indicates some practical differences in the two methods. The first-order prediction and error equations, for example, can be easily programmed into a calculator which reduces computation cost and may increase its utility. The Monte Carlo procedure, however, requires less initial investigator time since the distributions are easily characterized. The modeler should remember, then, that the first order method replaces a non-linear model with a linear approximation. This approxi- mation is good only within the region of low variability where the function is not highly non-linear. The philosophy behind this approach 64 is the belief that an approximation of the whole is better than a detailed account of only a part (Cornell, 1969). Detail on one level is necessarily sacrificed for generality at another. A highly refined, complicated model may not be a good candidate for this technique. For example, Gardner et al. (1981) chose a very complex, non-linear model which required the numerical approximation of 48 sensitivity equations. Most likely this is not susceptible of first- order analysis. The Monte Carlo method, though, has no such restric- tions and treats the non-linear model as it is. However, it is critical that the distributions be correctly characterized. In the face of inadequate data, this is not always easy. FINAL COMMENTS With the increasing dependency upon quantitative models, the need for estimating the uncertainty or reliability of model predic- tions will also grow. It is hoped therefore that research in error estimation methods will continue and that this study will have con- tributed in some measure to this body of knowledge. However, it is also hoped that as resource managers make greater use of mathematical models, that the broader, non-quantitative aspects of the management decision-making process not be overlooked. It is all too easy in this age of specialization and narrow quantifica- tion to neglect issues which require for their solution philoSOphical insight, rather than numerical approximations. In the context of manifold, complex questions that the water resource manager faces today, it must be remembered that modeling, though useful, is no substitute for sound judgment, and next to a perspicacious understand- ing of the larger social and scientific issues, must always remain ancillary. 65 APPENDICES 66 APPENDIX A MONTE CARLO APPENDICES 67 .h ~ __._ .. APPENDIX A MONTE CARLO APPENDICES A.1 Correlation Coefficients Correlation coefficients are used to measure the strength of the relationship between two variables. In linear regression analysis, it indicates the goodness-of-fit of the regression line to the data. The correlation coefficient used in linear regression is the Pearson Product-Moment Correlation Coefficient and is defined as: n 1'1 n l? <2 1 2 n i=lxi2- 1=1X1 2%] [T=TY12- (1511(4) 2%,]; In correlating the annual mean flows for the Monte Carlo Simulation data were taken from the calendar years 1935-1969 for the Genesse, Oswego, Black, Conaseraga, Fish and Moose Rivers. The correlation coefficients are presented below. Pearson Product-Moment Correlation Coefficients Moose Fish Conaseraga Black Oswege Genesee .51 .69 .90 .71 .82 Oswego .60 .74 .84 .73 Black .79 .95 .62 Conaseraga .44 .62 Fish .72 68 69 A.2 Markov Flow Model Within the Monte Carlo simulation, a Markov flow model is used to generate the historical flow pattern for the Niagara River/ Welland Canal flow. Such a model has both a deterministic and a random component. It assumes that the current flow depends on the previous year's flow plus a stochastic term. The equation is: 9, = i (l - r1)+ r1 q, _ 1 + ti ° s ° (1 - r12) where: qi = ith year's flow i = historical mean r1 = lag-one auto correlation coefficient S = standard deviation t = ith standard normal random variant. It is assumed that the flows are normally distributed. Data were taken from surveys of 23 calendar years of flow data (Chapra, 1981). A statistical analysis of the flows yeilds the following results: 1.961 . 10H m3/yr mean = SS = 1.051 . 1010 m3/yr The lag-one autocorrelation coefficient is r1 = .7998. Hence, the equation is: 9 8 q. = 1.961 . lO 1 (T - .7998) + .7997 q1._] + t1 . 1.051 . 10 . k “41 - .79982) . 70 Table A.1: Statistical Summary of 100 Iteration Base Run - Model Terms Phosphorus Concentration Prediction (mg/l) Settling Velocity Prediction (m/yr.) Areal Water Load Prediction (m/yr.) Phosphorus Loading Prediction (g/mZ-yr.) s- X CV- x .0182 .0005 .0275 .9989 16.0367 .4503 .0281 .9988 12.0348 .0460 .0038 1.0000 .5108 .0142 .0279 .9988 V 5 y-intercept slope r y-intercept Slope l" y-intercept slope r y—intercept slope l" = .1430 = .0026 .0186 .0194 =-.4586 16.1854 -.OO73 -.1884 12.0414 -.0003 -.O783 .5247 =-.OOO7 =-.5594 Source: Lee (1980) Table A.2: Statistical Summary of Base Run Fluxes (g/yr)* 71 Lake Erie Flux i = 4.064 x 109 CV)I .0108 _ 9 s; - .044 x 10 R; .9998 Diffuse Source Flux x = 3.161 x 109 CV; .0882 _ 9 S; - .276 x 10 R; .9885 Point Source Flux x = 1.703 x 109 CV; .0252 _ 9 s; - .043 x 10 R;< .9990 Atmospheric Flux x = 7.762 x 109 CV; .0370 _ 8 - S; - .287 x 10 Rx .9980 Total Flux x = 9.702 x 109 CV; .0278 S; = .268 x 109 R; .9989 Source: Lee (1980) *Table corresponding to p. 23. WW.» 72 Table A.3: Statistical Summary of Base Run Water Load (m3/yr.)* Atmospheric Water Loading x = 1.659 x 1010 CV; = .0199 _ 10 _ Si — .033 x 10 R; - .9994 Water Load From Lake 2 = 2.721 x 1010 CV;< = .0206 Ontario Basin Tributaries 10 s- = .056 x 10 R- = .9994 X X Water Load From x = 1.849 x 1011 CV; = .0027 Lake Erie 1] S- = .005 x 10 R- = 1.000 X X Total Water Load x = 2.287 x 10H CV; = .0039 _ 11 _ S; - .009 x 10 R; — 1.0000 Source: Lee (1980) *Table corresponding to p. 23. 73 Table A.4: Statistical Summary of Higher Iterations Number of Iterations 500 1000 .01872 .00090 .0479 .9966 .01899 .00095 .0499 .9963 CV § y-intercept slope 1 0| cm; m:o_accmu_ oom um m:o_uuwcmcm :o_gmcu:mo:ou mzco:;mo;a do mccw> oc-u_u/\ oc::_d Amc>q wa_p omcm 0—ON :ccm c9..— — d. _ d L 5. “mo. .333”: .................. . 3 .3333 000 awn““unwrmmqwunuux. nuu . ... .HH.U.H”:”nunnuunnunnununnnnu: 3 .m mum. fond..- 559w...duficch. I E .u_. m “mmfl ::gmmmu:::. .nxu. an: xxxmm: 1 mm ::. gmmmmmmp o :a J 3.0 H mm .fizfixf. umme.. Mu mm 533mm...” ..::u”. 0 .3 in” mm. .. 2 U 0. .mmmummmm” mm 1 . 17 x mummmflmnz: mu .uummmmmmmm. H mu .Hm” ..... UPC. 3 (\ 3.5 o I 8 menu. ...... ..... . 0 fine“. at. flame .6 g : 75 Aowmfiv mo; ”wULDOm cam» cm; m:o_gmcwu_ ooofl cog comm Eccccz Emcee; ccwm: mco_uuwcmc; :o_umcu:mu:co 11.N.< mc:c__ Amcxv mswh omom ogom cccm coo. _ 1 _ . I 2 3 O U o a 00 000 o W o oo o o o A E W. o o 0 U4. 0 O O O O U 00 O 6. rm 0 mu 0 o o o o O 1 cm 0 O o o o o L 76 Table A.5: Statistical Summary of Flux SensitivitieS--Phosph0rus Concentration Predictions (g/m3) Cut Point Source x = .0182 CV = .0143 Flux Variability _ by Half s; = .0005 s = .0026 CV;= .027 y-intercept = .0186 R; = .999 slope =‘u0002 r =-.4339 Upgrade All Point - = .01736 CV’= .1494 Source Treatment X to Phosphorus Removal S; = .OOO51 S = .0026 CV;= .02911 y-intercept = .01779 R; = .9987 slope = -.00002 r =-.4808 Use Eastern Erie ; = .0165 _V'= .1636 Concentration Value _ for Calculation of s; = .0005 s = .0027 Flux From Lake Erie CV—= .0298 y-intercept = .0169 R; = .9987 slope = -.00002 r = -.4383 Source: Lee (1980) 77 Aommfiv mm; “mucsom .x:_d mocao :.o co» Xu___ac_cm> N\~ o:_m: meowuu_umc; cowumcucmucou mzcocamocg do mmmW>.CMIT.m.<.9::3__ Amcxv mswp omen c_om ccom ELL 1 _ u - oooooooooo .......... nnnnnnnnnnnnnnn .................... .............. ........ on? ”a “on: I: b 1 (1/5“) uoiieuiuaauog .................. ................. ..... ..... ooooo 78 Aommfiv mm; ”mucsom .xapd ucwoa cob _m>oswz mzcozamoza .Pw ;u_3 mcomuowcocm :o_umcu:mu:ou mzcozamozm do memo» ecur.¢.< mczawd Amcxv chh owcm oqom coca ova. — u — q I c— ...... ooooooooooooooooooooo .............................. --------------------- ..mm. .26; m can...» MW WW mm 5.. a. 2 oooooooooooooo (1/5”) uollealuaouog ................... 79 .AomeV mob ”mucsom .cuma :o_uwcu:mu:ou mwcm :cmumcm ccwm: m:o_uowuwca :o_umcu:mo:ou mzcocamogg do meow» cerium.< mczawd Amcxv ma_p snow .Zcm coom coa— — q _ _ 6.. s 6 a s ....... .................... (1/5”) uolieuauaouog .3333 ”$1 5. ..... ..... ..... ..... ..... 80 Table A.6: Statistical Summary of Land Use Shifts Constant Rate Shift x to All Urban 5;. CV- x R- x Constant Rate Shift x to All Agriculture _ S CV; Rx Constant Rate Shift 2 to All Forest 5.. x CV; Rx .02199 .00128 .0582 .9949 .02257 .00257 .1139 .9809 .01577 .00163 .1034 .9842 CV S y-intercept slope 1" 0| oo SS + (PO — SS) Limit k], i + w Because both PO and SS are constants and because Ooo Limit Pi = SS. i+oo 93 8.4 The Solution for the ith Iterations We wish to know how many iterations, i, are required to reach convergence or to be within a certain distance of convergence. Since: = _ 1 Pi SS + (PO SS)k then: _ _ i Pi - SS — (PO SS)k and Pi-Sszki P - SS 0 (1/eVS/Z + l/T)1 p, — SS 1n .;L_____ P - SS .£l______-__= j -(vS/z + 1/T) 8.5 The Error Equation In order to see how the error equation behaves, it is helpful to rewrite it in a more tractable form. Let k = 1/e"s/Z + I/l, then 94 2 2 st = 0-1) -, L 22+ 2(k-1) L 2 [2L (,k-1g , (-k__>82(vs) vS as (vs+qsl , t 33. . L (k-11 S(T) S(qsl r(T.95) 2 2 T (vs+qs) + 2 (l-k) _1<__ S(L) S(_~.) ram] (vs+qS) T 2, 2 + (P1_1 - L > [ k2 82(VS) + k2 82(T)] .565 74 :71— + k2 52 (Pi_1) + ME The terms in the first set of brackets are the steady-state error terms (referred to hereafter as SSE). They are scaled by the decay factor (k-l)2 and the result iS a fraction of the steady—state error which isrnfiateihereafter as FSSE. The next two sets of brackets are scaled by the difference between the previous model prediction and the steady-state prediction. Since the phosphorus prediction model _.1;_. Vs+qs to zero. Hence these terms diminish with each iteration and are converges to the steady-state, the difference, Pi-l - converges therefore referred to as the ”diminishing” factors notated as Dj's. 96 The term k2.S2 (p1_1) reduces the previous error by an amount equal to 1-k2 and retains only k2 of the previous error. This is referred to as the "reduction" term. The abbreviation ME is the model error and is a constant. Combining the FSSE and ME to form one constant, and also combining the constants of the Di's gives a simpler equation. It Should be observed that the diminishing factors can be rewritten according to Appendix B.4. Since I 1. Y .1 _ P. - ———*——— = P — ~—————— k then 1 Vs+qs ( o Vs+qs ( L L i-I P. — ,- = P - -—————- k 1-1 vs+qS < o Vs+qs ) The diminishing factors can therefore be written in terms of a constant, the difference between the initialized value PO and the steady- state value, which decays according to ki'l. The error equation for the base run can be rewritten as: = (.OOOOIO6880) + [(-.OOO6754555)kI_1I . [—.0000013731] + [(-.0006754555)2k21'21 ° [ 0002292119] + k2.s2(P._1) 2 St This can be further simplified to: SE = (.0000106880) + (.0000000009)k" 1 21-2 2 2(P. + (.OOOOOOOOOl)k + k .s 1_l) More generally SE = F + (A + 8k1‘1)k"l + k2-32(P,_1) where ‘w mummy 97 F = FSSE + ME A = coefficient for k1."1 B = coefficient for k21'2 or SE = F + D, + k2.52 (Pi_1) where ; D = (A + BI When these conditions are met, and because of the faster rate of convergence for the error equation, the coefficient of variation, CV, will always have a normalized value greater than one and will asymptotically decay toward one. The second case is similar to the first in that both the prediction and the error grow, that is, the first condition is met but the second condition is not satisfied. Instead, the initial normalized error is smaller than the initial normalized prediction, E0 _P(_)_ Fyli-kzi < 55 ° In this case the error equation will eventually cross the prediction equation due to the fact that it converges more quickly. Before the 105 crossing, CV < l and after it CV > I. When the two functions equal each other exactly then Ei 31- and F/(i-kZT 55 CV = l The point at which these functions are equal is the point at which the CV equals the limit. This can be easily calculated: El :31 F/TT‘E27' s l-k21 +k21 = l- k + 3Q ki ' F/(fl"ZTU: ss "k21+EQ_ki ml” “'27 3 =32 -1 F/ l- k7 ss k = (PojSS-l) (Eo/F/(I -kz)- 1) . Po/SS - l : 1 , 1 n (EC/F/(l-k2)-l) -vS/Z + l/T) The third case arises when the following two conditions are met: l. both the prediction and the error decay, that is, MED m > 1 E0 > l and F/(i-kZT 2. the initial normalized error is smaller than the initial normalized prediction, JIIIIWWEW“‘WW 106 mi‘i’zvgt When these conditions are satisfied the coefficient of variation, CV, will always have a normalized value less than one and will asymptotically grow toward one. The fourth case is similar to the third case except that the second condition is not met. Instead, the initial normalized error is greater than the initial normalized prediction, mid-1m>§%- In this situation, as in the second instance, the error equation will cross the prediction equation. Before the crossing CV > 1 and after it CV<1. When the two functions are exactly equal, CV = 1. After crossing the limit, the CV reaches a minimum and asymptotically approaches the limit from below. The calculation of i is the same as before. The fifth case occurs when the prediction decays but the error grows, Pi -—§ > 1 and E' < 1 F/ l-k then CV < 1. 107 The CV will asymptotically approach the limit from below. The sixth case is the reverse of the above, namely, when the prediction grows but the error decays, -E% < 1 and E' E711{127'> 1 then CV > 1. The CV will asymptotically approach the limit from above. Only in the second and fourth cases do the fUnctions equal one another and hence, only in these two instances do the coefficients of variation equal and exceed the limit before they truly asymptotically approach their final values. 108 8.9 Relationships of the Time-Dependent Error to the Steady- State Error A comparison of the steadyestate error terms with those of the time-dependent form shows that the time dependent error is equal to the steady-state error scaled by the decay factor, i.e., it constitutes a fraction of the steady state error: FSSE = (1 — k)2 SSE Since O< 1 hence F350 > FSSE. l-kL The same argument holds relation ME > ME. l-k2 It should be noted that the time-dependent error can shown below: FSSE + ME2 ,, SSE l-k 1-k —£E§- - ME SSE- l—k 109 for the model error ME, giving the if the model error is sufficiently large, be larger than the steady-state error as 110 k 2 k ME , 2k SSE 1-k2 1+k —BEL- > 2(1/k — 11. SSE Thus, when the ratio of the model error to the steady—state error is greater than 2(1/k — 1), the equilibrium value of the time-dependent form will be greater than the steady—state value. ratios are 1.914 > .9802. Relationships of the Dynamic Error to the Steady-State Error 1. FSSE > FSSE 1—k 2. FSSE < SSE 3. FSSE < SSE l—k 4 ME2 > ME l—k 5. FSSE ME + ————- > SSE + ME 1-k2 1-k2 when _ME_' > 2 (l/k _ l), SSE In the base run, the 111 8.10 The Calculator Program The results of this study were calculated on a Texas Instruments programmable calculator inodel 58C. The workspace was partitioned into 320 programmable steps (0 — 319) and 20 data registers (0 - l9). Listed below are the data register constants, including their base run values, and the 294 step program. Data Registers Base Run Values Roo P1 R01 R02 R03 R04 R05 R06 ME .0032 R07 r(qS,L) .6822 R08 r(L,T) - 7078 R09 r(qS,I) —.9902 R10 exp (-vS/z - l/T) -19.1910/89 — 1/7.9402 = -.3416 R11 vS + qS 19.1910 + 10.6650 = 29.8560 R12 32 of vs 1.4311 R13 52 of qS 2.1339 R L .6352 14 Prediction Error 112 Data Registers Base Run Values R SD2 of L 0066 15 ‘ R16 1 7.9402 R17 so2 of 1 1.0860 R18 P161 .0206* R19 so2 of P1.”l .00000729* The Program FRC1 00 StO 18 RC1 14-% Rcl 11 x (I — Rcl 10 Inv 1n) + Rc1 18 x Rcl 10 Inv 1n - = StO 00 R/S 1.. I , 2 ‘3 vS ((Rcl 10 Inv ln - 1) x Rcl 14 T'RC1 11 x + g (RC1 18 — RC1 l4 %-RC1 11) X .0112 +/- X RC1 10 ‘E L Inv 1n) x2 x Rcl 12 = R/S ' VP];1 + Rcl 10 Inv ln x2 x Rcl 19 qS + (Rcl 14 x (Rcl 10 Inv ln - 1) +~Rcl 11 x2 ) x L x Rcl 13 + ((1 — Rcl 1o Inv ln) +—Rcl 11) x2 x Rcl 15 T + ((Rcl 18 - Rcl 14 %-Rcl 11) x Rcl 10 Inv 1n f-Rcl 16 x2 1 x2 x Rcl 17 Input Variables L. * Initial Value Error 113 — r(qS,L) + 2 x Rcl 14 x (RC1 lO Inv ln - 1) f—Rcl 11 x2 x (1 - Rcl 10 Inv 1n) f—Rcl 11 x Rcl 15 x m x Rcl 13 x x Rcl 07 £3 r(qsr) + 2 x Rcl 14 x (Rcl 10 Inv 1n - I) f-Rcl 11 x2 E x (Rcl 18 - Rcl 14411) x Rcl 1o Inv ln + Rcl 16 E; x2 x Rcl 13 x * Rcl 17 x * Rcl 09 '5 r(L,T) + 2x (1- RC1 10 Inv 1n) % Rcl * (Rcl 18 — Rcl 14 % Rcl 11); x Rcl 1o Inv 1n + Rcl 16 x2 L x Rcl 15 x x Rcl 17 x x Rcl 08 1.4 + Rcl 0.6 = Sto 19 ms RST (The program corresponds to the error equation in a straightforward manner using calculator keyboard notation.) BIBLIOGRAPHY 114 BIBLIOGRAPHY Beaulac, M. N.; 1980; Nutrient Export Coefficients: An Examination of Sampling Design and Natural Variability Within Differing Land Uses; Master's Thesis, Michigan State University, East Lansing, Michigan. Benjamin, J. R. and C. A. Cornell; 1970; Probability, Statistics, and Decision for Civil Engineers; McGraw-Hill Book Company. New York. pp. 180—186. Berthouex, P. M.; 1975; Modeling Concepts Considering Process Per- formance, Variability and Uncertainty. In Mathematical Modeling for Water Pollution Control, pp. 405-440. T. M. Keinath and M. P. Wanielista, editors. Ann Arbor Science Publishers, Inc. Ann Arbor, Michigan. Burges, S. J. and D. P. Lettenmaier; 1975; Probabilistic Methods in Stream Quality Management; Water Resources Bulletin, 11, 115- 130. Chapra, S. C.; 1977; Total Phosphorus Model for the Great Lakes: ASCE Journ. Econ. Eng'g. Div. 103; 147-161. Chapra, S. C. and K. H. Reckhow; 1979; Expressing the Phosphorus Loading Concept in Probabilistic Terms; Journ. Fish. Res. Board Can. 36: 225-229. Chapra, S. C.; 1981; National Oceanic and Atmospheric Administration, Ann Arbor, Michigan. Personal communication. Coleman, G. and D. G. DeCoursey; 1976; Sensitivity and Model Variance Analysis Applied to Some Evaporation and Evapo Transportation Models. Water Resources Research 12 (5): 873-879. Cornell, C. A.; 1969; A Normative Second—Moment Reliability Theory for Structural Design; Seminar No. 6, December 15, 1969; Soil Mechanics Division, University of Waterloo, Waterloo, Ontario, Canada. 115 116 Cornell, C. A.; 1970; A First-order Reliability Theory for Structural Design. In Structural Reliability and Codified Design, M. C. Lind, editor. Waterloo University, Waterloo, Ontario. Cornell, C. A.; 1972; First-order Analysis of Model and Parameter Uncertainty. In Proceeding of International Symposium on Uncertainties in Hydrologic and Water Resourceg§ystems, pp. 1-31, University of Arizona, Tucson, 1972. DePinto, J. V., J. K. Edzwalk, M. S. Switzenbaum and T. C. Young; 1980; Phosphorus Removal in Lower Great Lakes Municipal Treatment Plants. United States Environmental Protection Agency, in press. Fiering, M. G.; 1967; Streamflow Synthesis. Harvard University Press, Cambridge, Massachusetts. Gardner, R. H., R. V. O'Neill, J. B. Mankin and J. H. Carney; 1981; A Comparison of Sensitivity Analysis and Error Analysis Based on a Stream Ecosystem; Ecological Modelling 12 (1981): 173-190. Garen, D. C.; 1979; An Analysis of the Effects of Parameter Uncer- tainty on Runoff Predictions Using the Stanford Watershed Model. Master's Thesis. University of Washington. Gumbel, E. J.; 1958; Statistics