r?! —.-o- >~OV—t-.—o .————--....— (- , I‘I'J‘l‘Mv-w'u‘.‘1:.‘i~.‘fi“"'M _ ., ., . ,. I. W I, éaeévI‘uabm'Ls I; Mf‘mn' ' * ';.'é~ 1 U -.- - ~ A / ' DEVELOPMENT OF A DYNAMIC SIMULATION MODEL FOR PLANNING PHYSICAL DISTRIBUTION SYSTEMS: VALIDATION Thesis for the Degree of Ph; D. MICHIGAN STATE UNIVERSITY PHER GILMOUR 1971 mm IIIIWIIO This is to certify that the thesis entitled DEVELOPMENT OF _A DYNAMIC SIMULATION MODEL FOR PLANNING PHYSICAL DISTRIBUTION SYSTEMS: VALIDATION presented by Peter Gilmour has been accepted towards fulfillment of the requirements for __E_h...2._degree in Won flaw/7294a Ma or proifeuou/ DateW 0-7639 ABSTRACT DEVELOPMENT OF A DYNAMIC SIMULATION MODEL FOR PLANNING PHYSICAL DISTRIBUTION SYSTEMS: VALIDATION BY Peter Gilmour Only recently has the potential cost saving and the competitive advantage of an integrated physical distri- bution system been realized. The aim of a recently com- pleted research project at the Michigan State University Graduate School of Business Administration was to develOp a general model which would enable the user to evaluate total cost and service capability interactions within the physical distribution system over the long term. This dynamic simulation model has been developed and is named the Long Range Environmental Planning Simulator (LREPS). Simulation, as a managerial decision making tool, has greatly increased in acceptance and use over the past decade. Problems have been approached which up until this time were considered too large to be manageable. So the extremely complex problem is quite often analyzed through the use of a simulation model. Because of the large Peter Gilmour investments of time and money needed to develop a simula- tion model of a complex situation, little energy is often left to consider the question of the validity of the final model. This dissertation is a formal study of validation of computer simulation models in general, and in particular an analysis of the performance of the LREPS model. The concept of validity for a computer simulation model is rather naturally divisable into design validity and output validity. While design validity is the estab- lishment of the reasonableness of the basic underlying processes of the model, output validity is the acceptability of the form of the model's endogenous data streams. The argument that the validity of a theory (or model) is not based on the realism of its assumptions, but on the accuracy of its predictions, is accepted. Although this means con- centration on output validity, design validity is not ignored. Testing for design validity can take the form of determining the model's face validity, that is, testing in a rudimentary way to see if the model "makes sense" in relation to the available knowledge of the situation being modeled. This type of testing is a coarse screening device at stages during the model's development and as a test at its initial completion. Three major procedures are applied to establish a model's output validity: Peter Gilmour 1. Analysis of the stability of the model over the long term. Stability is the ability of the model to generate endogenous data streams which show persistent behavior. 2. Comparison of the output of the model for some past time period with the actual historical data that was recorded for that time period. 3. Comparative analysis of the data streams generated by the model before and after signifi- cant changes in the model's major assumptions. The output of a simulation model should not be related to the nature of specific assumptions contained in the model. A reasonably comprehensive subset of possible statistical techniques is examined for use in each of these three validation procedures. Considered are (1) Graphical Analysis, (2) Analysis of Variance, (3) Multiple Comparison, (4) Multiple Ranking, (5) The F Test, (6) Correlation, (7) Regression Analysis, (8) Sequential Analysis, (9) The Kolmogorov-Smirnov Test, (10) Response Surface Analysis, (11) The Chi-square Test, (12) Theil's Inequality Coef- ficient, (13) Spectral Analysis, and (14) Factor Analysis. Due to the rather stringent assumptions included in many of the other techniques, and also due to the fact that spectral analysis considers the effects of autocor- relation, the results of this technique were relied upon most heavily. Peter Gilmour The LREPS model was subjected to the proposed validity testing. Initial face validity testing established the acceptability of a wide range of cost and service data streams to the management of the industrial sponsor. Now the three general procedures could be applied to determine the output validity of LREPS. l. The model was found to be stable over the long run. 2. The ability of the model to duplicate actual historical data was not established. 3. The model output was not significantly related to the nature of the two major assumptions embodied in the model. The only unfavorable results for LREPS was the failure to establish the predictive ability of the model. Availa- bility of sufficient historical data obtained at an adequate time increment was a necessary condition for the satis- factory completion of this validation procedure. Data of the required quality was not available. Establishing the validity of a computer simulation model is a difficult task. These three validation pro- cedures do provide a general method which, together with the particular knowledge required for face validity testing, can be used to perform this task. DEVELOPMENT OF A DYNAMIC SIMULATION MODEL FOR PLANNING PHYSICAL DISTRIBUTION SYSTEMS: VALIDATION BY I Peter Gilmour A THESIS Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Management 1971 1" 7 K2 Copyright by PETER GILMOUR 1971 ACKNOWLEDGEMENTS This dissertation developed from a need to establish the validity of the LREPS model. Since this model was con- structed by a team of doctoral candidates under the direction of Professor Donald J. Bowersox, acknowledgement must be given to 0. Keith Helferich, Edward J. Marien, Michael L. Lawrence, Richard T. Rogers, and Fred W. Morgan, Jr., for assistance, both direct and indirect, in the progress of this study. I would also like to express my appreciation to the Johnson and Johnson Domestic Operating Company for support of the LREPS project. My dissertation committee was composed of Dr. Richard F. Gonzalez, Professor of Production Management, Dr. Donald J. Bowersox, Professor of Marketing and Transpor- tation and Co-chairman of the committee with Dr. Gonzalez, and Dr. Thomas J. Manetsch, Associate Professor of Electrical Engineering and Systems Science. Dr. Gonzalez, as my academic advisor, has been of great assistance in planning my progress through the doctoral program. His knowledge of the technical aspects of computer simulation and the procedural aspects of writing a doctoral dissertation has speeded my task. Dr. Bowersox negiotated the LREPS project ii with the industrial sponsor, and for the opportunity to participate in the project I owe him thanks. Dr. Manetsch provided assistance from his experience with continuous simulation model building. Finally, I would like to thank my parents, the Reverend and Mrs. William F. Gilmour, for the direction they provided, and my wife, Laurie, for typing the drafts, correcting the grammar, and for her patience. iii TABLE OF CONTENTS ACKNOWLEDGEMENTS . . . . . . . . LIST OF LIST OF Chapter I. II. III. TABLES . . . . . . . . . FIGURES . . . . . . . . VALIDATION . . . . . . . Introduction . . . . . PhilOSOphy of Validation . . Experimental Design and Validation Scope and Method . . . . . Organization . . . . . . VALIDATION TECHNIQUES . . . . Introduction . . . . . . Sequential Analysis . . . . The Chi-Square Test . . . . Regression Analysis . . . . Analysis of Variance . . . The F Distribution . . . . Multiple Comparison . . . . Multiple Ranking . . . Theil' s Inequality Coefficient The Kolmogorov-Smirnov Test . Response Surface Analysis . . Spectral Analysis . . . . Correlation . . . . . . Factor Analysis . . . . . Graphical Techniques . . . Application . . . . . . VALIDATION OF RECENT COMPUTER SIMULATION EXPERIMENTS . . . . . . . Introduction . . . Computer Models of the Shoe, Leather, Hide Sequence . . . . . . . iv Page ii vii ix \OQGWH I-‘ 15 15 15 17 18 20 21 22 23 25 28 28 32 39 41 42 42 46 46 46 Chapter IV. V. VI. VII. Simulation of Information and Decision Systems in the Firm . . . . Portfolio Selection: A Simulation Trust Investment . . . . . Simulation of Market Processes Industrial Dynamics . . . . Computer Simulation of Competitive Response . . . . . . . Model Classification . . Unifying Validation Concepts . THE MODEL 0 O O O O O O 0 Introduction . . . The Systems Approach Model Structure . . Subsystem Detail . . Validation . . . . STABILITY OF THE MODEL . . . Introduction . . . . . . Graphical Analysis . . . . Correlation . Theil's Inequality Coefficient Spectral Analysis . . . . Stability of the Model . . . THE MODEL'S PREDICTIVE ABILITY . Introduction . . . Graphical Analysis . Analysis of Variance Multiple Comparison . The F Test . . . . Correlation . . . Regression Analysis . The Chi-Square Test . Theil' s Inequality Coefficient Spectral Analysis . . . . Factor Analysis . . The Model' s Predictive Ability SENSITIVITY OF THE MODEL'S MAJOR Introduction . . . . Graphical Analysis . Analysis of Variance . Multiple Comparison . . of Market ASSUMPTIONS Page 47 49 51 53 55 58 60 65 65 65 67 72 78 83 83 84 86 88 90 94 97 97 99 101 104 106 108 111 111 113 115 124 127 129 129 132 137 141 Chapter Page The F Test . . . . . . . . . . . . 143 Correlation . . . . . . . . . . . 143 Regression Analysis . . . . . . . . . 145 The Chi- -Square Test . . . . . . . . 148 Theil' s Inequality Coefficient . . . . 150 Spectral Analysis . . . . . . . . . 153 Factor Analysis . . . . . . . . . 173 Sensitivity of the Model's Major Assumptions 175 VIII. A GENERALIZED VALIDATION PROCEDURE . . . . 178 Introduction . . . . . . . . 178 Selection of Statistical Validation Techniques . . . . . . . . . . . 178 Comparative Value of Results . . . . . . 181 Validity of LREPS . . . . . . . . . 185 A Generalized Procedure . . . . . . . 188 BIBLIOGRAPHY . . . . . . . . . . . . . . 194 vi LIST OF TABLES Classification of Computer Simulation Experiments . . . . . . . LREPS Face Validity . . . . Means, Variances, Skewness, and Kurtosis Test of Correlation Coefficients Coefficients of Determination . Autocorrelation . . . . . . Test of Predictive Quality . . Inequality Proportions . . . Means and Variances . . . . Skewness and Kurtosis . . . . Test of Means . . . . . . Multiple Comparison Test of Means Test of Variances . . . . . Test of Correlation Coefficients Coefficients of Determination . Test of Regression Lines . . . Chi-Square Test . . . . . . Test of Predictive Quality . . Inequality Proportions . . . Similarity Matrix for Factor Loadings vii Page 59 80 86 87 88 89 89 90 102 103 105 107 109 110 110 112 114 116 117 126 Means . . . . Variances . . . Skewness . . . Kurtosis . . . Test of Means . Multiple Comparison Test Test of Variances of Test of Correlation Coefficients Coefficients of Determination Test of Regression Lines . Chi-Square Test . Test of Predictive Quality Inequality Proportions Similarity Matrices for Factor Use of Statistical Techniques Assumptions of Statistical Techniques Indices of Validity viii Loadings Page 138 138 139 139 140 142 144 146 147 149 151 152 154 174 180 184 190 Figure 1.1. 2.1. LIST OF FIGURES LREPS Systems Design Procedure . . . . . Successful Application of Response Surface Analysis . . . . . . . . . . . . . Unsuccessful Application of Response Surface AnalYSiS O O O O O O O O O O O O O The Box-Wilson Method of Steepest Ascent . . General Description of Firm—Distribution Audit Stages of the Physical Distribution Network . LREPS Systems Model Concept . . . . . . . Sales Weight--Three Products for 10 Years . . Estimated Power Spectrum--Sales Weight for Product 1 . . . . . . . . . . . . . Estimated Power Spectrum--Sales Weight for Product 2 . . . . . . . . . . Estimated Power Spectrum--Sales Weight for Product 3 . . . . . . . . . . . . . Simulated and Actual Dollar Sales--Product l . Estimated Power Spectrum--Actua1 Dollar Sales for Product 1 . . . . . . . . . . . Estimated Power Spectrum--Simulated Dollar Sales for Product 1 . . . . . . . . . Coherence of Actual and Simulated Dollar sales--Pr0dUCt l o o o o o o o o 0 0 Phase of Actual and Simulated Dollar Sales-- Product 1 . . . . . . . . . . . . . Gain of Actual and Simulated Dollar Sales-- Product 1 . . . . . . . . . . . . . ix Page 10 31 31 33 68 69 73 85 91 92 93 100 119 120 122 123 125 Figure 7.1. 7.2. Plan A and Plan B--Dollar Sales for Product 1 . . . . . . . . . . . . Plan A and Plan C--Dollar Sales for Product 1 . . . . . . . . . . . . Plan A and Plan D-—Dollar Sales for Product 1 . . . . . . . . . . . . Plan A and Plan E--Dollar Sales for Product 1 . . . . . . . . . . . . Estimated Power Spectrum of Plan A—-Dollar Sales for Product 1 . . . . . . . . Estimated Power Spectrum of Plan B--Dollar Sales for Product 1 . . . . . . . . Estimated Power Spectrum of Plan C--Dollar Sales for Product 1 . . . . . . . . Estimated Power Spectrum of Plan D--Dollar Sales for Product 1 . . . . . . . . Estimated Power Spectrum of Plan E--Dollar Sales for Product 1 . . . . . . . . Coherence of Plan A and Plan B--Dollar Sales for Product 1 . . . . . . . . . . Coherence of Plan A and Plan C--Dollar Sales for Product 1 . . . . . . . . . . Coherence of Plan A and Plan D--Dollar Sales for Product 1 . . . . . . . . . . Coherence of Plan A and Plan E-—Dollar Sales for Product 1 . . . . . . . . . . Phase of Plan A and Plan B--Dollar Sales for Product 1 . . . . . . . . . . . . Phase of Plan A and Plan C--Dollar Sales for Product 1 . . . . . . . . . . . . Phase of Plan A and Plan D--Dollar Sales for Product 1 . . . . . . . . . . . . Page 133 134 135 136 156 157 158 159 160 161 162 163 164 165 166 167 Figure 7.17. 7.19. 7.20. Phase of Plan A and Plan E--Dollar Sales for Product Gain of Product Gain of Product Gain of Product Gain of Product 1 Plan 1 Plan 1 Plan 1 Plan 1 A and Plan and Plan and Plan and Plan xi B--Dollar C--Dollar D--Dollar E--Dollar Sales Sales Sales Sales Page 168 169 170 171 172 CHAPTER I VALIDATION Introduction The development and use of a mathematical model has become a popular means by which a solution to a problem is attempted. But when the quantitative relationships in the model become so complex that a mathematical solution is not possible or extremely difficult to obtain, computers and numerical methods offer a feasible alternative. This approach is simulation. The aim of computer simulation can basically be described as system design or system analysis. System design (a normative approach) is an attempt to find the combination of exogenous variables and parameter values that will Optimize a specified endogenous variable, possibly subjected to the attainment of specified limits on other endogenous variables. System analysis (a positive approach) is an explanation of the relationship between the endogenous variable and the controllable exogenous variables and para- meters. Simulation allows the analyst, in his drive for greater realism, to develop a much more detailed and com- plex model than he could using an analytical technique. But a simulation model is a symbolic or numerical abstrac- tion of the real process, and the danger exists that the limitations and assumptions of the method will become hidden (or not adequately considered) by its complexity. A simulation model may be constructed of a firm's physical distribution system. Sales forecasts and product line at that time form an integral part of the model. If the model is used over a period of years without updating these fac- tors, the output of the model may well be of no value to the firm. Validation of the operation of a simulation model is as desirable as the validation of the operation of any other scientific experiment. While the basic problem of validation is no different for a simulation experiment, the complexity of the model is such that the processes by which its validity is established are quite different. With most scientific experiments it is rather easy and inexpensive to carry out several independent replications. Due to the complexity of most simulation models, the expense of performing more than one experiment is often prohibitive, while longitudinal observations during this one experiment are autocorrelated. The time and effort needed to develop and make operational a computer simulation model are at present so great that the problem of its validation has generally been neglected. A common attitude seems to be that crude judgmental and graphic methods1 are preferable to completely ignoring validation. Philosophy of Validation To validate a model in a strict sense means to prove that the model is true. That truth is a rather elusive concept can be seen in the difficulty one has in developing a set of criteria for differentiating between a model which is "true" and one which is "not true." Fortunately most simulations are seldom concerned with proving the "truth" of the model (an exception might be Clarkson's model to simulate the behavior of a bank's investment trust officer).2 Popper,3 therefore, suggests that efforts should be concentrated on determining the degree of confirmation rather than verification. Models should be subjected to tests, the results of which could be negative with respect to the aims of the model. Each such test passed will add confidence to our assumption that the model behavior confirms the behavior of the real system. "Thus instead of verification, we may speak of gradually increasing confirmation of the law."4 Van Horn describes validation as the "process of building an acceptable level of confidence that an inference about a simulated process is a correct or valid inference for the actual process."5 The focus for validation should be to understand the input-output relationships in the model and to be able to translate "learning" from the simulation to "learning" about the actual process. Naylor and Finger6 basically agree and provide some insight as to how this focus can be operationalized. The computer simulation model and its output are based on inductive inferences about behavior of the real system in the form of behavioral assumptions or Operating characteristics. The real situation under study is usually so complex that the construction of an exact model is not possible. Another factor besides complexity which makes computer simulation the desirable method of analysis is the random nature of one or more of the exogenous variables. Therefore: The validity of the model is made probable, not certain, by the assumptions underlying the model. . . . The rules for validating computer simulation models and the data generated by these models are sampling rules resting entirely on the theory of probability.7 Three major methodological positions on validation are summarized by Naylor and Finger: rationalism, empiri— cism, and positive economics. Rationalism . . . Models or theory are a system of logical deductions from a series of synthetic premises of unquestionable truth. Validation is the search for the basic assumptions underlying the behavior of the system. Empiricism . . . The opposite View to rationalism is that empirical science is the ideal form of knowledge. The model should be constructed with facts, not assumptions. So any postulates or assumptions which cannot be independ- ently verified should not be considered. Positive Economics . . . This view championed by Milton Friedman is that the validity of a model depends upon its ability to predict the behavior of the dependent variables and not on the validity of the assumptions on which the model rests. These three positions are combined by Naylor and Finger into a multi-stage verification procedure, each stage of which is necessary but not sufficient. Stage 1 is the formulation of a set of postulates or hypotheses describing the behavior of the system. This involves specification of components, selection of variables, and formulation of functional relationships using observation, general knowledge, relevant theory, and intuition. Stage 2 is the attempt to verify the assumptions of the model by statistical analysis, and the final stage is to test the degree to which data generated by the model conforms to observed data. The multi-stage verification procedure attempts to include all major ways in which to build con- fidence in a model. A final view on validation is that of Fishman and Kiviat8 which is a narrower concept because they divide simulation testing into three parts. (1) Verification insures that a simulation model behaves as an experimenter intends. (2) Validation tests the agreement between the behavior of the simula- tion model and a real system. (3) Problem analysis embraces statistical problems relating to (the analy- sis) of data generated by computer simulation. Experimental Design and Validation It is difficult to distinguish where experimental design ends and validation begins. The process of computer simulation experimentation is interative: model construc- tion, model operation, validation, and experimental design. If the validation criteria are not satisfied, the process is repeated making adjustments until validity is indicated. The aim of a simulation experiment may be stated as the desire to explore and describe the response surface over some region in the factor space (system analysis) or to optimize the response over some feasible region in the factor space (system design). In order to achieve this aim in the most economical manner, careful attention must 10 be paid to experimental design. The types of experiments for which the model is used will depend upon the particular requirements that the model was designed to meet.11 But the types of problems that can be associated with experi- mental design are universal. A single run of a computer simulation provides an estimate of pOpulation parameters. Because the model contains exogenous random variables, this estimation, or sample of one, will not exactly equal the pOpulation para- meter. However, the larger the sample or the more runs that are made, the greater is the probability that the sample averages will be very close to the pOpulation averages. The convergence of sample averages to population averages with increasing sample size is called stochastic convergence. Because stochastic convergence is slow, methods other than increasing the sample size may be required. Another problem is that of size. The number of cells required for a full factorial experiment becomes very large even with few levels of a moderate number of factors. If a complete investigation of all factors is not essential, fractional factorial designs can ameliorate the problem. Yet another common problem associated with experi- mental design arises from the desire to observe many different response variables in a given experiment. It is often possible to bypass the multiple response problem by treating an experiment with many responses as many experiments each with a single response. Or several responses could be combined (e.g. by addition) and treated as a single response. However, it is not always possible to bypass the multiple response problem; often multiple responses are inherent to the situation under study. Unfortunately, experimental design techniques for multiple response experiments are virtually nonexistent.12 This dissertation will be concerned only with validation. The other elements of the interative process of computer simulation experimentation are discussed in detail elsewhere.13 Sc0pe and Method From the rather diverse views on validation examined earlier, a position must be taken. The validity of a computer simulation model can be shown by the model's ability to satisfy three distinct validation procedures. The output of a simulation model is in the form of a time path for each of the endogenous variables. The first validation procedure is to determine if these time series are statistically under control. Being under con- trol broadly means that over the long run the time path will show convergence prOperties or else the rate of change of the endogenous variable under study will be proportional to or acceptable to the rate of change in all other endogenous variables. Simulation models can be broadly classified as positive or normative. Positive models must by definition show reasonable correspondence to the real system, while normative models indicate a desirable level of operation for the real system which may or may not be currently achieved. But is it reasonable for a model to show the desired state and not to indicate how to reach this state from the current real state? If the normative model was built by changing starting conditions and parameter values of the positive model, management would be provided with the means to move from the current actual position to the more desirable normative position. The normative model should then be built from the basis of the positive model. For the positive simulation model, then, the second vali- dation procedure is to compare the model output over a past time period to the actual historical data from the same time period. The assumptions upon which a model is based often cannot be examined beyond the level of face validity. But the sensitivity of these assumptions can be examined, and this is the third validation procedure. If the values of the key endogenous variables are sensitive to the nature of the assumption, then managerial knowledge and intuition must be applied to confirm the assumption, or else the model must be restructured to eliminate or replace the assumption. Organization A research project to develop a long-range planning model for physical distribution has been established at the Michigan State University Graduate School of Business Administration. The project has two broad aims: to develop the model, which has been done, and to use the model and adaptions to it to provide management with information about the physical distribution system over the long run. Five dissertations will describe in detail the project develop- ment as shown in Figure 1.1. The scope of this dissertation is delineated in the figure although other aspects of the project will be briefly described for the sake of continuity. Attitudes towards the validation of computer simula- tion models and the general position to be taken in this 10 START A V RES OBJS F PROB DEFN & FEAS STUDY V MATH MODEL DEVELOPMENT [ L I‘ COMPUTER MODEL FORMULATION / p///// 2 222 2 2:121?) ///// (W 3 //fl///// / ,\ PROCESS EXP DESIGN & MODEL USAGE BLOCK [ 1 I/O CRI BLOC PARAM. C ) Figure l.1.--LREPS Systems Design Procedure.l 1D. J. Bowersox, et a1., Dynamic Simulation of Physical Distribution Systems, Monograph (East Lansing, Michigan: Division of Research, Michigan State University, Forthcoming). 11 dissertation were discussed in this introductory chapter. Many different statistical methods can be used in order to eStablish the validity of a computer simulation model. A reasonable subset of these statistical techniques is discussed in Chapter II without attempting at this point to establish the relative merit. Chapter III is a brief description of several celebrated simulation models and an evaluation of the attempts made by the model builders to validate their models. The simulation model is described in Chapter IV. The degree to which the model's face validity has been established is discussed. Also given in this chapter is the manner in which the model and its output will be used- in order to satisfy the general validation procedures outlined in Chapter I. The next three chapters deal in detail with each of these three general validation procedures. From the set of statistical techniques detailed in Chapter II are selected those most suitable for stability analysis (Chapter V), for the comparison of simulation output and actual data (Chapter VI), and for sensitivity analysis of the model's major assumptions (Chapter VII). After a technique is found to be suitable for a particular valida- tion procedure, the results of its application will be analyzed in the light of the assumptions inherent in the technique. 12 The final chapter (Chapter VIII) is a summary statement of the validity of the simulatiOn model. The question of establishing a general validation procedure for computer simulation models is also explored. CHAPTER I--FOOTNOTES lOne basic procedure is to determine the model's face validity. This is a necessary, but not sufficient, condition for validation, which is discussed at some length in Chapter IV. 2G. P. E. Clarkson, Portfolio Selection: A Simulation of Trust Investment (Englewood Cliffs, New Jersey: Prentice-Hall, Inc., 1962). 3K. R. POpper, The Logic of Scientific Discovery (New York: Basic Books, 1959). 4R. Carnap, "Testability and Meaning," Philosophy of Science, Vol. 3, No. 4 (October, 1936). 5R. Van Horn, "Validation," The Design of Computer Simulation Experiments, ed. by T. H. Naylor (Durham, N.C.: Duke University Press, 1969), pp. 232-251. 6T. H. Naylor and J. M. Finger, "Verification of Computer Simulation Models," Management Science, Vol. 14 (October, 1967), pp. 92-101. 7 Ibid., p. 93. 8G. S. Fishman and P. J. Kiviat, Digital Computer Simulation: Statistical Considerations (Santa Monica, Calif.: The Rand Corporation) RM-3281-PR, 1962. 91bid. 10T. H. Naylor, D. S. Burdick, and W. E. Sasser, Jr., "The Design of Computer Simulation Experiments," The Design of Computer Simulation Experiments, ed. by T. H. Naylor Tfifirham, N.C.: Duke University Press, 1969), PP. 3-35. 11R. T. Rogers, "Development of a Dynamic Simulation Model for Planning Physical Distribution Systems: Experi- mental Design and Analysis of Results"(unpublished Ph.D. dissertation, Michigan State University, Forthcoming). 12Naylor, Burdick and Sasser, p. 30. 13 14 13D. J. Bowersox, et a1., Dynamic Simulation of Physical Distribution Systems, Monograph (East Lansing, Michigan: Division of Research, Michigan State University, Forthcoming). CHAPTER II VALIDATION TECHNIQUES Introduction Three types of analysis for the validation of the simulation model are to be performed: 1. Stability testing. 2. The comparison of actual historical data with the simulation output for the same time period. 3. The comparison of two simulation data streams in order to test the sensitivity of some major assumptions made during model development. Many statistical and graphical techniques have been prOposed and used in an attempt to validate the output of computer simulation models.1 In order to determine which of these techniques will be most suitable for each of the three types of analysis, the nature of the techniques must be examined. This chapter presents what is hopefully a large subset of all possible validation techniques. Sequential Analysis Most decision-making procedures are carried out with the sample size predetermined and fixed. It is possible that this sample size is larger than it need be resulting in superflous information and unnecessary expense. But this 15 16 can be avoided if after each observation is examined the decision is made to: 1. Accept the hypothesis. 2. Reject the hypothesis. 3. Postpone a decision on the hypothesis and make another observation. Together with this variable sample size are required managerially determined values of d (producers risk) and B (consumers risk) to make the system operational. The decision rules for testing Ho:u=uo and H1:u=u1 are: Y H f(Xi11-J1) i=1 8 ‘ 1. If < ——— ; Accept H :u=u (Reject H1), — o o y l-a .11 f(XirUO) 1=1 Y H f(XiIU1) i=1 1-8 2. If 1 ; Reject Ho:u=uo (Accept H1), y a .H f(xi,uo) 1=1 II f(Xirll1) B 1=1 l-B . 3. If -——— < < ——— ; Take another observation, l-d y a .H f(XlIUO) 1=1 . 2 where y is the number of observat1ons taken. This general statement of the procedure for sequential analysis provides a method for deciding at the ith observation 17 whether to stOp sampling and accept or reject the hypothesis under consideration or whether to continue sampling by making the (i+l)th observation. At observation i the division of the iedimensional space of all possible observations into the three mutually exclusive and exhaustive sets is the basic problem of sequential analysis. The method has several applications for the analysis of the results of computer simulations. Procedures for testing the position of the true mean in relation to a hypothesized mean and for comparing the means of k experi- ments with a control mean have been developed by Paulson.3 A heuristic approach to Bechhofer and Blumenthal's method4 of selecting the population with the largest mean is described by Sasser, Burdick, Graham, and Naylor.5 The Chi-Square Test The Chi-square statistic can be used to measure the discrepancy between observed and expected frequencies. If x2=0, perfect agreement between observed and expected frequencies exists while the larger the value of x2, the greater the discrepancy between the two. 18 The sampling distribution of x2 is approximated by _ _ 2 the Chi-square distribution Y = YQ()(2)35(V 2) e 7x _ _ 2 Y xv 2 e %X o where v is the number of degrees of freedom and y0 is a constant related to v such that the total area under the curve is unity. When using the Chi-square Test, expected frequencies are develOped from a hypothesis Ho‘ It is reasonable to expect the calculated Chi-square value to be less than a critical value such as x2 which is the critical value 95 at the .05 significance level. If this turns out to be the case, H0 is accepted at this level of significance. Other- wise it is rejected. Caution should be exercised if the correspondence 2 between observed and expected is too close. If x is less than x2 0 at the .05 significance level, the agreement is 5 too great for the degree of significance chosen. Regression Analysis6 It is often meaningful to be able to express the relationship between the variable under study (the dependent I variable) and other variables which have influence over it (the independent variables). The most commonly accepted method of determining this relationship is that of least squares. A line, curve or plane is fitted to the data in such a manner so as to minimize the vertical squared dif- ference between the plotted data value and the value 19 determined by the function being fitted. The result is then the "best fitting" line, curve or plane. While this function shows the relationship between the independent variables and the dependent variables, it also enables predictions of the dependent variable to be made. The approach is illustrated by the most simple example of fitting a straight line to n pairs of values of two variables x and y. Let ei be the error or difference between the true sample value of y and the value of y (9) determined by the function of the straight line being fitted (9 = a + bx), i.e., e. = yi - 9(i=l,...,n). For J. O 2 O I I all observat1ons ei must be m1n1mized. MIN 28.2 = 2(Y. - Y)2 = 2(Y. - a - bx.)2 1 1 1 1 Take the partial derivatives with respect to a and b, set them equal to zero, and obtain the normal equations: ZY. na + 82x. 1 . l 2 2x.Y azx. + 82x. .1 l l Solve for a and B nZXY - (ZX)(ZY) nZX2 - (2X)2 U‘> ll 20 which are the least-squares point estimators of a and b. The least squares line fitted to the data is then y = a + Bx. Analysis of Variance Analysis of variance is used to test if two or more samples differ significantly with respect to a particular (usually qualitative) property. If observations are classified on the basis of a single prOperty, the ratio of the variance between the groups and the average variances within the groups (the F ratio) is used to determine if a significant difference does exist between the groups with respect to this prOperty. To test the null hypothesis, Ho’ that the expected prOfit from each of a number of plans is equal, this decision rule is set up. If F 3 Fa.k-l.k(n-l)’ where d is the significance level, k is the number of plans con- sidered and n is the number of replications per plan, reject Ho' otherwise accept it. While if H0 is accepted, the differences in expected profit between the plans is only due to random fluctuation, if H0 is rejected, further analysis (such as multiple comparison or multiple ranking) is needed to quantify this significant difference between plans. Given7 Xij = Total profit from the ith replication of the jth plan iflj = Average profit for jth plan over all replications Y.. = Grand average profit for all plans over all replications. 21 Degrees Sigigiig: Sum of Squares of Mean Square Freedom Between k SS plans = n2 _ 2 _ _SS_p1ans Plans j=l(X.j X..) k 1 MSp— k-l n k . Error SS error = Z Z ( i'-—° )2 k(n-l) Mse-S:(:£i?r i=1 j=l 3 3 n k _ 2 TOTAL SS total = Z Z (X..-X..) nk-l i=1 j=1 13 The value of F obtained ($32) is compared to the apprOpriate value from the F table in the manner indicated. The F Distribution To compare the variances of small samples, the F distribution is used. f(F) where = C The function is F5: (111 - 2) (n2 + n lF);5(n1 + “2) 22 n1 is the number of degrees of freedom for the'xzn1 distribution, and n2 is the number of degrees of freedom for the xzn2 distribution. The F statistic is equal to the ratio of the sample variances. Given a level of significance and the two sample sizes (from which can be determined the degrees of freedom), the critical value of F can be read from a table of the F distribution. By comparing the value of the F statistic to the critical value of F, the hypothesis that the variances are significantly different can either be accepted or rejected. Multiple Comparison Analysis of variance uses the F Test to determine if a significant difference exists between a statistic from different samples. If homogeneity does not exist, the method of multiple comparison quantifies the difference, while the method of multiple ranking8 (to be discussed) directly identifies the "best" sample or plan on the basis of the measured statistic. Both multiple comparison and multiple ranking must follow analysis of variance for another reason--the computational reason that both these methods use the mean square of the error. Use of confidence intervals rather than tests of hypotheses is a characteristic of the method. 23 Tukey9 develOped simultaneous confidence intervals for the differences between all pairs. Continuing with the notation used in the section on analysis of variance, the confidence intervals are: (200 -)_{. )iqu Vb'd—fl—e‘ j,J=l’2’ so. I k I where the q statistic can be obtained from tables and v is the number of degrees of freedom. If Student's t statistic is used, the intervals are not all simultaneously true at (Xi - 353-) i t mgfe— j,J=l'2, 000 p k. A somewhat different approach is taken by Dunnett.lo Instead of taking all possible pairs, he compares the con- trol statistic (usually a result Of the present Operation of the system under study) to all alternative values of this statistic. (x..->?.):d/2—”—4-S—% j=2,...,k where Y'c is the control sample statistic (mean) and d is Dunnett's t statistic with k(n-l) degrees of freedom for a one factor eXperiment. Multiple Ranking 11 This is a method to find the "best" plan. It is a more direct method than multiple comparison, answering 24 questions such as, "With what probability can it be said that a ranking of sample means represents the true ranking of the population means?" Bechhofer, Dunnett, and Sobel describe a two- sample multiple decision procedure for ranking means of normal pOpulations with a common unknown variance. Take a first sample of N1 observations from each of the k popu- lations or plans under investigation. Calculate the mean square of the error (MSe) which is an unbiased estimator of the population variance having k(n-l) degrees of freedom for n = N1“ Now take a second sample Of N - N 2 1 Observations from each of the k populations. N = SUP [ N [2MSe(h/6*)21| 2 1’ where [2 MSe(h/6*)2] is equal to the smallest integer greater than or equal to the rational number 2MSe(h/6*)2. The values of h are tabulated, and 5* is the smallest difference between expected values that is acceptable. So if 2MSe(h/6*)2 is less than or equal to N1, a second sample is not taken, and N2 is set to N1. The next step is to calculate the overall sample mean (Yj) for each popu- lation. N X X.. j=l,2’ coo ’ k i 25 _. _ — < .00 < 0 denote ranked values of X3 by X[l] < X[2] X[k] 2' Rank pOpulations according to observed ; and select that with the largest i . [x] Theil's Inequality Coefficient When comparing predicted results against actual outcomes, it is desirable to be able to establish the quality of the prediction. One way to do this is to calculate Theil's Inequality Coefficient.12 The mean-square prediction error for a set of n Observations is equal to tilt-I P44: F3 I p where (Pi'Ai) stands for a pair of predicted and observed values. Theil calls its square root the root-mean-square prediction error (RMS). This term is expressed in the same dimensions as the predictions and realizations. If the RMS prediction error is divided by the square root of the mean square successive difference of the realizations, the result is the inequality coefficient (U) of the n pa1rs (Pi’Ai)' / Z(P.-A.) l l U = 2 EA 1 26 If U = 0, the forecasts are perfect, as Pi = Ai for all i. While it should be observed that U = 1 indicates a pre- diction error equal tO that Obtained by the naive method of no-change extrapolation, it should also be noted that U has no finite upper bound. Worse methods of forecasting than simple extrapolation are possible. Comparison of the technique being used and extrapolation provide valuable information. Because the denominator of the inequality coeffi- cient is a factor only to provide the proper unit of measurement, attention can be centered on the numerator. The square of the numerator can be decomposed into three terms, each of which expresses the extent to which a particular kind of prediction error is present. 1 2_—_—2 _ 2 HZ(Pi-Ai) —(P A) +(s s) P A + 2(l - r)SPS A where P and A are the means: '17:}- SP. A=$2A. n n 1 8p and Sa are the standard deviations: 27 and r is the correlation coefficient of the predicted and realized changes: Errors leading to positive values for the first term of the decomposition are errors of central tendency: errors leading to positive values for the second term are errors of unequal variation; and errors due to incomplete covari- ation result in positive values for the decomposition's final term. If each of these three terms is divided by their sum, the resulting inequality proportions--Um the bias prOportion, US the variance prOportion, and UC the covariance prOportion--provide additional information as to the quality of the prediction and an indication as to the direction in which effort should be applied for improvement. Um = (P'- A)2 -l- 2(P. - A72 n 1 1 2 s_ (SP-SA) U ‘1 2 — 2(P. - A.) n 1 1 28 The Kolmogorov-Smirnov Test The Kolmogorov-Smirnov Test13 is a nonparametric test to determine if a given sample is a sample from a particular distribution function. A Chi-square Test can also be developed to supply the same information. Order the given sample, Xi’ in ascending order. Find F(Xi) for each Xi as the area below Xi in the theoretical distribution being considered. Where Number of Xi i t Fn(t) = n , Fn(Xi+) is the right-hand limit at Xi of Fn(t) and Fn(Xi-) is the left-hand limit at Xi of Fn(t). Dn is then equal to the maximum of the absolute values of Fn(Xi) — Fn(Xi+) or Fn(xi) - Fn(Xi-). Now, given that X = nDn and where n is the sample size, look up P which is tabulated. Finally the null hypothesis that the sample is a sample from this theoretical distri- bution is rejected if P is no larger than a preassigned number a. Response Surface Analysisl4’15 When the response y is a continuous function of a single factor x, the method of response surface analysis 29 is relatively easily applied to find the maximum or minimum of this function in the practical range of interest. Several conditions must be satisfied before this technique can be effective. It must be assumed that the response function can be approximated by a simple polynomial over the range of interest and that the function has only a single maximum (or minimum) within this range. SO the key to this method is seen to be the managerial skill with which the relevant area of interest is selected. The general area of the extreme point must be known. The general aims of the procedure are to find the extreme point and also to determine the sensitivity of the response function in the area of the extreme point. Make several Observations of y for different values of x within a selected subregion. Within this sub- region, if it is assumed that the response function can be approximated by a straight line, the lepe Of this line can indicate in which direction x should change for the next Observation of y. If the slope is relatively steep, the indication is that the extreme point of the function is still a reasonable distance (in terms of x) away, while if the lepe is small, the extreme point is either very near or very far. SO ifthe slope is small, several more Observa- tions of y are taken for a given change in x. If this new lepe declines, the Optimum is indeed close by; but if the new slope increases, the Optimum is still some distance 30 away. When the area of the extreme point is reached, a second-degree polynomial (y = a + bx + cxz) is fitted to the Observations made in the region. The first derivative Of this function will provide the extreme point, and the second derivative will provide the relative sensitivity of the function in the area of the extreme point. The size of the change in x used is important. This change is determined from a general knowledge of the process being examined. But at the same time, the change in x must be such that the resulting change in y is greater than can be explained by experimental error, otherwise, a poor estimate of the slope will result. When fitting the polynomial, the size of the change in x should be held constant. When the response y is dependent on more than one factor, the principles of the method remain the same, but now more than one path to the extreme point exists. The question now becomes how to reach the region of the Optimum most economically. Considering two factors, one method is to hold the first factor constant and vary the second until the response is at an Optimum. Hold the second factor constant at this level and vary the first until the response to it is Optimal. Continue this procedure until the response is Optimal for both factors simultaneously. This method and a response surface for which the method would not work are shown in Figures 2.1 and 2.2 respectively. 31 Factor A Factor Figure 2.1.--Successful Application of Response Surface Analysis. Factor A & Factor B Figure 2.2.--Unsuccessfu1 Application of Response Surface ' Analysis. 32 The Box-Wilson Method16 of steepest ascent over- comes this disadvantage to the one-at-a-time method. The greatest ascent at any point is obtained if movement is made in a direction perpendicular to the contour line through that point. TO find the contour, a small number Of observations must be made in a subregion which is con- sidered near the maximum and to these points is fitted a linear function or plane. Movement is made in a perpen- dicular direction, and if a marked gain in the reSponse function is observed, further observations are made in this new region and a new plane is fitted. This procedure is repeated until the fitted plane levels out (the increase in the response along the path of steepest ascent is diminishing) at which point the response surface is mapped with a second degree equation. Classical methods then determine the extreme point and its sensitivity. The method is illustrated in Figure 2.3. Spectral Analysis Because all data generated by time series is autocorrelated to some degree, a method of analysis which will account for this autocorrelation is desirable. After transforming the data from the time domain to the fre- 17'18 is a method by which quency domain, spectral analysis the autocorrelation can be quantified and evaluated. Information about the magnitude of deviations from the average level of a given activity and information 33 Factor A / Factor B Figure 2.3.—-The Box-Wilson Method of Steepest Ascent. about the period or length of these deviations can be Obtained. Let {X teT} be a generating process or ensemble t! from which a sample time series {X t=l,2, ... ,n} is t’ taken. Due to the stochastic nature of the system, analysis Of {Xt} cannot determine exactly the value Of the series at any particular time, but the approximate structure of the generating process can be determined. This is done by Obtaining estimates of the parameters which describe the generating process: 34 Mean Of the Process E[Xt] “t Variance Of the Process OX2 Euxt - ut)2] Autocovariance of the y(t,s) Process between Observations at times t and s E[(Xt-ut)(Xs-us)] These parameters can be estimated from M independent samples from {Xt} i.e., {Xt, k=l,2, ... , M}. One great advantage of computer simulation is that in order to cut across the ensemble at a particular t in this fashion all that is required is an alteration in the value of the seed of the pseudorandom number generator. As an example, cut across the ensemble at t = t0 in order to calculate the ensemble average estimating Estimates of OX2 and y(t,s) are Obtained in the same way. But spectral analysis is usually performed on time series which have first and second moments that are not a function of time.19 There is no trend inthe mean or variance of the series, and the autocovariance is a function of the time lag only. Such a series is called stationary. From a single time series can be obtained 35 n — l x = — Z X n t=1 t n 52 = i Z (Xt- X)2 t=1 and l n-I _ Ct = 3:?I 2_ (Xk - X)(xt+T 'X) t—l where y(0) = 02 and CO = 52 which can be used as estimators for EIXt] = u 2 _ 2 and Euxt- 11) (XS- m1 = Y(t-s) for all t,s y where T = t - s. T The power spectrum is defined as the Fourier cosine transformation of the autocovariance (X) ¢(w) = Y + 2 E y cos (wt) O < w < n O T=1 t - _ 36 The spectrum can be regarded as the "decomposition" of the variance of a time series. This is because the auto- covariance is recovered from the spectrum by the inverse transformation 1 n = F J ¢(w) cos (wT) dw : = 0.1.2, ... and in the special case when T==0, yo is equal to the variance (02). From the power spectrum is Obtained the squared amplitude associated with oscillations at different frequencies w, and the process is thus characterized in terms of independent additive contributions to the variance from each m. So in order to Obtain this information, an estimate of the power spectrum must be Obtained. Estimators of power spectra usually have the form f(wj) = loco + 2 i=1 ATCT cos (ij) where f(wj) is an estimate of the power spectrum averaged 11' 0 over a band of frequencies centered at wj, and ”j = 51' j = 0,1,2, ... , m, II are weights, and m is the number Of frequency bands to be estimated. The values of m and n should be selected with care in order to balance the con- flicting requirements of resolution and statistical stability. Granger and Hatanaka20 recommend a sample size of at least one hundred. 37 The spectrum is analyzed by plotting f(wj) against wj' Two important statistical prOperties are associated with the spectrum if xt is normal. The first is that Spectral estimates at nonadjacent frequencies are statistically independent. SO confidence intervals can be used. The second is that if the control or theoretical Spectrum (¢(wj)) is reasonably smooth, the distribution of f(w') sz 2n ¢ w is approximately —E— with K = —fi-degrees Of freedom. 3' With this knowledge, confidence intervals can be con- structed around ¢(wj)21, the succession of which at frequency points wj(j=0,l, ... , m) combine to form a confidence band. Now the question, does the spectrum for any plan under consideration lie within the confidence band of the control Spectrum, can be answered. An extension of this type of analysis is the com- ¢l(w.) parison of two spectra. The ratio Pj = $37647-1s used f(w.) 3f1(w.) instead of $75?7' Def1ne Rj to be equal to {EYEgT and R. - - - =.1 = =22 Obta1n the F statistic Fkl,k2 Pj where k1 k2 m degrees of freedom. The 95% confidence interval for Pj is R. ..1 = ' then P(F.975,k ,k < P. < F.025,k ,k ) .95 and solv1ng l 2 3 1 2 R. R. for P. P(F 3 < P. < F 1 ) = .95 sets up the 3 .025,kl,k2 3 .975,kl,k2 R. R. simultaneous confidence band P(F 3 < P. < F 3 ) .001,kl,k2 3 .999,tl,k2 38 = .95. If P = 1 lies within the desired simultaneous confidence band for P for all values of o :_w i n, the hypothesis that the two spectra under consideration are not significantly different can be accepted. Spectral analysis has been used to decompose the variance of a time series into its frequency components. A rather different application of the technique is to obtain an estimate of the variance as a whole for a given time series. Because of the autocorrelation 82 does not have a Chi-square distribution with (n-l) degrees Of free- dom, but as 02 can be expressed in terms of o, so can 82 be eXpressed in terms of f. Blackman and Tukey22 state 2- ...}. that S — C — m [ n f(n) + Z f(w.) + -———J follows a j=1 3 2 Chi-square distribution with K degrees of freedom, where m-l f(n) 2 [f2(0) +2.: f(wj) + T] K = 2 1=1 2 ' E m- m [15(3)] +2 [f(w.)]2+ [f(n)] j=1 3 2 For the comparison of two time series, the F statistic is 2 2 nlsl /01 k1 . A confidence interval can be set up for any 2 2 n232 /°2 k2 desired level of significance about this statistic, and 'then statements about the two variances can be made after 012 solving for ——7. o 2 39 Spectral analysis is a significant method of analysis of the output of computer simulations because it does account for autocorrelation. Correlation Correlation theory can most easily be examined in terms of regression analysis. When all observations fall on the regression line develOped from the data, perfect correlation exists between these variables. For two variables x and y, direct correlation exists if as y increases so also does x, While inverse correlation exists when x increases with a decrease in y. Perfect correlation occurs when both the amount and direction Of change is identical for both variables, or the regression equation of x on y is identical to the regression of y on x. The standard error of estimate is a measure of dispersion about the regression line. This statistic has the same prOperties as the standard deviation. The standard error of y on x is 2 S = //' (Y - Yest) y.x N ' A good measure of linear correlation is the coefficient of correlation (r). The total variation in y can be expressed as the sum of the unexplained variation and the explained variation: 40 2 Total Variation in y = 2(Y — Y)2 = 2(Y - Y ) est 2 + 2(Ye - Y) . st From this expression r is developed as plus or minus the square root of the eXplained variation as a fraction of the total variation. An advantage of r is that it is dimensionless. - a 2 2(Y .—.Y) r=i/ eSt_2 —l< 2 3 )= .95 2 X .025,k/k X .975,k/k 45 22R. B. Blackman and J. W. Tukey, The Measurement of Power Spectra (New York: Dover Publications, Inc., 1958). 23G. U. Yule and M. G. Kendall, An Introduction to Ehe Theory of Statistics (London: Charles Griffin and Co., Ltd}, 1950), p. 638. 24A large bibliography is provided in: H. H. Harman, Modern Factor Analysis (Chicago: The University Of CHICagO Press, 1960). 25R. M. Cyert, "A Description and Evaluation of Some Firm Simulations," Proceedings of the IBM Scientific Com- pptingSymposium on SimulatiOn Models and GamIng (White P ains, N.Y.: IBM, 1966). CHAPTER III VALIDATION OF RECENT COMPUTER SIMULATION EXPERIMENTS Introduction As indicated in Chapter I, the analyst's approach to the question of the validity of the results of his Simulation experiment is fundamentally determined by his basic point of view as to the aim and method Of execution of his experiment. The type of model built which is a function of the analyst's outlook and training is a primary factor in the nature and extent Of the validation procedure employed for the results of the model. This chapter will examine the procedures used to validate the results of some Of the better known and better documented simulation experiments of the recent past. Computer Models of the Shoe, Leather, Hide Sequence Cohen (1960) constructed two simulation models to describe the aggregate behavior of shoe retailers, shoe manufacturers, and cattlehide leather tanners between 1930 and 1940. This aggregate behavior was described in terms of selling price, production or sales, and receipts. While 46 47 the first model (Model II) was a "one-periOd-change" model determining values for these endogenous variables only one time period in advance, the second model (Model IIE) was a "process" model which determines endogenous variable values for an arbitrarily large number of future time periods. Cohen's model is discrete and dynamic with a time increment of one month. Visual comparison of the time paths of the model predictions of selling price, production, and receipts with the actual historical time paths Of these variables comprised the only validation of the model. The simulation runs for both Models II and IIE generate time paths for the endogenous variables which, although not in complete agreement with observed time paths, indicate that our models may incorporate some of the mechanisms which determine behavior in the shoe, leather, hide sequence.l Both models produce time paths which fluctuate around the observed time paths. For most variables, the amplitude of the oscillations is greater for Model IIE than for the actuals, with Model II having the largest amplitude. However, none of the time paths for either Model seem to be either explosive or overly damped. 2 The findings are also similar for average price. The time paths of both Models II and IIE are reasonably on course with observed values, although Model II Shows even wider fluctuations about the actuals than for the preceding prices.3 Simulation Of Information and Decision Systems in the Firm Continuing a research effort started principally by Cyert and March,4 Bonini (1963) constructed a computer 48 model of the behavioral theory of the firm. In order to Show the effects of organizational, informational, and envirOnmental factors upon the firm's decision making process, Bonini decided that price, level of inventory, cost, sales, profit, and amount of pressure would be an adequate endogenous variable set to represent the behavior pattern of the organization. The model was used as an exploratory device to describe the relationship between various informational flow patterns and the firm's decision process. From these relationships design changes for the firm could be recommended. Bonini was not concerned with modeling an actual firm. He was concerned with a comparison of the behavior of his theoretical firm after a proposed change with the original behavior. This comparison involved analyzing two sets of six time series (one time series for each Of the variables price, level of inventory, cost, sales, profit, and amount of pressure before and after the proposed change). Because these time series did not exhibit any tendency to Obtain steady-state or equilibrium values over time, Bonini settled for a measure of central tendency (the arithmetical mean), a measure Of dispersion (the standard deviation), and a measure of trend (the least- squares regression coefficient) to describe the output time-series of his model. 49 Bonini determined the requirements for the length of these time series in the following fashion: On the one hand, the run Should extend over sufficient Simulated periods so that extreme values in the time series can be averaged out (that is, so there will be relatively small sampling error associated with the above three measures). On the other hand, limitations on computation time would argue for keeping a reason- ably short number of periods. In addition, if we are going to apply our results to real organizations, we would be more interested in the immediate and short-run effects (Of particular changes) than in what might be the average level over, say 20 or 30 years. In view Of these considerations, I have chosen 108 time 5 periods . . . as the length for the simulation runs. Portfolio Selection: A Simulation of Trust Investment Clarkson (1962) developed a simulation model to duplicate the procedure by which a trust officer in a bank selected stock for any particular client's portfolio. The model combines a set of decision rules which are selected on the basis on information available about the client's financial situation and requirements. The output of the model is not a data stream but a selection of a variable number of shares of a variable number of stocks, given the client's position. Clarkson applies two types of testing procedures to his model: those pertaining to the output of the model alone and those pertaining to the decision processes incorporated in the model. For testing output Clarkson notes, Since the problem of determining the type Of error when comparing generated to actual output has not yet 50 been solved, statistical tests on the goodness of fit of the generated output are not very meaningful. The only statistical test that has much meaning is to test whether the generated data give a significantly 'better fit' than that which would be produced by some random or naive mechanisms.6 He tested the model against a "random selector" from the total population. Stocks were being selected at random without replacement from the list of total stocks available. This list contains M stocks Of which W have been selected by the trust Officer for the particular port— folio under consideration. Z is defined as the number of these stocks selected by the trust officer which occur in a sample of n stocks drawn at random from the list without replacement. 2 is called the hypergeometric random variable. (13) (13319 Pz(k) = k = 0,1,2, . . . ,n M) n where <2) = O for a>b Clarkson rejected the hypothesis that this probability was equal to the percentage of matching or "correct" responses generated by the model. The size of the list was reduced to include only those issues which displayed the character- istics desired by the client, and the hypothesis was still rejected. 51 Naive decision rules replaced the random selection procedure, and the hypothesis was still rejected. The decision rules considered were: 1. Rank growth stocks on the basis of growth in price over the last 10 years. 2. Rank growth stocks on the basis of growth in earnings over the last 10 years. 3. Rank growth stocks on the basis of growth in sales over the last 10 years. 4. Rank growth stocks on the basis of growth low yield over the last 10 years. 5. Rank yield stocks on the basis of growth high yield over the last 10 years. His Objective, Clarkson contends, is to simulate investment behavior, to select the correct portfolios with the same processes and for the same reasons as the invest- ment Officer. Therefore, the need to test the decision processes exists. Turing's test7 was used: Can an impartial Observer discriminate between the output of the model of human behavior and the output of the actual human behavior? Simulation of Market Processes Balderston and Hoggatt (1962) constructed a computer simulation model of the West Coast lumber industry. The emphasis of the model is not to describe the real firms making up this industry, but to study the dynamic behavior of firms in a two-stage market from the vieWpOint of an 52 economic theorist. The model is driven by wholesalers to whom suppliers provide and from whom customers purchase. While flows of information, material, and money move vertically through the market, no horizontal movement is allowed. At the end of each market period decisions about output and price and entry and exit to the industry are made. Concern for the validity of the model centered on the question Of viability. Viability, as used by Balderston and Hoggatt, does not require equilibrium of the endogenous time paths, but only requires that "behavior should persist over a significant time interval."8 Persistent behavior means that the time path is stable--stable in the sense that it settles into a state which exhibits properties of convergence or stable in the sense that change over time is steady with proportional (or acceptable) changes in the other endogenous variables. This is the extent to which the original study considered the model's validity. Hoggatt in a later article9 applied G. E. P. Box's10 method of system analysis to the model. At this time more SOphisticated validation techniques were introduced. Hoggatt states that he would consider the model valid if it "duplicated [the] trends and 11 frequency response of [the] real system" rather than aiming to have the model duplicate the time paths of the 53 real system. In order to measure the frequency response of the model, he used the autocorrelation function. Industrial Dynamics Industrial Dynamics was developed by Forrester (1962) from his original dynamic simulation model of a firm's production-distribution system. Forrester has tried with limited success to convert his model building techniques into a general management philOSOphy. He describes Industrial Dynamics as the study of the information-feedback characteristics of industrial activity to show how organizational structure, amplification (in policies), and time delays (in decisions and actions) interact to influ- ence the success of the enterprise. It treats the interactions between flows of information, money, orders, materials, personnel, and capital equipment in a company, and industry, or a national economy. Industrial Dynamics provides a single framework for integrating the functional areas Of management-- marketing, production, accounting, research and development and capital investment. It is a quanti- tative and experimental approach for relating organiza- tional structure and corporate policy to industrial growth and stability.12 The greatest contribution of the Industrial Dynamics models was to point out the extraordinarily large fluctua- tions that can occur in the inventory held at the retail level when a change in customer demand is reflected through the lagged order delivery sequence: retailers-distributors- factory warehouse-factory-factory warehouse-distributors- retailers. From this basic production-distribution model many possible changes can be tested: limit factory capacity, 54 eliminate the distributors, add additional sectors such as a market sector, include advertising. How well the model serves its purpose is Forrester's test of its validity.. The purpose of Industrial Dynamics is to design better management systems; therefore, validity can only be tested after an Industrial Dynamics approach has been applied to a situation and the results measured in some concrete terms such as increased profit. Defense of the model prior to use can only be given in terms of an individual defense of each detail of structure and policy so that in sum the total behavior of the model shows performance characteristics associated with the real system. The validity of the model at this stage as a description of a specific system can only be examined relative to the system boundaries (Are the boundaries suitable relative to the objectives of the experiment?), to the interacting variables, and to the values of the parameters. If the similarity of the model output to the actual characteristics of the system is not sufficient, these three factors must be examined and changed., These views on validity can be summarized in the following quotations: Validity as an abstract concept divorced from purpose, has no useful meaning.13 The ability Of a model to predict the state of the real system at some specific future time is not a sound test of model usefulness.l4 55 Data may serve to reject a grossly wrong decision- making hypothesis, but they can scarcely prove a correct one. Forrester believes the final test for validity is whether the actual system is being controlled to agree with the model. Computer Simulation of Competitive Market Response In order to define and analyze management problems involving the environment of the firm, Amstutz (1967) developed a simulation model of competitive market response. The objective of the study was to model the firm and the environment external to the firm so that the total effect of changes in variables which can be controlled by management could be measured. Amstutz set up his system structure in terms of three sets of elements. Active elements are human. They can originate and react to signals. The eight active elements involved in the model are the producer, his competitors, distributors and whole- salers, salesmen, retailers, consumers, government Officials and research workers. "Elements of flow are the vehicles 16 These are the Of interaction between active elements." elements management can manipulate in order to try and achieve his Objectives. The elements of flow are product, information and capital. The last set of elements are the passive elements (time delays, dissipators and storage) which describe the channels through which the flow elements 56 move between the active elements. By means of this formula— tion the dynamic effects of the origination of a signal by management can be examined. The tests Amstutz carried out in an attempt to analyze the worth of his model were of two types--reliability testing and validity testing. The purpose of reliability testing is to determine if the results of the model are reproduceable. Are the results Obtained on sequential runs sufficiently alike to justify the assumption that they are two samples drawn from the same population of data? Validity testing is concerned with "truth." As there is no Objective measure of truth, Amstutz argues that a subjective evaluation of the consistency Of the model's performance with theory and prior knowledge must be made. Validity Of a model can be established only by examining the realism of the assumptions on which it is based.17 Evaluation of the model's performance is possible using the Turing test. If a person knowledgeable in the area to be modeled cannot distinguish the model from the real system when provided with responses from both, then the model is realistic. Other tests for validity can be performed once the validity of the assumptions on which the model is based has been established. 57 Tests for Viability . . . This is a very gross test which is usually satisfied without explicit consideration. Does the model generate behavior which persists over a significant time interval? Tests for Stability . . . Variables and processes which are stable in the real world must also exhibit stability when modeled. Tests for Consistency . . . Consistency between model behavior and behavior observed in the real world. The extent to which the assumptions of the model agree with known facts must be tested as must the internal consistency or "deductive veracity" of the model--does the model "make sense." This testing may be done subjectively as "face validity" testing (does the model appear to be satisfactory), or analytically with sensitivity analysis. Duplication of Historical Conditions . . . The fourth set of tests proposed by Amstutz. Prediction of Future Conditions . . . The ability of the model to predict cannot be tested until after the passage of time over which the predictions were made unless "pseudo predictions" are made of past results. Amstutz carried out these tests in the following manner. Reliability was tested by calculating "interrun deviations" when changing the seed in the random number generator. Subjectivity and "eyeball" testing confirmed viability, stability, and consistency requirements. To 58 determine the extent to which the simulated exogenous time paths matched historical data the absolute error between simulated and actual was summed and averaged. The predic- tive ability of the model was not examined. Model Classification In order to summarize the views on validation Of these seven model builders, it might prove instructive to classify their models. The models will be classified as discrete or continuous, positive or normative, and behav- ioral or physical.4 A discrete time model is structured using difference equations while a continuous time model is built with differential equations. A positive or descriptive model is one which attempts to replicate a real system. But no consideration is given as to the adequacy or value of this real system. A normative model attempts to produce the Optimal conditions for the system under study. Explorative models generate solutions in search of this goal. Positive is to normative as "what is" is to "what ought to be." The last classification dichotomy is behavioral-physical. If any part of the model is an attempt to duplicate human behavior, the model is classified as behavioral, otherwise it is physical (see Table 3.1). The next task is to use this classification scheme to determine if those who build the same type of model hold similar views as tO the procedures by which their models can be validated. 59 X N x x mmmmq x x x Nusumfim x x x Hmpmmunom x x x upmmmom Ocm :Oumnmpfimm x x x cOmmeHO x x x flcflcom x x x x cmnou amoemmcm HMHOfl>m£mm m>flpmauoz m>fluflmom msoscflucoo Opmuomflo .mpcmfiflnmmxm :Oflumasfiflm HOPSQEOO mo coflpmowmwmmmaola.a.m mqmfis 6O Unifying Validation Concepts From the study of these six models general concern is directed in varying degrees to two distinct types of validation--va1idation of the basic underlying processes of the model and validation of the data stream output of the model. Because the basic design and assumptions used in any model are certain to differ from those used in any other model, design validation procedures must of necessity be tailored to the particular model under consideration. This type of validation is probably best carried out by interactions between the model builders and those who are familiar with the real system being modeled both during and after construction of the model. After completion of the model, the Turing test can be used to increase confi- dence in the validity of the basic design. This type of model validity will be called design validity; validity of the output data stream will be called output validity. This study will not consider design validity to any great extent for two reasons. First, as indicated, design validity is a concept specific to the particular :model at hand; and second, if the model satisfiesthe requirements of output validity, it is not unreasonable to assume that the basic processes of the real system must .have been modeled reasonably accurately. Friedman adds weight to the decision not to consider design validity. the believes that the validity of a theory is not based 61 on the realism of its assumptions (complete "realism" is unattainable), but on the accuracy of its predictions. Design validity is the point at which many normative model builders (in particular Forrester) stop. They argue that a normative model is not built to represent the actual system, but to represent the system the way it should be. Missing from this argument is a rational method Of moving from the actual state to the desired state. A functional normative model might well be one which first models the actual system (at which point output validity testing can be carried out) and then the desired corrections are made from this basis. The Cohen and Bonini models, and even the more recent Amstutz model, after a rather thorough description of validity testing, use subjective and basic statistical tests for validity. It is reasonable to conjecture that in general validation of currently built simulation models is not carried out at a much, if any, higher level of sophistication. Balderston and Hoggatt's original analysis for validity is also rather limited and basic, although Hoggatt's later analysis is the most sophisticated of those employed in the models discussed. Data produced from a strictly behavioral model such as Clarkson's is very limited. His analysis is quite adequate for the purpose of his model. 62 Two main points arise from this examination of some of the most well known Simulation models. The first point is that regardless of the type of simulation used or the aims of the analyst, much of the activity that has to be carried out in order to validate the model is the same. The second point is the Obvious need for the use of more extensive and more reliable techniques in the validation process. CHAPTER III-~FOOTNOTES 1K. J. Cohen, Computer Models of the Shoe, Leather, Hide Sequence (Englewood7Cliffs, N. J.: Prentice-Hall, I960), p. 60. 2Ibid., pp. 62-63. 31bid., p. 63. 4R. M. Cyert, E. A. Feigenbaum, and J. G. March, "Models of a Behavioral Theory of the Firm," Behavioral Science, Vol. 4, No. 2 (April,1959), pp. 81-95. 5C. P. Bonini, Simulation of Information and Decision Systems in the Firm (Englewood Cliffs, N. J.: Prentice-Hall, 1962), p. 52. 6G. P. E. Clarkson, Portfolio Selection: A Simulation of Trust Investment (Englewood Cliffs, N. J.: Prentice-Hall, 1962), p. 55. 7A. M. Turing, "Can a Machine Think?" The World 9f Mathematics, ed. by J. R. Newman (New York: Simon and Schuster, 1956), pp. 2099-2123. 8F. E. Balderston and A. C. Hoggatt, Simulation of Market Processes (Berkeley, California: Institute of BuSiness and Economic Research, 1962), p. 33. 9A. C. Hoggatt, "Statistical Techniques for the Computer Analysis of Simulation Models," Appendix in Studies in a Simulated Market. L. E. Preston and N. R. Collins (Berkeley, California: Institute of Business and Economic Research, 1966), pp. 92-122. 10G. E. P. Box and K. B. Wilson, "On the Experi- mental Attainment of Optimum Conditions," Journal of the Royal Statistical Society, B, XIII (1951), pp. 1-45. 11Hoggatt, p. 94. 12J. W. Forrester, Industrial Dynamics (Cambridge, Mass.: The M.I.T. Press, 1961), p. 13. 63 64 lBIbid., p. 115. 14Ibid., p. 115. lsIbid., p. 118. 16A. E. Amstutz, Computer Simulation Of_Competitive Market Response (Cambridge, Mass.: The M.I.T. Press, 1967), p. 18. l7Ibid., p. 369. CHAPTER IV THE MODEL Introduction When large amounts of money and manpower have been applied to a project over an extended period of time, there is a natural reluctance (maybe not explicitly stated or felt) to subject the finished model to scrutiny, the result of which may indicate the worthlessness of the expenditures. Because our industrial sponsor did not discourage critical examination of the completed model, this dissertation is a formal analysis of the model's validity. Rather than narrow the focus to the validity of one specific model, validation of simulation models as a class will be examined with particular reference to this one model. A description of the long-range environmental planning simulator for a physical distribution system (LREPS) follows. The Systems Approach During the post-war period there has been an increasing use of quantitative analysis (usually discussed as operations research or management science methods) of .industrial problems in order to supply an added dimension in: the decision making process. Use of these techniques 65 66 in physical distribution has on the whole been applied to isolated segments of the entire system.1 Only recently has the firm's fixed facility network, transport capability, inventory allocations, communications, and unitization (material handling, packaging, containerization) procedures been conceptualized as an integrated physical distribution system.2 Suboptimization can occur without an orientation toward an integrated system. For example, suppose a cor- poration is organized into four functional areas: purchas- ing, finance, manufacturing, and sales. The responsibility for physical distribution activities is allocated as follows: inbound materials under purchasing, branch plant shipments and order processing under finance, traffic and shipping under manufacturing, and inventory control and public warehousing under sales. If planning is not carried on from the point of view of the corporation as a system, suboptimization might occur if purchasing determined the quantity of raw materials required solely on the basis of price per unit. This would probably mean large inbound shipments and non-optimal raw material inventory due to high storage costs. Many other Situations can OCCur where the Optimal action for a particular corporate functional area is suboptimal for the company as a whole. Recognition <3f the possibility of this type of suboptimization has led to the establishment of integrated physical distribution Systems by many corporations. 67 The argument for integration using the systems concept could be extended. Why not integrate the functions of the firm? Why not integrate firms into a model of the economy? Given the capacity limitations of the present generation of computing machinery, the trade-off exists between cost benefits from the "systems effect" and loss Of ability to represent the system components accurately in the required detail. At the desired level of detail a great deal of effort had to be expended in order to ensure that the size of LREPS did not exceed the capacity of the available computing machinery. Integration beyond the level of the physical distribution system would have required a lower level of model refinement. But the systems concept is a vital development which will be extended with future technological advances. Model Structure The actual physical distribution system is modeled in terms of the general structure given in Figures 4.1 and 4.2. The five basic components of an integrated physical distribution system (the fixed facility network, transport capability, inventory allocation, communication, and unitization) are evaluated at three stages in the channel structure. These three stages are: l. The manufacturing control center (MCC) which produces a partial product line and distributes 68 PHYSICAL DISTRIBUTION SYSTEM MANUFACTURING CONTROL CENTERS (MCC) MULTI-LOCATION EACH PRODUCES LESS THAN FULL LINE EACH PRODUCT IS PRODUCED AT MORE THAN ONE MCC REPLENISHMENT CENTERS (RC) MULTI-LOCATION EACH STOCKS ALL PRODUCTS MANUFACTURED AT MCC DISTRIBUTION CENTERS (PDC) (RDC) MULTI-LOCATION FULL LINE - PRIMARY DC (PDC) FULL OR PARTIAL LINE - REMOTE DC (RDC) CONSOLIDATED SHIPPING POINT (CSP) TRANSPORTATION COMMON CARRIER - TRUCK, RAIL, AIR INVENTORY STOCKS AT RC, PDC, RDC COMMUNICATIONS COMPUTER, TELETYPE, MAIL, TELEPHONE UNITIZATION AUTOMATED OR MANUAL PRODUCT PROFILE MULTI-PRODUCT LINE KEY PRODUCT GROUPS FOR EACH CUSTOMER CLASS OF TRADE MARKET PROFILE MULTI-CUSTOMER CLASSES OF TRADE TOTAL U.S. MARKET COMPETITIVE PROFILE MULTI-COMPETITORS Figure 4.1.--General Description of Firm-Distribution Audit.l 1D. J. Bowersox, et a1., Dynamic Simulation of Physical Distribution Systems, Monograph (East Lansing, Michigan: Division of Research, Michigan State University, Forthcoming). 69 STAGE 1: MANUFACTURING CONTROL CENTERS AND REPLENISH- MENT CENTERS STAGE 2: DISTRI- BUTION CENTERS PDC PARTIAL LINE STAGE 3: DEMAND UNITS PD REGION PD REGION J ----- INFORMATION FLOW PRODUCT FLOW REGION..THE REGION IS DEFINED BY THE ASSIGNMENT OF RDCS AND DUS TO A.PDC. ’ MCC.....EACH MANUFACTURING CENTER PRODUCES A PARTIAL LINE. RC......REPLENISHMENT CENTERS STOCK ONLY PRODUCTS MANUFAC- TURED AT COINCIDENT MCC. RDC.....REMOTE DISTRIBUTION CENTER. FULL 0R PARTIAL LINE. . PDC.....PRIMARY DISTRIBUTION CENTER. EACH PDC IS FULL LINE AND SUPPLIES ALL PRODUCTS TO DUS ASSIGNED TO THE PDC REGION: PRODUCT CATEGORIES NOT STOCKED AT THE PARTIAL LINE RDCS IN THE REGION ARE ALSO SHIPPED BY THE PDC. DU......THE DEMAND UNIT CONSISTS OF ZIP SECTIONAL CENTER(S). CSP.....CONSOLIDATED SHIPPING POINT. . . . . 1 Figure 4.2.--Stages Of the Phy51ca1 Distribution Network. 1D. J. Bowersox, et a1., Dynamic Simulation of Physical Distribution Systems, Monograph (East Lansing,. MichIgan: Division of Research, Michigan State Univer31ty, Forthcoming). 70 these products from the adjoining replenishment center (RC). 2. The distribution center (DC) which provides a product selection at a location from which customer service requirements can be satisfied. 3. The demand unit (DU) which is an individual customer's demand or the agglomeration of several customers' demands. The items manufactured at the MCC move to the customer through the distribution centers. Four different types of distribution center exist at the DC stage. Primary distribution centers (PDC) handle a full line of the firm's products and have the potential to serve all the demand units in a defined region of the total market area. Remote distribution centers full line (RDC-F) also handle all of the firm's products, but service only a pre- assigned subset of the DU's within the PDC market region. A remote distribution center which handles only a fraction of the firm's total product line is called a remote distri- bution center partial line (RDC-P). The last type Of DC is the consolidated shipping point (CSP) which is an RDC-P which handles no products, but functions as a point at which the demand of several DU's is agglomerated and served from a PDC. The PDC'S are capable of serving the same demand units as an RDC-P, but cannot serve the demand units affiliated with an RDC-F. 71 This model structure presents the physical distri- bution system at an integrated level, a level which allows the accumulation of information pertinent to the particular project in progress, but which also allows the same model (with minor modification) to be used in a wide range of other applications. Consideration of the physical distribution system as an integrated unit offers management financial advantages. Can the Operations research techniques used to analyze the elements of the system be extended to the system in its entirety? Usually not. The interaction between the elements of the system normally introduces a degree of complexity such that analytical procedures cannot be used. Fortunately numerical procedures exist which provide a method for study- ing this class of larger, more complex, problems. Such a numerical procedure is simulation. Simulation as a tool is less accurate and more costly than an analytical tech- nique, but it is feasible. As a design specification Of the project was for a ten year time horizon, the model must be dynamic--dynamic because information is required of the system at all points along the time horizon, not just the end. The effect of a decision at time n is dependent upon the timing and nature of the decisions made prior to time n. LREPS has the facility to change over time both the endogenous variables, 72 using internal feedback mechanisms and the exogenous variables, which represent the system's environment. A dynamic simulation model is desired which will analyze the cost and service trade-Offs between the elements or subsystems of the physical distribution system caused by any given sequence of decisions made over a long-range planning horizon. The two main aspects which set the model apart from previous studies are the consideration of both spatial and temporal dimensions of the physical distribution system in one model and the concept Of flexibility. The descrip- tion of the model subsystems to follow will indicate the method of including both spatial and temporal considera- tions. Due to the stochastic nature of the system being modeled, several acceptable outcomes are possible from a given managerial decision. The flexibility of one parti- cular outcome is the degree to which it is representative of the whole range of acceptable outcomes. Subsystem Detail The model3 is constructed in three main parts: The Data Support Subsystem, the four subsystems which comprise the actual Operating model (the Demand and Environment Subsystem, the Operations Subsystem, the Measurement Subsystem, and the Monitor and Control Subsystem), and the Report Generator Subsystem. This structure is shown in Figure 4.3. 73 cmmHSOHE nonmomwm mo SOHmH>HO .AmcHEoonuuom .>PHmHO>HcD oumgm .cmmHQOHz .mchcmH Pmmmv ammumocoz .mEOpmNm SOHHSQHHumHQ HOOHmwnm mo cOHumHsEHm OHEmcmn ..Hm Pm .xOmuO3om .O .OH Emhm>m no >mHH4_m_xm4m mu_>mmm mm4m m0hm ea hompzoo >21 8 w z . Emzmmm >21 mmhm oz~Hm _ ozrpmoaazm I— _ _ _ _ _ _ Ll. 74 The Data Support Subsystem generates the input tape for the model. Contained on this tape are the con- stant exogenous variables for a particular experiment using the model and also the amount and timing over the ten year planning horizon of changes in controllable variables. The controllable variables are order char- acteristics, product mix, new products, customer mix, facility network, inventory policy, transportation, com- munications and unitization. The second main segment of the model contains a mathematical representation (difference equations) Of demand generation and allocation, the driver of the model, and the five elements of the physical distribution system: transportation, inventory control, facility location, unitization, and communications. The Demand and Environment Subsystem subdivides the national sales forecast to the individual demand units, generates actual customer orders by product, allocates these orders to the demand units, and assigns a distribu- tion center to service each demand unit. TO avoid dealing with individual customers, demand was summarized by Zip Sectional Center. The product orders representing this demand were drawn in blocks at random from the order matrix until the demand unit's daily sales forecast4 was satisfied. Blocks on the order matrix contain orders for a stratified sample of fifty products, or about 12% of the total product 75 line. These orders can be constructed to be representative of historical conditions, or “pseudo orders" can be gener- ated. Testing new product lines, changing demand patterns, or observing the dynamics of alternative inventory policies is possible by generating "pseudo orders" with the desired characteristics. Finally, demand units are assigned to distribution centers according to one Of these decisions rules: minimum distance, minimum transit time, minimum transportation cost or a heuristic combination of these three factors. The Operations Subsystem uses the information supplied by Demand and Environment and processes the product and information flows through the physical distri- bution system. Orders arrive each day at the distribution centers from the demand units. If inventory on hand is sufficient to meet this demand, the order is prepared and shipment is made, but if inventory on hand is not sufficient, a backorder is created, and at the time indicated by the inventory policy in use, an order is sent to the replenish- ment center. This transmittal time for the order, together with order processing and preparation time, the delay to the next scheduled shipping time, and the transit time to the distribution center, make up the reorder cycle. The average customer order cycle time, a measure of the system's aservice capability, can then be calculated as the total of customer order transmittal time, customer order processing 76 and preparation time, the mean reorder cycle time, and the customer transit time. One of three inventory policies trigger the reorder cycle--a daily reorder point system, an optional replenishment system or a hybrid combination of these two. Communication policies can be tested by varying the distribution from which the transmittal time is selected. An order system based on mail, for example, would be represented by a distribution of order transmittal times with a larger mean and variance than would an order system using a teletype. The Measurement Subsystem develops cost, service, and flexibility measures of the activity levels of the Operations Subsystem. Fixed facility investment cost, tranSportation cost, communications cost, average inventory carrying cost, reorder cost, and throughput or unitization cost per distribution center are summed to the total cost associated with the physical distribution system. The annual fixed facility investment cost is Obtained by depreciating the dollar investment for the facility over its functional life span. The dollar investment is assumed to be constant for a given size and type Of facility. To determine transportation costs, both inbound from the replenishment center to the distribution center and outbound from the distribution center to the demand ‘units, the appropriate freight rate for the distance is Inultiplied by the weight. The freight rates were determined 77 by regression analysis in order to account for such factors as freight class, weight breaks, regional differences, negotiated rates and average shipment size. The number of orders and lines processed are used to determine com- munication costs for each network link and each facility size, again by regression analysis. Inventory costs (aver- age carrying cost and reorder cost) are determined for a sample product category and then extrapolated up by the appropriate sample to product line ratio. Average throughput costs per unit of volume moved through distri- bution centers of each size and type have been calculated. Throughput cost for the distribution center is then volume times the appropriate cost per unit. Also calculated in the Measurement Subsystem are such service characteristics as the number of stockouts, total order cycle time, and the percentage of demand satis- fied within a specified number of days' transit time. The Monitor and Control Subsystem provides an alter- native to specifying all changes in controllable variables in the Data Support Subsystem prior to the actual running of the model. In Monitor and Control, desired and actual levels Of cost, service, and flexibility are compared at.specified stages over the time horizon, and modi- 2fications are made automatically to the physical dis- ‘tribution system on the basis of the size of the ‘Lariance. The modification might take the form of an 78 expansion, addition or deletion of physical facilities for future periods or it might be an alteration of the sales forecasts for future periods. The final main segment Of the model is the Report Generator Subsystem which organizes the output data of the model into management reports. Validation Effort to validate a computer simulation model can be directed in two ways--to validate the design or method Of construction of the model and tO validate the output of the model. As indicated in Chapter III, too much emphasis has been placed on design validity in the past. This dissertation will concentrate on methods to establish the output validity of computer simulation models in general, and in particular the LREPS model. Given this emphasis, it is still important to recognize the need to test for design validity during the process of constructing the model and as an initial procedure upon its completion. This testing involves checking the functioning of the model and its components for reasonableness. DO the values Of the endogenous variables fall within acceptable limits? This procedure is sometimes known as determining the model's face validity, that is, determining the extent to which the assumptions of the model agree with known facts and also the internal cIonsistency or "deductive veracity" of the model. In 79 other words, the model must "make sense." Table 4.1 con- tains the face validity analysis for LREPS. A comparison of simulated versus actual data for an information category is designated "within limits" if the variance is less than 5%. The third output validation procedure proposed is to examine the sensitivity of the major assumption employed by the model. To the extent Of the analysis of data streams before and after a change in these assumptions, this is output validity. But the determination of the particular assumptions to be examined is a problem of design validity. Gross malfunctions of a particular model can be discovered by analysis for face validity or design validity. Once the model has satisfied these criteria, the more general and SOphisticated procedures for establishing output validity can be applied. These methods as applied to the LREPS model are now briefly discussed (the next three chapters take up each of the methods in greater detail). Data streams for several endogenous variables need to be generated by the model over an extended time period. This is so the stability or viability of the model over the long run can be established. Do the data streams examined show persistent behavior over this time interval? 60 TABLE 4.1.--LREPS Face Validity. Information Category Simulated Versus Actual PD Stages Cust Sales Cust Dollar Sales/Order Cust Wt Sales/Order Line Items per Order Cust Serv-- NOCT-Avg Within Limits DC and Domestic NOCT-Std Dev No Data Avail. DC and Domestic T4-Avg Within Limits DC and Domestic T4-Std Dev No Data Avail. DC and Domestic Dollar-Preps No Data Avail. DC only Order Preps Within Limits DC only DC-MCC Reorders Within Limits DC only DC Stockouts No Data Avail. DC only DC Avg IOH Within Limits DC only Cust ship Difficult to Accums Compare Because Of Small Sample Averages in Cust Order Blocks MCC Ship Accums Within Limits MCC only Total Product Demand Within Limits Domestic only Total PD Cost-- Within Limits DC and Domestic Facilities Within Limits DC and Domestic Transportation v Inbound Within Limits DC and Domestic Outbound Within Limits DC and Domestic Inventory Within Limits DC and Domestic Communications Within Limits DC and Domestic Throughput Within Limits DC and Domestic Cum Wt Indicies Within Limits DU, DC and Regional Within Limits Within Limits Within Limits Within Limits DU, DC and Domestic DC and Domestic DC and Domestic DC and Domestic 81 The second validation procedure requires a measure Of the extent to which the model is an accurate representa- tion of the real system. Time paths of selected endogenous variables, which are representative of the physical distri- bution system's behavior, will be generated by the model over a past time period. Statistical analysis Of this data with actual historical data over the same time period will provide the required measure. Two critical building blocks in the model are the use of a stratified sample of fifty products to represent the total product line and the method of generating demand unit orders. The model should be constructed so that reasonable changes in these two procedures do not have a significant effect on the model output. To carry out this third validation procedure, analysis of selected endogenous data streams before and after the change will be required. An example of such a change is the alteration of the compo- sition or size of the stratified sample. As indicated, the methods used, and the results Obtained, with these three types of analysis will be examined in later chapters. CHAPTER IV--FOOTNOTES 1For example: Transportation-- W. H. Hausman and P. Gilmour, "A Multi-Period Truck Delivery Problem," Traneportgtion Research, Vol. 1, NO. 4 (December, 1967), pp. 349-357. Warehousing-— A. A. Kuehn and M. J. Hamburger, "A Heuristic Program for Locating Warehouses," Management Science, Vol. 9, NO. 11 (July, 1963). PP. 643-666. Inventory-- A. F. Veinott, "The Status of Mathematical Inventory Theory," Management Science, Vol. 12, NO. 11 (July, 1966), pp. 745-777. (This article includes an extensive bibli- ography.) 2D. J. Bowersox, E. W. Smykay, and B. H. LaLonde, Physical Distribution Management (New York: The Macmillan Company, 1968), Chapter 5. 3A more detailed description of the model can be Obtained from the monograph "Development Of a Dynamic Simulation Model for Planning Physical Distribution Systems: Formulation of the Conceptual Approach and Research Design" which is in process at the Graduate School of Business Administration, Michigan State Uni- versity. 4The daily sales forecast for the demand unit is a function of population, retail sales, personal income and effective buying power associated with the Zip Sectional Center. 82 CHAPTER V STABILITY OF THE MODEL Introduction The first aspect of validity to be subjected to detailed analysis is long-term stability. Stability is the ability of the model to generate endogenous data streams which Show persistent behavior over the long run. Over this time period the data streams will exhibit con- vergence properties or the rate of change of each endogenous variable being examined will be proportional to or accepta- ble to the rate of change in all other endogenous variables. The ten-year planning horizon of LREPS is considered "long- term." I This type of analysis follows naturally the establishment of the model's face validity. While face validity is a statement of the model's reasonableness over the Short run (preliminary runs of any model are usually not for the entire planning horizon), the analysis Of this chapter is a statement of the model's reasonableness over the long run. Endogenous data streams of sales weights for the three products are examined. This analysis is carried out 1J1 two ways. The first way is to study the time series or 83 84 data stream and then make statements as to the reasonable- ness of its variability over the time horizon. Spectral analysis is used for this purpose. The second type Of analysis is to lag the original time series by k units and then compare this lagged time series with the original set of Observations. This comparison Should indicate a reasonable correspondence between the two data streams. Given this particular analysis a 10 unit lag was selected, as a large proportion of the variance of the time series could be expected to occur over a two week period. Graphical Analysis Gross instability of the endogenous data stream under consideration is indicated rather clearly when the data is graphed. But it must be pointed out that the amount Of variability contained in the data can appear to increase or decline with a contraction or expansion of the range of the ordinate. Figure 5.1 is the graph of sales weight for each of the three products over a ten- year period (Product 1 is plotted with "+'s," Product 2 with octagons, and Product 3 with triangles). No inordinate amount of fluctuation is observable from this graph. Parameters of the data streams are of relatively little value because of the averaging effect over a large number of observations and also because a comparison Of two different data streams is not being made. Recognizing this fact, the means, variances, skewness, and kurtosis of .mummw OH How mpocponm Gonzallpanmz mOHmmII.H.m Ousmwm UTFN H WMN NOON MHTH. JDPH. DUN.-. ijH. . ROB. . EARN. . TMN CO C 'E‘uJ'l 5:1 ".2: ’9'“ I :5 U LIE. HT l"! '(E. R'FE' "Q5 86 the three data streams are given in Table 5.1. The means and variances are of limited value as absolute quantities. A normal value for kurtosis is 3, and the symmetry of a symmetrical distribution is l. The distribution for Product 1 is remarkably symmetric. The distributions of the other two products are nonsymmetric and leptokurtic ("humped" to a degree greater than normal). TABLE 5.l.--Means, Variances, Skewness, and Kurtosis. Sales Weight Product 1 Product 2 Product 3 Mean 530.95 326.61 1.78 Variance 100162.81 89553.83 13.79 Skewness 1.00 2.05 2.81 Kurtosis 1.23 6.37 9.40 Correlation The amount of correlation between a time series and the same time series with observations lagged by k units is of interest. This can be shown by the coefficient of deter- mination (r2) which expresses the percentage of the total variation in the original variable which is "explained" by the regression line of this variable on the lagged variable. .Also conveying the same type of information is the autocor- relation of a time series at time t and at time (t + k). The «autocorrelation of order k is given by 87 COV(ut. ut+k) ek _ (VAR ut) (VAR “t+k’ The first task is to examine the coefficient of correlation (r). The range of this coefficient is from -1 to +1, or from perfect negative correlation to perfect positive correlation. The values of r for original data on lagged data are given in Table 5.2 as well as the results Of the null hypothesis that r is significantly different from zero. In order to accept the null hypothesis with 95% confidence, r must be greater than 0.197.1 The hypothesis is rejected for Product 3. This product is a slow mover, and so the variation in sales weight between a given time and a time two weeks later could be considerable (for example a positive sales weight against no sale or zero sales weight). So this result appears reasonable. TABLE 5.2.--Test of Correlation Coefficients. Sales Weight r HO Product 1 0.6162 Accept Product 2 0.5421 Accept Product 3 0.1075 Reject The values of the coefficient of determination are given in Table 5.3. A moderate amount of the total 88 TABLE 5.3.--Coefficients of Determination. Sales Weight Product 1 0.3797 Product 2 0.2939 Product 3 0.0116 variation in the original data for Products 2 and 3 is explained by the lagged data--enough to suggest the absence of instability over two-week periods. Usually the presence of autocorrelation is a burden to the analyst of time series. But for the present purpose, autocorrelation indicates an inherent relationship between observations in the time series at point n and those at point (n + k). The existence of such a relationship limits the susceptibility of the time series to excessive fluctua- tion. The autocorrelations of order (k = 10) for the three data streams are listed in Table 5.4. Theil's Inequality Coefficient The quality Of predicted results, given the availa- 1bility of the actual outcomes, is measured by Theil's inequality coefficient. If the coefficient is zero, the :forecasts are perfect; and if the coefficient has a value <3f one, it means that the forecasting method has generated r’Gasults no better than those obtained by no-change 89 TABLE 5.4.--Autocorre1ation. Sales Weight Product 1 0.0044 Product 2 -0.0199 Product 3 0.0394 extrapolation. The inequality coefficient has no finite upper bound. Forecasting outcomes to be equal to those which occurred two weeks previously is not good forecasting technique, and the results of this test are not expected to be good. But if the inequality coefficient has a value close to one, it means that the variation occurring in the time series over a two-week period is minimal and also that movement within the series is gradual. The coeffi- cients given in Table 5.5 Show that this is indeed so. As expected, the covariance proportion accounts for all of the disparity between forecast and actual (Table 5.6). TABLE 5.5.--Test Of Predictive Quality. Sales Weight Product 1 0.7222 Product 2 0.9609 Product 3 1.1886 90 TABLE 5.6.--Inequality Proportions. Sales Weight Bias Variance Covariance Product 1 0.0000 0.0000 1.0000 Product 2 0.0000 0.0000 1.0000 Product 3 0.0000 0.0000 1.0000 Spectral Analysis The techniques discussed up to this point in the chapter have been applied to analyze the relationship Of the Observations in a time series at point t with observa- tions in the same time series at point (t + k). The other form of testing for long-term stability is to inspect the variability contained in the original data stream. Examination of the power spectrum of this data stream allows the determination of the extent to which particular frequency bands contribute to the total variance. If the graph of the logarithm of the power Spectrum does not violate Granger and Hatanaka's2 simulataneous confidence interval at some specified confidence level, then the original time series can be said to exhibit stability for that time period. Figure 5.2 is a graph of the logarithm of the power Spectrum of 2590 observations of the sales weight for Product 1 against 120 frequency levels. Figures 5.3 and 5-»4 are similarly graphs for Products 2 and 3 respectively. .H nonponm Mom uanOz mUHmmIIEsnuommm Hm3om poumEHHmmll.m.m mucmHm , HMS. . $04 T DU HT . 1.3 B. NF mu , GD. 1. . 37. mn. . ......m N. ..JN. ... . NJ. . um... ..., > 7H: >, >2 /< >>H>< / HIS/H/ >H,<,./ HS 1 \< ,> 1c< :< (2 << 9 >1- H> :> Ht :H/ H x .> \J/ ./ y H/\ Hz W (H \r > Wr\ HHH: ,///\>/ :\J\ /\ (CH //>\ H/\ /\/\ > >L\/\/P>H ,<.>> >»/ 3 rH/H/H \ Sq <>< HH/ 92 .N HOSOOHA MOM uanoz memmlpEOHuowmm stom Uwumsprmll.m.m musmHm .EOH $.38 FIJI fl.NP 3.03 1.3m... ml..an NJJN .....NH. 30.... _ _ _ _ _ H H _ _ H _ H H H m _ H H H .I..I.,.HI< , I) . II... .H S. .3 .I>...Hf..H I; > .2/ HIH,.H/\/ /.HH / \> \I, IQ; H/ 3\/..H/ NYC/Jr)... \/.H >> I H/kz C / > x/ X > HP \ /\/ 7 D .I 3 . H :H/ HN /,\/ my. H / > ./ H/H H\/ .N MIN N (Hz HH/ H < IH I , H I...) .I H I HI. .Ib >. H I.I H/I HH/.H I H IH I... < I../ _ .IH I.I...IHH. II.II.IH I.2 2, . I .. -4 Cr in sea. 7;»— 5 101 manner, although the means and variances of the data streams are given in Table 6.1 and their skewness and kurtosis in Table 6.2. It should be noted from Table 6.1 that the simulated inventory on hand for Product 3 is maintained at zero units. Product 3 is a slow mover, and on the infrequent occasions when this product is demanded, it is placed on back order. The information of this table shows large discrepancies between actual and simulated means and variances for all products over the three variables. Skewness is a measure of the departure of a distribution from symmetry. This measure would take on the value zero if the distribution was symmetrical. Most of the time series considered are not very symmetrical (Table 6.2). Kurtosis is a measure of the "hump" of a single humped distribution. This measure centers on the value 3, platykurtic distributions having a kurtosis value less than 3, and leptokurtic distributions having values greater than 3. While this measure is of little value for the study at hand, most of the time series considered are platykurtic. Analysis of Variance A one-way analysis of variance is conducted to test the null hypothesis that the mean of the simulated data stream is not significantly (at a 95% significance 102 vm.vmmm oo.o 05.5mm ww.ma H5.mm wm.mh mommamm> mm.ova oo.o Ho.mm om.H om.n mm.v com: m poooonm om.mnamv om.mmma hm.HmHmv HH.>mmmm mv.vmmmm nm.mmmov mommamm> vm.mmm Nb.mv mv.mmm om.mmm Nh.mha ma.mwm cmmz N nonponm ma.mvhmon ma.mommv wh.am>vma m>.mvahm mm.vmmmmmv mm.mav¢nm mUQMHHm> ov.moma om.mmm o¢.mmm m>.mom No.mnmm mm.ommH com: H posoomm Hoopod ompmaseflm Hospo< omumHDEHm Hmsuo< Umpmaseflm mama so uzmflmz moamm muogcm>CH mmamm HMHHOQ .mmocMflHm> tam mcmozll.a.m mqmda 103 mn.o: oo.o ma.o mm.h mm.o wo.m mamOpHSM mm.o oo.o mm.o mm.m mo.o om.m mmmnzmxm m posoonm Hm.0| mo.m hm.b mo.m hm.n mo.m mflmouusm mm.o H6.H HH.N nv.a HH.N bv.a mmocZoxm m vosooum mo.m no.0: mm.H ww.m mm.a vm.m mflmouuom mm.H No.0: No.0 «m.H mm.o vm.a mmmczmxm H uosoonm Hmsuo< ooDmHoEflm stuod ompmaofiflm Hospom omDmHDEflm ocmm so Dnmflmz mmamm wuoucm>cH mmamm HMHHOQ .mflmounom pom mmmczmxmll.m.w mamaa 104 level) different from the mean of the actual data stream. Table 6.3 contains the results of this analysis. The decision to reject the null hypothesis is made if the calculated P value (MSp/MSe) is greater than the tabled F value for the apprOpriate degrees of freedom. If the null hypothesis is rejected, the means at this level of confidence are significantly different. The model indicated that inventory on hand for Product 3 should be maintained at a zero level so an F value could not be calculated. In all cases tested the null hypothesis was accepted at the 95% confidence level. Multiple Comparison Multiple comparison is a technique which can be used to test if a particular statistic from a simulation is significantly different from the same statistic in the control. The control in this case is the actual his- torical data, and the statistic to be tested is the mean. This analysis should confirm the results obtained using analysis of variance. If the absolute difference between the mean of the simulated data stream and the mean of the actual data is greater than an appropriate Dunnett statistic multiple of the square root of twice the mean square error over the number of variables, then the hypothesis that the Ineans are equal must be rejected. The appropriate Dunnett 105 ummooa ms.H pdmooa Hm.H m noseonm pmmooa mm.H ud¢004 wm.o pmmooa mm.o m nonwonm ammooa en.m namooa H~.H unmoom HN.H H nonwoum om m cm m om m comm co >H0pcm>cH Dreams mmamm mmamw HMHHOQ .mcmmz mo ummBII.m.m mqmfie 106 statistic is indexed by the desired confidence level (95%), the number of "plans" to be compared (k=2), and the degrees of freedom for the mean square error term. The results of Table 6.4 show that in all cases this hypothesis was rejected. The F Test Similarity of simulated to actual mean values has been evaluated using analysis of variance and multiple comparison. The F distribution is to be used to test if a significant difference exists between the variances. It should be noted that other methods, such as multiple comparison, could be used. The F Test was selected because of the relatively small sample size. The ratio of the actual variance to the simulated variance is distributed as F. With a knowledge of the number of degrees of freedom contained in each variance calculation and the significance level desired (95%), the correct F value can be found. If the tabled value of F is less than the F statistic, then the hypothesis that the two variances are equal at this significance level is rejected. The number of degrees of freedom in both the numerator and denominator of the ratio of the variances is 102, and the F value at the 95% confidence level is 1.37. The hypothesis that the variances are equal will 107 ti o m E '0 ll m o n _m-.m_n« pomflmm mm.o mH.HN nommmm mm.H 5H.m m uoseonm nowmmm «H.m mm.~mm pomflmm mo.vs eH.mMH nommmm eh.m¢ ee.os m Desmond nooflmm pm.mm om.HomH yommmm mm.mm Hm.omH pomflmm oo.m©H mm.mmmm H #UDUOHm on m < comm co muoucm>2H 0m.‘ . m d om m fl “roams mmflmm mmamm umHHoo .mcmmz mo ummB QOmHHmmEOU mamfluaoznl.v.m mqmfifi 108 only be accepted if both the ratio of actual to simulated and the ratio of simulated to actual variances are less than 1.37. The information in Table 6.5 shows that in no case is this true, and so the null hypothesis must be rejected every time. Correlation The coefficient of determination expresses the percentage of the total variation in one variable which is "explained" by the regression line of this variable on another variable. Taking the square root of the coef- ficient of determination gives the coefficient of correla- tion r. The range of r is -l to +1 or perfect negative correlation to perfect positive correlation. For there to be some degree of correlation between two variables, r must be shown to be significantly different from zero. Tables are available1 which show the value which r must be greater than, at a particular confidence level, to be considered different from zero. At a 95% confidence level this value is r=0.l97. Analysis of the r values is con- tained in Table 6.6. The null hypothesis is that the value cof r is significantly different from zero. As stated previously, the value of the coefficient (of determination is the proportion of the sum of the squared 109 mmflumm Hmsuo< mo mocmwnm> n m mommmm acupom mo mocMflHm> I wmflnom pmpwaseflm mo mocmflnm> I d nomflmm 00.00 pommmm Hm.ma m 556004 «0.0 ummooa 00.0 a m poseoum nommmm 00.00 hammoa 00.0 nomflmm me.a m pmmooa 00.0 homflmm 00.H ammooa 05.0 m m Hosmoum 556000 00.0 nmmoo< 00.0 pomflmm 50.5 m pom5mm 00.0 nommmm 00.H pmmooa vH.0 « H noncoum om owumm om oflumm om oflpmm mama so muoucm>cH pnmflmz mmaom mmamm HmHHon .moocMHnm> mo umwBI|.m.m mqmda 110 mwNo.o mNmo.o m uODcOHm 0000.0 0000.0 oooo.o m noncoum vmm0.o mmao.o omao.o a poocoum comm oo mucuom>oH nomad: mmamm mmamm HmHHoo .ooflpmoHEHmumo mo muomfloflmmmooll.5.m magma uomflmm 0000.0 momflmm 0000.0 0 0000000 006000 0000.0- pomflwm 0000.0 pummmm 0000.0 0 0600000 000000 0000.0- pomflmm 0000.0 uomflmm 0000.0 0 0000000 om H om H om H comm oo muouoo>oH ummflmz memm mmHmm HmHHOD .maomHOmemOU ooflomamunoo mo pmmall.w.c mamms lll Regression Analysis If perfect correlation existed between the values of the simulated endogenous data streams and the actual data, then the regression line of either of these two variables on the other would be a straight line passing through the origin with a slope of one. Another test of the degree of correlation between these two variables is to determine if the regression line of actual on simulated has an intercept significantly different from zero and a slope significantly different from one. The difference between the sum of the squared deviations between each actual and simulated datum and the sum of the squared deviations between the regression line and each simulated observation divided by the number of observations n all divided by the residual sum of squares divided by n-l is distributed as F. If this value is greater than the tabled F value (F=3.97) indexed by the degrees of freedom and the confidence level, then the hypothesis that the .intercept is not significantly different from zero and 'the lepe is not significantly different from one is rejected. The results of this test are given in Table 6.8 VVith the hypothesis being rejected in half the cases. The Chi-Square Test For the validity testing of simulated against actual cIata streams, the Chi-square test is not used in the 112 000000 00.00 000000 00.0 0 0000000 000000 00.0000 000000 00.0 000000 00.0 0 0000000 000000 50.0000 000000 50.0 000000 00.00 0 0000000 om m om m om m comm oo muoaom>oH pomflmz mmamm 000mm HmHHoo .mmoflq o00mmmummm Mo pmmall.m.m mamme 113 accustomed manner. Whichever is larger, the range of the actual data or the range of the simulated data, is divided into ten equal parts. The number of observations from the actual data which fall into each of these cells becomes the expected frequencies, and the number of simulation observations falling into each cell are the observed frequencies. Summing the squared differences between observed and expected frequencies divided by the expected frequency gives the Chi-square value. This Value is compared with a tabled value given a confidence level and degrees of freedom, and if the calculated value is larger than the tabled value, then the hypothesis that there exists a significant correspondence between observed and expected frequencies is rejected. With nine degrees of freedom and a 95% significance level, the appropriate value of Chi-square is 16.9. The values of Chi-square given in Table 6.9 are compared to the value 16.9, and if smaller, then the hypothesis that the actual and simulated frequencies show reasonable correspondence is not rejected. Theil's Inequality Coefficient Theil's Inequality Coefficient U measures the quality of predicted results against actual outcomes. The coefficient has a range from zero to infinity. If U=O, the forecaSts are perfect, and U=l indicates a prediction 114 000000 00.00 000000 00.5 000000 00.0 0 0000000 000000 05.50 000000 00.0 000000 00.0 0 0000000 000000 00.00 000000 00.0 000000 50.00 0 0000000 om . . mumovmn0ou om 00mowml0ou om onmovml0ou comm oo mnouo0>o0 0:000: 000mm 000mm Hmaaoa .0mmB mHmDUml0mUII.m.w mqmda 115 error equal to that obtained by extrapolation assuming no change. From Table 6.10 it can be seen that when con- sidering the simulation output as a forecast of the actual daily observations, the prediction is of rather poor quali- ty. Table 6.11 shows that the disparity between forecast and actual is not consistently due to one particular in- equality proportion, although the variance proportion is of less effect than the bias or covariance proportions. Spectral Analysis When considering the Fourier representation of a time series, the contribution that a particular frequency or frequencies make to the overall variance of the series is of interest. This type of analysis is possible because the frequency band (0, m + dw) contributes f(w) dw to the total variance (f(w) is the power spectrum as defined in Chapter II). The number of frequency bands or lags m to consider should be less than % (where n is the number of observations in the series), and if n is not large, m n n 2 3'01? "6‘. should be about For the n=103 of this analysis, m=20 was chosen. Examination of the power spectra of the actual time series and the simulated time series will show which fre- quencies contribute the most to the total variance. If the frequencies were the same or close for both series, 116 mmmm.o mmmm.o 0000.0 m pooconm mm0m.o mmmm.0 mo~0.0 N 0oocoum oomh.o mo0m.o mmmh.o H poocOHm comm oo >000om>oH 0:0003 000mm 000mm HmHHoo .mu0amoo o>0uo0cmum mo 00051:.o0.o mamma 117 oooo.o mmm0.o HHhm.o Hmvo.o maoo.o mmmm.o vmm0.o mNmH.o mmmm.o 0000.0 Hmhm.o mmmm.o mmmm.o oomo.o HmHH.o onmm.o mmHo.o 5500.0 mmHm.o mmmo.o Nmmo.o mmom.c Nvmo.o mmmo.o momm.o mvva.o mmHo.o moom00m>oo moom00m> mmwm 0 0000000 moom00m>ou ooom00m> mm0m m uoocoum moom00m>ou moom00m> mm0m 0 0000000 comm oo >000om>oH 0:0003 000mm 000mm HmHHoo .mo00pnomoum >00HmoqmoHul.HH.m mqmda 118 similarity of the original series would be indicated. The log of the power spectrum is plotted against j in Figures 6.2 and 6.3 in order to construct Granger and Hatanaka's3 simultaneous confidence bands (lOO-d)% for all j (a = con- fidence level). Notable "power" exists at frequencies where a smooth curve cannot be drawn easily between the confidence limits. The shape of the power spectra of Figures 6.2 and 6.3 are quite different. The frequency band centered on the component with a period of about 2.67 days for the actual data shows a significant lack of contribution to the overall variance. For the simulated time series the frequency band centered on the component with a period of about 6.67 days provides significant positive contribution to total variance. No reasonable interpretation can be found for periods of 2.67 or 6.67 days. It is also noticable that the low- frequency range of the power spectra (within which the "long-run" components are concentrated) did not contribute to the extent that is normally found in economic time series. A detailed explanation of these rather poor results is contained in the final chapter. A measure of the correlation between the frequency components of two series is given by c2 (w) + q2(w) C = (w) fx(w) + fy(w) .0 poocoum 000 000mm 000000 Hmo0o \xxé\ / / \ 1., /\ / / \ \ / /, \\ \\ // / ,\ / , \ N 111V / \ /. / _\ x\‘ / , \\ 120 .0 poswoum How mm0mm HM00oo wqu0DE0mIIESHuommm uw3om cwpm80ummnl.m.w musm0m b‘; 121 where C(w) is the co-spectrum, q(w) is the quadrature spec- trum, fx(w) the power spectrum of x, and fy(w) is the power spectrum of y. C(w) is the coherence at w. The range of C(w) is from zero to one and its value can be interpreted as the square of the correlation coefficient. The coherence of actual and simulated dollar sales for Product 1 is not great at any frequency although a stronger relationship does exist for frequencies of one month, one week and half a week (Figure 6.4). Tests estab- lished by Goodman4 hypothesize that the true coherence at all frequencies in Figure 6.4 is zero. A relationship may exist between one time series at point n and another at point (n+k). A measure of the phase difference between the frequency components of two l<:q(u0:>. ¢(w) = TAN C(w) From the phase diagram of Figure 6.5 no such relationship series is appears. There is no trend in the phase diagram which would indicate a time lag, neither are there oscillations about a constant other than zero indicating an angle lag. A final diagram which may indicate the nature of a relationship between two time series is the gain diagram. The gain R; y(w) is defined by fy (w)R: y(w) = fx(w)C(w)- 122 .0 uospoumunmm0mm HM00oo ©m0m0ofi0m pom 0msuo¢ Mo oocmumnoonl.w.m mnsm0m L7 . .'. 5 123 .0 uosmoumttmo0mm HM00OQ wouM0DE0m pcm 0m9yo¢ mo mmcnmll.m.m onsm0m . m3. 0... D0 P . 1..., fl . N0. w. . C0. C]. . I CW . 3. Cm... 1 OH. . «J. CC C -.J _ .... __J -—1 — 0 d 0 0 0 0 0 0 A 4 q H 124 Gain can be considered as the regression coefficient of process {Xt} on process {Yt} at frequency m (Figure 6.6). The results of the other eight comparisons of actual time series with simulated time series were of comparable quality to those presented for daily dollar sales of Product 1 and so they are not reproduced here. Factor Analysis Cohen and Cyert5 suggest comparison of the factor loadings of simulated results with the factor loadings of actual results as a method of appraising the quality of the simulated output. A factor analysis of the nine actual data streams (dollar sales, sales weight, and inventory on hand, each for three products) produced most meaningful factor loadings with three factors. This was also the case with the nine“ simulated data streams. It is now of interest to determine the extent to which the three actual factors and the three simulated factors differ in ability to describe the actual and simulated data respectively. Table 6.12 is the simi- larity matrix for these three factor pairs. Each element in the matrix has a range of values from -1 to 1, significant correspondence between the factors occurring only for values of 0.78868 or greater. The best factor pairings are: actual l with simulated 2, actual 2 with simulated 3, and actual 3 *with simulated 1. Only the second pairing is significant. .H #ODQOHmVlmMme HMHHOQ UGflMHDEHm USN HMSHVOAN MO fiflfiUll-Qom GHDOWM O N m: I. .80 P .11-... fl . N0, my . C0 01,. E OF. . ..u CW . 7. 71.. . N 30. Q _ - :I a i 0 0 _ d J a 4 0 _ 4 a 0 A '4 _ _ I- LI .6. L - i a .Q 01 u 3L u u .L a 5 l. 2 TI! 1 l f . .L - . -... a l \ . - -1 w. / i .. /\.\ -. I. / ’1 a I ISL rt - 21 n. L ll!!! lJrII (a . . _ . . . _ . . . . . . 126 mnmo.0I N000.0I vhmo.o m mama 0msuo« 0mqm.o 0b>m.o vam.0| m How 0:0cmoq mmmo.o m0mm.o mmmw.o 0 Houomm mpmo ©w#M0DE0m new mc0pw00 Houowm .mmc0pm00 Houomm HOM x0num2 mpHHM00E0mII.N0.m mqmma 127 The Model's Predictivevaility The ability of the LREPS model to predict the behavior of the actual system has not been established. The results presented in this chapter are poor and at times contradictory. But neither has any major defect in the model been established. The only conclusion to be drawn is that the validity of the model's predictive capability has not been established. In order to do this, these same tests must be repeated with a larger number of observations collected at a longer time incre- ment. CHAPTER VI--FOOTNOTES lJ. Riggs, Production Systems: Planning, Analysis and Control (New York: John Wiley & Sons, Inc., 1970), p. 70. 2C. W. J. Granger and M. Hatanaka, Spectral Analysis of Economic Time Series (Princeton, N.J.: Princeton Uni- versity Press, 1964), p. 61. 3 Ibid., p. 62. 4N. R. Goodman, Scientific Paper No. 10 (New York, N.Y.: New York University, Engineering Statistics Labora- tory, 1957). 5K. J. Cohen and R. M. Cyert, "Computer Models in Dynamic Economics," The Quarterly Journal of_§conomics, Vol. LXXV, No. 1 (February, 1961), pp. 112-127. 128 CHAPTER VII SENSITIVITY OF THE MODEL'S MAJOR ASSUMPTIONS Introduction The third and final part of the validation procedure is to determine the degree to which the characteristics of the endogenous data streams change when the form of one of the model's major assumptions is altered. Assumptions are usually made to simplify the complexity of real situations and so make the modeling process easier. Indeed, model construction may not be possible in many situations without incorporating rather stringent assumptions. But it is undesirable to have the model output dependent on the nature of the assumptions embodied in the model. It seems reasonable that the endogenous data streams of a valid computer simulation model will not change significantly even with rather severe changes to the assumptions which are incorporated into the model. This chapter describes the analysis performed in order to test this statement for the LREPS model. The LREPS model contains two major assumptions. The first concerns the way in which demand from the con- sumer level is generated, and the second concerns the 129 130 selection of products from the total product line over which this demand will be allocated. Both of these assumptions are required because a firm of reasonable magnitude produc- ing consumer products can expect to handle hundreds of thousands of orders for hundreds of different products during the course of a year. The dilemma created is: too much detail cannot be handled by available computing machin- ery; too much aggregation of this detail will reduce the model's ability to test the effects of such changes as the introduction of new products, different inventory poli- Cies or different demand patterns. Solution of the dilemma comes with the introduction of assumptions. A stratified sample of 50 products from the total product line was selected.1'2The products in the sample must be representative of the entire product line so that the information generated on the basis of the sample can be extrapolated to the level of the total corporate operation. The sample products were selected on the basis that a prod- uct be representative of the company's inventory and movement costs. Products were classified into four cate- gories on the basis of annual dollar sales, with the first category containing "high-movers" and any products management might want to give special consideration. Rather than attempt to account for each of several hundred thousand individual orders, a random selection is made of a year's invoices. A particular number of individual 131 orders for the sample products is summarized into a block. The number of orders so summarized is called the blocking factor. These blocks are then combined into an order file or order matrix from which a block of orders is randomly drawn to generate the demand for each time period.3 This chapter investigates the effect of four changes in the assumptions for LREPS product analysis and order generation. .The normal blocking factor for order genera- tion is lO--blocking factors of 5 and of 20 are considered. A stratified sample of 50 products is used, these products divided into four categories--a new sample of 50 products is generated, and the effect of using only 3 product categories is investigated. So the net result is the analysis of the control endogenous data stream (the output of the model in its unmodified condition) with the endogenous data streams resulting when each of the four proposed changes is put into effect. To simplify the presentation of the results of the statistical tests used, these five situations will be designated plans viz.: Plan A The control--no change in model structure Plan B Blocking factor of 5 used in order generation Plan C Blocking factor of 20 used in order generation Plan D 3 categories used for sample products E Plan New product sample used 132 Graphical Analysis An approximate idea as to the degree of change occurring in the model's output data streams with a change in assumptions can be obtained by examination of the graph of these data streams before and after the change. Six endogenous data streams of the unmodified model are obtained: dollar sales for each of the three products for a two-year period and sales weights for the three products over the same two years. Comparison of each of these six Plan A's with each of the other four plans gives a net result of 30 data streams or 24 one-on-one comparisons of Plan A with another plan. An exceedingly large volume of data is recorded if the results of all tests for all products for both variables are included. In this section and the Spectral Analysis section only the results for dollar sales of Product 1 are presented and even then the amount of data included is considerable. The results not included do not add any new dimension to the analysis which might justify their inclusion. Figure 7.1 shows the dollar sales of Product 1 Plan A (the control) against Plan B. Figures 7.2 to 7.4 are the graphs of Plan A and Plan C, Plan A and Plan D, and Plan A and Plan E. The high degree of intermeshing of each of the pairs of data streams indicates no radical change in results for any of the four plans tested. postoum Hem mo0mm um00ootlm cM0m tam 4 cm0mll.0.h ousm0m L_F'i .0 uonpoum How mm0mm HM00OQIIU GM0m can m cm0mll.m.h musmwm .Owa . MBJ .031 . Jim . NEW . OflN .ECN .nfiuq. . 1.00 O . ND 00 . O If??? 31 TL" 6‘ F ' 1‘ W1 '15:? Ea"; II‘Z‘ 135 .0 poswonm How mw0mm Hm00oollo cM0m cam c :m0mll.m.> musm0m n .mm: 041 .33m mam .amm .mow .Jma .300 o.mm 90.0 _ 0 _ _ 0 0 0 0 0 0 fl 0 0 _ _ . _ _ _ _ _g | L ._ . l .l . - _1. _ J c: . 1 .- _ .‘ 1 _ ._,_ I I .,._ _ , 1 ‘ _ J - l r L L r A . 0 a “" 1'5 9.1}. at? .0 Mosconm How mm0mm HM00oakpm CM0m tam d CM0mtl.v.h musm0m . N00.“ .CflN . HUN . 3P0 . 700 O . Nb. CO . C _ _1 —1 _. _. ‘1 —-+ ._ _ _ ... .‘E" —? D1 1’; vs; 9.ch LPZ‘ 137 Added information can be obtained by a detailed analysis of the graphs as outlined in the preceeding chapter. But again this will not be done, as later tests will provide similar information by more reliable methods. The means (Table 7.1), variances (Table 7.2), skewness (Table 7.3), and kurtosis (Table 7.4) are given. These parameters show no remarkable change between the control and any of the other four plans. Analysis of Variance Analysis of variance is used to test for any dif— ference between the mean of the control (Plan A) and the means of the other four plans. The null hypothesis that at the 95% confidence level no difference exists between the control mean and the other means is examined in Table 7.5. The null hypothesis is rejected if the calculated value of F (MSp/MSe) is greater than the tabled value of F for the appropriate degrees of freedom. The null hypothesis is accepted when Plan A is compared with Plans B or C, but is rejected when Plan A is compared with Plans D or E. Remember that Plans B and C involve changes in the order generation process, while Plans D and E involve alterations to the product sampling procedure. Given a change in the method of order genera- tion, a particular product should still be contained in the average order to the same extent. But when the number 138 0m.0om mm.00 mm.00 mo.00 0m.m0 m uoswoum ~0.~mqmmv mm.mommm mm.o~mmm mm.0~mnm0 mo.0mmvm m poscoum mm.0~mmo0 «v.0nwm00 0m.mmoom mm.o~m0m~ av.mthm 0 posooum unmflmz mm0mm 0o.mm0 om.qm am.vm mm.0m v¢.mn m poscoum mm.mom0¢ mm.mmnmm va.mm0em mo.m0vms op.v~omm N nonconm «5.0mmmon mn.m000¢b 0~.0hsh~m mm.mvwmmm0 No.0memvm 0 posooum mm0mm Hm00on m 260m o am0a o :m0m m cm0m a :m0m .mmocmwum>ll.m.h m0nma mm.m 0m.0 mm.~ $0.0 ma.0 m posooum mm.voa mm.omm mm.0vm mm.msm mo.vmm m noscoum mm.mvm mm.mvm mm.mmm «m.omm av.omm 0 pogooua pnmflmz mmHmm mm.m «h.m mm.m m¢.~ m0.v m Doscoum ma.m0~ om.m0m wm.m~m ma.mm~ mm.s0~ m uoscona mo.mov0 mm.mov0 mm.m~v0 mm.v0v0 vu.0mm0 0 nosooum. mmfiwm .HmHHOQ m aM0a o cmHE o cm0m m cm0m a :m0m .mzmmzll.0.h mqmda 139 «v.5 vh.m mN.m mm.om ~¢.m m Doscoum mm.m wo.m om.0 00.m mm.m N poopoum nm.0 mm.0 00.0 m>.m mm.0 0 Doctoum . pnmwms mmHmm 0m.> No.00 0m.m mm.0m mm.m m ucsmomm mm.m mo.m om.0 00.m mm.m m posooum nm.0 mm.0 o0.0 mh.m mm.0 0 poswoum mm0mm Hm00on m cM0m Q cM0m U QM0m m :m0m m cM0m .m0mouH5MI|.w.n mqmda mm.m mm.m m0.m mm.v on.m m pospoum om.0 nm.0 .m0.0 05.0 mh.0 m posponm 00.0 o0.0 mm.o vm.0 mo.0 0 posoonm psmflwz mm0mm vm.~ 0m.~ m0.m mm.v mn.m m poopoum om.0 hm.0 ,m0.0 o>.0 mh.0 m Dostoum 00.0 00.0 mm.o v0.0 vo.0 0 uospoum mm0mm HM000Q m cm0m a cM0m u GM0m m cM0m < cm0m kl .mmmczmxmll.m.h mqmda 140 Domflmm sv.ms pommmm mm.m0~ pamoom mv.o pamooa mm.o m posuoma pomflmm oo.mm pomflmm av.mm0 pamoom m~.o pamooa 0m.o m posooum pomflmm mfi.mp pomflmm mm.0m0 namooa m0.0 namooa ev.0 0 Boscoum pam0m3 mm0wm nomflmm m0.mm pommmm mm.hm~ pamooa se.o namooa vv.o m nonconm nomflom sv.vm nommmm om.n~0 namooa a~.o namooa mm.o N noncoum nomflmm 05.mh Domflmm 0m.vm0 ummooa m0.0 namoom mm.0 0 posooua mm0mm HM00OQ om m on m om m om m m CMHm..¢ CMHm D SMHQ..< CMHA U GMHQ..4 CMHQ m swam..4 CMHA .mcmmz mo pmmBII.m.h mamme 141 of product categories or the particular products included in the sample are changed, the extent of a particular product's presence in the average order might well vary. Multiple Comparison To test for significant difference in a particular statistic between the control and alternative plans, multiple comparison is used. Again multiple comparison is used to confirm the analysis of variance testing of the mean values. The absolute difference between the mean of the control and the mean of the particular alternative plan under consideration must be less than a specified amount. Otherwise the null hypothesis that no significant dif- ference exists between the means cannot be accepted. This specified amount is an appropriate Dunnett statistic multiple of the square root of twice the mean square error divided by the number of variables involved. The correct Dunnett statistic is found with a knowledge of the desired confidence level (95%), the number of plans (2), and the degrees of freedom for the mean square error. Table 7.6 contains the results of this analysis. The analysis of variance testing is confirmed only to a moderate degree. General acceptance of the null hypothe- sis is shown for all plans for Products 1 and 2, while general rejection of the null hypothesis is shown for Product 3. 142 :mmmzxc n m o m. _m .m "4 nomflmm av.0 sm.s ummooa 0N.o N0.o m uosuomm namooa sm.m0 mN.N pamooa sm.m0 mN.N N uosvonm pamooa 0H.¢N ¢«.m0 ummooa wm.0N mm.m0 0 poaooum u£m0m3 mm0mm nomflmm 00.0 om.N namoom m¢.o 0¢.o m uosoonm Damooa Nn.m0 an.o ummoo< no.00 Nm.0 N uoscomm pamooa om.0m 0m.kv Damooa Nm.vm mm.0¢ 0 uosnoum mm0mm HM00OQ om m d om. m fl m cmHN..a :mHN .o :60a..< swam pomflmm mm.o wm.o pomflmm em.o Nn.o m nonconm namooa mo.mN Nm.50 pommmm mp.0s mm.Nm N posoona namooa ms.mN mo.mN Damooa mm.mv mq.0N 0 noncona DBNNmz mmHmm pomflmm NN.0 om.m pomflmm om.o No.0 m uosooum pamooa 0m.ma mm.00 pomhmm mo.mN m0.mm N posoonm namooa mm.ms em.mm Damned m¢.mN0 0m.Nm 0 unsecum mm0mm uc00on om m d on m d U QM0m..¢ SMHA m swam..¢ CMHm .mcmmz mo umma c0m0nmmfioo m0m0u0szll.w.h mqmda 143 The F Test While analysis of variance and multiple comparison have been used to test means, the F distribution is used to test for significant differences between variances. The ratio of the variance of the control (Plan A) to the variance of one of the other plans is distributed as F. This F value, if greater than the appropriate tabled value of F, will cause the null hypothesis that the two variances are equal to be rejected. The correct tabled value of F is selected with knowledge of the desired significance level (95%) and the degrees of freedom of each of the variances (519). The tabled F value of 1.11 is used for the results of Table 7.7. For Products 1 and 2 the null hypothesis is accepted for all plans except Plan C. Plan C uses the large blocking factor which provides an individual product with a greater probability of being included in the block, and therefore decreases the variability (and variance) for the product. Generally, the null hypothesis is rejected for Product 3. This product is a slow mover, and so it occurrs in an order with a great degree of irregularity, forcing the variance to be relatively large and unpredictable. Correlation The square of the correlation coefficient r is the coefficient of determination which expresses the amount of 144 .mn0c0mnumcoo 0m0o m0.m wo omHm>c0 mnu m0 m0nmummoom mum mm>0pmcumu0m.mmm£u >0co as cM0m m>0pwcump0d AN cM0mv Hospcoo u m pamooa «o.o nomflmm m0.0 . nomflmm m0.0 noommm NN.0 m nonconm pamooc m0.o «namoom mm.o pommmm mm.0 pamoom om.o N uosooum *pmoooa 0m.o ummoom >m.o pooflmm mm.0 pmmood mv.o 0 pontoum prawns mm0mw pamooa mm.o nomflmm m0.0 nomflmm m0.0 nomflmm mN.0 m nosmoum «ummood mm.o «ammoom mm.o pommwm mm.0 ammoom om.o N Doctoum «vacuum 0m.o ammoofi nm.o Dommmm m~.0 Damood mv.o 0 Dosponm mm0cm HM00OQ on m om m om m 0m m m CMHQ..< £60m D cm0mn.¢ CMHm U cmamn.¢ swam m cmamn.¢ swam .moo:M0Hm> mo ummBII.h.h mqmda 145 the total variation contained in one variable which is "explained" by the regression line of this variable on another variable (Table 7.9). The range of the correlation coefficient is from -1 to +1. If a relationship exists between two variables, the most basic test is to show that the correlation coef— ficient is significantly different from zero. Given a particular confidence level, the calculated value of r must be larger than a tabled r value4 in order to accept the null hypothesis that the value of r is significantly different from zero. At a 95% confidence level this tabled value of r is 0.197. Table 7.8 gives the results of such testing for the correlation coefficient of the control plan and each of the alternative plans. While the correlation coefficients are not significantly dif- ferent with changes in order generation (Plans B and C), they are significantly different with changes in the product sample characteristics (Plans D and E). This is confirmed by the values of the coefficients of determina- tion. Regression Analysis A regression line passing through the origin with a slope of one indicates perfect correlation between the dependent and independent variable(s). The regression lines of the control values (Plan A) against the values 146 .................. ummoom NNmm.o pamooa somm.o nomflmm .Hmoo»o nomflmm m500.ou m posnoum ummooa Nmms.o pamooa nmNm.o nommmm 0voo.o pommmm Nmso.o N nosooum Damooa NHHN.o namooa 6N0m.o pomflmm mmmo.o nomflmm ONNo.o 0 poscoua ummflmz mmHmm ummooa mmhm.o pamooa mmwm.o pommmm mo0o.o nommmm mm00.o- m possenm namooa Nmms.o namooa mmNm.o nommmm Nwoo.o nomflmm omvo.o N noncoum paooom N00s.o pamooa mN0m.o Domflmm mmmo.o Domflmm ONNo.o 0 Dosooua mewm HwHHOQ cm H on H om H om H m «8.70 I d CM0n0 Q CMHm I d r870 U CMHmI AN 03.70 m 050m Id 080m .mgcm0o0wmmou :00DM0mHHOU mo pmmfill.w.b mqmda 147 ..mmNn.o ..VoOoo.o 0omv.o mooo.o m unspoum m0hm.o mmmm.o oooo.o 0moo.o m pospoum moom.o momm.o m0oo.o mooo.o 0 posponm pnmwmz mo0mm 0owv.o «mmn.o 0ooo.o mooo.o m pooponm m0>m.o mmmw.o oooo.o 0moo.o m posponm moom.o Nomm.o m0oo.o mooo.o 0 ucsuoum mo0wm HB00OQ m cm0a o cm0m . o :MHN m cm0a ..< CM0m ..fl CM0m I d CMOm I ¢ :00m oQOH#MfiHE®#®D MO mflgmflUHHHGOUllomoh qummNH. 148 of each of the other plans is constructed and the slopes and intercepts are tested to determine if they are sig- nificantly different from one and zero respectively (Table 7.10). The sum of the squared deviations between each observation of the control (Plan A) and another plan and the sum of the squared deviations between the regression line and the control observations are calculated. The difference between these two sums is divided by the number of observations n and the result divided by the residual sum of squares over n-l. The net result of this calculation is distributed as F. In order to reject the null hypothesis that the intercept is not significantly different from zero and the slope is not significantly different from one, this F value must be greater than a tabled value of F indexed by degrees of freedom and confidence level. The tabled F value is 3.00 for this testing. Table 7.10 shows that this hypothesis is accepted in all but one case which involved Product 3. The Chi-Square Test The Chi-square test is used to compare the control (Plan A) with the other plans. The larger of the range (of the control observations and the range of observations <3f the other plan being considered is divided into ten enqual parts. The number of observations from the control 149 pomflmm mmmN.ou namooa vuvv.o namooa mNom.o namooa vmvm.o m nosoonm pamoua Nvoo.os namooa Noam.o ummoom 00ms.o hamooa NHmN.o N nosooum pamooa N050.o ummooa mva.o Damooa mv0m.o namoom NVNN.o 0 Boscoum uanmz mm0mm namooa mmwv.o Dawooa mvmv.o pamooa wmom.c namooa memo.o m Doscoum pamooa Nva.o namooa mmnv.o namooa HNmN.o namooa NNNN.o N uoswoum ummooa NNNv.o namooa smmv.o pamooa m00m.o pawoom mva.o 0 poscoua mm0mw “@0000 om m on m cm m om .m m CMHm..¢ CM0m D CMHm..< QmHm U CMHA..< :m0m m CMHm..¢ CM0m .mmC00 COHmmmHmmm mo pmmBI|.o0.h m0m00000000 mo 000B||.~0.> m0m4a 153 the alternative plans. All comparisons for Products 1 and 2 generate coefficients well below 1, with the values for Plans D and E being below those of Plans B and C. The coefficient for Product 3 also indicates better predictions using Plans D or E, but the coefficient values are higher in every comparison than for Products 1 and 2. The covariance proportion consistently accounts for most of the disparity between the actual and forecast results (Table 7.13). Spectral Analysis Spectral Analysis is used to analyze the relation- ship between the control (Plan A) and alternative plans in the same manner as described in Chapter VI. The log of the power spectrum is plotted and around it are con- structed simultaneous confidence bands for all frequencies. The power spectra of the two plans under examination should show similar characteristics; notable "power" should exist at similar frequencies. The correlation between frequency components of the two series is given in the coherence diagram. Relationship between different frequencies of the two series is shown in the phase diagram. Finally, the gain diagram is the graph of the equivalent of the regression coefficient of one process on the other at all frequencies. Again this procedure is presented only for dollar sales of Product 1, as no significant additional information 154 .0000»o ..0000.0 0000.0 0000»o 0000000000 0000.0 0000.0 0000.0 0000.0 0000000> 0 0000000 0000.0 0000.0 0000.0 0000.0 0000 0000.0 0000.0 0000.0 0000.0 0000000000 0000.0 0000.0 0000.0 0000.0 0000000> 0 0000000 0000.0 0000.0 0000.0 0000.0 0000 0000.0 0000.0 0000.0 0000.0 0000000>00 0000.0 0000.0 0000.0 0000.0 0000000> 0 0000000 0000.0 0000.0 0000.0 0000.0 0000 000003 00000 0000.0 0000.0 0000.0 0000.0 0000000>00 0000.0 0000.0 0000.0 0000.0 0000000> 0 0000000 0000.0 0000.0 0000.0 0000.0 0000 0000.0 0000.0 0000.0 0000.0 0000000>00 0000.0 0000.0 0000.0 0000.0 0000000> 0 0000000 0000.0 0000.0 0000.0 0000.0 0000 0000.0 0000.0 0000.0 0000.0 0000000>00 0000.0 0000.0 0000.0 0000.0 0000000> 0 0000000 0000.0 0000.0 0000.0 0000.0 0000 000mm 000009 0 0000 0 0000_ 0 0000 m 0000 ..0 0000 ..0 0000 ..0 0000 ..0 0000 .0000uuomonm 0000msvocHll.m0.0 mqmds 155 is provided by the analysis of the control and four plans for the other five variables. Observation of Figures 7.5, 7.6, 7.7, 7.8, and 7.9 shows that in all cases (Plans A, B, C, D, and E respec- tively) particular frequency bands contribute more to the overall variance than might reasonably be expected. These frequency levels occur at the equivalent of 20 days, 8 days, 4.5 days, 3.5 days, and just under 3 days. The 20 day (four week) and 4.5 day (almost one week) periodicity could well be expected, but an explanation for other three frequency bands is difficult to find. But anyway the main result to be obtained from the power spectra is that the frequency bands supplying notable power are the same for all plans. This implies that a significant difference does not exist between the original data streams. Figures 7.10 to 7.13 show the coherence of each alternative plan with plan A, while Figures 7.14 to 7.17 give the phase and Figures 7.18 to 7.21, the gain of all possible comparisons with the control. The coherence diagrams show that the correlation per pair of frequency components is stronger when the control is compared with the alternatives of a new product sample (Plan E) and three product categories (Plan D) than xvith the alternatives of changed blocking factors (Plans B and C). .0 000060.000 How mwamm MwHHonplm 80000 no ghuoomm 090000 owumaauwMIlfiK 00.00—0.00.0. 4.104. .U . On... 3. (Q n CA 3 . DJ mu Ohm 0... _ C... W. cm Aw CW 4. CA. CC A C _ _ A _ _ _ A — _ _ _ _ _ — q . _ _ _I _ _ d ldl > \0 k; . >K/3/ > > > > __ < iii/0 < E 0:001: r /\> / 00 > E :7 07 :0 > > 0/ 0L > y 0005 >>>\: ~/. >W/ 0/70000: 0:00.: :0: :0< 00 \ CWO -00.: \./\C 0W0, <0 _\ 20> > (7 ~\/ EM 000 ./ 0L \/,,:0/\ /\ C 0 0 <0 :00: 00 /\~ / / fidN >4 0000 < >< 0 /> K 00% 00 > VNA / z 0 \0 .‘U'C .H nonvoum How mwamm Huaaontlm swam mo ennuommm Hm3om wwumfiflumMIl.m.h musmwh .000 0.00 m.om 0.05 3.00 n.0m 1.C3 P.3m m cm 0.00 oo_o * _ _ _ _ _ _ 0 0 _ 0 1 0 _ _ _ _ _ _ 0 w c 0 > 7. K/ 0 > : \ / 0 0 > >> / >>7.>\/ xfxfl 0M> KC/ ~l \ , / >\ 0 < A L / 0\00: HR» 000 C0\) /\ 2 0\ 0 \ 0 0 ,0\ KS/ K 0L > 2 / 0 ,/\ 0% /<\ : H W 0 ”<00 0000 x J\ é 00 00F 00 00 0,000 00 00 \001 . 0 0/\00 J<170 / $025 :0 000/ 5 0 0 / / \ 00 q /\ 3 / / >< / 0 0\ 0 \V \ 70>. <33 < K; S \ < /., 00 \ > 00., < 3.0 0 10:00 Xxx C :< < : /<\ K /\ < < / 158 .0 0050000 How mwamm HMHHonmIU :mam mo Efinuowmm HmBOm mmumfiflumMIl.h.h munwflm B . GB. my . GD 1 . Di... mm . 0.00m /7> >< <0) \ <>c1JS>Z 4 7 <\ \7/ \/\ /:/,\L D: : ,< ><1 : E >2 : Y: \Z 1. Z: :0- 0: 0 :1 /.7\/ 0:“ / \< w: H/ ::> 0 0 [k 0 N ,/\0 0/ \ p/N 0 a < N 0KFWMKK 00 : ”x 0 0 0 00 N/ A \HAMF / 00 / \ 0 \#/\ ><~ , 00M: :2 0 0 /0 \00/ <0~<0 KN /\ 000.. \0N 0 0 0 g. r 0 H; .11; 159 .H uoscoum How mmamm Hmaaoarcm swam no Eduuommm Hm3om wwnmfiwummll.m n ousmwm P.0d. n.8,”. D.OD .1. > >\>\/ \2: > ; > ;\/>: 53 ,, E §> :5 E. , P \1/ \EE :\ /\\ < / \L r x + . E :F >\:>/ fz > >\:\:>>\<\: :<::~// \/ > >, :12: E11 a 3 7x: 1 N: /> N 7, Z): \ /L c3 NC < N r/\ \SML~ C/ N3/\/ \ w! \ I, L W. >\ /< N ::/TJ A NW/ a h/N \/ \> > ‘2 \. :r < ~ wlwmatlr /w\ x / Ax /\/ / \ /.—- N H," ~N Q E < \\ / ..‘d C 160 .H poswonm new mwflmm Hmaaonllm swam mo fisuuommm umzom wwwmfiflummll.m.h muswfim < E, / P > V y g 7 c: z: <6 w \EC/ />\ i. : >/\/ \L : \/ x/ < 5 7:: \/ 7 <73; > 3/ (\S N/\ / K / P :N ~:\ /\> / :2/ . : aag‘ Mag ~ r/ M. \/e\ V \K / \L ¢ /\~/ \/ / \~ / K()K2 7 ~/ \/ :\ \:~$ / \/ HPNQ / \~ ¢ mnsmwm Z; \. / \/ FL' I? “‘5' ’_\ 35 C w m fl . Duo 3. .. On. \Jg 5‘0' 93‘- ‘39 : 7!? F] .H “osoonm now mmflmm amaaonsrm swam cum 4 swam mo :Hmw33.ma.h musmflm 169 H.023 U . CU .... Cm. m. . Cm _ j . On... m... . DD 1 F... ml, OM mw ON ..., 0:. CO C _ _ _ _ _ _ a _ a A fl _ 3a d a q _ _ _ a _ - 4 3 - / L I / l4 - \ 3 3 i 3 J H.‘ F;- "E 170 w .H uoswonm How mmamm Hmaaoallo swam cam 4 swam mo :HmUII.mH.n musmwm n. .Cn. fl . CD. P . CM .1 . Ci. mu ,Omm mu . CW H .OH. CC . O It. vfi ... Cfi .H uosnonm Hem mmamm Hmaaoalso swam mam m swam mo :flmwtl.o~.> musmwm x/ , i/ /\ 2 .daa 0.00 3 am 5.65 3.03 9.0m 1.03 m.3u m_nm A.ca 00.0 _ _ 4 d A _ _ _ fl _ _ _ _ _ — _ . q _ a q _ _ L1 ‘31 i .H uosoonm How mmamm HMHHOQIIM swam can a swam mo camoll.am.h madman P .Om. 3.00.. 3.an 7.07. m» .Cmm N.CN .... .OH. 00.0 A a _ ‘ fl — _ a _ — -fl fl fl a fi3333+|l \ 34 ((1 -4 Lwl _ . . . _ . . . _ . . . . . . L‘a’ ‘ 173 The phase diagrams shows oscillations about a constant other than zero. This indicates the presence of a fixed angle lag rather than a fixed time lag (i.e. the lag is prOportional to the inverse of the frequency which is the period of the component). Although this fact is of interest in the analysis of the time series, it is only germane to this study to the extent that this angle lag is present for all plans. Values which can be interpreted as regression coeffiCients are given in the gain diagram. As with coherence, better results are obtained for Plans D and E than for Plans B and C. Factor Analysis The factor loadings of the control (Plan A) are compared to the factor loadings of each of the alternative 'plans. A factor analysis of the six streams of data generated by Plan A (dollar sales and sales weight for each of the three products) produced factor loadings of rmost value with three factors. This was also true for «each of the other four plans. These three factors in «each case describe the data they represent, and the simi- larity of this descriptive power between plans indicates ea similarity in the basic data. 3 Table 7.14 contains the similarity matrices for time factor loadings of all plans when compared with Plan A. 174 NNNN.O mmHo.c memo.o m m cmHm Hem Nvmo.o3 mmoo.o3 oooo.H3 . N mchmoq Hoaomm mvmo.o Nmmm.o Hvoo.o H oooo.H Hsoo.o HNNo.o m a cmHm Haw omoo.o mamm.o mmoo.o3 N mcHumoH Hoyomm mmNo.o omoo.o3 Nmmm.o H vmvo.o mmmo.o3 Hamm.o m o :mHm How mmmm.o omoo.o mmvo.o N mcmeoH “Opomm NOHo.o3 Hamm.o3 mmmo.o H vao.o NNmo.o3 ommm.o m m sMHm How mmmm.o HNHo.o3 oomo.o N mchmoH Houomm NNHo.o ommm.o3 Nmmo.o H m N H 4 swam How mcflomoq Howocm .mmcflpmoq Houomm How mmowuumz wpflumafleflm3u.va.n mamma 175 Each element in the matrix has a range from -1 to +1, significant correspondence between the factors occurring with a value of 0.78868 or above. From the table a significant one—to—one correspondence between factors is found for every comparison. Sensitivity of the Model's Major Assumptions Some of the early results of this chapter are contradictory. While analysis of variance indicated that the means of Plans D and E were significantly different from the mean of the control, multiple comparison provided results exactly opposite--accept the means of Plans D and E as being the same as the control mean. The F test of variances and the testing of the correlation coefficient rejected about half of the plans as being equal to the control. But all the remaining tests of the chapter accepted the alternative plans as being equal to the con- trol plan. While there was not 100% support from all analyses for the hypothesis that no significant difference exists between Plan A and Plans B, C, D, and E, neither was the hypothesis consistently rejected for any one plan (even ‘when considering only those few tests which rejected the hypothesis for one or more plans). Even prior to an evaluation of the relative merit (of each form of analysis, the conclusion that the two 176 major assumptions embodied in LREPS do not have a sig- nificant influence on the model's endogenous data streams can be accepted. CHAPTER VI I--FOOTNOTES 1A. H. Packer, "Simulation and Adaptive Forecasting as Applied to Inventory Control," Qperations Research, Vol. 15 (July, 1967), pp. 660-679. 20. K. Helferich, "Development of a Dynamic Simula- tion Model for Planning Physical Distribution Systems: Formulation of the Mathematical Model" (unpublished D.B.A. dissertation, Michigan State University, 1970), P. 98. 3 Ibid., p. 121. 4J. Riggs, Production Systems: Planning, Analysis andIControl (New York: John Wiley & Sons, Inc., 1970), p. 70. 177 CHAPTER VIII A GENERALIZED VALIDATION PROCEDURE Introduction Before the tests of the last three chapters can be evaluated as a generalized validation procedure, the results of the application of these tests for the LREPS model need to be more closely examined. The results obtained must be evaluated in light of the relative merit or value of the technique generating them. The merit of a technique is established from the number and severity of the assumptions of the technique. The selection procedure for techniques to be used for each type of validity testing is given in the next section, and the following section is a discussion of the assumptions contained in the techniques which were selected. Two questions remain: is the LREPS model valid, and has a generalized validation procedure been developed? These two questions are answered in the final sections of the chapter. Selection of Statistical Validation Techniques Not all the statistical techniques presented in (Shapter II were used for the validation procedures described 178 179 in Chapters V, VI, and VII. A summary of those techniques which were used is given in Table 8.1. Four techniques were not used at all: sequential analysis, multiple ranking, the Kolmogorov-Smirnov test, and response surface analysis. Sequential analysis provides a means of reducing computation if superfluous information is available. This technique was not considered because the primary difficulty in the analysis of the LREPS model was that caused by insufficient data. Use of this technique can save time and effort, but does not change the final results obtained. Multiple ranking is a method to determine the "best" of several plans under consideration. To estab- lish the validity of a model, the important task is to determine if significant differences exist between sets of data. The size of this difference is unimportant; the mere fact that it exists casts doubt on model validity. This technique could be used to advantage during model experimentation. The Kolmogorov-Smirnov test establishes if a given sample is a sample from a particular distribution. This test could have been used to test the normality of data used in other techniques which assume normality. This was not done, as more powerful techniques not having this assumption were also used. Response surface analysis is a technique which can be used to approximate the optimal value of a given function. 180 XXXX XX x ><><><>< XX X x mHmMHmc< momMHsm mmcommom mflmemcé Houomm mammamcs Hmnuommm pcmHOHmmoou wuflamsqmcH m.aflm£B #mmB mumswmlflnu mga umme >ocHHEw3>onomoaHom one mammamcd scammmummm mflmwamcd coflpmawunou pmme m was mcflxcmm mamfluasz somflmmmfioo wamfluasz mocwflhm> mo mflmhamcs mflmwfimc< anagcosqmm mammamcm HMOflgmmuu AHH> smudmnov mQONDQESmm< Mona: mo muw>flgflmcmm HH> “mummgov suHHHna m>HuoHomHm A> Hmpmmnov suHHHnmum .mmsqflcnome Havaumflnwpm mo mmblu.a.m mamme 181 Although this technique is recommended for testing simula- tion models by Naylor,l it was determined to be of relevance for design rather than the validity procedures developed in this dissertation. All of the remaining techniques of Chapter II were used to test the model's predictive ability and the sensi- tivity of its assumptions. When establishing the long-term stability of the model, analysis is concentrated on a single endogenous data stream (all other analyses are comparisons between pairs of endogenous data streams). This limits the applicability of techniques for stability testing to the four listed in Table 8.1. Comparative Value of Results Because of the reasonably large number of techniques used and because the results obtained from these techniques were sometimes conflicting, the results of a technique need to be weighted by a measure of the technique's merit or value. This measure of value is established by the number of major assumptions which are contained in the technique. The three most common assumptions in the techniques used are the assumed independence between individual observa- tions, the assumed equality of variance, and the assumed normality of the variables under consideration.2 Analysis of variance, the F test, and multiple comparison include all these three assumptions. The assumption of independence 182 can be satisfied by the independence of the pseudorandom numbers generated.3 This is not so for the type of testing carried out on the LREPS model. Inequality of variance for analysis of variance has little effect for a reasonable number of plans when the sample size is the same.4 Departure from normality can have severe effects on inferences about variances, but little effect on inferences about means.5 The number of observations in a Chi-square test needs to be large (at least 50) in order for the excess of actual over expected frequencies to be normally distributed. Also the theoretical cell frequency must be an absolute minimum of 5 and a reasonable minimum of 10.6 Theil's Inequality Coefficient is always positive. Because it does not discriminate between the direction of forecast error, the coefficient might not be suitable for some applications.7 The main assumption of factor analysis is that the observed variables are linear functions of the factor variables. All observed variables must also be linearly related to one another.8 This assumed relationship can be relaxed to monotonic, as a straight line can be assumed a good approximation to a monotonic function. While another assumption is that each observed variable must be normally distributed, considerable latitude from this assumption is often possible. 183 For correlation analysis the number of observations used must be reasonably large (even up to 100) or little reliability can be placed on the interpretation of the coef- ficient of correlation.9 The spectral analysis performed assumed the stochastic process under consideration to be covariance stationary.lo That is, the second moment of the process is finite and a function only of reference time. If the process is not covariance stationary, the trend can be removed by filtering or transforming the time series. An effective method of performing this task is to apply a large term moving average to the data. The Tukey-Hanning estimate of the power spectrum was used for all analyses which allows very small leakage from one frequency band to another.11 The effect of the covariance stationarity assump- tion is then minimized even if the data violates the assump- tion. But the most important fact about spectral analysis is that the technique does not assume independence of observations. This means that autocorrelated data (the form of the output of most simulation models) can be analyzed effectively. I A summary of these assumptions is shown in Table 8.2. Because the effect of an assumption can vary given the particular analysis, the important consideration is how many of these assumptions are violated for the analysis 184 vmmm x x AmmHDMNHm> “Opomw mo mc0flaocsm Hmmcfla on umsfi mmHanHm> ©m>ummnov Ammfiumm mfifiu mumcoHDMDmv Amcmflm cmoBDmh mpmcflfifluomflp p0: mmomv Aoa mo Eoeficfle mocmsqwum Hamo Hmowumuomnev Ammuwa on Dmsfi mcoflpm>uwmno mo HmQESZV x Ammmma on umsE chHDm>Homno mo Honeszv x Ammuma on umDE mcoflpm>smmno wo HoQEDZV x x x x x x x x x mwmwamsm Houomm mammamc< Hmupowmm usmfloflmwmoo anHmswqu m.HHmae Dmme mumsqmlflzo one mflmmamcfi coflmmmummm mflmwamcm coflpmaoumoo umma m ore cOmHHmQEOU mamflpasz mocmflnm> mo mflmwawcs mflmwamcd Havanmmuo emumHoH> wDHHmEHoz mocmwnm> mocmoaomoch mQONDQESmmm wo mpfiamsqm mo quEsz .mosqwcnomfi Havapmflamum wo mcoHumEsmm<33.N.m mqm mo mHmsHmcm o.ooH 0.0m o.OOH ON.o s mHmsHmca HmoHnmmuo HH> .dmno H> .deo > .deo cmumHoH> pnmflmz mcoflpmadmmd mo umnfisz mpasmom manwuo>mm mo ommucmomom .wuflpflam> mo mooflchll.m.m mqmda 191 money permit, he can then move to techniques which Violate more assumptions and provide information of poorer quality. With this selection procedure the value of the validity index may tend to vary inversely with the number of techniques used.14 The procedures detailed in this thesis provide a generalized validation procedure, and the validity index provides a basis for intra-model analysis and inter-model comparison. CHAPTER VIII--FOOTNOTES 1T. H. Naylor, Computer Simulation Experiments with Models of Economic Systems (New York: John Wiley & Sons, Inc., 1971), pp. 172-175. 2T. H. Naylor, K. Wertz, and T. H. Wonnacott, "Methods of Analyzing Data from Computer Simulation Experi- ments," Communications of the ACM, Vol. 10 (November, 1967), p. 703. 3M. D. MacLaren and G. Marsaglia, "Uniform Random Number Generators," Journal of the ACM, Vol. 12 (1965), pp. 83‘890 4H. Scheffe, The Analysis of Variance (New York: John Wiley & Sons, Inc., 1959), p. 345. 5 Ibid., Chapter 10. 6G. U. Yule and M. G. Kendall, An Introduction to the Theory of Statistics (London: Charles Griffin and Company Ltd., 1953): p. 469. 7H. Theil, Applied Economic Forecasting (Amsterdam: The North-Holland Publishing Co., 1966), p. 28. 8H. H. Harman, Modern Factor Analysis (Chicago: The University of Chicago Press, 1960), p. 380. 9 Yule and Kendall, p. 231. 10E. Parzen, Stochaspic Processes (San Francisco: Holden-Day, Inc., 1962), p. 70. 11C. W. J. Granger and M. Hatanaka, Spectral Analysis of Economic Time Series (Princeton, N.J.: Princeton Uni- versity Press, 1964), p. 60. 12 The results of face validity testing for the LREPS model are shown in Table 4.1. 13The indices were calculated omitting results pertaining to Product 3 because of this product's instability of demand. 192 193 14The truth of this statement can be established or rejected by sensitivity analysis. If the index does vary in this manner the appropriate corrective weighting system can also be determined. BIBLIOGRAPHY 194 BIBLIOGRAPHY Amstutz, A. E. Computer Simulation of Competitive Market Res onse. Cambridge, Massachusetts: The M.I.T. Press, 1967. Balderston, F. E., and Hoggatt, A. C. Simulation of Market Processes. Berkeley, California: Institute of Business and Economic Research, 1962. Bechhofer, R. E., and Blumenthal, S. "A Sequential Multiple- Decision Procedure for Selecting the Best One of Several Normal Populations with a Common Unknown Variance, II: Monte Carlo Sampling Results and New Computing Formulae." Biometrics, Vol. 18, March 1962. Blackman, R. B., and Tukey, J. W. The Measurement of Power Spectra. New York: Dover Publications, Inc., 1958. Bonini, C. P. Simulation of Information and Decision Systems in the Firm. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1962. Bowersox, D. J., et a1. Dynamic Simulation of Physical Distribution Systems. Monograph. East Lansing, MIChIgan: Division of Research, Michigan State University, Forthcoming. ‘ Bowersox, D. J.: Smykay, E. W.; and LaLonde, B. H. Physical Distribution Management. New York: The Macmillan Company, 1968. Box, G. E. P. "The Exploration and Exploitation of Response Surfaces: Some General Considerations and Examples." Biometrics, Vol. 10, 1954. Box, G. E. P., and Wilson, K. B. "On the Experimental Attainment of Optimum Conditions." Journal of the Royal Statistical Society, Series B, Vol. XIII, I951. Buchan, J., and Koenigsberg, E. Scientific Inventory Manage- ment. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1963. 195 196 Carnap, R. "Testability and Meaning." Philosophy of Science. Vol. 3, No. 4, October, 1936. Chu, K. Quantitative Methods for Business and Economic Analysis. Scranton, Pennsylvania: International Textbook Co., 1969. Clarkson, G. P. E. Portfolio Selection: A Simulation of Trust Investment. Englewood Cliffs, N.J.: Prentice- Hall, Inc., 1962. Cohen, K. J. Computer Models of the Shoe, Leather, Hide Sequence. Englewood Cliffs, N.J.: Prentice-Hall, Inc., 1960. Cohen, K. J., and Cyert, R. M. "Computer Models in Dynamic Economics." The Quarterly Journal of Economics. Vol. LXXV, No. 1, February, 1961. Conway, R. W. An Experimental Investigation of Priority Assignment in a Job Shop. Santa Monica, California: The Rand Corporation, RM-3789-PR, 1964. Conway, R. W.; Johnson, B. M.; and Maxwell, W. L. "Some Problems of Digital Systems Simulation." Management Science. Vol. 6, October, 1959. Cooper, G. R., and McGillem, D. C. Methods of Signal and System Analysis. New York: Holt, Rinehart and Winston, Inc., 1967. Cyert, R. M. "A Description and Evaluation of Some Firm Simulations." Proceedings of the IBM Scientific Computing Symposium on Simulation Models and Gaming. White Plains, N.Y.: IBM, 1966. Cyert, R. M.; Feigenbaum, E. A.; and March, J. G. "Models of a Behavioral Theory of the Firm." Behavioral Science. Vol. 4, No. 2, April, 1959. Draper, N. R., and Smith, H. Applied Regression Analysis. New York: John Wiley & Sons, Inc., 1967. Duncan, A. J. Quality Control and Industrial Statistics. Homewood, Illinois: Richard D. Irwin, Inc., 1965. Dunnett, C. W. "A Multiple Comparison Procedure for Comparing Several Treatments with a Control." Journal of the American Statistical Association. Vol. 50, December, 1955. 197 Fishman, G. S. Digital Computer Simulation: Input-Output Analysis. Santa Monica, California: The Rand Corporation, RM-5540-PR, 1968. Fishman, G. 8. Digital Computer Simulation: The Allocation of Computer Time in Comparing Simulation Experiments. Santa Monica, California: The Rand Corporation, RM-5288-PR, 1967. Fishman, G. S. Problems in the Statistical Analysis of Simulation Experiments: The Comparison of Means and the Length of Sample Records. Santa Monica, California: The Rand Corporation, RM-4880-PR, 1966. Fishman, G. S., and Kiviat, P. J. Digital Computer Simula- tion: Statistical Considerations. Santa Monica, California: The Rand Corporation, RM-3281-PR, 1962. Fishman, G. S., and Kiviat, P. J. Spectral Analysis of Time Series Generated by Simulation Models. Santa Monica, California: The Rand Corporation, RM-4393— PR, 1965. Forrester, J. W. Industrial Dynamics. Cambridge, Mass.: The M.I.T. Press, 1961. Goodman, N. R. Scientific Paper No. 10. New York: New York University Engineering Statistics Laboratory, 1957. Granger, C. W. J., and Hatanaka, M. Spectral Analysis of Economic Time Series. Princeton, N.J.: Princeton University Press, 1964. Harman, H. H. Modern Factor Analysis. Chicago: The University of Chicago Press, 1960. Hausman, W. H., and Gilmour, P. "A Multi-Period Truck Delivery Problem." Transportation Research. Vol. 1, No. 4, December, 1967. Helferich, O. K. "Development of-a Dynamic Simulation Model for Planning Physical Distribution Systems: Formulation of the Mathematical Model." Unpublished D.B.A.dissertation, Michigan State University, 1970. Hoggatt, A. C. "Statistical Techniques for the Computer Analysis of Simulation Models." Appendix in Studies in a Simulated Market. L. E. Preston and N. R. Collins. Berkeley, California: Institute of Business and Economic Research, 1966. 198 Holzinger, K. J., and Harman, H. H. Factor Analysis: A Synthesis of Factorial Methods. Chicago: The University of Chicago Press, 1941. Jenkins, G. M., and Box, G. E. P. Time Series Analysis: Forecasting and Control. San Francisco: Holden- Day, Inc. , 1970- Jenkins, G. M., and Watts, D. G. Spectral Analysis and its Applications. San Francisco: Holden-Day, Inc., 1968. Karreman, H. F. Computer Proggams for Spectral Analysis of Economic Time Series. Princeton, N.J.: Economic Research Program. Princeton University, Research Memorandum, No. 59, 1963. Kraft, C. H., and Van Eeden, C.. A Nopparametric Introduc- tion to Statistics. New York: The McMillan Co., 1968. Kuehn, A. A., and Hamburger, M. J. "A Heuristic Program for Locating Warehouses." Manggement Science. Vol. 9, No. 11, July, 1963. MacLaren, M. D., and Marsaglia, G. "Uniform Random Number Generators." Journal of the ACM. Vol. 12, 1965. Marien, E. J. "Deve10pment of a Dynamic Simulation Model for Planning Physical Distribution Systems: - Formulation of the Computer Model." Unpublished Ph.D. dissertation, Michigan State University, 1970. McMillan, C., and Gonzalez, R. F. Systems Analysis: A Computer Approach to Decision Models. Homewood, Illinois: Richard D. Irwin, Inc.,i1968. Naylor, T. H. Computer Simulation Experiments with Models of Economic Systems. New York: John Wiley & Sons, Inc., 1971. ' Naylor, T. H., et a1. Computer Simulation Techniques. New York: John Wiley & Sons, Inc., 1966. Naylor, T. H.; Burdick, D. 8.; and Sasser, W. E., Jr. "The Design of Computer Simulation Experiments." The Design of Computer Simulation Experiments. Edited by T. H. Naylor. Durham, N.C.: Duke University Press, 1969. 199 Naylor, T. H., and Finger, J. M. "Verification of Computer Simulation Models." Management Science, Vol. 14, October, 1967. Naylor, T. H.; Wertz, K.; and Wonnacott, T. H. "Methods for Analyzing Data From Computer Simulation Experiments." Communications of the ACM. Vol. 10, November, 1967. Naylor, T. H.; Wertz, K.; and Wonnacott, T. H. "Spectral Analysis of Data Generated by Simulation Experiments with Economic Models." Econometrica. Vol. 37, April, 1969. Packer, A. H. "Simulation and Adaptive Forecasting as Applied to Inventory Control." Operations Research. Vol. 15, July, 1967. Parzen, E. Stochastic Processes. San Francisco: Holden- Day, Inc., 1962. Parzen, E., ed. Time Series Analysis Papers. San Francisco: Holden-Day, Inc., 1967. Paulson, E. "Sequential Estimation and Closed Sequential Decision Procedures." The Annals of Mathematical Statistics. Vol. 35, September, 1964. Popper, K. R. The Logic of Scientific Discovery. New York: Basic Books, 1959. Riggs, J. Production Systems: Planning, Analysis and Control. New York: John Wiley & Sons, Inc., 1970. Robinson, E. A. Multichannel Time Series with Digital Computer Programs. San Francisco: Holden-Day, Inc., 1967. Rogers, R. T. "Development of a Dynamic Simulation Model for Planning Physical Distribution Systems: Experi- mental Design and Analysis of Results." Unpublished Ph.D. dissertation, Michigan State University, Forthcoming. Sasser, W. E.; Burdick, D. 8.; Graham, D. A.; and Naylor, T. H. "The Application of Sequential Sampling to Simulation: An Example Inventory Model." Communica- tions of the ACM. Vol. 13, May, 1970. Scheffe, H. The Analysis of Variance. New York: John Wiley & Sons, Inc., 1959. 200 Siegel, S. Nonparametric Statistics. New York: McGraw- Hill Book Company, 1956. Theil, H. Applied Economic Forecasting. Amsterdam: The North-Holland Publishing Co., 1966. Tocher, K. D. The Art of Simulation. London: The English Universities Press Ltd., 1963. Tukey, J. W. "Discussion, Emphasizing the Connection Between the Analysis of Variance and Spectrum Analysis." Technometrics. Vol. 3, No. 2, May, 1961. Tukey, J. W. "The Problem of Multiple Comparisons." Princeton, N.J.: Dittoed Manuscript. Princeton University, 1965. Turing, A. M. "Can a Machine Think?" The World of Mathematics. Edited by J. R. Newman. New York: Simon and Schuster, 1956. Van Horn, R. "Validation." The Design of Computer Simulation Experiments. Edited by T. H. Naylor. Durham, N.C.: Duke University Press, 1969. (3Veinott, A. F. "The Status of Mathematical Inventory V Theory." Management Science. Vol. 12, No. 11, July, 1966. Winer, B. J. Statistical Principles in Experimental Design. New York: McGraw-Hill Book Company, 1962. Yaglom, A. M. An Introduction to the Theory of Stationary Random Functions. Englewood Cliffs, N.J.: Prentice- Hall, Inc., 1962. Yule, G. U., and Kendall, M. G. An Introduction to the Theory of Statistics. London: Charles Griffin and Co., Ltd., 1950. "1.111111113311111(TS