‘ b .N‘EE‘" I.I r - . 7‘ .I I) I ’3'“, .I'I-xv, 1, .1. 1‘4. €11,515. - . . ‘2‘. m 1' 1 1,: _ 11 . ‘W-Smu 2’. . .’. V 1 L 1. §:.. . Au? "",‘4‘-‘ “‘6 .4 ”“1“" ‘ 1. \tI ’\ 'V' 5 . .5. 1 ' l 1 1" ‘J’A‘éI-l I ‘ ‘ A I. .1 11M. 6-" 1;". *t" i": 'gfi’i‘fi $3 TI ‘3'; .r1uurIL'.)-'P%\lifl.£1.:;¢ \‘V‘. “W” ' V h\:“1‘:." KMA \ :I'-.. .Q I 'I L I If. ' 1 > n 1 "I ‘Sy‘vé gx'ggéqvvs. :3 :I'W 1; $3». 1113' S, 1‘ 3 I ,. ' 3“I- l' ' It} :Lg'ak, "l' 1 ‘-I""‘-' I1 ‘ II '1‘“. “HINT ' " “14"..." _ 53$: 0“": ct" (1'11" ,‘\ _. I’II:v \ I‘M-"x: ‘1‘ 1“ T“ 1\I I“ '1’JE.“"‘U'..L "NF". 5". w 1. 4a-. . milk? .' 51%.... '- 3 VJ, ' .1' n ‘I'l'v't I I l I"! 1‘ 1'” l '1; t' 31;. mm 1 1‘" ”L I” . . '11: 12-15. .:..#\ I . .;‘."\ , . 1 . a . 1 -. a. dz‘r; My. I t Q .I‘ tw‘ l1 . 0“ I II’ II‘ - . w ‘ '1 1.9“}- , \Ip A. 1 1 1 ‘. III Iv -. -< - . -. » '1'! I ‘.I':..".' 7 m l"; n- I“ $i'11", '1 1 ’1‘? -. ":1 "“ .. - fl ‘1? :1? “111“”l “h! 4“” 1' I 1 ,. ' I‘ '. I . ‘ . b.2191 '.. "I ‘1‘ U . I..I"' I ’ ‘ ‘, 4, ' ' ’ W I" l 1 III A'.‘ Q“ ' f;|.nl-r I- . *2 '1. "dul: I I ~Iy‘; H“ -‘I'I um“; I “1"- .' 1:1. 1. "in This is to certify that the thesis entitled A MODEL OF PROGRAM PERFORMANCE IN MULTIPROGRAMMING SYSTEMS presented by Chester Terrance Trahan has been accepted towards fulfillment of the requirements for Ph.D degree in Management Science Mame/t Major professor Date ”/5/ 77 / , 0-7639 © 1978 CHESTE R TE RRAN CE TRAHAN ALL RI GHTS RESERVED A HODEL 0P PROGRAH PERPORHANCB IN HULTIPROGRLHUING SYSTEMS BY Chester Terrance Trahan A DISSERTATION Submitted to Bichigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Departlent of Management 1978 (1‘ ABSTRACT A HODEL OF PROGRAH PERPORHANCE IN HULTIPROGRAHHING SYSTEES BY Chester Terrance Trahan Good planning information is a continuing necessity for Data Processing management. Along with the need for good performance information in planning to meet demand for computer services, there is a need for individual program performance prediction in computer job scheduling. Some of the information necessary for planning is collected based on performance measurement (usually by system software). This must be augmented by benchmarks, simulation or other techniques to predict system performance under a specific set of circumstances. The use of an analytic model has been proposed as the most flexible method of predicting computer performance for medium scale, priority-interrupt driven Operating systems. To this end an analytic model was developed with cyclic queuing submodels for the Central Processing Unit and the Input/Output units and a deterministic "independent reference" submodel for program paging behavior. The model was programmed in APL [1] and Newton's method and Aitken's pelt; §qgggg [2] algorithm were used to achieve convergence of the model to equilibrium solutions. Chester Terrance Trahan A set of parameterization runs were made on a System/370 nodel 1u8 and their measurements were then used to estimate the system parameters: I/O overhead, paging overhead, Page-in/Page-out ratios, and page "weights" for the paging submodel. A "synthetic" program was written so that computer usage and paging behavior could be controlled internally and real storage, Input/Output behavior and priority could be controlled externally to the program. Several variations of the "synthetic" program were measured and their characteristics estimated. Following parameter estimation, the estimated values were used in the APL model to predict the performance of the experimental programs in an experiment designed as a 2X2X2X2X2 half-replicate factorial. The actual experiments were then conducted and the experimental results compared to the predictions. The results of the experiment showed good prediction in the area of working set sizes and elapsed times and aggregate I/O rates. The results showed a sizable error in page rates, channel utilization, overhead and CPU utilization. The nature of these results was attributed to the inadequacy of the independent reference model in representing paging behavior. These results reinforced Belady and Kuehner's conclusions about the unsuitability of independent references [3] as a model for program paging. Hodifications were then made to the model and the predictions of the Chester Terrance Trahan revised paging model showed good agreement with the experimental data. The revised paging model as well as several previously developed models showed good agreement with paging behavior for programs executing with constrained memory. However, all of the models examined showed a poor fit under conditions of loose memory constraints. BIBLIOGRAPHY Iverson, K.B. A Programming Language, New York, John Riley 8 Sons, Inc., 1962. Isaacson, 2., and Keller, 8.8. Analys;_ g; ggmgrica; methods, New York: John Wiley 8 Sons, Inc., 1966. Belady, L.A., and Kuehner, C.J. "Dynamic Space-Sharing in Computer Systems", Communications 9; the Agg, To Thelma iii ACKNOWLEDGHENTS I would like to thank the chairman of my guidance committee, and present chairman of the Management Depart- ment, Professor Phillip Carter for his encouragement. I also want to thank the members of my dissertation committee, Co-chairmen Professors Richard Henshaw and Herman Hughes, and Professor Gerald Park. I also owe a debt of gratitude to my manager, Paul Beukema, for his help in getting permis- sion for me to perform my experimental work at IBh. East Lansing,hichigan C.T.T. 1978 iv TABLE OF CONTENTS LIST OF TABLES . . . . . . . . . . . . . . LIST OF FIGURES. LIST OF SYMBOLS. . . . . . . . . . . . . . INTRODUCTION . . . . . . . . . . . . . . . . I. Performance Planning . . . . . Scheduling . . . . . . . . . . Priorities and Interactions. Batch versus Timesharing . . . BACKGROUND AND PREVIOUS WORK . . . . . CPU Models . . . . . . . . . . Central Server Network Model . Simple Cyclic Queuing Model. . Simple Flow Model. . . . . . . . Cyclic Queuing with Paging and Overhead Product of Stages Model. . . . . . Multiple Resource Allocation Model "Straightfoward" Queuing Model . Eclectic Model . . . . . . . . . Summary of Queuing Models. . . . Synthetic Workload Benchmark . Models of Paging Behavior. . . . . WOrking Set Model. . . . . . The Lifetime Function. . . . . Markov Models. . . . . . A "Half-Life" Model. . . The Simple Linear Model. The Paging Index . . . . . The Page Survival Index. . . . . . I/O Models . . . . . . . . . . . . Disk Response Model. . . . . . . . Disk and Drum Scheduling Models. . The Disk Seek Model. . . . . . . Revised Disk Response Model. . viii H \l O O O O O I O O N on II. MODEL DEVELOPMENT AND ANALYSIS. . . . . . . . . . . .35 Batch Model Requirements . . . . . . . . . . . . .35 CPU Submodel . . . . . . . . . . . . . . . . . . .38 CPU Completion Time. . . . . . . . . . . . . . .39 CPU Waiting Time . . . . . . . . . . . . . . . .ua CPU Utilization and Elapsed Time . . . . . . . .46 Paging Submodel. . . . . . . . . . . .'. . . . . .47 Page Exception Rate. . . . . . . . . . . . . . .48 Memory Allocaton . . . . . . . . . . . . . . . .u9 "Near-Optimal" Memory Allocation . . . . . . . .53 1/0 Submodel . ... . . . . . . . . . . . . . . . .Su I/O Overview . . . . . . . . . . . . . . . . . .sa I/O Submodel Development . . . . . . . . . . . .55 Disk I/O Model . . . . . . . . . . . . . . . . .57 Direct Access Disk I/O . . . . . . . . . . . .62 Indexed Disk I/C . . . . . . . . . . . . . . .68 Sequential Access Disk I/O . . . . . . . . . .73 1/0 for Paging . . . . . . . . . . . . . . . .75 Submodel Integration . . . . . . . . . . . . . . .76 Model Convergence. . . . . . . . . . . . . . . . .82 III. MODEL VALIDATION AND EXPERIMENTAL DESIGN . . . . . .89 Experimental Plan. . . . . . . . . . . . . . . . .89 Programs for Measurement and Control . . . . . .91 Experimental Variables . . . . . . . . . . . . .92 Instruments and Measures . . . . . . . . . . . .93 Experimental Design. . . . . . . . .,. . . . . .98 Design for Parameter Estimation. . . . . . . .98 Experimental Design for Validation. . . . . 101 vi IV. EXPERIMENTAL RESULTS AND ANALYSIS . . . Parameter Estimation . . . Normal I/O overhead. . . Paging I/O overhead. . . Paging Ratio Estimation. Page Allocation Heights. Miscellaneous Parameters . . . CPU Rate Estimation. . . . . . . Virtual Storage Estimation . . . Storage and Paging Index Estimation. Experimental Manipulations . . . . CPU Variance Manipulation. . . I/O Variance Manipulation. . . Model Predictions. . . . . . . . Model Validation . . . . . . . . Results of Analysis of Variance Confidence Interval Analysis . Sources of Error . . . . . . . . CPU Submodel . . . . . . . . . . Potential Sources of Error-Paging. Post-Experimental Paging Analysis. . Revised Paging Submodel. . . . . . Comparison with Earlier Models . . V. CONCLUSIONS AND RECOMMENDATIONS. . . . . Research Conclusions . . . . . . . . Recommendations for Further Research GLOSSARY OF TERMS. . . . . . . . . . . . . . APPENDIX A O O O C C O O O O O O O O O O O O SELECTED BIBLIOGRAPHY. . . . . . . . . . . . vii 112 112 112 113 115 117 120 121 122 122 127 129 130 130 131 135 137 138 138 139 141 141 142 143 143 145 148 156 171 1. 2. 3. 4. S. 6. 7. 8. 9. 10. 11. 12. A1. A2. A3. A4. A5. A6. A7. A8. A9. A10. A11. A12. LIST OF TABLES I/O Overhead Regression Analysis. . Paging Overhead Regression Analysis Page I/O Ratio Analysis . . . . . . System-wide Paging Ratio Analysis Estimated Working Set Heights . . Working Set Weight Analysis . . . Paging Index Calculations . . . . Experimental Program Estimates. . Model Predictions . . . . . . . . Experimental Runs - Measured. . . Experimental Runs - Calculated. . ANOVA Table of Prediction Errors. Non-paging Runs Repl I - Measured Non-paging Runs Repl I - Calculated Non-paging Runs Repl II - Measured. Non-paging Runs Repl II - Calculated. Paging Runs Repl I - Measured . . . Paging Runs Repl I - Calculated . . Paging Runs Repl II - Measured. . . Paging Runs Repl II - Calculated. Page Allocation Runs ~ Measured . Page Allocation Runs - Calculated . Program Estimation Runs - Measured. Program Estimation Runs - Calculated. viii 113 115 118 119 120 121 124 128 131 133 134 136 156 15 8 160 161 162 163 164 165 166 167 168 169 10. LIST OF FIGURES Central Server Model . . . . Rational Laplace Transforms. Elapsed Time "Cycle"(ET) . . Disk Access Model Diagram. . Indexed Access Timing Diagram. Random Access Timing Diagram . . . Experimental Design for Estimation Experimental Design for Validation Levels for Independent Variables . Multiprogramming Interaction Model ix .13 .16 .40 .59 .60 .61 100 103 108 132 9(V) f(.) w(j) A(j) ALP ARAT(i) ARAT(i,j) ARAT(i,j,k) 3(1) BETA(j) BETA BS(i,k) BSP C(j) LIST OF SYMBOLS The average process time between page faults. A function giving the difference between the average instantaneous I/O access rates on two consecutive iterations of the model. A real number between zero and one. A probabili- ty. The average working set size for program j (pages). ,The paging activity index for program j (references/instruction). CPU overhead for a non-paging I/O cycle (sec- onds). Total access rate for all programs to device 1 (accesses/second). The access rate for program j to device 1 (accesses/second). The access rate for program j to file k on device 1 (accesses/second). The length of a CPU service interval before an I/O or paging operation is generated (seconds). The ratio of page-reads to page-writes for program j. The system ratio of page-reads to page-writes. The average block size (physical record size) for file k of device i (bytes). The block size of a page in the virtual system under discussion (bytes). The completion time for program j (seconds). CA CER (j) CI(i,k) CK (j) CL(i,k) COH(j) CP(j) CPPU) CSU) CS CTIME(j) CYL(i,k) CYT (j) DU) DC DEL DI (ivjlk) DI(i,k) The coefficient of variation for the distribu- tion of I/O requests for disk. The instantaneous page exception rate for program j (exceptions/second of process time). The cylinder location of the mid-point of the index area for indexed file k on device 1. A factor used in calculating the paging response time for program j. The number of cylinders in the index portion of file k on device 1. The CPU utilization for overhead due to program j. The CPU utilization due to program j not includ- ing overhead (COH(j)). The total CPU utilization due to program j including both CP(j) and COH(j). The coefficient of variation for the distribu- tion of completion time for program j. The coefficient of variation for the distribu- tion of disk service time. The total CPU time attributed to program j by operating system job accounting routines. Total problem-state time (seconds). The size of file k on device 1 (cylinders). The ratio of the apparent machine instruction rate to the nominal machine instruction rate for program j. The amount of CPU processing time preoempted by the channel in performing an average I/O opera— tion for program j. The amount of CPU time "stolen" by the channel for each byte of data transferred (seconds). The average CPU overhead for a paging cycle (seconds). The probability of an access by program j to file k on device i. The probability of an access to file k on device xi DS ETU) ETA(j) EXP(N) FAC(j) FN H(j) 10(1) KS LAN(j) LC(i,k) L(i) i by any program during a specified interval. The amount of CPU time "stolen" by the channel to initiate an I/O operation (seconds). The elapsed time per cycle for program j includ- ing initial wait H(j), completion time C(j), and I/O wait time IC(j) (seconds). The average I/o wait time during a paging cycle, ETA(j) = (1 + Beta(j))*PHI(j) (seconds). The paging expansion factor. The tendency of overall system paging rate to increase with an increase in multiprogramming set due to page frame contention. The expected fraction of completion time which a program must wait if completion of it's I/O finds the CPU busy servicing program j. If Cs(j) is the coefficient of variation for the completion time for program j, FAC(j) = (1+C5(j))/2- The average system real-time paging rate. The sum of the individual program real-time paging rates (pages/second). A factor used in calculating the induced comple- tion time due to the completion of I/O for a higher priority program while a lower priority program is in the interrupted state. The average response or wait time for any type of I/O operation by program j including paging (seconds). The average CPU instruction execution rate (instructionS/second). The slope of the expression for seek time. The average seek time per cylinder (seconds/cylinder). The average instantaneous I/O rate for program j. This means that while program j waits for completion of paging or I/O the operation completes with rate LAM(j) (completions/second). The mid-point of the data area of file k of device i (cylinder). The weighting factor for program j in a biased memory allocation scheme. xii M(j) M0(3) N(i,j,k) N(i,k) N(j) OHT(N) PG(Ii) PHI(1) 91(3) PDC” 93(1) RC(i,k) RCU(ivjnk) RCU RCU(j) The "critical memory". The smallest value of program j's working set such that program j's paging rate is less than .5 pages per second. The instantaneous CPU access rate for program j in the absence of paging. MU(j) = CTIHE(j)/(N(j)+1)- The average number of non-paging I/O operations to file k of device i issued by program j during it's execution. The average number of non-paging I/O operations issued by all programs in the multiprogramming mix to file k of device 1 during some specified interval. The total number of non-paging I/O operations generated by program j during it's execution. The total number of programs in the multipro- grammming mix. The total system CPU overhead with N programs in the multiprogramming set. The total number of cycles in program j's execution which terminate in paging exceptions. The number of pages read for program j. The average time it takes for program j to read or write a page from virtual memory (seconds). The average real-time page exception rate for program j. PI(j) = CP(j)*CER(j) (exceptions/second of real time). The total number of page-write operations on behalf of program j during it's execution. The the sum of the page-read and page-write rates for program j. The total paging rate for program j. The number of data records per cylinder for indexed file k cn device 1. The average channel utilization caused by program j accessing file k on device i. The average total channel utilization for some specified channel during a specified interval. The average total channel utilization that is xiii RD 5(1) 5K(iujoki SK(i,k) 5K2(i:j:k) S1(i,j,k) 52(iojlk) T(j) TAU (j) TRD(i:j:K) TS(ic1ok) TS1(i,j,k) apparent to program j. The expected channel utilization on the condition that program j is not using the channel. The period of rotation of a disk device. The length of time required for the platters or disks to complete one revolution (seconds). The total amount of pagable memory. The total amount of real memory minus fixed memory (pag- es). The total amount of virtual memory required by program j for execution--program j's address space (pages). The average seek distance experienced by program j when accessing file k of device i (cylinders). The average seek distance for any program in the multiprogramming mix when accessing file k on device i (cylinders). The second moment of the seek distance for program j when accessing file k on device i (cylinders**2). The average seek distance for program j when accessing the index area of indexed file k on device i (cylinders). The average seek distance for program j when accessing the data portion of indexed file k on device i (cylinders). The amount of time required to move a disk access mechanism exactly one cylinder (seconds).- The aggregate I/O rate for all but program j. This symbol is also used to represent a delay term in calculating initial wait. The average amount of CPU usage due to program j per cycle, including the overhead for paging and/or normal I/O (seconds). The average total response time for program j accessing file k on device i, including wait time (seconds). The average seek time for program j when access- ing file k on device 1 (seconds). The average seek time for program j when access- xiv T52(i:jok) TSC(i,k) TSC1(i,j,k) TSC2(i,j,k) TSD(iojok) TSD1(i,j,k) T"C(iojok) THE (1, j, k) V(j) X1(1) ing the index area of indexed file k on device 1 (seconds). The average seek time for program j when access- ing the data area of indexed file k on device i (seconds). The average channel service time for any program accessing file k on device 1 (seconds). The average channel service time for program j's access to the index area of indexed file k on device i (seconds). The average channel service time for program j's access to the data area of indexed file k on device i (seconds). The average device service time for program j when accessing file k on device 1 directly or sequentially (seconds). The average device service time for program j when accessing the index area of indexed file k on device 1 (seconds). The average device service time for program j when accessing the data area of indexed file k on device i (seconds). The average channel wait time due to blocking by other programs experienced by program j when accessing file k on device i (seconds). The average channel wait time due to the channel being busy when the record to be read or written passes under the read/write heads. This is the wait time experienced by program j when access- ing file k on device 1 (seconds). The initial wait time experienced by program j between the completion of an I/o or paging operation and program j's next access to the CPU (seconds). The average I/o response time experienced by program j for non-paging I/O to all of it's files (seconds). XV INTRODUCTION Data Processing management has a continuing need for planning information. Long and short range planning is necessary so that the data processing needs of the organiza- tion can be met without undue disruption of service and unnecessary expense. Lead times for equipment delivery schedules require that equipment be ordered several months to years before the equipment is actually needed. To assess the impact on customer service and installation stability, the data processing management team must be able to estimate the effect of changes in software or hardware. Software changes can be to application software or systems software. Application software is the collection of programs and procedures in an installation which are written to support the busine.§ fgpgtions of the organization. Programs which do payroll, schedule manufacturing opera- tions, and maintain inventory accounts are examples of application programs. Programs which generally support the functions of the data processing organization are called system software. Program compilers, sort routines, and even the collection of programs which control the elements of the computer system-~the operating system--are all examples of system software. Hardware changes may be made to the central processor (CPU), or to the peripheral or input/output (I/O) devices such as disk drives, tape drives, printers, drums, card equipment and teleprocessing termi~ nals. The change may be either replacement, acceleration of operating speed, or augmentation of the capacity of the device(s) or unit(s) under consideration. Part of the information needed by data processing management is provided by their data processing system in the form of job accounting data collected during system operation, while the systems requirements come from the business plans developed by corporate planners. The remain- der of the information required by data processing manage- ment to fulfill their responsibilities is derived from industry publications, vender literature and proposals, staff research and intuition. Information in the third category tends to be associ' ated with performance, capacity and the capability of computer systems. Typical questions that one hears in this category might be: will the addition of that new disk drive improve throughput by 10% or are other changes required? how many programs should be running concurrently to optimize throughput? will the installation of the new CPU be sufficient to handle a projected 15% increase in workload l:h' r during the next twelve months? will the proposed system meet the performance specifications? how will the schedule be affected if the third shift update is moved to the first shift? The answers to such questions are not only of interest to data processing management and systems analysts but they are of key importance to vendors of hardware and software. Vhile system collected job accounting data can supply the basis for an analysis which may yield the answers to the above questions, it cannot provide answers. The means of deriving answers to the above questions range from intuitive guesses to in-depth analysis and simulation. In some cases decisions are made to increase computer capacity without in-depth analysis because of a lack of analytical skills. In other instances benchmarks are performed as a means of alleviating this situation. A benchmark is the execution of an actual set of programs (currently implemented on an installed or "base" system) on the proposed or "target" system. This approach is usually impractical for the following reaSons: (1) the amount of time available on a target system is sometimes limited, (2) the data files from the system being modeled cannot be removed because they are needed for continuing processing, (3) and the number of disk packs available for extended periods of time is often inadequate for a complete bench- mark. Even if the foregoing problems can be surmounted, the logistics of duplication and transportation of card decks, disk packs and tapes can create additional problems. When one considers the conflicts involved in schedul- ing computer Operators for the benchmark as well as continu- ing production work, it is easy to see why the approach generally taken is the preparation of a ”representative" sample of job streams which are supposed to embody the characteristics of the entire system and can be executed in a limited time-~usually a few hours-~with limited data files. This approach is still quite expensive and is usually not used except in the case where very large compu- ter systems are under consideration. The limited benchmark approach still requires a great deal of preparation involv- ing both computer time and analysts' time. Another expensive approach to performance planning is the use of simulation. This approach has several varia- tions. In one variation, systems analysts build a model of the computer system from a basic computer language such as FORTRAN or PL/I. Another variation requires the analyst to build the model using a generalized simulation language such as GPSS, GASP, or SIMSCRIPT. The systems analyst sometimes chooses to use a basic system model supplied by a software vendor or consulting firm. In this case, the I/O configura- tion, memory size, and CPU speed of the target system are supplied as parameters to the basic system model. The major processes of each program are then modelled as events,i.e. CPU access, I/O access, and "interruption" or pre-emption of the access to the CPU of a less important program (low priority) by a more important program (high priority). This method, while more practical than the first two approaches, requires purchase of the vendor's software or his consulting and educational services and can still be quite expensive. In the real world, simulation has been used extensive- 1y to model large teleprocessing networks. Such networks are not typical of most data processing organizations. Even if the expense of the simulation approach were not prohibi- tive, many users of small and medium sized computers do not have employees with the skills necessary to do this kind of study, nor do they think that the simulation packages available are economically justified. Sgheduligg There is an urgent need need for easier or more convenient performance prediction. Another important area for which a predictive tool is required is computer scheduling. Compu- ter scheduling, as used here, has to do with the sequences and combinations of programs which are determined ggggggglly to the computer, and is to be differentiated from schedules which are determined by the computer's operating system, which will be referred to as "task scheduling" or "dispatching". Scheduling, in the context used in this thesis, is normally a clerical function. The scheduler tries to arrange the sequence of executions so that system resources are balanced and deadlines are met [23]. To do this, he must consider program memory size, peripheral requirements, job dependencies, data availability and other complex factors. The scheduler is really interested in predicting the actual job run time unde; g specifig se; 9; gircumstanceg, but normally uses the average job run time that has been calculated over several executions of the job being scheduled. Presently, most batch job computer scheduling is done using the average elapsed time for each job in the schedule [5,21], even in cases where the scheduling function is computer assisted. That is to say each program or sequence of logically related programs (job) is considered to execute for a fixed amount of time on a given computer system, regardless of the charactersitics of all the other programs in the system at the same time. The inadequacy of the deterministic type of scheduling can be observed from the non-deterministic nature of interactions between the jobs in a computer system at any given time and the effects of priorities. Consider two programs which are assigned priorities or levels of importance such that the lower priority program will be forced to relinquish the CPU whenever the higher priority program is ready to access the CPU (usually at the completion of I/O activity). The lower priority program will have to wait for the higher priority program to relin- quish the CPU and will only have potential access to the CPU during the times that the higher priority program waits for I/O operations. It is easy enough to visualize that if the low priority program always has to wait for the high priori- ty program, and the high priority program never has to wait for the low priority program, the respective behaviors of the two programs will surely be affected if their roles are reversed. Extending the example to the case of several programs will exaggerate the effects of program interactions in the processor. The actual run time for a particular execution of a given job or program will deviate from the calculated average elapsed time unless it is run with precisely the same set of other programs and identical assignment of priorities as when the average was calculated. Interactions within the processor are further compli- cated by contention for channels and I/O devices. Returning to the two program example, suppose each program accesses a unique disk device and has no other I/O. Further suppose that each program has a nontrivial disk utilization, 20% for 'example. If the data which these two programs access are consolidated on one disk drive, it is inconceiveable that the execution or run times of each of the two programs would be unaffected unless other (compensating) changes were also made. There is another type of variation in job durations in a typical business oriented data processing system-~the variation due to transaction volumes in transaction driven applications. For such applications the elapsed job times are proportional to the number of transactions processed if gghgr phigqg 35g gqgal. Since techniques such as trend analysis and exponential smoothing may be used for the prediction of transaction volumes, the focus of this research will be on the prediction of the more complex type of variation in job execution duration-~the variation due to interactions among several programs executing in a computing system. gapgh ygrsus Tipesharinq Computer systems may be classified into two groups according to whether or not the system is pgipagily dedicat- ed to batch multiprogramming or teleprocessing (time-shar~ ing) multiprogramming. Within both groups, the dispatching scheme or the method which is used to allocate the resources of the computer system to competing programs or "tasks", may be a priority system, a time-sharing system or a combination of the two. The priority dispatching scheme has the primary objective of optimizing the throughput (number of units of work completed per unit time) of the mpg; importggt jghg g; paskg. The time-sharing scheme, on the other hand, has the primary objective of optimizing (minimizing) the "response" time or the time it takes for each interaction of a terminal with the CPU. The combination scheme of priority dispatch- ing within time-sharing is an attempt to have it both ways. Although either of the above dispatching strategies may be found in batch multiprogramming systems or in teleprccessing systems, the priority dispatching scheme tends to be associ- ated with batch multiprogramming and time-shared dispatching tends to be associated with teleprocessing systems because this is consistent with the primary performance objectives of these respective systems. For this reason teleprocessing under the control of primarily batch multiprogramming systems may use priority dispatching and batch-oriented processing under the control of primarily time-sharing or "interactive" systems will be dispatched using the time-sharing discipline. Because most of the work on time-sharing systems is not under the control of the data processing installation but is initiated by the terminal user, the prediction of system loads is a statistical problem. It is useless to talk about scheduling a time-sharing system in the sense that "scheduling" is used in this thesis, therefore this research emphasizes primarily batch-oriented multiprogramming systems. 10 This investigation sets forth the development of an analytic model which is readily usable by computer system analysts, computer schedulers, and hardware and software vendors to predict gross computer system performance charac- teristics (CPU, channel, and device utilizations and throughput) as well as elapsed times and CPU, channel and device utilizations by pggqrgg g; jgb. A discussion of the background relevant to computer system performance predic- tion and related models is presented in chapter 1, and the model used in this research is developed in chapter 2. Chapter 3 consists of an explanation of the parameter estimation and validation procedures and the experimental design. A presentation and discussion of the experimental results is given in chapter 4. The research results are discussed and recommendations for future research are presented in chapter 5. I. BACKGROUND AND PREVIOUS WORK 92 1:: I! O a; (D H W) ggptral e ve Network Mode 1H Most of the models developed to predict the perfor- mance of multiprogramming systems are closed, cyclic queuing network models (so-called "central server" models) based on the early work of J.R. Jackson [32] and later extensions by Gordon and Newell [28]. These researchers determined the conditions under which closed form solutions to network queuing models were known to exist. The types of networks which met these conditions were called "separable" and the solutions were said to be in "product-form". The product-form solution states that the equilibrium state probability for the network is the product of the equilibri- um state probabilities of the component service centers in the network. The central server model is based on the assumption that the execution of programs in a multiprogramming system consists of alternate periods during which each program is either receiving or waiting for CPU service and periods during which each program is either receiving or waiting for I/O device service. Another general assumption in the 11 12 central server model is that at the completion of CPU service, each program requests service from the i-th server (I/O device) with probability P(i). A schematic diagram of the central server model is given in Figure 1. §immls Prelim assgiss Amie; One of the better known computer performance models is a model developed by D.P. Gaver, Jr. Gaver assumes an identical probability distribution for the CPU demand of each job and an identical exponentially distributed response time for each I/O device in the system [26]. Parameters in Gaver's model are the number of homogeneous jobs and their CPU service time and the number of homogeneous I/O units and their I/O service time. This model is a specific implemen- tation of the central server model which has two stations, the CPU and the parallel server I/O station. The I/O devices in the Gaver model are treated as a pool from which a request for I/O may be serviced by any device which is idle. An arbitrary CPU service time may be modeled by either an Erlang, Hypoexponential or Hyperexponential distribution. Paging behavior is not explicitely modeled but may be considered to be included in the overall I/O rate. 13 W1 W2 . I u ——)‘ “'1. Central Server Peripheral Servers (Data Channels) Figure 1. Central Server Model 14 §AEB.§ Flow Ages; Fenichel and Grossman's Plow model [20] does not use probability distributions and does not account for direct program I/O but rather assumes a fixed relationship between average compute time and paging. For this model the only I/O considered is paging. The paging response is computed from a response table. Simulation of the operation of the paging device is used to develop a table of response times under different paging rates. The Flow model makes no assumptions about the form of the probability distribution of paging service time. gyglig Queuigg with Paging g_g Overhead Lewis and Schedler's Cylic Queuing model [34] returns to some of the central server assumptions and accounts for I/O and idealized paging behavior using exponential distribu- tions. This model assumes that program execution intervals between page exceptions (requests for I/O) are identically and independently distributed exponential variables. In this way the dependence of paging rates on memory size is avoided. Like other central server models, this model considers the behavior of all programs in the system to be statistically identical. This is equivalent to partitioning the computer's main memory into equal sized segments and having the page replacement algorithm operate lggglly (each program would only steal pages from itself). If this were 15 not the case, independent execution intervals would not hold. This model differs from most of the central server models in that it explicitely includes the CPU overhead for task dispatching and paging I/O. ggod ct 9; Stage Model Using some results by Cox on probability distributions with rational Laplace transforms [15], Basket and Gomez [4], and Muntz [15] extended the class of known closed queuing networks with product form solutions to include certain servers with general service time distributions by approxi- mating the distribution with a combination of exponential stages (see Figure 2). Using the method of stages, closed form solutions are known to exist for: (1) exponential servers with firstvcome-first-served (PIES) service discip- line, (2) general servers with processor-shared (PS) discip- line, (3) general servers with last-comevfirst-served-preemptive (LCFS) service discipline, and (4) infinite servers (IS) with general service distribu- tions. 16 Figure 2. Rational Laplace Transforms 17 13 1:: 1H In 1H Io Io 1w m o a n m 33' L: gearish node; A Multiple Resource Allocation computer model has been developed by Boyd [7]. This model handles the resource requirements of the jobs in the multiprogramming mix in a similar fashion to most central server models. However, it goes much further in the sense that, given the level of multiprogramming, the probabalistic aspects of job selection for execution from the job queue are developed in great detail. The selection criteria is based on the permanent (execution) resource requirements of the jobs that are waiting to be added to the multiprogramming mix. Although this model is referred to as a batch multiprogramming model by the author, in reality it behaves very much like a time-sharing model. In fact, the dispatching strategy for this model is a time-sharing strategy. As a batch multipro- gramming model, this model is representative of an installa- tion where there are a very large number of small jobs, with few if any data preparation constraints, and no precedence constraints (by the assumption of independence). Futhermore there can be little if any external control of the job schedule since job selection and execution are entirely determined by statistical distributions. 18 Another recent computer performance model has been developed by Boyse and Warn [8], and is a time-sharing cyclic queuing model. The CPU intervals in this model may be either constant or exponential and the only I/O modeled is paging. The CPU modeled has parallel paths to the I/O devices (drums) and, because the number of concurrently executing programs is small (3), the I/O response time may be treated as independent of the number of pending I/O requests. The assumptions in this model agree very closely with the features of the system for which it was developed, a dedicated graphics terminal system, and was found to have a very satisfactory fit. Eclectic Model A model by Brandwajn [9] has incorporated several recent developments into the central server model. Brand- wajn includes paging in his model and uses Belady's "life- time function" [6] to determine the effects of memory allocation on paging rates. Brandwajn also uses the "prin- ciple of decomposition" [14] to simplify the calculation of the equilibrium state probabilities of his model. The principle of decomposition states that if the elements of a subnetwork of the overall system have rates of state transi- tions much higher than the remainder of the system, this portion of the overall network will reach equilibrium sooner. This means that the subsystem composed of the CPU 19 and paging device may be separately analyzed under the assumption that the rate of paging is much higher than the rate of direct I/O. The total system is then modeled as a two-server system where one server is the disk I/O device and the other is the composite CPU-paging server. Sggg_gy f Queuing Mgdels All of the above models are limited or inadequate for the purposes of this research because they all use global parameters to determine system behavior but say nothing about individual programs that may be executing at a given time. Futhermore they are really processor models that do not give a very realistic treatment to the input/output effects on the system. The assumption of homogeneity among I/O devices and channels is a serious weakness of these models for configuration and scheduling studies. The homogeneity assumption is tantamount to saying that differ- ent types of I/O devices aren't really very different therefore they can be treated identically. It is known that disk drives don't behave like tape drives, printers, or card I/O equipment. Since the latter devices are dedicated to a particular program at any point in time, the variation in their response times is usually due to the interference caused by devices used by other programs on the same chan- nel. On the other hand, disks are normally shared among programs, thereby causing variation in response due to the potentially random positioning of the disk access arm before 20 I/O can take place as well as queuing time for both the device and channel. The variation in response introduced by shared disks is on the order of several times larger than the time required for data transfer. This research consid- ers another approach which overcomes many of the objections to the central server models. Syn het'g Work oa Benchmagk The Synthetic Workload method [42] of Sreenivasan and Kleinman gives good results but requires the solution of two or more simultaneous, non-linear equations in six unknowns for each job (program) being modeled. Implementation of the Synthetic Workload proceeds as follows. Let x1 represent CPU seconds and x2 the number of EXCP's (approximately equal to the number of I/O operations). Dividing each of these dimensions into L parts over the range of the x1 and 12 values for the actual workload, the percentage of observa- tions in each cell of the total number of jobs will be the joint probability density of the real workload. The synthetic workload may consist of a smaller collection of programs with ghg §_gg jgigp probability funcgigg. If P(i,j) is the probability of the (i,j)-th cell and MTOT' the total number of programs in the synthetic workload, then the number of programs in the (i,j)-th cell of the synthetic workload is given by 21 N'(i,j) = P(i,j) * NTOT' i,j = 1,2,...,L. (1.1.1) Then, if X1(i) corresponds to the mid-point of-the ivth partition of the X1 dimension, the constraint on total CPU time for the synthetic workload is expressed as sum[ X1(i)*N'(i,j) : i,j = 1,2,..,L ] = T. (1.1.2) The joint probability distribution of the real workload is duplicated by NTOT' executions of the same program, a synthetic program [10]. The synthetic program used by Sreenivasan and Kleinman simulates a file update process. Its execution characteristics may be manipulated by varying a set of six parameters supplied to the program by Job Control Language (JCL). P1, P2, P3, P4, P5 and P6, the parameters of the synthetic program, correspond to the number of master records created, the number of detail records created, the number of executions of a "compute kernal" [33] per match of the master and detail files, the number of times the file update is repeated, I/O buffer blocksize, and record size respectively. The functional dependence of x1 and X2 on the six parameters can be expressed as X1=K1+K2*P4+K3*(P1+P2)+K4*P4*(P1+P2)+K5*P2*P3*P4 (1.1.3) 22 X2=2*P4+(2*P4+1)*([P1*P6/P5]+[P2*P6/P5]). (1.1.4) The constants K1, K2, 33, K4 and K5 may be estimated by regression experiments, but since there are more indepenv dent variables than equations, there is no unique solution to 1.1.2, 1.1.3 and 1.1.4. A solution is therefore achieved by choosing integral values for P1, P2, P5 and P6 and iterating on the values of P4 and P3 until the calculated and the "given" values for x1 and x2 agree. This must be done for each of the UTOT' programs in the synthetic work- load. A shortcoming of the synthetic workload is that it requires the actual setup and execution of the synthetic programs under the operating system being modeled. A further complication is the requirement that the target system be available for execution of the synthetic workload. Although the version of the synthetic workload model discussed here can be extended to account for paging and I/O response time, such an extension will increase the computa- tional complexity many-fold. Compared to the representative sampling approach, the synthetic workload does not require as much planning and preparation since only one program is involved and it generates its own data. For benchmarks in which the primary objective is comparison among alternative CPU's, the synthetic benchmark is superior since pglgtive performance is the basis for decision. For benchmarks in 23 which gyghgh alternatives are being considered, the specifi- cations, allocation and distribution cf data for the I/O subsystem becomes much more critical. In this case, exten- sion of the synthetic model and more planning becomes necessary, thus negating some of the advantages of the synthetic workload model. Both the representative sample benchmark and the synthetic benchmark share the disadvantage of being unsuita- ble for scheduling applications; the former because predic- tion involves hghhhhhy hhhhihg the job streams, and the latter because the synthetic jobs run hpproximatghy hhg ghm length 2: time a 211.2 £...eal fishe- f Easing fisheries Most models of paging behavior were developed as an aid in evaluating paging algorithms for operating systems. Denning's "working set" model [18,19] is defined in terms of the collection of pages of a program which have been refer- enced during the process interval [t-TAU,t], where t is an instant in time and TAU is an interval of time. Denning defines the working set size to be the number of distinct pages in the working set. He proposed the use of TAU by the operating system software (and hardware) as a parameter to control page residency. 24 Denning showed the working set size to be an increas- ing function of the parameter TAU and the page fault rate (the rate at which a program tries to access pages which are not present in main memory) to be equal to the negative slope of the working set size. The working set size func- tion depends on some knowledge of the underlying probability distributions of the memory reference patterns. The working set concept (as defined by Denning) has proven useful in the analysis of page management algorithms but does not serve well as a predictor of paging behavior for systems which do not use the working set parameter TAU to control page residency. Thg ifetihg Punct oh A function proposed by Belady and Kuehner, the "life- time function" [6], is based on a model of independent references to the pages in a program. The independent reference model assumes that each memory reference is an independent Bernoulli trial relative to each page of the program, where the probability of a reference to page i is given by q(i). Belady and Kuehner's independent reference model is a special case where q(i) = 1/S for a program with 5 pages. For this program, w pages in main memory results in a probability of (S-w)/S that a page reference will result in a page fault. The lifetime function is defined to be the expected 25 number of consecutive references before a page will be referenced which is not in main memory (thereby causing a page fault). Using the geometric distribution, the lifetime function is the found to have the following form: e(w) = w/(S-w). (1.2.1) Belady and Kuehner then proceed to approximate the indepen- dent reference model by e(w) = A*w**k, 1.5S(j), for all programs j, we are finished and we simply set w(j)=S(j) since the composite constraint is not binding, i.e. sum(w(j) : j=1,2,..Ni is less than available pageable memory R. The fourth case is the most interesting (and probably the most common in practice). In this case at least one of the programs' memory constraint is violated and at least one is not binding. Let Q=[j : 1SjSN,w(j)ZS(j)] and let H:[j 1SjSN,w(j)w(j) for all i in H, implying that DFN("(j))/d'(j)(DFN("(3))/d'(j)=DFN(V(i))/dfl(i) (2-3-13) DFN (11' (j) ) /dW (j) (DPN (W (i) )/dW (i) (DPN (S (i) )/dW (1) (2.3.14) for j in H and i in Q. This reassignment cannot produce a situation where any reallocation from the set [w'(i) : i in Q] to the set [w'(j): j in H] can result in a reduction in PR. We can drop the former set from further consideration and proceed as we did initially. Applying the algorithm to the set [w'(j): j in B], we must have either all constraints bind- ing, no constraints binding, or a combination in which case the foregoing logic is again applied. We eventually arrive at a stage where either Q or H is empty and the algorithm is terminated. 53 Because the paging mechanisms used by most existing operating systems are not optimal we may modify this proce- dure to reflect more realistic operating system behavior. For example, the paging algorithm may be biased toward the higher priority programs in the type of operating system for which this analysis is intended. We may wish to define a scale factor, L(j)=(1/m)**j*a, where m is greater than or equal to 1 and a is greater than or equal to zero. The Kuhn-Tucker conditions then lead to the following: A(j)"'CP(Ii)"‘5U)"‘1-(Zi)"W(i)"""2'-'A(i)"‘CP(i)'"S(i)”"1-(1)"“'(j)"""2 for i,j=1,2,..N. (2.3.15) An example of a biased paging model is one that yields the constant ratio w(j+1)=p*w(j) provided that p is greater than zero and less than or equal to 1, and that A(j+1)*CP(j+1)*S(j+1)=A(j)*CP(j)*S(j). This leads to the parameter L(j)=p**(j-1) and the solution is H1)=(1"P)*R/(1‘P**N)u (2-3-16) W(j)=V(1)*P**(j’1)- (2-3-17) In parameterizing the paging submodel L(j) is one of the values that we could estimate. The actual I/O estimates for the paging model will be calculated by the I/O submodel which will now be developed. I49 JEEViQY. The I/O model used in this research is the composition of a number of simpler models of assumed independent compo- nents. The I/O response time of the composite model will be computed as a sum of random variables with uniform, exponen- tial and Erlang distributions. This model will be elaborat- ed primarily as a disk I/O model since disk activity is the predominant form of I/O activity in most modern operating systems and disk I/O is also that aspect of the system which causes the most difficulty in modeling. Disk I/O is so prevalent because disks are used for paging devices, data staging devices (temporary storage), and permanent storage devices. The Unit Record (card and print) activity is generally "spooled" on disk. This means that card input and output as well as print operations are actually disk opera- tions gggigq use; r ram ggecutiop, with the card input to disk taking place prior to program initiation and card and print output taking place after program termination. The Unit Record (U/R) devices are driven by special system programs called "spoolers" or "readers" and "writers". 55 The operation of the I/O submodel is related to the CPU and Paging submodels as follows: The I/O wait time in the CPU submodel is actually the aggggqg wait time over all the forms of I/O in which the program engages, igglggigq pgqigq. The I/o access rates generated by the CPU submodel are gggpggiggg of the rates for each program's files. The device access rates are the aggregate of the access rates by program. A program's access rate to a given device is the product of the program's total number of accesses to that device times the program's composite access rate divided by the program's total number of I/O accesses. Let the number of accesses to file k on device 1 by program j be given by N(i,j,k). Then the total number of I/O operations by program j is N‘j)=8um "(i,j'k’: i=1'2'ooND;k=1'2'ooNF] j=102r00N: (2.“.1) where ND is the total number of devices and RF is the number of files on device 1. Since the number of accesses by program j to device 1 is N(i,j), the access rate to device 1 by program j is then 56 ARAT(i,j)=N(i.j)/N(j)*BT(j) i=1'200ND and j=1'2'ooNo (2.“.2, The total access rate to device 1 is then ARAT(i)=sum[ARAT(i,j): j=1,2,..N]. (2.“.3) To simplify the computations, the arrival rate of I/O requests at the I/O device queues is assumed to have a Poisson distribution. Even if it was assumed that both the CPU and the I/O devices were exponential servers, there would not be Poisson arrivals in general because the CPU uses a priority-resume service discipline [32]. For the case of card or print I/O we will assume a constant service time. The U/R service time will depend on the number of cards or lines transferred (to disk) per I/O operation and the speed of the I/o devices. The only variation in the response time for these devices will be that due to channel contention. A program which issues an I/O operation to a U/R device will only have to wait on the channel. Because these devices are gggiggtgg, a program accessing them will never find them busy since we have excluded the possibility of buffered gpggations. Another type of dedicated device is magnetic tape. In this case the mean service time depends on the mean block- size of the data transferred and on the tape transfer speeds of the magnetic tape devices. In addition to channel delay, 57 we will also experience 993539; 33;; delay. The latter form of delay is included with channel delay or ignored as negligible since the configurations to which this research applies rarely have more than one tape control unit per channel. The channels to which U/R devices, tapes and disks are attached are almost always unique so that these components of the I/O subsystem may be treated separately in computing the I/O response. The most complex portion of the I/O subsystem is the part which deals with magnetic disks. It is assumed that the size, location and disk identification for each logical file accessed by the programs in the system are known (either from job accounting data or otherwise). Qisk I49 node; Three basic patterns of disk access are considered: (1) uniformly distributed random access over the cylinders of a file, (2) uniformly distributed ipggggg access over the cylinders of a file (requiring prior access to an index before accessing the data record), and (3) sequential access. A disk access mode diagram appears in Figure a. This is only a partial diagram of the variations on the three major modes of disk access. In what follows, fixed record (block) sizes, "verify" option for all writes, a uniform distribution for rotational delay, and a uniform 58 distribution of accesses for random access to both random and indexed files is assumed. It should be noted that some of the equations devel- oped in this section are herdware or ipplementation depen- dent. In particular the access methods are IBM implementa- tions and the disks are IBu 3330 disks. These hardware dependencies will be pointed out where applicable. A timing diagram for indexed access appears in Figure 5, and one for random access in Figure 6. The diagram for sequential access difffers from the latter only by the absence of a significant seek time component. In what follows, the occurance of equations with indices i,j and k can be assumed to imply that the equations will hold for the entire range of each index unless stated otherwise. 59 Sequential Indexed Direct Access Access Access Write Read Read Write Read Write No Core. No Core. No Verify Verify Index Index Rewrite Add Verify Verify No No Verify Verify Verify Verify Figure 4. Disk Access Mode Diagram 60 Cylinder Index Read Figure 5. Indexed Access Timing Diagram wait Seek wait Wait Wait Search & for Cyl. for for for Transfer ) Device Index Channel Record RPS Index TWD TSl TWC RD/2 TWR TSCl Track Index and Data Read Wait Seek Wait Wait Wait Search & for Track for for for Transfer Device Index Channel Record RPS Index TWD T82 TWC RD/2 TWR TSCl wait Wait Wait SearCh & for for for Transfer Channel Record RPS Data TWC RD/2 TWR TSC2 61 Direct and Sequential Read wait Seek Wait wait Wait Read for Cyl. for for for Key 8 Device Channel Record RPS Data TWD TSK TWC RD/2 TWR TSC Direct and Sequential write Wait Seek Wait Wait Wait write for Cyl. for for for Key & Device Channel Record RPS Data TWD TSK TWC RD/2 TWR TSC wait Wait Read for for Key & Record Channel Data RD/2 TWR TSC Figure 6. Random Access Timing Diagram 62 Direct Access Disk I/O In computing the disk file access time it will be assumed that the disk hardware uses the £95g§ieeel peeigi.e eeeeieg (RPS) technology. This means that the channel will be allowed to disconnect from the disk device during both seek egg search operations. A geometric distribution is assumed for the delay due to RPS. Let BS(i,k) be the average block size for file k on device i, which is accessed by program j. Let CYL(i,k) be the size of the file and LC(i,k) its mid-point. We define the channel service time, TSC, due to access to file k on device i as TSC(i,k)=2*RD/128+BS(i,k)/TR, (2.u.a) where RD is the revolution time, TR is the disk transfer rate and the term 2*RD/128 is the time necessary to prepare the Rotational Position Sensing (RPS) device for data transfer. This term causes any expression in which it appears to be hardware dependent. To apply these expres- sions to disk devices other than 185 3330's, this term should be modified. The I/C access rates are expressed by ARAT(i,j,k)=N(i,j,k)/N(j)*ET(j). (2.14.5) Then the channel utilization by device, file and program is 63 RCU(i,j,k)=ARAT(i,j,k)*TSC(i,k) (2.“.6) and the total channel utilization RC0, is given by RCU=sum[RCU(i,j,k): i=1,.ND;j=1,..N;k=1,..NF]. (2.0.7) The channel wait time due to RPS is TWR=RCU*RD/(1-RCU). (2.u.8) To estimate the channel utilization as "seen" by program j, we have RCU(j)=RCU-sum[RCU(i,j,k): i=1,..ND;k=1,..NF], (2.u.9) and therefore we have an expected channel wait time for RPS of TWR(j)=RCU(j)*RD/(1-RCU(j)). (2.u.10) This is true because at the time that program j requests I/O, no other operation by program j can be in progress. Next, the average seek time by program, file and by device is computed. First, the probability of access to each file on each device DI(i,k), is derived as follows: 6Q ARAT(i,k)=sum[ARAT(i,j,k): j=1...N]. (2.u.11) ARAT(i)=sum[ARAT(i,j,k): j=1,..N;k=1,..NF], (2.“.12) and DI(i,k)=ARLT(i,k)/ARAT(1). (2.4.13) Disk files are considered to be organized by "cylin- der", where a cylinder is the collection of all disk records which can be read or written at one physical positioning of the access mechanism or "heads". A movement of the heads from one location to another is called a "seek". Since we are accessing file k with a uniform distribution, the expected value (in cylinders) of the seek distance SK(i,k), is given by E(SK(i,k))=sum[DI(i,l)*lLC(i,k)-LC(i,1)I: l¢k] +DI(i,k)*((CYL(i,k)**2-1)/3*cyl(i,k)). (2.a.1u) This expression is based on the fact that, on the average, the seek distance between two files on the same disk will be given by the distance between their mid-points [an]. For a seek from some point within the §é-§ file the average distance will be given by the second term in the above expression. Similiary we may estimate the second moment for the seek time E(SK2(i,k)) as 65 E(SK2(i,k))=sum DI(i,l)*(LC(i,k)-LC(i,l))*2 : lsk] +sum[DI(i,l)*((CYL(i,k)**2+CIL(i,l)**2)/12 : l¢k] +DI(i,k)*(CYL(i,k)**2-1)/6. (2.“.15) Using this expression and the previously derived mean seek distance, the variance of seek distance may be calculated using the well-known formula, V(SK(i,k)) = E(SZ(i,k)) - E(SK(i,k))**2. In calculating the seek time for the model, it will be assumed that the seek time is a linear function of the distance moved. This assumption is reasonable for most seeks greater than a few cylinders in distance. We define the seek time function as TS=T*U(SK)+KS*SK (2.u.16) where SK is the number of cylinders seeked, K5 is the slope, T is a constant, and U(SK) = 1 for SR greater than or equal to 1 and U(SK) = 0 for SK equal to zero. Then the seek time for an access to random file k on device 1 is E(TS(i,k))=(CYL(i,k)-DI(i,k))*T/CYL(i,k) +KS*(sum[DI(i,l)*lLC(i,l)-IC(i,k)I: 1¢k1 (2.4.17) +KS*DI(i,k)*((CYL(i,k)**2)-1)/CYL(i,k)). Using Water's method [an], and the previous assumptions about the programs in the multiprogramming set and their 66 disk accesses, we calculate the variance of the seek times to random file k on device 1 as V(TS(i,k))=(1-DI(i,k)/CYL(i,k))*DI(i,k)*(I**2)/CYL(i,k) +2*T*KS*DI(i,k)*E(SK(i,k))/CYL(i,k) (2.“.18) +V(SK(i,k))*KS**2. The wait time due to channel blocking caused by other I/O processes is given by Wilhelm's model [HS] as TWC(i,j,k)=sum[ARAT(m,l,k)*TSC(m,k)**2:l#j;m#i]. (2.4.19) The device service time for a random read Operation will be given by TSD(i,j,k)=TS(i,k)+RD/2+THR(j)+TSC(i,k)+TWC(i,j,k), (2.u.20) and service time for a random write operation will be TSD(i,j,k)=TS(i,k)+RD+2*(TWR(j)+TSC(i,k))+TWC(i,j,k). (2.“.21) The utilization of file (i,k) due to program j will be RDU(i,j,k)=ARAT(i,j,k)*TSD(i,j,k). (2.u.22) The respective variances for the read and write operations 67 are V(TSD(i,j,k))=V(TS(i,k)+(Rt**2)/2 +RCU(j)*(RD**2)/(1-RCU(j))**2 (2.u.23) +sum[ARAT(m,l,k)*TSC(m,k)**3:l#j:m¢i]/3+TWC**2, and V (T513011 2).“) )‘V (T5 (1.1!) ) + (R13**2)/3 +u*RCU(j)*(RD**2)/(1-RCU(j))**2 (2.u.2u) +sum[ARAT(m,l,k)*TSC(m,k)**3:l¢j;m#i]/3+TWC**2. From the previous computation of the variance for disk service time, we may now compute the coefficient of varia- tion for the disk service time as CS(i,j,k)=V(TSD(i,j,k))/TSD(i,j,k)**2. (2.0.25) We finally get the disk response time by use of the queuing formula TRD(i,j,k) = TSD(i,j,k)*(1 + 330*(oa**2 + os**2)/2 * (1 - RDU(i,j)) where RHO = 1 + 2 * (RDU(i,j) - 1)/(ca**2 + 1), and CA and C5 are the coefficients of variation of the arrival and service distributions. The term RDU(i,j) is the sum of the utilization of device i by all programs other than program j. A particular application of this formula, for Poisson arrivals (CA=1) is 68 TRD(i,j,k)=TSD(i,j,k)*(1+RDU(ioj)* (1+CS**2)/2*(1-RDU(i,j)). (2.“.26) To get the total response time, we must add the wait time while the disk is available and the channel is busy, TWD(i,j,k)=RCU(j)*(1-RDU(i,j))*TWC(i,j,k)/2. (2.u.27) We now look at a slightly more complex model, a model of access for indexed data. Indexed Disk I/O For the indexed access method we will assume that the index is located on the same device as the data portion of the file and that a seek to this index is required. There are several cases of indexed access: (1) initial load or sequential retrieval, which can be treated using the sequen- tial method to be given later, (2) indexed read-only, and (3) indexed read-write. We will limit the analysis of the indexed method to random reads without the index in memory (core index). Analysis of the other cases may be achieved by simple extensions of the methods used here. We will first calculate the seek time. For a random read there are actually two seeks to be calculated; (1) the seek from the starting location of the access mechanism to the cylinder index, and (2) the seek from the cylinder index to the indexed file's data area. The expected values of 69 these seeks are equal to the seek times from mid-point of the file where the access arm is initially located to the mid-point of the index, and from there to the mid-point of the data area. Designating these seeks as S1 and S2 respec- tively, we have E(S1(koi))= sum[DI(i,1)*lCI(i,k)-LC(i,1)l:1=1,..NF], (2.4.28) E(SZ(k,i))=|CI(i,k)-LC(i,k)I, (2.“.29) where CI(i,k) is the mid-point of the index for file k. Having computed the expected value of the seek distance, we may compute the expected seek times for the index and the data portions of the file as P(TS) = T + KS*E(SK) since the access mechanism will always have to move at least one cylinder. This is true because the index will always be read first and the access mechanism will never be left in the index area. Since we actually have two I/O operations to transfer a single data record, we will compute service times for the two operations separately. First we have the channel busy time for searching and transferring the index TSC1(i,k)=RD/2+2*RD/128+10/2*TR, (2.u.30) and the channel busy time for searching and transferring the 70 data record is TSC2(i,k)=RD+2*RD/128+BS(i,k)/2*TR. (2.u.31) Assuming independence for each segment of the timing diagram, we calculate the channel utilization for indexed reads as RCU(i,j,k)=ARAT(i,j,k)*(TSC1(i,k)+TSC2(i,k))/2, (2.4.32) We then have the delay due to RPS TWR(j)=RCU(j)*RD/(1-RCU(j)), (2.4.3u) and we calculate the channel wait time as before by Wilhelm's method: TWC(i,j,k)=sum ARAT(m,l,k)*TSC(m,k):m#i;l¢j]. (2.u.35) The service time for the indexed access will be TSD1(i,j,k)=TSC1(i,k)+TWR(j)+TWC(i,j,k) +RD/2+TS(E(S1(i,k))). (2.0.36) The device service time for the data access will be 71 TSD2(i,j,k)=TSC2(i,k)+2*(TWR(j)+TWC(i,j,k)) +RD/2+TS(E(52(i,k))), (2.u.37) and the rate of access to device 1 for all programs other than program j is ARAT(i,j)=sum[ARAT(i,l,k):lsj;k=1,..NF]. (2.“.38) The device utilization which program j finds at device 1 will be RDU(i,j)=sum[ARAT(i,l)*(TSD1(1,1) +TSD2(i,l))/2 : 1:3], (2.u.39) and the wait time for program j then becomes TWD(j)=RCU(j)*(1-RDU(j))*TWC(j)/2. (2.u.u0) To complete the elaboration of the model we must find the variance of the service times. It can be shown that the variance of the distance for the initial seek, S1, is V(S1(i,k))=sum[DI(i,1)*(CL(i,k)**2+CYL(i,l)**2)/12:1=1,..NP] +sum[DI(i,l)*(1-DI(i,l))*(CI(i,k)-LC(i,l))**2:l=1,..NF] -2*sum[ DI (i, 1)*|CI(i, 1!) -LC (1,1) I* sum[DI(i,m)*|CI(i,k)-LC(i,m)):m=1,..1-1]:l=1,..NF], (2.u.u1) 72 where CL is the number of cylinders in the index portion of file k on device i. The seek time is then V(T(S1(i,k)))=T**2+V(S1(i,k))*KS**2. (2.u.a2) The variance of the second seek is then calculated as V(52(i,k))=(CL(i,k)**2+CYL(i,k)**2)/12, (2.u.u3) and the seek time for the second seek is V(T(S1(i,k)))=T**2+V(52(i,k))*KS**2 (2.“.04) The variance for the channel service time is then (RD**2)/12 and 2*(RD**2)/12 for the index read and the data read respectively. Since the variance of the rotational delay is (RD**2)/12 we have the following expression for the variance of the device service time for the read of the index: V (T5131 (it jlk) ) =V (T (31 (iljlk’)) +V (THC (j), +V(TWR(j))+(RD**2)/12+V(TSC1(i,j,k)), (2.u.a5) where W(TWC) and W(TWR) are calculated exactly as in the case of the random access method. Similarly, we have the variance of the disk service time for the data read opera- tion as 73 v(TSD2(iojvk))=v(T(S1(iljvk)))+(RD**2)/12 +2*(V(TWC(3))+V(TWR(3)))*V(TSC2(i'j'k))- (Z-Q-“5) At this point we form the coefficient of variation exactly as before and use this in the general queuing formula to get the expected response times TRD1(i,j,k) and TRD2(i,j,k) for the indexed access method. The extension of these methods to other forms of I/O for indexed files is straightforward. For example, to extend the previous analysis to the case where the index is in core we simply calculate the seek time component (52) exactly as we would for a randomly accessed file. Of course the use of a core-index implies something about the memory requirements and also the CPU usage of the program. The use of a core-index makes the program a gifgegege pgeggeg as far as the CPU and Paging submodels are concerned. Sequential Access Disk I/O For a sequentially accessed disk file, we calculate the timing diagram in a similar manner to calculation for the direct access case. The difference is in the calcula- tion of the seek time. First we determine the expected value of the seek distance 74 B(SK(i,k))=sum[DI(i,l)*|LC(i,k)-LC(i,l)l:l¢k] +DI (i'k)/RCI (2.“.“7, where RC is the number of records to be read from each cyclinder of file k. The variance of the seek distance we calculate as V(SK(i,k))=(1-DI(i,k)/RC)*DI(i,k)/RC +sum[DI(I,l)*((CYL(i,l**2+CYL(i,l)**2)/12 +(1-DI(i,k))*(CYL(i,k)-CYL(i,l))**2 -2*DI(i,k)*ILC(i,k)-LC(i,l)I : lik] (2.".48) ‘2*(Sum[DI(i,1)*ILC(i,k)-LC(i,1)l* sum[DI(i,m)*|LC(i,k)-LC(i,m)l:m=1,.l-1,m¢k]:l#k]. We then have.the expressions for the expected seek time, E(TS(i,k))=(1-DI(i,k)+DI(i,k)/RC)*T (2.u.u9) +K*(sum[DI(i,l)*lLC(i,l)-LC(i,k)| lik]+DI(i,k)/RC), and the variance of the seek time, V(TS(i,k))=2*K*T*DI(i,k)*E(S(i,k))/CYL(i,k)+V(SK(i,k))*K**2 + (1-DI (i,k)/CYL (i,k) ) *DI (i,k) * (T**2)/CYL (i,k) . (2.4. 50) With the expected value and variance of the seek time we calculate the coefficient of variation, the disk response time, channel waiting time, and the I/O response time as we 75 did in the direct access case. From the previously derived expressions for access rate by file, device and program, we can calculate the average I/O response time for program j by XI(j)=sum[N(i,j,k)*TRD(i,j,k)/N(j):i=1,..ND;k=1,..NF]. (2.4.51) I/O for Paging The I/O response times for paging operations are computed as if the paging file were comprised of distinct subfiles for each program in the multiprogramming set. The response for each program is computed as the I/O time for random access to its paging subfile with the added condition that the paging I/O is priority scheduled using Head-of-the-Line (HOL) policy. It should be noted that the foregoing comments are specific to the implementation of the particular operating system used in this research. Using PHI(j) as the average time to read or write a page on behalf of program j, TSD(l) as the mean service time for all I/O to the paging data set, and TSD(l,j,j) as the mean service time for access by program j to its portion of the paging file we have the following: 76 PHI (1)) =TSD (1: j,j) +3130 (1:2)) *TSD (1) *CK (j) , CK(j)=CK(j-1)+ARAT(l,j-1,j-1)*TSD(1,j-1,j-1), (2.u.52) CK(1)=(1+CS(1)**2)/2 j=1,..,N, where ARAT and RDU have the same meanings as defined previ- ously. All other values used in determining the response time for paging is computed exactly the same as for a random access data set. This concludes the detailed discussion of the I/O submodel. This discussion was not intended to be exhaustive but does point out the kind of I/O models that can be used to give a more realistic treatment to I/O than is usually found in computer performance models. §g§model Integ at'eg We complete the model by integrating the Paging, CPU, and I/O submodels. This is accomplished by calculating the statistics for the CPU and Paging submodels, and then using these statistics as input to the I/O submodel. The 1/0 statistics are then used as input to the CPU and Paging submodels. This process is repeated until the model state variables converge to an equilibrium solution. Since the model developed herein is comprised of a system of non-linear equations, the geggle Falei, or the 77 algorithm are used to accelerate convergence of the system. The ideas that enable the various parts of the model, the submodels, to fit together will now be examined in some detail. The idea of a CPU cycle is extended to include program CPU execution intervals which terminate in paging operations as well as those terminating in normal (non-paging) I/O operations. Representing the average page read or write time as PHI(j) for program j, and defining the system ratio of page reads to page writes as BETA, we have the expression for the average page wait time during a paging cycle ETA(j)=(1+BETL)*PHI. (2.5.1) With N(j) the number of non-paging I/O operations during the execution of program j, and CTINE(j) the total problem state time during the execution of program j, we have NU(j)=(N(j)+1)/CTINE(j). (2-5-2) The approximate number of page fault cycles is given by PG(j)=N(j)*(1/B(j)*UU(j)'1)- (2-5-3) We then calculate the proportion of all I/O delays due to paging as 78 PG (3)/(P30) +N(j))=N(j) *(1/HU(3)*B (3)‘1)/ N(j)*(1/’HU(j)*B(3)‘1)*N(j) (2-5-‘0 =1’HUU)*E(3) , where B(j) is the average CPU execution time per cycle for program j and PG(j) is the total number of cycles of program j's execution which terminate in page exceptions. Similar- ly, the proportion of cycles due to non-paging I/O opera- tions is N(j)/(135(3)+N(j))=HU(j)*B(j)- (2.5-5) Using the instantaneous page exception rate CER(j) defined previously, the number of paging cycles PG(j) is given by the product of the total number of cycles times the rate of page exception generation per cycle PG (:1) = (N (j) +PG (j) ) *CER (j) *8 (j) . or equivalently, GER (31:9301/30) * (N (3) +PGU) 1:1/301‘5001 . (2.5.6) This implies that the CPU time per cycle B(j) is B(j)=1/ (30 (31*CERUH (2-5-7) and B(j) can be determined. Now that we have an estimate for the average paging response time ETA(j) and the normal 79 I/O response XI(j), we can calculate the average overall I/O response as 10(3)=NU(31*B(3)*XI(3)*(1'UU(3)*B(3))*ETl(j)- (2-5-8) Turning now to a re-examination of the CPU overhead due to I/O and page management, let the normal I/O process- ing cycle overhead be given by ALP(j) and the average paging cycle overhead by DBL(j). The average CPU time consumed per cycle for program j will be TAU(3)=B(3)*HU(3)*B(3)*BLP(3) +(1'HU(3)*B(3))*DEL(31- (2-5-9) For a given program j (remember this also means priority level j), there is only one variable on the right hand side of the above expression, the variable B(j). In taking the output of the CPU and Paging submodels as input to the I/O submodel, the device/file access rates must be disaggregated so the appropriate rates are reflected for non-paging and paging I/O (the paging I/O response is calculated by the I/O submodel just like any other I/O). The normal I/O rates are computed as NU(j)*B(j)/ET(j) and the paging rates are computed as (1-HU(j)*B(j))*(1 + BETA)/ET(j). After computing these I/O components separate- ly, they are recombined as shown previously. One other aspect of CPU service time to be introduced 80 at this point has to do with the elongation of service time due to CPU cycles being "stolen" to accomplish I/O. This effect can become significant at high I/O rates because the amount of CPU time used by the channel is proportional to the amount of data transferred and the number of I/O opera- tions started. This effect will be quantified by defining an "expan- sion factor" CYT, as the ratio of available CPU time with some level of I/O to the maximum available CPU time without I/O. This factor is expected to be different for different programs in the multiprogramming mix. Defining the amount of time stolen by the channel for an I/O operation by program j as D(j), we have D(j)=DS+[(1-B*HU)*BS(j)+B*NU*BSP]*DC, (2.5.10) where DS is the amount of time used by the channel to initiate and terminate an I/O, BS(j) is the average I/O block size transferred for program j, BSP is the size of a page in the system under discussion, and DC is the amount of CPU time used by the channel per byte transferred. For the present research DS = 76 microseconds, DC = 0.15 microse- conds, BS = 568 bytes and BSP = 2048 bytes. These factors are all hardware dependent. Then the amount of CPU time stolen each second for I/O on behalf of program j is given by D(j)/ET(j) so the 81 fraction of CPU cycles available for instruction execution is 1-D(j)/ET(j). A program of higher priority than program j must be in the I/O stage if program j's instruction are executing, so the fraction of CPU time available due to the higher priority program (k) is 1-LAN(k)*D(k). Combining this with the results for a lower priority program, it can be shown that CYT(j)=prod[1-LAH(k)*D(k):k=1,..j-1] (2.5.11) *prod[1-D(k)/ET(k):k=j+1,..,N]. CYT may be computed in sequential fashion in the CPU submo- del by CYT(3)=CYT(j‘1)*(1‘LA5U‘1)*DU‘1))/(1'D(j)/ET(3))o CYT(1)=prod[1~D(k)/ET(k):k=2,..,N], (2.5.12) for j=2,3,..,N. The expected length of a CPU execution interval, whether it is B, TAU, ALP or DEL may be elongated simply by dividing by the corresponding CYT. This completes the submodel integration. At this point all of the essential elements of the model have been put together. The conditions for convergence to an equili- brium and the extent to which these conditions are met by the present model will now be examined. .edel £2mx.rgense If we represent a cycle of iterations of the CPU, Paging, and I/O submodels by the real valued functions f(.), defined over Euclidean N-space, we have f(j)=LAN(j)-LAH'(j) for j=1,..N, (2.6.1) where LAN(j)=1/IO(j) and LAU'(j)=1/IO'(j) are the outputs of the I/O submodel (and consequently input to the CPU and Paging submodel) on successive iterations. LAN and LAN' are interpreted as instantaneous I/O response rates. The equilibrium condition requires that f(j)=0 for j=1,..N. (2.6.2) Since the model is a system of non-linear equations, and the iterative method of alternating between CPU-Paging submodel and I/O submodel calculations produces an oscillat- ing series, we have employed a variation of Newton's Nethod, the Reggie geiei [30], to accelerate convergence. The algorithm is based on the general form 9(3)=X(j)’f(j)/f'(j) j=1u-N. (2-5-3) where x(j)=LAu'(j) is a scalar and f'(.) is the derivative of f(.) with respect to x(j). Representing the vector forms for f, g, y, and x by the upper cases F, G, Y, and x we 83 have the vector condition for convergence of the algorithm: IIG(X)‘G(Y)IISH*IIX‘YII. (2-5-“) for l)X-Y|| 0 for every program j. Furthermore, it can be shown that f(j;x(j)) s f(j+1;x(j+1)). We can guarantee that f(N x(N)=0) < 0 by the imposi- tion of the requirement that the utilization of the disk be 85 less than or equal to 100%. Given the access rates ARAT(1), and a single disk device, we have f(j;x(j))=x(j)+sum[ARAT(l):l¢j]-1/TSD, Sum[ARAT(l)*TSD:l=1,..N-1] < 1, ==> sum[ARAT(l):l=1,..N-1] < 1/TSD, (2.6.6) ==> f(j;x(j)=0)=sum[ARAT(l):l=1,..N-1]-1/TSD < 0. For x(j) = 1/TSD we have f(j;x(j)=1/TSD)=1/TSD +sum[ARAT(l):l=2,..N]-1/TSD (2.6.7) =sum[ARAT(l):1=2,..N] 2 0, since each rate ARAT(l) is strictly positive and converges to zero as x(j) becomes infinitely large. To show that f(j x(j)) S f(j+1;x(j+1)) we have f(j;x(j))=f(j+1:x(j+1))-(x(j)**21*(C(j)+x(j)* (W(j)+IO(j))*ARAT(j)*ARAT(j+1). (2.6.8) Since all of the variables following the negative sign in the above expression are greater than or equal to zero, the necessary condition prevails. Because the functions involved are continuous in the domain of interest, we are assured of a solution. 1/_ " 86 It can also be shown that the functions f(.) are differentiable and that the partial derivatives of f(j) with respect to x(j)--Df(j)/dx(j)--are ergereé Df(1)/dx(1)SDf(2)/dx(2)S...SDf(N)/dx(N)=1, (2.6.9) and the second partials DDf(j)/dx(j) are non-negative everywhere, DDf(1)/dx(1)ZDDf(2)/dx(2)2..ZDDf(N)/dx(N)=0. (2.6.10) These conditions assure us of a unique solution and they also assure us that the derivative of each function is greater than zero and less than or equal to one (1) in the neighborhood of the solution (the derivatives are very nearly equal to one in this neighborhood). We have that Df(1:EP(j))/dx(1)>TSD*(f(1:1/TSD)'f(1:0)) (2.6.11) (g(j:x(j))-g(j:y(j))|=|x(j)-y(j)+f(j:x(j))/f'(j:x(j)) “f(3=Y(j))/f'(j=Y(3))l- (2-5-12) By Taylor's Theorem, f(j=2(j))=f(j=EP(j))+f'(j=EP(j))*(Z(j)'EP(j)): (2-6-13) implying that 87 |g(j:x(j))-g(j:x(j))|=|x(j)-y(j)+CHI*(y(j)-x(j))I =I1'CHII*IX(3)’Y(3)I: (2-6-1“) where CHI is a non-negative scalar which is less than one. Using ||X|| = max[x(j):j=1,..N] as the definition of the norm, we chose RHO so that [IX-EPII < RHO implies by contin- uity that f'(j=EP(j))*“/5 R) is given by Dw(j)/dR=w(j)/R. (3.5.2) This type of "approximate" derivative indicated that the appropriate order of the variables for step-down and step-wise testing was: w, PI, PR, ET, CP, C08, and RCU for the dependent variables, and A, VS, VI, N, and j for the independent variables. A feature of the experimental design is that each run was a measurement of only one program, with the other programs in the multiprogramming set controlling the envi- ronment. The program being measured ran in a "matrix" of copies of a "control" program. The control program was selected to exhibit "average" behavior compared to the 107 experimental programs. A table of level values for each independent variable, and a table of levels for each of the synthetic programs is given in Figure 9. 108 PROGRAM-RELATED'VARIABLES Synthetic Paging CPU I/O Program. Activity Service Response Expected Number Code Index Variance Variance Error 1 000 low high high low 2 001 low high low 3 010 low low high 4 011 low low low 5 100 high high high 6 101 high high low 7 110 high low high 8 111 high low low high ENVIRONMENTAL VARIABLES Pageable Number of Expected Code Memos! Priority Programs Error 000 fixed low low low 001 fixed low high 010 fixed high low 011 fixed high high 100 fixed low low 101 fixed low high 110 fixed high low 111 fixed high high high Figure 9. Levels of Independent Variables 109 The Synthetic Program At this point a discussion of the programs used in the experimental manipulations is in order. The experimental programs are COBOL programs which process a sequential file of "transactions" from disk against a direct update file on disk. The program is written so that it can be run any number of times against the files. Control information is provided by means of execution parameters on a control card. These parameters tell the program how large the records on the disk are, how many records there are in each file, how many passes to make through the transaction file, the average number of times to execute the "compute kernal" between I/O's, the interval of variation to use in computing a random number of kernal executions on each transaction record and the amount of variation to use in accessing its own instructions in memory for paging. Variation of the I/O is provided by the placement of the data sets, the size of the direct files, and the record sizes that are built when the files are created. The files are created by a separate program which creates the direct file sequentially then creates the sequential file by generating random relative record numbers of records in the direct access file. In practice, the variation in I/O response was controlled by data set placement and record sizes were held constant. The compute kernal of the program selected 10 numbers 110 from an array of 1000 numbers and performed non-trivial arithmetic operations on them which were self-checking (they had to match previously computed numbers in the transaction file). This turned out to be so compute-bound that the number of compute passes between I/O had to be held to 2. The actual number of compute passes could be held to exactly two by selecting a variation of 0 or the maximum variation could by generated by selecting 2, causing the program to randomly choose an equally likely integer in the interval [0:4]- The program's executable code (as well as the 1000 entry numeric array) was duplicated five times. Each section of code was sufficiently different that a paging parameter of 0 would result in sequential execution of each block of code processing the data in corresponding data blocks. A paging parameter of 1 caused the program to skip one-half of the code and data in each block. A paging parameter of 2 caused the program to skip one-fourth of the code and data in each block. This particular scheme was chosen because it was felt that sequential execution would result in the lowest paging rates since fewer pages would be referenced during a specific execution interval. The parameter levels chosen for the control programs were 1 for paging index, 1 for CPU variance, and 2 for number of kernal passes. The I/O was determined by making the files 7 and 2 cylinders in size and locating their 111 centers 4 cylinders apart on the same phsysical disk drives. The I/O for the experimental programs was controlled by making the files 9 and 3 cylinders in size and locating their centers 3 and 15 cylinders apart for the high variance and the low variance versions of the program respectively. As in parameterization, the experimental data were collected by SMF and OS/VS1 Utilization Monitor and reduced for input (along with the predicted results) to the MULTI- VARIANCE program. IV. EXPERIHENTAL RESULTS AND ANALYSIS Paramete; Estimatieg EQEBQL £49 9122.229 Hypotheses 1, g, e_g 3, Two replications of these runs were made and the measured and calculated results are listed in appendix A, Tables A1oA4. The data were analyzed using linear regression on the model equation 3.4.4. Initial analysis yielded a significant contribution to variance for all but the last factor. The aggregate file I/O access rate contributed to an increase in ALP but the increase was not significant. This term was subsequently dropped from the model and the data re-analyzed. The statistics for the revised regression analysis is given in Table 1. The statistics in Table 1 supports hypotheses 1 at the .0001 level. The effect of multiprogramming set (N) is signifi- cant at the .0001 level but its directiOn is contradictory to hypothesis 3. Adjustment of the predicted mean so that it is equal to the observed mean yields a positive inter- cept. This reflects the fact that in a lightly loaded system, the page management routines use idle CPU time to search page tables and maintain page queues. 113 Table 1. I/O Overhead Regression Analysis goungg RAH REGR ggsrrs s D §_§g_ 9. ES CONSTANT 2.2733703-03 8.0777652-05 PRIORITY (j) 3.8395762-04 3.4289222-05 ups SET (N) -2.85u969E-05 n.8937953-06 STD DEV OP DEP VARIABLE (OH) = 0.0212 HULT-R-SQUARED=.999Q P(3,12)=6776.6“8 P<.0001 5.32.2121: §§§§§§§TQT CONSTANT P(1,1u)= 753.360 98.18% ADDL VAR P<.0001 PRIORITY P(1,13)= 91.806 1.60% ABEL VAR P<.0001 MPG SET F(1,12)= 30.03“ 0.17% ABEL VAR P<.0001 ADJUSTBENT TO HEAR = 0.008075 229$.Q £19 QEEEQEEQ The same programs that were run in a non-memory constrained environment for ALP estimation were then run with pageable memory constrained to 312K. This amount of memory was deemed appropriate to run three of the programs (each with total memory requirement of 162K) with negligible paging. The paging experiments were started with u programs in the mix and another program was added with each succeed- ing run up to a final run with 8 programs. In the final run the system was under such stress that the monitor data that was being logged was incomplete and it was not possible to get measurements on two of the programs in the mix. when the sequence of runs was repeated, the replication with 8 programs was therefore omitted. 114 The data collected from these runs is reflected in appendix A, Tables AS-AB. Equation 3.4.4 was then used with the coefficients for ALP to predict the overhead due to non-paging I/o. The resultant overhead was subtracted from the measured overhead to form the residue. The data were then analyzed using linear regression with the I/O rates in equation 3.4.5 replaced by the 223129 rates. A surprising aspect of this process was that the overhead with paging was actually lggg; than it was without paging at smaller multi- programming sets (N). Analysis of these runs (Table 2) shows that the intercept for paging overhead is negative. An interpretation for this anomaly is that the page management routines use CPU idle time to perform housekeep- ing functions and look for work to do even in the abggggg 9; paging. When the CPU is highly utilized and no paging is taking place this discretionary work is effectively bypassed. However when paging is taking place this work cannot be bypassed and much of it is performed as a part of normal page I/O processing. Ideally the intercepts for ALP and DEL should cancel each other since no overhead should be expended when I/O is not being done. fiypgthg§g_ 4, §, a.g g. Initial statistics for analysis of the model equation 3.4.5 indicated that the independent variables for priority level j, multiprogramming set N, and the aggregate paging rate T(j). all had effects in the hypothesized direction on the paging overhead parameter DEL. 115 Table 2. Paging Overhead Regression Analysis §QQ§Q§ BA! BEEB §Q§§Z§ -§1. .3593 Q: .§I CONSTANT 6.165776E-03 6.7830703-05 HULT-R-SQUARED=.9993 F(1,6) =8262.718 P<.0001 ADJUSTMENT FOR HEANS=-0.0129923 However the effects were not statistically significant since the significance levels for these effects were .79, .99, and .40 respectively. The constant term was left as the only term in equa- tion 3.4.5, making paging overhead directly proportional to paging rate. The data were re-analyzed using only the constant factor in equation 3.4.5, yielding the coefficients in Table 2. Eéfliflfl Befig §§_imation Since paging was needed to perform this estimation the same set of runs were used for both DEL estimation and paging ratios (BETA) estimation. An immediate source of difficulty is the fact that page-out counts recorded by most operating systems are not the same as those used in this formulation. The theory predicts the number of pages that must be written out to accomodate pages that are read for a specific program. The identification of the program which is losing the page is of no concern. However job accounting routines do not record the identification of the program 116 which caused the page to be written but only adds to a counter of pages lost for the program losing the page. There are two alternatives to this dilemma. Hypothesis 7 could be agggggg and hypothesis 9 tested, or it can be agsu ed that the number of pages written on behalf of a program is equivalent to the number of pages which that program g_g§g§ to be written and proceed with the analysis. Proceeding on the latter assumption, results of the hypothe- sis tests must be interpreted in view of the above discus- sion. Hypgthgggs 1 and g. Hypotheses 7 and 8 were tested by performing a 4X4 factorial analysis of variance on the PO/PI ratios of the same 4 programs for multiprogramming sets N = 4, S, 6 and 7. The within-group sum of squares was used as the error term. Results of this analysis are summarized in Table 3. Since there are no significant interactions main effects may be tested. Hypothesis 7 was not supported since there was a priority effect at the .0009 significance level. The multiprogramming effect was positive (consistent with hypothesis 8) but the significance level was .124. The cell means predicted from the coefficients in Table 3 were used as the BETA(j,N) for the predictive model. Hypothesis 2. Hypotheses 9 was tested by performing a one-way analysis of variance on the system-wide paging ratios computed from the paging estimation runs with 4, 5, 6 and 7 programs in the multiprogramming sets, yielding 8 117 samples. From observations of cell means it was decided to test the hypothesis using orthoganal polynomial contrasts. Hypothesis 9 was not supported at the higher multiprogram- ming levels since a quadratic effect was found at the .0004 significance level. The Analysis of Variance table for this analysis is given in Table 4. Page Allocatiop w iqht This factor tests the Paging submodel's accuracy in predicting working set allocations. The measure for this factor L, is derived from the Kuhn-Tucker conditions for minimization of the total paging rate where program j's paging rate is weighted by L(j). Summing the L(j)s to 1 and assuming that paging indices A(j) are identical, we have the following estimator for L(j): L(ij)=('(j)**2)/CPPU)*S‘III ("(1)**2)/CPP(1131=1o- . J]- (“o 1.1) To make the L(j) comparable across multiprogramming sets, The L(j) calculated by equation 4.1.1 will be multi- plied by the number of programs in the multiprogramming set, N. The page allocation weights can be used in this form since it is the relative sizes of the weights that is important. .J 118 Table 3. Page I/O Ratio Analysis §QQ.§§ g.£ 3.93 .szzs §.2 T3393 9: .§T GRAND HEAN .603356 .085417 HPG SET .329145 .241595 MPG SET .617560 .241595 HPG SET .239244 .241595 PRIORITY -1.104013 .241595 PRIORITY -1.083273 .241595 PRIORITY -.917266 .241595 INTERACTION -1.173274 .683333 INTERACTION -1.567103 .683333 INTERACTION -.423430 .683333 INTERACTION -1.337678 .683333 INTERACTION -1.306999 .683333 INTERACTION -1.669633 .683333 INTERACTION -.464727 .683333 INTERACTION -1.358317 .683333 INTERACTION -.564918 .683333 $3.23.; ALAN smug; 2.12 z 2 GRAND MEAN 11.649 1 49.896 .0001 HPG EFFECT 0.521 3 2.231 .1241 PRIORITY 2.198 3 9.413 .0009 INTERACTIONS 0.263 9 1.128 .3985 ERROR TERH 0.233 16 Both replications of Paging Estimation runs with 4, S, 6, and 7 programs were used giving 44 observations of L(j). i The data for these runs is given in the Appendix, Tables A9 and A10. The equalized estimates for L are given in Table 5 and the ANOVA table for a 4X4 factorial design is given in Table 6. Tests of hypothesis 10 use only the first four programs in the analysis. Examination of the ANOVA table reveals a significant interaction at the .0001 level, hypothesis 10 is not supported. Regression coefficients from this analysis 119 Table 4. System-wide Paging Ratio Analysis SELL SELL !£.!§ £.LL §.2 2:1. N = 4 0.257946 3.380890E-02 N = 5 0.477944 3.876854E-02 N = 6 0.563385 1.867964E-02 N = 7 0.354059 6.027378E-03 §QQE.§ 32.! §QEAE§ 22 z 2 LINEAR 0.014 1 18.436 .0128 QUADRIATIC 0.092 1 121.610 .0004 CUBIC 0.003 1 3.387 .1396 ERROR TERH 0.001 4 cannot be used since priority 5, 6, and 7 programs have not been included. Regression analysis was therefore performed using all 44 observations with covariates: multiprogramming level (N), priority (j), priority squared (j**2), multipro- gramming level squared (N**2), and the cross-product of multiprogramming level and priority. The coefficients resulting from this analysis are given in the bottom of Table 6. These coefficients are used in computing the weights used in the predictive model. Since there are significant deviations from unity in the ANOVA it can be concluded that the paging model is deficient. At this point it will be assumed that the working set weights that have been estimated are not specif- ic to the programs used in the analysis but reflect error in the paging submodel and continue with the analysis. 120 Table 5. Estimated Working Set Heights HULTIPROGRAHHING SET (N) G 2.1325.le (.1) 2 .~ .5. 2 2.140348 0.699155 0.375096 0.355586 1.799340 0.862835 0.377862‘ 0.368515 0.384188 0.323695 0.205488 0.162483 0.552756 0.363280 0.230628 0.175931 0.421240 0.418820 0.367578 0.279153 0.467356 0.360490 0.268770 0.273000 1.054224 0.708645 0.519882 0.411845 1.180548 0.675260 0.475926 0.426566 . . . 2.849685 1.347792 1.815296 . . . 3.098625 1.236636 2.007369 6 . . . . . . 3.184158 1.917405 6 . . . . . . 3.410190 1.985942 7 O C C C O C C O 0 2.059225 7 C O O O O C O O 0 1.762670 Higggllaneg_§ Parameters Prior to using the model to predict program perfor- mance, the CPU rates (HU), paging indices (A), maximum storage (5) and critical memory point (H) must be estimated. For this purpose the experimental programs were executed in 8 runs of N=4, with each experimental program being executed with 3 copies of the control program. The measured and calculated results of these runs are given in appendix A, Table A11 and A12. 121 Table 6. Working Set Weight Analysis 999399 92.3 99933; 92 I E GRAND HEAN 9.77u 1 1499.218 .0001 NRC EFFECT 0.802 3 122.997 .0001 PRIORITY 0.590 3 90.430 .0001 INTERACTIONS 0.196 9 29.991 .0001 ERROR TERH 0.007 16 .993.3 333 3293 COEFF§ STE 3339. 9: §§. PRIORITY (j) -0.286627 0.301902 j*j .143958 0.035327 . j*N -0.060599 0.065588 . HPG EFFECT (N) .180209 1.019103 ~4 N*N -0.025178 0.090079 CPU Rate Estimation CPU rates for the control programs were estimated from the 4 runs of the control program with N = 4, 5, 6 and 7. The parameter MD was found to vary considerably, even within a single run. This effect is thought to be related to the method used to account for program time and the processing which takes place following an interrupt and before the value of the CPU clock is stored. This has the effect of making CPU service intervals for low priority programs appear larger and therefore the CPU rates (HU) would appear smaller. The rates computed from each of the runs in appendix A, Table A9 and A10 are weighted averages of the rates computed from each program with I/O access rates used as weights. The expansion factors (CYT) were used to increase these rates (decrease service times) by applying the 122 equations in section 2.5, approximating the instantaneous I/O rates by LAH = 1/ET*(1-CP). The rates from the 4 runs were then averaged together and used in subsequent calcula- tions and in the predictive model. The CPU rates for the experimental programs were then estimated from the 8 runs in appendix A, Tables A11 and A12 The same procedures as before were used to estimate CYT and thereby remove some of the effects of "cycle stealing" due to I/O. Virtual Storage Estimation This turned out to be trivial since the programs were constructed to have the same maximum sizes (162K). This number was taken from the job listings which gives a measurement of the maximum virtual storage used. Storage and Paging Index Estimation These parameters were estimated in a sequential fashion. First a "quasi" paging index A' was computed with the ratio of the sum of critical storage to real storage equal to 1. We have A'=V(3)*PI(j)/CP(j)*K*(5(j)‘9(j))- (9-1-2) A linear model was then fitted to this data using ln(A') as the dependent variable and ln(N) as the independent varia- ble. Symbolically we have A = A'*(N*H/R)**-q or A = 123 A'*((H/R)**-q)*N**-q. The choice for H = 56 was justified on the basis of what follows. If the paging rate for the entire system is less than 2 pages a second it may be concluded that memory constraints are 395 active. Depending on the length of a program run, an apparent system paging rate of 2 pages/second can be caused by the initial flurry of activity when the programs begin execution and open files. The fact that the constraint is not active at N = 4 may be inferred from the size of the working sets of the low priority programs compared to the higher priority programs. The memory constraint appears to be just barely active at N = 5 so an appropriate value for H is H = R/S = 280/5 = 56K. Using H = 56K and equation 4.1.2, A is estimated for 4 different priorities of the CNTL program. From Table 7 it may be observed that the values of A vary from a low of 0.000504336 to a high of 0.21720546. This wide variation in the values of A are again indicative of the paging submodel's failure to accurately predict paging rates. To determine what is a reasonable value for A, dimen- sional analysis is applied to the defining equation for paging. The dimensions of interest are those of K and A. In the estimation process, a value of K = 490 was used for the 8/370 model 148 (meaning that the 148's average speed is 490,000 instructions per second). This makes the units of K instructions per thousanths of a second and therefore the Table 7. 124 Paging Index Calculations A' VALUES FOR H = 56K PRIORITY LEVELS 399 LIL (3) 1 2 3 9 4 0.001343 0.000504 0.001292 0.008048 5 0.002593 0.002228 0.002144 0.004502 6 0.028177 0.029452 0.044289 0.070589 7 0.051738 0.045349 0.070277 0.119278 CORR 0.962743 0.977337 0.942172 0.849545 b -16.743895 T19.677474 -18.106903 -13.270179 k 7.110351 8.672056 7.998743 5.655085 A 0.004301 0.002735 0.004414 0.013748 NORHALIZE 1.572647 1.000000 1.613795 5.026601 NORHAIIZATION FACTORS PRIORITIES (j) 929 §E111 (A) 1 2 .3. 9 2 .9 1 4 2.663 1.000 2.562 15.957 . . . . . . . . . 5 1.164 1.000 0.962 2.021 2.345 . . . . . . 6 0.957 1.000 1.504 2.397 3.591 7.375 . . . 7 1.141 1.000 1.550 2.630 4.707 4.556 4.425 units of A are in memory references per instruction divided by 1000. Expressing A in terms of memory references per instruction for the A~values from Table 7, A = 4.3, 2.7, 4.4 and 13.7 respectively. System /370 instructions reference either 0, 1, 2 or 3 locations in main memory. Of these instruction, the ones that reference 0 storage locations are: supervisory call instructions, register instructions and those used for integer and floating point arithmetic. These are not frequently used in commercial programs and, even when they are used they must be used with instructions which reference 1 storage location to retrieve and store calculated data. 125 The instructions which reference 3 storage locations are also infrequently used. These are instructions which actually reference two starting addresses but involve movement of enough data that ocassionally the data overlaps multiple pages. An example of an instruction of this type is the long move instruction. There are a large number of instructions which refer- ence a single location and they are very frequently used. Instructions in this class include branching, register loading and storing and immediate instructions which contain a single byte of data in the instruction. Instructions which reference 2 storage locations are also very frequently used and these include storage to storage moves and compares. These instructions are particularly frequent in commericial applications where they are used extensively for logic and data formatting. Based on the previous discussion and observations of instruction profiles of many programs it is very likely that the A value for most programs would be in the rangeof 1 to 2 storage references per instruction. The smallest of the estimates for A in Table 7 (and the only one in the realm of possibility) indicates 2.73 storage references per instruc- tion. Although this is possible it is not very likely. However the above analysis does point out that the only estimates for paging index in the practical range are those developed from partition 2 of Table 7. In order to make A 126 values estimable using this format, it was decided to normalize A' estimates by the corresponding factors from the bottom of Table 7 by dividing by the appropriate factor. These factors were derived by dividing the A' calculated in each partition of the four runs of the CNTL program by the corresponding estimate for A' from partition 2 (column 2 in the table). To estimate the page indices for the experimental programs, 8 runs of 4 programs each were performed. These runs consisted of a copy of the experimental program and 3 copies of the CNTL program. The measured and calculated results of these runs (see appendix A Tables A11 and A12) and the conversions in Table 7 were applied to compute the estimates of A in Table 8. From the page index estimates of Table 8 it may be observed that the paging manipulation was not successful since the estimates for A are apparently correlated with low I/O variance rather than with the high paging indicators (1X1 in the program identification). Prom the program estimation runs in the appendix, Table A10 it may also be observed that higher CPU service times are apparently correlated with high priority execution. Along with this effect a slight increase in process time may be observed, however the increase in process time is not proportional to the increase in the number of I/O operations, yielding a higher CPU rate. Fewer I/O's are reported at higher priori- 127 ties because the lower priority jobs were started first in order to delay some of the effects of priority on job initiation. At the same time the lower priority jobs are delayed in starting to execute so it makes no sense to use observations during this initiation stage. Thus only those measurements which are taken while all programs are executing are includ~ ed. This procedure should yield results which are contrary to the "expansion effect" and also contrary to the argument that the lower priority programs should have smaller rates (large average service times) because they include "start- up" time and time for initially opening the files. Hhile this effect appears to be systematic the exact cause is unknown at this time. 33.2.1 9.2....r' 481139; “.lfian ' PHAAEAQAS To the extent that the paging manipulation does result in higher paging rates the manipulation was successful. However the previous discussion on page index estima- tion points out that the attempt to-manipulate page index was not successful in terms of the estimates of A since this index is correlated with the I/O variance manipulation and not with the paging manipulation. 128 Table 8. Experimental Program Estimates 2399339 A 3 .99 A ELI 231.99 000 6 1 126.584 0.002069 3 1 4 011 6 1 126.800 0.016921 3 2 3 101 6 1 125.073 0.020515 3 2 3 110 6 1 125.528 0.003124 3 1 4 001 6 4 128.534 0.002655 3 2 3 010 6 4 128.350 0.000941 3 1 4 100 6 4 128.764 0.001193 3 1 4 111 6 4 127.095 0.002889 3 2 3 001 7 1 128.534 0.002655 3 2 3 010 7 1 128.350 0.000941 3 1 4 100 7 1 128.764 0.001193 3 1 4 111 7 1 127.095 0.002889 3 2 3 000 7 4 126.584 0.002069 3 1 4 011 7 4 126.800 0.016921 3 2 3 101 7 4 125.073 0.020515 3 2 3 110 7 4 125.528 0.003124 3 1 4 CNTL 6 2 54.296 0.002735 2 3 4 CNTL 6 3 54.296 0.002735 2 5 6 CNTL 6 4 54.296 0.002735 2 7 8 CNTL 6 5 54.296 0.002735 4 1 2 CNTL 6 6 54.296 0.002735 4 3 4 CNTL 6 1 54.296 0.002735 2 1 2 CNTL 6 2 54.296 0.002735 2 3 4 CNTL 6 3 54.296 0.002735 2 5 6 CNTL 6 5 54.296 0.002735 4 1 2 CNTL 6 6 54.296 0.002735 4 3 4 CNTL 7 2 54.296 0.002735 2 3 4 CNTL 7 -3 54.296 0.002735 2 5 6 CNTL 7 4 54.296 0.002735 2 7 8 CNTL 7 5 54.296 0.002735 4 1 2 CNTL 7 6 54.296 0.002735 4 3 4 CNTL 7 7 54.296 0.002735 4 5 6 CNTL 7 1 54.296 0.002735 2 1 2 CNTL 7 2 54.296 0.002735 2 3 4 CNTL 7 3 54.296 0.002735 2 5 6 CNTL 7 5 54.296 0.002735 4 1 2 CNTL 7 6 54.296 0.002735 4 3 4 CNTL 7 7 54.296 0.002735 4 5 6 Note: Program definitions are given in Figure 9. 129 £29 LLLaACG .49....2111 “4163.15.22 From the closeness of all estimates for CPU service time (rate) it may be inferred that the manipulation for CPU service variance was successful although the measurement tools used in this research do not permit estimation of the absolute magnitude. The following formulas were used to construct the CPU variance manipulations: S=(1+X/K)/HU**2, V(S)=v*(v+1)/3*(K*HU)**2, (4.2.1) E(S)=1/HU, where K is an integer representing the number of times the compute "kernal" is to be executed and X is an integer in [-v,v] chosen with probability 1/(2*v+1), and v is an integer less than or equal to K. In all cases K = 2 is used. For the low variance programs (11X), v = 0 is used for zero variance. For the high variance programs, (XOX) v = 2 was used for a variance of 1/2*HU or approximately 0.000032746 seconds squared compared to a mean service time of 0.00784314 seconds. For the CNTL program v = 1 was chosen. This yields a variance of 1/6*HU or 0.00005653 seconds squared compared to a mean service time of approxi- mately 0.0184176 seconds. 130 H nipplation The I/O variance was again manipulated by construc- tion. The high I/o variance program files were set up with their centers 15 cylinders apart and file sizes of 9 cylin- ders. The low I/O variance programs were set up with their centers 3 cylinders apart and with file sizes of 3 cylinders each. The I/o variance for the low and high variance programs are approximately 93 H5 squared and 123 Hs squared respectively. However, the means are also different in this case, approximately 28 H5 for the high variance programs and 18 Hs for the low variance programs. :3 o. m H 1m 10 a. 0 1+ "-0 to 1m .1...‘ n...- Sixteen runs of the APL model were made to develop the predictions to be used in the analysis of variance. The results of these runs for the experimental programs is given in Table 9. In making these predictions the APL model was found to converge quite rapidly. 0f the 16 runs, 13 converged in 6 iterations, 2 converged in 10 iterations and one required 16 iterations. The convergence criteria was to stop iterating when the maximum absolute difference in I/O rates from one iteration to the next was less than 0.002. This results in an difference of less than 0.002 seconds in I/O service time for the very lowest priority programs. A diagram of the relationship between the various submodels and the H P CPU CYCLE PAGE I/O CHNL OVER NRKG 2399 L 3 NEIL TIRE 3313 3313 BILL 9939 §§I 000 6 1 0.21 0.034 3.5 2.9 26.65 0.033 0.23 53 000 7 4 .04 .114 5.6 4.1 4.60 0.004 .34 24 001 6 4 .05 .131 2.3 1.7 5.79 0.007 .21 30 001 7 1 .18 .032 9.6 8.3 22.90 0.018 .36 57 010 6 4 .05 .138 1.5 1.1 6.08 0.008 .20 19 010 7 1 .19 .034 6.5 5.5 23.47 0.019 .33 36 011 6 1 .22 .031 5.8 4.8 26.85 0.034 .26 111 011 7 4 .02 .097 10.4 7.6 2.65 0.002 .37 46 100 6 4 .05 .137 1.7 1.2 5.95 0.007 .20 22 100 7 1 .18 .034 7.3 6.3 22.65 0.018 .34 40 101 6 1 .22 .031 5.7 4.7 26.76 0.034 .27 119 101 7 4 .02 .096 10.9 8.0 2.40 0.002 .38 48 110 6 1 .21 .034 4.0 3.3 26.18 0.033 .24 60 110 7 4 .04 .111 6.3 4.6 4.30 0.003 .35 27 111 6 4 .05 .130 2-5 1.9 5.75 0.007 .21 32 111 7 1 0.18 0.032 10.1 8.7 22.50 0.017 0.36 60 Table 9. Hodel Predictions Note: Program definitions are given in Figure 9. convergence algorithm is given in Figure 10. Hgdel gglidatio The experimental programs were then run on the System /370 Hodel 148. Since each run was repeated there were a total of 32 runs. The measured and the calculated perfor~ mance variables from the experimental programs are given in Tables 10 and 11. 132 DE AND mfl< Hove: scauomumusn mcaaemnmommauasz .oa gunman damm.Q.m.<.M.m v 9m.mo Hmcosnmm Hm madman m m 2‘ mm... IKIIKII flaw mmu Hx Hmcosnmm moo Ham Hmcosnsm 3.8 9.8 mom O\H 9 .m. .0H .m. .3 532662 59 m.couzmz no 04 AND mm MB H0 133 Table 10. Experimental Runs - Heasured H P PAGES PAGES OTHER ELAPSED PROC OVER- HRKG 99.9 9 9 999 99 929 9999 9999 9999 999 000 6 1 99 303 8,720 342.09 69.75 69.62 50 000 6 1 68 203 8,720 330.70 69.56 65.29 52 000 7 4 1,241 7,413 8,720 1,859.87 72.39 442.99 42 000 7 4 821 5,409 8,462 1,511.70 69.30 362.51 42 001 6 4 234 1,277 8,718, 833.58 69.23 176.30 52 001 6 4 288 1,423 8,718 842.08 69.31 182.45 50 001 7 1 128 1,009 5,800 322.98 47.06 75.22 54 001 7 1 217 1,462 8,353 469.79 67.93 110.99 48 010 6 4 197 1,172 8,720 848.99 69.32 182.52 54 010 6 4 182 1,048 8,720 880.52 69.18 181.84 46 010 7 1 125 1,574 7,709 489.00 63.11 115.97 48 010 7 1 106 1,479 8,057 478.64 65.76 114.60 54 011 6 1 119 426 8,718 330.65 70.36 65.86 48 011 6 1 142 419 8,718 325.86 70.55 67.47 52 011 7 4 815 5,086 7,331 1,365.27 60.97 325.92 46 011 7 4 950 5,700 6,765 1,455.22 56.81 344.14 44 100 6 4 919 4,413 8,720 1,266.70 70.30 296.02 58 100 6 4 1,191 5,808 8,720 1,463.46 70.93 347.36 56 100 7 1 223 3,537 8,498 743.75 69.70 178.52 54 100 7 1 275 4,008 8,513 819.89 70.10 196.13 62 101 6 1 127 578 8,718 337.04 70.57 70.14 58 101 6 1 129 525 8,718 330.45 70.49 69.06 60 101 7 4 1,862 7,220 1,329 1,460.52 13.25 346.28 38 101 7 4 1,641 6,061 1,060 1,244.75 10.70 292.30 36 110 6 1 83 492 8,720 367.05 70.33 78.89 58 110 6 1 84 409 8,720 358.19 70.47 74.85 58 110 7 4 1,693 6,133 927 1,257.55 9.47 295.34 40 110 7 4 1,627 6,206 1,224 1,254.70 11.85 296.95 42 111 6 4 708 3,477 8,718 1,103.52 71.18 255.99 56 111 6 4 678 3,482 8,718 1,100.18 70.92_ 254.93 56 111 7 1 241 2,618 8,601 620.79 71.15 147.87 56 111 7 1 259 2,468 8,432 593.80 69.73 141.51 54 Note: Program definitions are given in Figure 9. 001 001 001 001 010 010 010 010 011 011 011 011 100 100 100 100 101 101 101 101 110 110 110 110 111 111 111 111 ~J~IO\O\ QQmCh #4010) 1H3 .=£=a.a 4.9;“: d-aSLc cue-Ad mam QQGG fik—I—I d-‘kk \lsJO‘O‘ QQO‘O‘ ~1qu QQO‘G CC-I—I 4.19:4: Table 11. CYCLE PAGE CPU HEEL 2295 0.20 0.038 .20 .037 .04 .115 .05 .109 .08 .083 .08 .083 .15 .047 .15 .048 .08 .086 .08 .090 .13 .053 .14 .050 .21 .036 .22 .036 .05 .110 .04 .117 .06 .096 .05 .101 .09 .062 .09 .065 .21 .036 .21 .036 .01 .171 .01 .175 .19 .040 .20 .039 .01 .178 .01 .169 .07 .090 .06 .090 .12 .055 0.12 0.054 O O I admh) o o o 6 £80.; 03300 d-IQU') 010010 “-1de o 0 m 0000.... o o o o 14.10130 ONUIOCD ww-A—s cum-1.: 0 ANN: m o o @Us’t‘h SC-I-i 40de NOADONQ \DCDOU'I \dew 0 O 10de 4:423:01 ww—I-I O... O‘O‘NN U1U'1C: l O 0 0 O O O ONO.) kt—I—I ca---: 0000 0000 sauna 001—5.4 mmmm MM‘O‘ kcww mono NNNN I/O 99!! 25.49 26.37 4.69 5.60 10.46 10.35 17.96 17.78 10.27 9.90 15.77 16.84 26.37 26.76 5.37 4.65 6.89 6.00 11.43 10.38 25.87 26.39 0.91 0.85 23.76 24.35 0.74 0.98 7.90 7.93 13.86 14.20 CHNL QTIL 0.021 0.021 0.015 0.014. 0.012 0.012 0.022 0.022 0.011 0.011 0.020 0.020 0.023 0.023 0.015 0.015 0.016 0.016 0.020 0.019 0.024 0.024 0.016 0.016 0.021 0.021 0.016 0.017 0.015 0.015 0.021 0.022 Experimental Runs - Calculated Note: Program definitions are given in Figure 9. OVER HRKG 9999 999 0.20 50 .20 52 .24 42 .24 42 .21 52 .22 50 .23 54 .24 48 .22 54 .21 46 .24 48 .20 48 .21 52 .24 46 .24 44 .24 56 .24 54 .24 62 .21 58 .21 60 .24 48 .24 48 .22 60 .21 60 .24 27 .24 27 .23 32 .23 32 .24 60 0.24 60 135 The predictions and these results were then merged and made inputs to an analysis of variance using Finn's HULTI- VARIANCE [22]. The criterion variables were transformed into a relative error by dividing the differences between the predictions and the outcomes by the outcomes. Results 13.. 999.1..819 f V 999.99 Hypgghg§_§ 1:6. Examination of the Hultivariate Analysis of Variance results (Table 12) indicates significance for all 10 interactions. This prevents tests for main effects because the main effects are gggfgggggg with the interac- tions. Based on the analysis of variance it could well be that there is not a fixed effect but this cannot be shown using these tests. This effectly bars statements about the fixed effects of the five factors other than to say that there is apparently significant errors in the predictive model. Thus these techniques are unable to help further in analyzing the model. Univariate techniques will now be explored for some partial answers. Table 12. pg 136 ANOVA Table LEAST SQUARES ESTIMATES OF .3236 -.0636 .9023 of Prediction Errors Q B .99 EFFECTS-GRAND MEAN .8109“ -02218 .3 149 -.0369 STD ERRORS OF LEAST SQUARE ESTIMATES - GRAND MEAN .0181 .0044 .0436 .1‘_E__P;_I E EGR GRAND HEAN PAGE INDEX CPU VARIANCE I/O VARIANCE MPG LEVEL PRIORITY PGNDX PGNDX PGNDX PGNDX CPVAR CPVAR CPVAR IOVAR IOVAR MPG X X X X CPVAR IOVAR MPG PRIORITY IOVAR MPG PRIORITY MPG PRIORITY PRIORITY .0486 .0 020 .0 F(7,10)= 2,268.646 F(7,10)= F(7,10)= F(7,10)= 154.835 4.626 128.399 F(7,10)= 3,296.047 F(7,10)= 6,596.086 F(7,10)= F(7,10)= F (7,10) = F (7,10) = F(7,10)= F(7,10)= F(7,10)= F(7,10)= F(7,10)= F(7,10)= 32.045 17.399 41.494 56.054 23.016 27.486 6.103 32.918 36.309 376.800 372 .0091 P<.0001 P<.0001 P<.0150 P<.0001 P<.0001 P<.0001 P<.0001 P<.0001 P<.0001 P<.0001 P<.0001 P<.0001 P<.0057 P<.0001 P<.0001 P<.0001 137 9999 99. 9 99999999 99.91999 Hypgthe s 1313. To help assess the effectiveness of the model, confidence interval analysis will now be employed. The null hypotheses will be supported if the confidence intervals calculated from estimates of the of the grand mean of the relative error lies in the interval [-.15,+.15]. The confidence intervals are of the form u1t*s/n**.5, where u is the estimate of the mean, and t is the value with 16 degrees of freedom at the .025 probability level (2.12), n is the number of observations (32), and s is the standard error of the estimate [22]. The confidence intervals for the dependent variables are as follows: PROGRAH CPU UTILIZATION (CP): [0.282928,0.364252] PROGRAM CYCLE TIHE (ET): [-.073485,-.053715] TOTAL PAGING RATE (PR): [0.740463,0.958317] PAGING INPUT RATE (PI): [0.804467,1.000070] CHANNEL UTILIZATION (RCU): [-.226226,-.217274] CPU OVERHEAD (CHT): [0.231618,0.398242] HORKING SET (w): [-.057384,0.016416] From this it can observed that hypotheses 8 and 13 are supported but hypotheses 7, 9, 10, 11, and 12 do not receive support. Thus from a statistical point of view, the model does not predict the performance variables of interest within the desired error margin of 15%. A possible source of error might have been the fact that the CPU submodel does not expligitly account for variance in I/O service time. The I/O service time depends on a random pattern of accesses by device, channel conten- tion, rotational position sensing, and even device errors and retry. Hinor sources of variation in CPU service times may be attributable to error routines and exception rout— ines. A potential source of error in the CPU submodel is the fact that the model does not address the portion of CPU overhead which is not igtggpuptiblg, nor does it handle the portion of I/O processing which g_p interrupt higher priori- ty programs. A major source of error is the scaling effect due to differences in small values of CPU utilization. For exam- ple, the difference between a predicted CPU utilization of '0.02 and an observed utilization of 0.01 is of little practical importance but the relative error of 100% serious- 1y biases the CPU utilization statistic. The major source of error in the model is the subsys- tem which deals with paging. Host previous authors have either considered the paging rate to be fixed or ignored it altogether. Some authors have assumed the rate of paging operations to be proportional to the instruction execution rate for the task itself (this approach ignores data refer- ences). Others predict paging rates but only for certain "safe" values (where the paging "function" is more or less linear). The choice of two factors, critical storage size and page index to represent a program's paging behavior is an oversimplification. A program's address space may consist of many distinct types of data, each of which may have a different reference pattern. For example, the executable instructions of a program may be composed of loops, jumps, or straight-line code and the instructions may or may not modify themselves. The data portion may be scanned sequen- tially or searched using a binary or indexed technique. Data areas may be separate from instructions or the data may be interspersed among the instructions, a situation referred to as 99999991 99‘999999999- The use of a program's full size as a parameter of page demand is somewhat unrealistic in view of the fact that a certain portion of many programs are composed of error routines, initialization and termination routines which 140 execute an insignificant amount of time. The proposed model includes some of the methods of previous authors, and therefore includes some of the shortv comings. In particular the paging coefficients estimated for the page-in and page-out operations may be more repre- sentative of the characteristics of the programs run on the system than they are of the operating system and memory size of the system. If this is indeed the case, the error in predicting the performance of a particular set of programs will vary according to their deviation from the "typical" programs in an installation. The assumption that CPU overhead processing is carried out in non-pageable memory is contrary to the known opera- tion of the Operating system used in these experiments. Some portions of the operating system used in this research compete with the user programs for the available real storage pages. A final source of error in the paging submodel is that the paging indices were estimated at a multiprogramming level of N=4. From previous comments about the lack of a real memory constraint at N=4, it is likely that more stable estimates for this parameter could be obtained at higher multiprogramming levels. 141 gggtvExperimentg; Paging Agglysig -9999999 999999 99940de1 The results of the program runs with 4, 5, 6, and 7 copies of the control program were fitted to a model of the form w(j) = Q*CP(j)**1/k. This model was found to account for 6%, 69%, 99%, and 97% of the variation in w(j) for N = 4, 5, 6, and 7 respectively. The constant of proportionali- ty (Q) was found to fit the expression R/sum[CP(l):l=1,..,N] very closely. This suggests a general expression for working set size of of the form "(3)=Q*(3(j)*CP(3))**1/ko (“-5-1) where Q = R/sum[(A(l)*CP(l))**1/k:l=1,..,N]. Working backward from this expression, the following expressions are suggested for the real-time paging rate: PI(j)=CP(j)*A(j)*K*EXP(N)/(k-1)*w(j)**(k-1), (“-5-2) EXP(N)=(sum[H(l):1=1,..,N]/R)**q. The value of g was estimated to be 7.14 and the value of k was estimated to be approximately 4. Subsequent runs of the predictive model with the revised paging submodel and new estimates of A for N=7 yielded the following mean errors: 142 SE 99 £11 2.1. 99.9 9.112 9'. 0.285 0.006 0.343 0.298 0.179 0.068 0.010 This significantly reduces the errors in paging and the size of the relative errors in CPU utilization (CP) and channel utilization (RCU) is believed to be due to the scaling effect mentioned earlier. 9292..ar9929 9....ith 9999999 9..od999 Fitting the control program runs to the Belady-Kuehner "lifetime function“ [6] accounts for 47S, 47S, 98%, and 96% of the variation in paging rates. Likewise the paging expansion factor EXP(N) was found to account for 96% of the variation in the system paging rate. The large exponent value (7.14) was unexpected in view of quadratic and linear segments in Bard's Paging Index [3]. Chamberlain, Fuller, and Liu's Half-life function [12] was found to account for 49%, 39%, 99%, and 96% of the variation in paging rates when fitted to the data from the control program runs. V. CONCLUSIONS AND RECOMMENDATIONS Research gonglu519g§ The knowledge gained by study of the model discussed herein is a step toward better predictive methods for computer performance, I/O configuration performance and program performance. The system overhead due to handling I/O requests was found to be proportional to the I/O rates and there was a small but significant positive effect due to program priori- ty. There was a small but significant negative effect due to multiprogramming level. The paging overhead was found be proportional to paging rates but the small effects due to priority and multiprogramming levels were not statistically significant. The increase in the ratio of page-writes to page-reads was not significant but there was a significant increase with increasing priority. The method of sucessive estimation and the convergence techniques used in this model were found to be very effec- tive. Convergence was usually obtained in less than 6 iterations. Improvements in the algorithm made after the 143 144 experiments had been performed reduced the number of itera- tions to the range of 4 to 5 iterations for about half of the model executions. While the original goals of this research have not been achieved, positive benefits have been gained from this effort. For one thing the least elaborate part of the entire model, the paging submodel, has turned out to be the source of much of the error. Part of the reason why this part of the model is so simple is because this area is not as well understood as the operation of the CPU and I/O devices. Failure of the model to accurately represent paging behavior has a more serious effect on some of the criterion variables than others. For example, the criterion variables that seem to have the largest errors are paging counts (as expected) but also CPU overhead. This is not so hard to understand in view of the demonstrated effect of paging rates on CPU overhead. The techniques employed with the variable paging manipulation of the synthetic programs can be used as a tool in controlling loads on other systems during performance measurement, systems tuning and benchmarks. The CPU and I/O submodels are well elaborated and it is possible that they can be of benefit to others who will study the kinds of systems studied in this research. The reasons why these models are so elaborate is that suitable models for this research were not generally available and the insights 145 gained from the model building process stimulated further elaboration. Failure to accurately predict paging behavior does not prevent the model in this thesis from explaining many other events such as CPU and I/O service intervals, initial wait and completion times since paging rates are a matter of measurement in existing systems. It is possible that the use of the revised paging submodel can make this method worthwhile for practical applications. The inherent errors in this submodel at unconstrained memory levels is not too serious a defect since working set estimates are usually unimportant at unconstrained memory levels and the paging rates in these cases is usually neglible. Another benefit of this effort is that it clearly points out the direction that further research in this area ought to take. gggommendatiggg g_£ Fgrthg; Research Clearly the direction for future research in this area is the development of better analytical models of paging behavior. A likely candidate for this research would be some variation of Belady and Kuehner's lifetime function approximation [6]. Another possibility is to validate the analytic paging model based on the relationship between the CPU utilization and working set size. Post-experimental analysis of the data obtained in this research revealed strong evidence for a relationship of the form w=Q*CP**1/k. 146 This could become the basis of more research into paging in priority interrupt driven systems. The page-write/page-read ratios were not central to this research, however questions posed here suggests further study in this area. There is no reason why the data for such a study cannot be drawn from the job accounting data collected from production systems rather than from experi- ments. Methods of analysis other than factorial analysis of variance is recommended since there is some indication that interactions will reduce the usefulness of this kind of analysis. One way to overcome this problem is to perform enough observations so that independent one-way analyses can be performed. A larger number of sample points can be obtained with fewer actual program runs by including 2;; programs from each run in the analysis. The size of the relative errors in predicting the cycle time (ET) and the system overhead (OHT) suggests that the large relative errors in the program CPU utilization (CP) and the channel utilization (RCU) are scaling effects. It is therefore recommended that these variables should be scaled so that the contribution to overall error of differ- ences in small values is proportional to their importance. It is also recommended that smaller compute kernals be used for synthetic programs. This would allow a greater 147 range and control of the paging rates and the CPU variance. For example compute kernals of about 20% the size of the ones used in this research would have been more appropriate and would have allowed a greater range of CPU variance. The needs expressed in the beginning of this thesis will continue to inspire the investigation necessary to the accomplishment of better ways to perform capacity manage- ment, resource scheduling, and data processing system tuning. GLOSSARY OF TERMS g_g§§§ 25g: The mechanism which is used to physically position the read/write heads of disk-type devices for I/O operations. gpplicaglgg pgggra : A program which serves an end-use function in a computing system rather than a utility function. Examples are: payroll, inventory, etc. bggghgggk: Execution of a prescribed set of programs on one or more different computer systems for the purpose of performance measurement or comparison. blgggg pggg allocatlon: A method of partitioning the pageable real memory of a virtual computer system which arbitrarily favors certain programs over others with respect to minimization of the system paging rate. bgffering: A method of I/O data management which reads or writes data from two or more areas of real memory, allowing CPU and I/O operations of the same program to be overlapped. ggggggl servgg: The node of a queuing network which represents the CPU in some program models. The other nodes of the network represent the I/O devices. ghgnnel: The data path between an I/O device or control unit and the main memory of a computer system. instant a program accesses the CPU until the program generates an I/O or paging operation. gggpute kggggl: A collection of machine (computer) instructions which is artificially constructed to represent a "typical" program or to serve as a basis for comparisons in benchmarks. 92g 99399: A situation which prevails when the CPU is the limiting resource in a computer system. In this thesis a program is considered to be CPU bound if its CPU utilization exceeds 50% when no other programs are active on the system. 148 149 gggflgggatlgg: A particular combination of memory, processors, I/O devices and channels in a computer system. gyllgggg: The collection of all records which can be read or written with one physical positioning of a disk access mechanism (arm). glgggl gggggg gevlce: An I/O device with the ability to read or write data at any location on the recording medium without regard to sequence. Examples of direct access devices are disks and drums. glgggg gggggg gethod: The method of data access to disks or drums which takes advantage of the direct access characteristics of the device. This usually implies that the data is organized for direct access and the physical address of the data is generated by the program based on a randomizing algorithm. glgk §ghgduling: The queue management and service discip- line for disk I/O requests. Examples of disk schedul- ing strategies are First-Come-First-Served (FCES), Priority, and SCAN. glgk: A type of direct access device which uses iron oxide coated platters or "disks" as a recording medium. Data is recorded in concentric circles about the axis of a shaft which rotates the surfaces under read/write heads. A number of disk surfaces may be mounted on a single shaft. glgpgtchigg: The process of storing the status (or state) of the program which has control of the CPU and restoring the status of a suspended program. The process of switching control of the CPU from one program to another. _l§§: Execute-Channel-Program. A request by an executing program to the Operating System to commence a specific I/O operation. FC §: A queue service policy, First-Come-First-Served (see FIFO). 9.1..- : A queue service policy, First-In-First-Out (see 5 The mean program execution interval between page 150 lLQ pagag: A situation which prevails when the I/O devices are the limiting resources in a computer system. A program is considered to be I/O bound if it spends greater than 75% of its elapsed execution time waiting for I/O when it is the only program active on a computer system. laaaagg aggagg: A method of disk access which requires that the location of a particular disk record be retrieved from an index before the data record itself can be read. The index is a disk file but it may be made core resident during program execution. laltlal £2953 The elapsed time from the completion of a program's I/O until the program has access to the CPU. This implies that another program is using the CPU when the program in question becomes ready to run. lglgggapt: An event which triggers a change in the program which is in control of the CPU. An interrupt may be voluntary, as in a request for I/O, or it may be involuntary, as is the case for a page exception or the completion of another program's I/O. In the latter case control will be ultimately given to the highest priority program that is ready to run. l§: A queue service discipline for a server with infinite capacity, or an Infinite Server. This differs from the PS server in that the arrival of new requests does not diminish the rate at which previous arrivals are served. and recording I/O counts, paging counts, storage allocations and elapsed time and cumulative CPU time for the processes or programs in a computer system. I" ._§§: Last-Come-First-Served queue service discipline. The most recently arrived request is served immediate- ly and the earliest arriving requests that are still unsatisfied are not served until all later arrivals are served. lifetiaa fgactiga: A relation between the amount of real memory allocated to a process (program) in a virtual system and the process time between page exceptions. RU: A paging algorithm which attempts to select the Least-Recently-Used page frame to satisfy a paging exception. A stack procedure which is used to imple- ment the LRU algorithm. 151 aaahgg a; falaa pgalplga: A method for accelerating convergence of a sequence by approximating the deriva- tive in Newton's Hethod with a divided difference of previously computed sequence values. agfigla: The collection of disk surfaces which rotate about a common shaft in a disk I/O device. It is also referred to as a spindle. mu l_lp_ggramming: The process of concurrent execution of tw wo or more programs in a single computer system. mult £pgggaaming gap: The collection of all the programs or tasks that are concurrently executing in a single computer system during the interval under considera- tion. 929229929299 agagay: The portion of high speed memory which is fixed and not allowed to be paged in a virtual system. Generally this portion of memory is used by the operating system for critical system tasks such as I/O or paging management. gapancy: The probability of executing the instructions in a particular page of a program's address space. gyggheag: The portion of elapsed time during which the operating system is performing functions on behalf of problem (user) programs. This time is not attributed to any specific program by job accounting. pagg aagapglga: An interrupt which occurs whenever reference is made to a page of virtual memory which is not in real memory, is. page fault. pagg faglt: A page exception. paga: The smallest unit of virtual memory that is managed by the paging mechanism of the virtual system. In this thesis a page is fixed and either 2048 or 4096 bytes. paglag: The process of selecting a page frame of real memory, possibly writing its contents onto auxiliary storage (disk or drum), and reading the missing virtual page into the selected page frame in virtual systems. paglpheral: An I/O unit or piece of data processing equipment which is locally attached to the central processor (CPU) but is not a part of it. 152 pglgglly aghgggllag: The operating system policy of giving control to the program which has the greatest importance (possibly on an arbitrary scale) and is ready to run. pglgglay;£a§aag garvice: The queue service discipline which services the highest priority arrival in the queue and resumes service to any interrupted arrival only after all higher priority arrivals have completed service. pggblaa 599553 The state of the CPU when user program instructions are being executed. In this state, certain CPU instructions which may compromise the integrity of the entire system are prohibited. This is in contradiction to system state, during which any instruction may be executed. pgggggnggra: The form of the solution for network queuing problems which expresses the probabilities for network and subnetwork states as the product of the probibilities for the individual nodes of the network. Such a problem is said to be separable. g§: A queue service discipline for a server which has Processor Sharing. In this case the server can service all arrivals simultaneously, but with the individual service rates identical and inversely proportional to the number of arrivals being served. This construct is the limiting case for round-robin scheduling. gaag_a aggaag: See direct access method. aagponse: In the case of terminals, usually the elapsed time from data input until the answer or acknowledge- ment is received at the terminal. In the case of program I/O, response time is the time from them request to begin an I/O operation until the operation is completed. _§§: Rotational Position Sensing. The technology which allows some disk or drum I/O units to disconnect from the channel during rotational delay and to reconnect to the channel when the desired records come under the read/write heads. This greatly increases the data carrying capacity of the channel. §lT_: An I/O scheduling method for disks and drums which services a queue of I/O requests so that requests with the shortest access time (from the current angular displacement of the recording medium or the current sector) are serviced first, Shortest-Access-Time-First. 153 §§_N: An I/O scheduling method for disks which moves the access mechanism back and forth over the disk surfac- es, stopping at cylinders which have outstanding requests to be serviced. agaedullag: In this thesis, scheduling is understood to mean the arrangement and sequencing of programs or jobs for production as performed by a person or a program which is not a part of the operating system. In other uses of the word "scheduling", the meaning is made clear by the presence of another word such as "disk", "I/O", etc. seek: A physical movement of a disk access mechanism or arm from one position or cylinder location to another. §aal;ga;kg§§ paocess: A stochastic process characterized by the fact that the probability distribution of the time between changes in system states depends only on the initial and terminal states and is otherwise independent of the previous history of the system. aapagable: A problem which can be formulated in such a way that the solution is expressible as a product of probabilities (see product form). gaaggg aagggy: Portions of virtual memory having subrout- ines which are executed by two or more independent programs. Such subroutines may not modify instruc- tions or data areas in the shared memory, and are said to be gazaaarant. spindla: See module or disk. §pggllag: A method of increasing overall computer system efficiency by freeing programs from direct interaction with low speed I/O devices. The input is read from data input devices by a "reader" program, blocked and written to disk. When the actual processing program is executed, its card and print output is blocked and written to disk. Actual punching and printing are accomplished asynchronously by "writer" programs. _§l§: A disk scheduling method which services I/O requests in the order of the shortest seek times from the current location of the access arm, Shortest-Seek-Time-First. 154 51555! 52595: The state of the CPU when operating system instructions are being executed. In this state, all instructions are allowed including special system (privileged) instructions. Host of the time in this state is not attributed to any specific program and may be thought of as overhead. alag;sharlag: The process of giving computer access to terminal users so that the user can interact with the computer with little regard to the absence or presence of other users. Time-sharing systems generally use the round-robin or time-slicing scheduling discipline. The service discipline rotates access to the CPU among the members of the queue in discrete increments of time. tgaaaactiga: A logical exchange of data between a custom- er or user and the data processing system. Examples of transactions are payments, orders, requests, requisitions and inquiries. algagal systaa: A computer system (including hardware and software) which is capable of executing programs of any size without regard to the size of the computer's real memory. aggklag gap: The collection of all pages of a program (or system) which has been referenced during some arbi- trary interval. APPENDIX A Table A1. lW'U GwN—b thaw»... O‘U’ISUJN.‘ PAGES OUT APPENDIX A Non-paging Runs Repl I - Heasured PAGES OTHER 991 929 11,961 12,002 6,334 11,900 5,571 2,811 11,712 6,718 3,736 2,151 11,662 6,507 3,820 2,332 985 11,448 6,673 3,942 2,469 1,131 528 156 9999 301.01 301.01 342.50 342.50 342.50 357.18 357.18 355.12 355.12 431.66 431.66 431.66 429.59 429.59 437.97 437.97 437.97 437.97 431.77 431.77 452.48 452.48 452.48 452.48 452.48 435.96 435.96 ELAPSED PROCESS 9999 97.87 34.07 96.98 117.44 53.16 95.62 102.71 52.45 59.24 93.67 123.99 68.68 40.35 73.81 92.54 119.86 70.04 43.94 18.97 76.10 90.86 122.83 72.24 46.40 21.70 9.97 77.49 URKG 999 58 1136 60 56 1136 60 58 54 1136 62 56 56 56 113 60 58 58 58 56 1136 62 58 58 60 56 56 1136 157 Table A1(Cont.). 1511'!) qmmcww... (DQONU'ISWN... PAGES PAGES OTHER ELAPSED PROCESS NRKG 999 99 929 9999 9999 999 . . . . . . 10,600 429.87 83.78 62 . . . . . . 6,343 429.87 116.74 60 . . . . . . 3,766 431.93 68.97 56 . . . . . . 2,370 431.93 44.36 58 . . . . . . 1,149 431.93 21.88 58 . . . . . . 547 431.93 10.11 58 . . . . . . 177 411.27 3.90 56 411.27 73.09 1136 . . . . . . 10,284 419.55 81.35 66 . . . . . . 6,115 419.55 112.55 58 . . . . . . 3,648 421.62 67.05 56 . . . . . . 2,317 421.62 43.35 58 . . . . . . 1,121 421.62 21.36 58 . . . . . . 557 421.62 10.43 56 . . . . . . 216 421.62 4.70 58 . . . . . . 75 351.39 1.57 54 351.39 62.32 1138 29.9 01101 OVRHED 01101 02011 OVRHED 01101 02011 03001 OVRHED 01101 02011 03001 04000 OVRHED 01101 02011 03001 04000 05110 OVRHED 01101 02011 03001 04000 05110 06101 OVRHED Table CPU 9999 0.33 .11 .28 .34 .16 .27 .29 .15 .17 .22 .29 .16 .09 .17 .21 .27 .16 .10 .04 .18 .20 .27 .16 .10 .05 .02 .18 A2. CYCLE 9999 0.025 .029 .054 .030 .064 .126 .037 .064 .116 .200 .038 .067 .115 .188 .438 .040 .068 .115 .183 .400 .824 TI/O RATE 35.05 18.50 33.32 15.60 7.92 27.14 15.57 8.66 5.01 26.63 14.86 8.72 5.33 2.28 25.30 14.75 8.71 5.46 2.50 1.21 158 CPU CPU 3929 9999999 122.22 0.00818 123.76 .00808 53.94 .01854 124.46 .00804 54.25 .01843 53.61 .01865 125.05 .00800 54.19 .01845 54.42 .01838 53.33 .01875 126.03 .00794 54.30 .01842 54.56 .01833 53.09 .01884 51.97 .01924 126.00 .00794 54.33 .01841 54.58 .01832 53.23 .01879 52.17 .01917 53.05 .01885 PAGEIN 9999 Non-paging Runs Repl I - Calculated I/O 9999 39.74 35.05 18.50 33.32 15.60 7.92 27.14 15.57 8.66 5.01 26.63 14.86 8.72 5.33 2.28 25.30 14.75 8.71 5.46 2.50 1.21 CYCLE EACTQR 1.000 0.997 0.992 0.996 0.992 0.987 0.995 0.992 0.989 0.985 0.995 0.992 0.989 0.986 0.983 0.995 0.992 0.989 0.986 0.983 0.982 OVRHED 01101 02011 03001 04000 05110 06101 07100 08000 OVRHED CPU 9.2;; .20 .27 .16 .10 .05 .02 .01 .18 .19 .27 .16 .10 .05 .03 .01 0.00 .18 CYCLE El!!! .041 .068 .115 .182 .376 .788 2.311 .041 .069 .116 .182 .376 .756 1.943 4.624 159 Table A2(Cont.). TI/O 5.5.2.3. 24.66 14.76 8.72 5.49 2.66 1.27 0.43 24.52 14.58 8.66 5.50 2.66 1.32 0.52 0.22 CPU CPU 3521:; 52an 126.53 .00790 54.34 .01840 54.62 .01831 53.44 .01871 52.57 .01902 54.22 .01844 45.63 .02192 126.43 .00791 54.34 .01840 54.43 .01837 53.47 .01870 52.53 .01904 53.48 .01870 46.20 .02165 48.44 0.02065 PAGEIN 139.23 I/O CYCLE BEE 24.66 14.76 8.72 5.49 2.66 1.27 0.43 24.52 14.58 8.66 5.50 2.66 1.32 0.52 0.22 245.193 0.995 0.992 0.989 0.987 0.985 0.983 0.982 0.995 0.992 0.989 0.986 0.984 0.982 0.981 0.981 Table A3. (113de SCUM—I (”N-d N-b .a O‘UlfiuJN-L gonnawm-n 0340\mfiwN—l PAGES 922 PAGES l! 160 OTHER £49 11,977 11,982 6,388 11,965 5,606 2,856 11,838 6,810 3,736 2,151 11,724 6,571 3,858 2,391 1,042 11,582 6,751 4,075 2,523 1,183 544 11,159 6,607 3,948 2,441 1,169 585 198 10,327 6,182 3,785 2,380 1,161 596 228 81 ELAPSED 2w: 301.03 301.03 342.51 344.58 344.51 359.20 361.27 357.14 357.14 431.71 431.71 433.77 431.71 431.79 444.15 444.15 444.15 444.15 435.89 435.89 460.81 460.81 460.81 460.81 460.81 448.41 448.41 450.57 450.57 450.57 450.57 450.57 450.57 429.92 429.92 427.90 427.90 427.90 427.90 427.90 427.90 425.83 354.72 354.72 Non-paging Runs Repl II - Heasured PROCESS 2132 97.70 33.94 96.78 117.73 53.45 96.29 103.00 52.76 59.86 94.14 124.76 68.35 40.10 74.73 93.30 120.78 70.55 44.88 19.99 77.16 91.96 123.82 74.41 47.29 22.64 10.13 79.96 88.46 121.88 72.14 45.90 22.16. 10.86 4.21 76.83 82.02 113.20 69.14 44.21 21.99 11.09 4.92 1.59 63.14 BBQE 01101 OVRHED 01101 02011 OVRHED 01101 02011 03001 OVRHED 01101 02011 03001 04000 OVRHED 01101 02011 03001 04000 05110 OVRHED 01101 02011 03001 04000 05110 06101 OVRHED 01101 02011 03001 04000 05110 06101 07100 OVRHED 01101 02011 03001 04000 05110 06101 07100 08000 OVRHED Table A4. CPU CYCLE 92;; an; 0.33 0.025 .11 .28 .029 .34 .054 .16 .27 .030 .29 .064 .15 .125 .17 .22 .036 .29 .063 .16 .116 .09 .201 .17 .21 .038 .27 .068 .16 .115 .10 .186 .05 .418 .18 .20 .040 .27 .068 .16 .113 .10 .183 .05 .389 .02 .823 .18 .20 .040 .27 .068 .16 .114 .10 .185 .05 .385 .02 .769 .01 2.160 .18 .19 .041 .27 .069 .16 .113 .10 .180 .05 .368 .03 .717 .01 1.860 0.00 4.326 TI/O 3.42.1; 39.79 34.99 18.54 33.31 15.52 8.00 27.42 15.78 8.62 4.99 26.40 14.80 8.69 5.39 2.39 25.14 14.65 8.85 5.48 2.57 1.22 24.77 14.67 8.77 5.42 2.60 1.30 0.46 24.14 14.45 8.85 5.56 2.72 1.40 0.54 0.23 161 Non-paging Buns CPU 3.9.23; 122.60 123.81 54.27 124.27 54.44 54.15 125.76 54.59 54.67 53.66 125.68 54.41 54.70 53.30 52.18 125.96 54.53 54.78 53.35 52.31 53.80 126.16 54.21 54.74 53.20 52.80 53.97 47.25 125.92 54.62 54.76 53.85 52.83 53.83 46.55 51.74 CPU QLANIEE 0.00816 .00808 .01843 .00805 .01837 .01847 .00795 .01832 .01829 .01864 .00796 .01838 .01828 .01876 .01916 .00794 .01834 .01826 .01874 .01912 .01859 .00793 .01845 .01827 .01880 .01894 .01853 .02117 .00794 .01831 .01826 .01857 .01893 .01858 .02149 0.01933 PAGEIN £223 Repl II - Calculated I/O RAT§ 39.79 34.99 18.54 33.31 15.52 8.00 27.42 15.78 8.62 4.99 26.40 14.80 8.69 5.39 2.39 25.14 14.65 8.85 5.48 2.57 1.22 24.77 14.67 8.77 5.42 2.60 1.30 0.46 24.14 14.45 8.85 5.56 2.72 1.40 0.54 0.23 CYCLE FACTOR 1.000 0.997 0.992 0.996 0.992 0.987 0.995 0.992 0.988 0.985 0.995 0.992 0.989 0.986 0.983 0.995 0.992 0.989 0.986 0.983 0.982 0.995 0.992 0.989 0.986 0.983 0.982 0.981 0.995 0.992 0.989 0.986 0.984 0.982 0.981 0.981 Table A5. worn 4:de OU'ISUJN—I mch-a ~301U14=WN4 mummawwd PAGES 992 41 14 2 2 10 95 39 28 34 73 26 150 132 66 141 448 569 97 197 253 347 458 826 732 590 237 1,392 2,187 3,225 2,930 2,691 3,206 922 162 Paging Runs Repl I - Measured PAGES ~H 160 68 3 (4 31 296 120 34 13 111 93 476 336 159 163 491 1,090 341 1,261 987 949 954 2,111 1,751 1,390 648 10,734 8,671 8,202 7,501 6,921 6,618 3,136 OTHER 29 12,002 7,015 3,909 2,331 12,002 6,830 2,593 1,081 12,002 6,983 4,348 2,763 1,149 261 12,002 7,288 4,062 2,581 429 178 100 11,921 10,564 525 460 367 866 ELAPSED PROCESS PIPE. 441.10 445.16 447.19 447.19 441.10 461.23 467.54 467.54 467.54 467.54 461.23 500.05 508.69 517.17 517.17 517.17 517.17 500.05 597.96 609.39 584.91 611.67 611.67 611.67 611.67 584.91 2,183.56 2,183.56 2,185.5 2,185.59 2,185.59 2,185.59 2,183.56 22.14;: 95.26 129.29 71.72 43.69 72.11 95.29 125.87 75.22 48.51 20.71 78.74 95.63 128.51 79.51 51.76 21.97 5.32 94.21 96.44 135.87 75.56 49.02 9.05 3.67 2.38 127.99 99.97 201.39 12.55 10.64 8.97 19.84 506.51 HRKG £31 110 54 42 52 120 66 50 44 46 60 62 54 46 48 46 48 36 34 58 46 46 44 40 26 22 34 50 44 26 24 22 32 32 01101 02011 03001 04000 OVRHED 01101 02011 03001 04000 05110 OVRHED 01101 02011 03001 04000 05110 06101 OVRHED 01101 02011 03001 04000 05110 06101 07100 OVRHED 01101 02011 03001 04000 05110 06101 07100 08000 OVRHED Table A6. CPU CYCLE 9.1;; 1.1.9}; 0.22 0.036 .29 .063 .16 .114 .10 .191 .16 .21 .038 .27 .067 .16 .113 .10 .179 .04 .392 .17 .19 .040 .25 .069 .15 .115 .10 .177 .04 .315 .01 .383 .19 .16 .045 .22 .074 .13 .117 .08 .173 .02 ..241 .01 .317 .00 .410 .22 .05 .096 .09 .114 .01 .250 .01 .275 .00 .300 .01 .292 163 Paging Buns Repl I - Calculated TI/O BATE 27.58 15.91 8.75 5.22 26.67 14.87 8.89 5.58 2.552 24.96 14.39 8.72 5.66 3.17 2.61 22.18 13.58 8.57 5.78 4.15 3.16 2.44 10.38 8.81 4.00 3.64 3.34 3.43 CPU 3.3.2.}; 999329.}! CPU 126.00 0.00783 54.26 54.52 53.37 125.97 54.27 54.78 53.47 52.25 125.52 54.35 54.70 53.40 52.35 49.26 124.46 53.65 53.77 52.68 47.50 48.80 42.44 119.26 52.46 41.93 43.32 41.01 43.69 .01825 .01833 .01871 .00775 .01811 .01811 .01861 .01736 .00766 .01756 .01764 .01769 .01339 .00393 .00727 .01642 .01508 .01386 .00356 .00190 .00160 .00441 .01047 .00144 .00134 .00123 0.00265 PAGEIN I/o CYCLE gggg FACTOR BATE. 0.4 .2 0 0 1 6 3 1 0 2 2 0 7 3 3 9 2.1 :4: ANNwd-I-IN .300wa noon-no 0000.... q 1:016:00 dmeIGGO‘d \I 040 27.21 15.76 8.74 5.22 26.02 14.61 8.81 5.55 2.31 24.00 13.73 8.41 5.35 2.22 0.51 20.07 11.96 6.95 4.22 0.70 0.29 0.17 4.92 4.84 0.24 0.21 0.17 OCuo 0.995‘ 0.992 0.988 0.985 0.994 0.995 0.992 0.988 0.985 0.982 0.993 0.994 0.991 0.988 0.985 0.983 0.981 0.992 0.991 0.989 0.986 0.984 0.982 0.980 0.978 0.989 Table A7. cum..- Iw'U 01013:de (”CUM-A QONU'l-FLUNA -PAGES OUT 45 16 0 21 18 116 63 24 37 81 36 150 97 31 112 388 531 74 234 298 402 521 951 807 629 255 PAGES 2! 166 85 3 7 52 302 170 34 20 86 95 459 252 86 88 378 1,067 282 1,556 1,212 1,156 1,151 2,336 1,954 1,556 755 164 OTHER I‘O 12,002 6,919 3,916 2,295 12,002 6,859 4,160 2,610 1,110 12,002 6,953 4,242 2,838 1,173 249 12,002 7,444 4,180 2,770 537 183 91 Elfifi 441.43 443.54 445.57 445.57 441.43 462.56 468.95 471.04 471.04 471.04 462.56 487.47 502.52 504.88 504.88 504.88 504.88 487.47 644.82 654.00 627.30 656.28 656.28 656.28 656.28 627.297 Paging Runs Repl II - Measured ELAPSED PROCESS QIHE 95.46 127.67 71.89 43.00 72.22 95.49 126.63 76.01 48.80 21.22 78.78 95.54 128.55 78.36 53.16 22.46 4.89 89.78 96.51 138.82 78.19 52.67 11.176 3.81 2.30 138.81 WRKG SP: 100 64 44 54 110 70 52 40 44 62 62 56 50 42 46 48 38 34 56 46 44 44 44 26 20 34 OVRHED 01101 02011 03001 04000 05110 OVRHED 01101 02011 03001 04000 05110 06101 OVRHED 01101 02011 03001 04000 05110 06101 07100 OVRHED Table A8. CPU CYCLE 9.2;; PISS 0.22 0.036 .29 .063 .16 .114 .10 .193 .16 .21 .038 .27 .067 .16 .112 .10 .179 .05 .394 .17 .20 .039 .26 .070 .16 .117 .11 .172 .04 .325 .01 .383 .18 .15 .048 .21 .076 .13 .118 .08 .167 .02 .228 .01 .307 .00 .398 0.22 Paging TI/O BALE. 27.57 15.79 8.80 5.17 26.60 14.99 8.91 5.59 2.54 25.57 14.34 8.57 5.80 3.07 2.61 21.03 13.24 8.51 5.98 4.38 3.26 2.51 165 CPU ELIE 92.3.1112! CPU 125.74 0.00784 54.20 54.48 53.40 125.71 54.17 54.75 53.51 52.37 125.63 54.10 54.15 53.41 52.28 51.17 124.38 53.63 53.47 52.62 48.14 48.29 40.00 .01823 .01834 .01867 .00776 .01801 .01812 .01855 .01772 .00796 .01784 .01810 .01816 .01447 .00371 .00712 .01604 .01465 .01343 .00389 .00178 0.00140 PAGEIN RATE 0.4 .2 .0 .0 .1 .7 .4 .1 .0 .2 .2 0.9 .5 .2 .2 .7 2.1 .6 dNUU-l-A-LN I. O O 0 l 0 0 Nkchmmk Runs Repl II - Calculated I/O CYCLE 342:3.- ..P 142.192. 27.19 15.60 8.79 5.15 25.95 14.63 8.83 5.54 2.36 24.62 13.84 8.40 5.62 2.33 0.50 18.61 11.38 6.67 4.22 0.82 0.28 0.14 0.995 0.992 0.988 0.985 0.994 0.995 0.992 0.988 0.985 0.982 0.993 0.994 0.991 0.988 0.985 0.983 0.981 0.992 0.991 0.989 0.987 0.984 0.983 0.981 0.979 0.989 "R _ 23.9. CNTL CNTL CNTL CNTL OVRHED CNTL CNTL CNTL CNTL CNTL OVRHED CNTL CNTL CNTL CNTL CNTL CNTL OVRHED CNTL CNTL CNTL CNTL CNTL CNTL CNTL OVRHED Table A9. cum-s GUISWN-B mbN-i downstate... PAGES 921 28 23 33 57 70 90 83 143 184 312 312 419 600 685 575 579 793 915 970 901 805 166 PAGES OTHER ;3 143 73 110 275 263 269 210 326 381 1,534 1,502 1,554 1,746 1,854 3,306 2,787 2,682 2,500 2,411 2,192 1,095 .119 12,144 9,123 6,223 3,409 8,101 6,679 5,353 3,904 3,975 3,785 2,712 1,848 1,144 538 3,647 3,161 1,693 711 333 194 156 2;!3 761.46 761.46 762.73 762.73 761.46 655.25 655.25 655.25 649.05 636.48 636.48 393.59 393.59 393.59 393.58 391.53 391.53 609.72 609.72 609.72 609.72 606.47 606.47 584.38 584.38 Page Allocation Runs - Heasured ELAPSED PROCESS ELLE 222.39 164.80 113.59 64.75 110.71 142.30 116.44 94.19 70.22 74.19 95.19 68.75 49.12 33.66 21.24 10.87 91.72 68.90 60.90 32.77 14.02 6.82 4.33 3.54 138.50 HRKG SP: 82 58 64 78 66 52 52 52 54 62 52 52 48 38 26 56 52 48 40 36 26 24 EBQE CNTL 1 CNTL 2 CNTL 3 CNTL 4 OVRHED CNTL 1 CNTL 2 CNTL 3 CNTL 4 CNTL 5 D Table A10. CPU CYCLE HEEL TEE; 0.29 0.062 .22 .083 .15 .120 .08 .207 .15 .22 .078 .18 .094 .14 .118 .11 .153 .12 .146 .15 .18 .074 .13 .093 .09 .116 .05 .136 .03 .164 .01 . . . .23 .11 .088 .10 .103 .05 .139 .02 .190 .01 .221 .01 .254 .01 0.286 0.24 167 Page Allocation Runs - Calculated TI/O BATE 16.17 12.08 8.30 4.83 12.77 10.60 8.49 6.52 6.85 13.52 10.71 8.65 7.35 6.11 4.20 11.40 9.76 7.18 5.27 4.52 3.93 3.50 CPU CPU SAPS QWUANTQE 54.61 0.01801 55.37 .01792 54.78 .01794 52.66 .01757 56.94 .01701 57.37 .01677 56.85 .01693 55.60 .01660 53.59 .01703 55.07 .01292 55.23 .01165 54.93 .00989 53.90 .00735 49.49 .00455 52.95 .00990 51.92 .00964 51.69 .00749 50.77 .00437 48.95 .00249 45.01 .00181 44.40 0.00171 PAGEIN BATE 0.2 .1 .1 .4 coon-o OQSOQNO waawww UURngU'I 0000000 NGOJkO‘S I/O RATE 15.95 11.98 8.16 4.47 12.37 10.20 8.17 6.02 6.25 9.62 6.89 4.70 2.91 1.37 0.31 5.98 5.19 2.78 1.17 0.55 0.32 0.26 CYCLE PAcggg 0.996 .994 .993 .992 .994 .993 .993 .992 .992 .989 .988 .987 .987 .986 .985 .997 .996 .996 .995 .995 .995 0.995 25.9 000 CTL CTL CTL OVRHED 011 CTL CTL CTL OVRHED 101 CTL CTL CTL OVRHED 110 CTL CTL CTL OVRHED CTL CTL CTL 001 OVRHED CTL CTL CTL 010 OVRHED CTL CTL CTL 100 OVRHED CTL CTL CTL 111 OVRHED Table A11. lw'U SWN-fl GWN-i sumo... #UJN—b 1:de BOON—I 1:de SWIG-l PAGES 992 1 9 23 51 12 23 0 28 57 5 22 0 6 59 15 5 11 25 60 15 24 29 39 70 16 26 17 34 52 21 38 27 36 59 25 31 26 35 65 19 168 PAGES OTHER 2! 44 21 47 76 32 144 (4 47 101 14 165 a 8 112 49 44 23 43 97 29 117 78 97 239 60 143 57 117 115 53 167 68 128 125 72 166 81 116 240 65 149 8,462 4,302 2,467 1,293 8,409 4,015 2,230 1,114 8,313 3,777 2,129 1,050 8,440 4,415 2,575 1,359 11,915 8,961 6,044 8,718 12,395 9,330 6,418 8,720 12,075 9,174 6,255 8,720 11,941 9,025 6,148 8,718 Iléfi 288.76 288.76 290.81 280.56 280.56 270.62 270.62 270.62 258.43 258.43 260.51 260.51 262.60 250.35 250.35 294.87 294.87 296.96 286.67 286.67 746.14 748.17 748.17 737.97 737.97 776.57 776.57 778.60 768.41 768.41 760.68 760.68 762.76 752.52 752.52 752.53 752.53 752.53 742.29 742.29 Program Estimation Runs - Measured ELAPSED PROCESS r;u§ 67.18 79.81 46.37 24.54 45.27 67.66 74.35 42.04 21.15 42.63 66.82 69.83 39.77 19.99 41.53 67.58 81.51 48.29 25.81 46.35 218.38 161.44 110.52 68.36 108.75 227.01 168.15 117.22 68.48 219.93 164.09 113.55 68.26 110.79 219.04 162.67 112.41 69.14 109.74 wPKG §§T 56 52 64 96 96 92 54 56 76 94 94 56 58 64 82 72 58 62 84 80 84 58 66 70 94 90 60 66 58 96 82 62 64 64 84 82 58 64 .74 88 169 Table A12. Program Estimation Runs - Calculated CPU CYCLE TI/O CPU PAGE PAGEIN EXP ADJ-CPU 2595 PEEL IlEE SAPS 2A2; BATE BATE EASIQB BATE 000 1 0.23 0.034 29.46 125.98 0.2 0.2 0.995 126.58 CNTL 2 .28 .067 14.93 54.92 0.1 0.1 0.991 54.38 CNTL 3 .16 .116 8.65 53.23 0.2 0.2 0.990 53.79 CNTL 4 .09 .205 4.88 52.73 0.5 0.3 0.989 53.32 OVRHED .16 0.2 0.1 011 1 .25 .032 31.08 126.20 0.6 0.5 0.995 126.80 CNTL 2 .28 .067 15.83 54.02 0.0 0.0 0.991 54.51 CNTL 3 .16 .119 8.42 53.07 0.3 0.2 0.989 53.65 CNTL 4 .08 .213 4.71 52.71 0.6 0.4 0.988 53.32 OVRHED .17 0.1 0.1 101 1 .26 .031 32.57 124.50 0.7 0.6 0.995 125.07 CNTL 2 .27 .069 14.52 54.10 0.0 0.0 0.993 54.48 CNTL 3 .15 .123 8.14 53.57 0.1 0.0 0.991 54.05 CNTL 4 .08 .215 4.65 52.58 0.7 0.4 0.990 53.09 OVRHED .17 ' 0.3 0.2 110 1 .23 .035 28.78 124.91 0.2 0.1 0.995 125.53 CNTL 2 .28 .066 15.05 54.18 0.1 0.1 0.992 54.64 CNTL 3 .16 .113 8.82 53.37 0.2 0.1 0.990 53.90 CNTL 4 .09 .197 5.08 52.55 0.5 0.3 0.989 53.14 OVRHED .16 0.2 0.1 CNTL 1 .29 .062 16.13 54.57 0.2 0.2 0.995 54.86 CNTL 2 .22 .083 12.08 55.52 0.1 0.1 0.993 55.91 CNTL 3 .15 .122 8.21 54.70 0.2 0.1 0.992 55.15 001 4 .09 .082 12.14 127.55 0.4 0.3 0.992 128.53 OVRHED .15 0.1 0.1 CNTL 1 .29 .062 16.15 54.61 0.2 0.2 0.995 54.90 CNTL 2 .22 .083 12.09 55.49 0.1 0.1 0.993 .55.88 CNTL 3 .15 .119 8.39 54.76 0.2 0.2 0.992 55.21 010 4 .09 .087 11.50 127.36 0.2 0.2 0.992 128.35 OVRHED .15 0.1 0.1 CNTL 1 .29 .062 16.09 54.91 0.3 0.2 0.995 55.20 CNTL 2 .22 .082 12.15 55.91 0.1 0.1 0.993 56.31 CNTL 3 .15 .119 8.37 55.10 0.2 0.2 0.992 55.55 100 4 .09 .085 11.76 127.77 0.2 0.2 0.992 128.76 OVRHED .15 0.1 0.1 CNTL 1 .29 .062 16.09 54.52 0.3 0.2 0.995 54.82 CNTL 2 .22 .083 12.10 55.49 0.1 0.1 0.993 55.88 CNTL 3 .15 .120 8.33 54.70 0.2 0.2 0.992 55.16 100 4 .09 0.083 12.07 126.12 0.4 0.3 0.992 127.10 OVRHED 0.15 0.1 0.1 SELECTED BIBLIOGRAPHY 1. SELECTED BIBLIOGRAPHY Bard, Y. "Characterization of Program Paging in a Time-Sharing Environment", Inn gonnnel e; Reseanen nne Devenepmenn, Vol. 17, No. 5, (Sept 1973), pp. 387-393. -- "Application of the Page Survival Index (PSI) to Virtual-Eemory System Performance", Inn Journa; e; Researen nnn nevelopnenn, Vol. 19, (Hay 1973f? pp. 212-220. _. "A Characterization of VH/370 Uorkloads", Inn gembnigge Scientific Centen, Technical Report Baskett, F., and Gomez, F.P. "Processor Sharing in a Central Server Queuing Hodel of Hultiprogramming with Applications", girth nnnne; Princenen gengen- enee en Infiormation Scieneee eng S stem , (1972), pp. 598-603. Bass, L.J. "0n Optimal Processor Scheduling for Hulti- programming", SIAM g. QOHPUT., Vol. 2, No. 4, (December 1973), pp. 273-80. Belady, L.A., and Kuehner, C.J. "Dynamic Space-Sharing in Computer Systems", Conmunicetions 92 ene neg, Vol. 12, No. 5, (Bay 1969), pp. 282-288. Boyd, D.L. "A Hultiple Resource Hodel for_A Batch-Pro- cessing Hultiprogralming System", Natione; gechniee; Infonnenien Senviee, U.S. Department of Commerce, Report an-722332, (narch 1971). Boyse, J.i., and Earn, D.P. "A Straightforward Hodel for Computer Performance Prediction", gelputing gnngeys, Vol. 7, No. 2, (June 1975), pp. 73-93. Brandwajn, A. "A Queuing Hodel of hultiprogrammed Computer Systems Under Pull Load Conditions", £23222i9é212P§ 2: Lbs ASA. (April 1977). PP- 222-240. 171 10. 11. 12. 13. 14. 15. 16. 17. 18. 19. 20. 21. 172 Buchholz, W. "A Synthetic Job for Heasuring System Performance", 183 §yetems Jounnel, Vol. 8, No. 4, Burge, 3.8., and Konheim, A.G. "An Accessing Hodel", gennnal en nne ngn, Vol. 18, No. 3, (July 1971), pp. Chamberlain, D.D., Fuller, 5.8., and Liu, L.Y. An Analysis of Page Allocation Strategies for Hultipro- gramming Systems with Virtual Hemory", IE! gennnel 2; §§§§§£EP 229 PesslPPPSPP. Vol- 17. NO- 5. (Sept 1973), pp. 404-412. Chang, H. "Single-Server Queuing Processes in Computing Systems", 22! Systems gournel, Vol. 9, No. 1, Courtois, P.J. "Decomposibility, Instabilities, and Saturation in multiprogramming Systems", genmuniee- £i29§ PE 322 LEE. Vol. 18. No. 7. (July 1975). PP- 371-376. Cox, D.R. "A Use of Complex Probabilities in the Theory of Stochastic Processes", Proeeednnge e; nne generidge gnilosophicn; §ogiet1, Vol. 51, (1955), pp. 313-319. ..--....- Planning PI §£P§£12§P$§o New York: John Wiley and Sons,Inc., 1958. Denning, P.J. "Effects of Scheduling on File Hemory Operations", Spying Join; ggnputen gengerenee, (1967), pp. 9-21. =. "The Working Set Hodel for Prograh Behav- ior", gommunicatione en nne ngn, Vol. 11, No. 5, (May 1968), pp. 323-33. Denning, P.J., and Schwartz, S.C. "Properties of the Working Set Hodel", Communicntiene en nne neg, Vol. 15, No. 1, (Harch 1972), pp. 191-198. Fenichel, R.R., and Grossman, A.J. "An Analytic Hodel of Unltiprogrammed Computing", Pnoceedinge e; nne Sarina 92mm 9.222....uter segsrsmcg. (1969). pp. Fernandez, E.B., and Lang, T. "Computation of Lower Bounds for Hultiprocessor Schedules", Inn gennnn; en Reseenen enn nenelepnenn, Vol. 19, No. 5, (September 1975), pp. 435-44. 22. 23. 24. 25. 26. 27. 28. 29. 30. 31. 32. 33. 34. 173 Finn. J... A £32222; Eggs; :2; PPlIiYSIISES APPIYSAS: New York: Holt, Rinehart and Winston, Inc., 1974. Forbes, K., and Goldsworthy, A.W. "A Prescheduling Algorithm -- Scheduling a Suitable Nix Prior to Processing", Eng gggggzg; gennnel, Vol. 20, No. 1, (1977), pp. 27-29. Franklin, N.A., and Gupta, R.K. "Computation of Page Fault Probability from Program Transition Diagrams", Semngaisasieaé 2E :23 ASE. Vol- 17. No. 4. (April 1974), pp. 186-191. Gaver, Jr., D.P. "A Waiting Line with Interrupted Service Including Priorities", gennne; e; nne Regen gnennenica; genieny, Series 13, No. 24, (1962), pp. 73-90. _____ . "Probability Bodels for Uultiprogramming Computer Systems", geurnn; 93 ene AC8, Vol. 14,No. Ghanem, 8.2. "Dynamic Partitioning of Rain Memory Using the Working Set Concept", Inn gennnn; en geeeenen eng nenenepment, Vol. 19, No. 5, (September 1975), pp. 445-50. Gordon, W.J., and Newell, G.F. "Closed Queuing Systems with Exponential Servers", Openenions Reseenen, Vol. 15, No. 2, (April 1967), pp. 254-265. IBM Corporation "Analysis of Some Queuing Models in Real-Time Systems",Form GF20-0007, Data Processing Division, White Plains, N.Y. 10604. Isaacson, 8., and Keller, 8.8. nnelysis en Nguenieel nennods, New York: John Wiley 8 Sons, Inc., 1966. Iverson, K.E. n Pnognnnning Lengnege, New York. John Wiley 8 Sons, Inc., 1962. Jackson, J.R. "Jobshop-Like Queuing Systems”, Ranage- ment §enence, Vol. 10, No. 1, (Oct 1963), pp. 131-142. Kimbleton, S.R. "Batch Computer Scheduling: A Heuristi- cally activated Approach", Offnee e; gene; Reseanen, Report AD-A007922 (September 1974). Lewis, P.A.W., and Schedler, 6.5. “A Cyclic Queue Hodel of Multiprogramming", Jounne; e; nne neg, Vol. 18, No. 2, (April 1971), pp. 199—220. 35. 36. 37. 38. 39. 40. 41. 42. 43. 44. 45. 46. 174 Biller, Jr., R.G. "Priority Queues", Tne nnnals en APPASPPSASPA SESSAPEIPS: Vol- 31. (1960). PP- 86-103. Muntz, R.L. "Poisson Departure Processes and Queuing Networks", Conferenee en Innennation Scieneee enn gyenene, (Barch 1973), pp. 435-440. National Bureau of Standards, Enactionel geotorie; 2§§292§ £2; §é§£2£§ 23 2 L§!§l§v Washington: U- 5- Department of Commerce, 1957. Oden, P.8., and Schedler, G.S. "A Model of Remory Contention in a Paging Rachine", Communications 9; nne ngn, Vol. 15, N0. 8, (August 1972), pp 761-771. Reiser, 8. "Interactive Bodeling of Computer Systems", IP! §1§£§2§ 922222;. Vol. 15. so. a. (1976). pp- 308-27. Saltzer, J.8. "A Simple Linear Hodel of Demand Paging Performance", gemmunieatione en the ngn, Vol. 17, No. 4, (April 1974), pp. 181-186. Seaman, P.8., Lind, R.A., and Wilson, T.L. ”0n Telep- rocessing System Design, Part IV: An Analysis of Auxiliary Storage Activity", Inn Systene gournel, Vol. 5, No. 3, (1968), pp. 158-170. Sreenivasan, K., and Kleinman, A.J. "0n the Construc- tion of a Representative Synthetic Workload", Communicenions e; nne ngn, Vol. 17, No. 3, (Raroh 1974), pp. 127-33. Teorey, T.J. "Properties of Disk Scheduling Policies on Hultiprogrammed Computer Systems", gel; geint gemnuten gonfenenee, (1972), pp. 1-11. Waters, S.W. "Estimating magnetic Disk Seeks", Ine gempunen gennnel, Vol. 18, No. 1, (1975), pp. 12-17. Wilhelm, N.C. "A General Hodel for the Performance of Disk Systems", gourne; e; nne ngn, Vol. 24, No. 1, (Jan 1977), pp. 14-31. Welch, P.D. "0n Pre-Emptive Resume Priority Queues", Tne nnnnle en nathemeniee; §§etistice, Vol. 35, (1964), pp. 1340-48. “1111111111114auguijnfigwflmm“ 3