‘Dmerﬁateon for the Degree of W D g p .5,“
MICHIGAN STATE UNIVERSITY
GEOFFREY WILLIAM GATES ‘
1;” ” 1973 ‘1"

 

Msis III

MICHIGAN S|Y3I

IIIIIII IIIII IIIWII

 

 

 

 

 

 

 

 

 

 

 

 

    

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

LIBRARY

Michigan State
University

 

This is to certify that the -.f

thesis entitled . w

A COMPUTER SYSTEM LOAD MODEL
BASED ON CLUSTER ANALYSIS

presented by
Geoffrey W. Gates

has been accepted towards fulfillment
of the requirements for

._Phn.D_._degree in £—

    

M 'or professor
Date WM .

0-7839

 

 

came-Ham

 

ABSTRACT

A COMPUTER SYSTEM LOAD MODEL BASED ON CLUSTER ANALYSIS
By

Geoffrey William Gates

The evaluation and comparison of computer systems is an increasingly
important endeavor. Such evaluation of the computer as a tool must
consider the programs or load which the computer is expected to perform.
The lack of a quantitative characterization of such a load has been a
hinderance to the general evaluation of computer systems. In this
dissertation a probabilistic model is suggested which provides a
characterization of the load on a computer system.

The model represents the behavior of the load, defined as the
utilization of the computer system resources as a function of time.

The model assumes the load is made up of some number of job steps.
These job steps are classified according to behavior using cluster
analysis so that a separate submodel may be constructed for each
distinct type of behavior.

The amount of detail required of any partiCular instance of the
model depends upon the application of the model. A hierarchy of
instances of the model is described which is determined by the resources
included in a particular instance of the model, the times at which the
behavior may be sampled and the goodness of fit to the real load.

This hierarchy affords guidance in improving any particular instance
of the model.

The techniques described are applied to construct a simple model

 

 

Geoffrey William Gates

of the load on the Michigan State University CDC 6500. DeSpite the
simplicity of this instance and the relatively small amount of data
used in computing the model parameters, the resource requirements

predicted by the model are comparable to the requirements observed

for an equivalent number of real jobs.

 

A COMPUTER SYSTEM LOAD MODEL BASED ON CLUSTER.ANALYSIS

BY

Geoffrey William.Gates

A DISSERTATION
Submitted to
Michigan State University
in partial fulfillment of the requirements
for the degree of
DOCTOR OF PHILOSOPHY

Department of Computer Science

1973

 

TO

MY PARENTS

ii

 

 

ACKNOWLEDGMENTS

I am grateful to Dr. Lewis H. Greenberg, my thesis adviser, for
his help in selecting the problem and guiding my research. I am also
grateful to Dr. Carl V. Page, the chairman of my guidance committee,
for his encouragement and direction given during the course of writing
this dissertation. My thanks also go to Dr. Harry Eick, Dr. Julian
Kateley and Dr. Morteza Rahimi for serving on my guidance committee
and for reviewing this work. My special thanks go to Dr. Bernhard
Weinberg and Dr. Richard Dubes for their assistance.

Finally, there is really no way I can adequately eXpress to my
wife Ginny the appreciation I feel for her patience, understanding

and encouragement.

iii

TABLE OF CONTENTS

LIST OF TABLES. . . . . . . . . . . . .

LIST OF FIGURES . . . . . . . . . . . .

Chapter

1.

2.

3.

BACKGROUND AND LITERATURE

Introduction . . . . . . .
The Environment. . . . . .
Literature Review. . . . .
Criteria for a Model . . .

Uses of a Load Model . . .

Organization of the Dissertation . . .

Mo DE L FMWORK I O O C O O O 0

Introduction . . . . . . .
Definitions. . . . . . . .
The Job Model. . . . . . .
The Sequencer. . . . . . .
An Example of a Sequencer.
Job Step Submodels . . . .
An Example of a JSSM . . .

Summary and Conclusions. .

COMPUTATION OF MODEL PARAMETERS

Introduction . . . . . . .
Resource Selection . . . .
Measurement and Sampling .
Determination of the Types

JSSM Structure . . . . . .

iv

of Behavior

Page
vi

viii

12
15
17

19

19
19
23
24
29
31
35
36

39

39
39
41
45
53

Chapter

TranSiti-on Probabilities O O O O O O O O O O O 0
Representation of JSSM Distribution Functions. .

811er O O O I O O O O O O O O O O O O O 0 O O O

4. A HIERARCHY 0F INSTANCES OF THE JOB MODEL . . . . . .

Introduction . . . . . . . . . . . . . . . . . .
Formal Basis for a Hierarchy . . . . . . . . . .
Approximation at the Job Step Level. . . . . . .
Modeling the Occurrence of Wait Conditions . . .
Detailed Resource Utilization. . . . . . . . . .
A Very High Level of Detail. . . . . . . . . . .

sma ry O O O O O O O O O O 0 O O O O O O O 0 O O

5. AN INSTANCE OF THE LOAD MODEL . . . . . . . . . . . .

Introduction . . . . . . . . . . . . . . . . . .
The Load to be Modeled . . . . . . . . . . . . .
Identification of the Types of Behavior. . . . .
Interpretation of the Types of Behavior. . . . .
Estimation and Interpretation of the Transition
Function . . . . . . . . . . . . . . . . . . . .
Specification of the JSSM's. . . . . . . . . . .
Implementation of this Instance of the Model . .
Comments on the Model Implementation . . . . . .
Improvement to the Model Instance. . . . . . . .

$11er 0 O O C O O O O 0 O O I O O O O 0 0 0 0 C

6. SUMMARY AND FURTHER.RESEARCH. . . . . . . . . . . . .

APPENDIX. . .

BIBLIOGRAPHY.

Summary of the Dissertation. . . . . . . . . . .
Conclusions. . . . . . . . . . . . . . . . . . .

Further Research . . . . . . . . . . . . . . . .

O O O O O O O O O O O O O O O O O O O O O O O O

Page

54
56
59

60

6O
60
67
69
75
77
80

81

81
81
84
89

95
100
101
103
105
107

108

108
109
110

113

116

Table

10

11

LIST OF TABLES

Relative frequencies of instruction types in percentages

for five standard instruction mixes. . . . . . . . . . .

Relative frequencies of instruction types for the Gibson

III inStmction mixo O O O O 0 O O O O O O O O O O I I 0

Instruction frequencies measured on a CDC 6400 for seven

computation bound programs . . . . . . . . . . . . . . .

Equilibrium probabilities for a nine state Markov model

of the EXEC 8 Operating system under four different

loads 0 O O O O O O O O O O O O O C O O O O O O 0 O O O 0

Sample sequence frequencies illustrating the effects

of using a Markov process for the sequencer. . . . . . .

Matrix of transition probabilities computed from the

sequences in Table 5 . . . . . . . . . . . . . . . . . .

Sequences assigned non-zero probabilities by the

transition matrix in Table 6 . . . . . . . . . . . . . .

Job steps provided by the operating system on the
IBM 1130 O O O O O O O O O O O O O O O O O O O O O O O 0

Summary of clustering exPeriments showing the number
of clusters obtained for several values of the minimum
number of job steps acceptable per cluster at a

threshold parameter value of 0.0 . . . . . . . . . . . .

Cluster point coordinates from 800 job steps run from

12 noon to 2:00 p.m. with a threshold parameter of 0.05.

Identification of each cluster by command or command

type occurring most frequently in the cluster. . . . . .

vi

Page

13

26

26

26

30

86

90

94

Table

12

13

14

Page

Distribution of user written program executions

among the different clusters . . . . . . . . . . . . . . . 96

Matrix of transition probabilities for the
sequencer determined from 800 job steps run from

12 noon to 2:00 p.m. on a class day. . . . . . . . . . . . 97

Summary of the results obtained from the

implementation of the CDC 6500 load model. . . . . . . . . 104

vii

 

Figure

10

LIST OF FIGURES

Structure of the total computer system showing
the possible access paths to resources available

to the user 0 O O O O O O O O O I O O O O O O O O O 0

State transition structure to produce the sequence

of job steps "FTN.LGO." with the correct frequencies.

The sequencer for the IBM 1130 with a simple

Operating system used in the example. . . . . . . . .

Condition function relating cards compiled to
lines printed showing the effect of printing

header and trailer information. . . . . . . . . . . .

Hypothetical distribution of measured job step

resource utilization in a two-dimension space . . . .

Hypothetical distribution of measured job step
resource utilization as in Figure 5 with decision

surfaces Shown. O O O O O O O O O O O O O I O O O O 0

Illustration of the effect of the shape parameter,

c, on the form of the Weibull distribution. . . . . .

Illustration of three different approximations to
a real Job, x1(t), x2(t) and x3(t), and the

corresponding consistency sets. . . . . . . . . . . .

Distribution of central processor seconds for job

steps of cluster one. . . . . . . . . . . . . . . . .

Distribution of the number of system requests issued

by job steps of cluster one . . . . . . . . . . . . .

viii

Page

28

32

37

46

46

58

65

91

91

Figure Page

11 Distribution of the number of record units

transferred by job steps of cluster one . . . . . . . . o 92

12 Distribution of the number of words of memory

required by job steps of cluster one. . . . . . . . . . . 92

ix

 

Chapter 1

Background and Literature Review

Introduction

 

The processing done by a computer system can conveniently be con-

sidered to be of one of two types:
1. That processing that results directly from a command
issued by a user.
2. That processing that results from some action of the
operating system.
The 1229 on a computer system consists only of the processing of the
first type, izg;_done at the specific request of a user.

While much effort has been expended on the instrumentation, mea-
surement and modeling of computer system performance, the lack of a
quantitative characterization of the load has been a hinderance. In
this dissertation such a model is proposed and some of its properties
are studied.

The Environment

For the purposes of this dissertation the load, iLngall user
specified actions, and the computer system performing these actions
are considered to be a closed system, the total computer system. All
stimuli presented to the computer system comes from the load. The
computer system comprises all resources available to perform the comp-

utations specified by the load. Any overhead required by the computer

2
system, ELg;_memory management or accounting, is part of the computer
system and not part of the load.

The resources used in the performance of the load are ultimately
embodied in the hardware of the computer which performs the processing
and are available to the user at several different levels of complexity.
Each level corresponds to an access path to the hardware. The relation-
ship between the user's tasks and the various access paths to the hard-
ware is illustrated in Figure 1.

At the simplest level is the direct access to the hardware, appear-
ing generally in the execution of the user's machine language instruc-
tions by the central processor. Ideally, all processing could be per-
formed in this manner, however, certain Operations have the potential
of adversely affecting other users of the system. For example, if a
user's program is using a tape drive, no other user's program should be
able to access the same drive. These sensitive operations are perform-
ed for the user by the operating system, thus protecting other users.

These operations are accessible by means of the system request, typi-

 

cally made by the performance of a special instruction such as an RA+1
request on the CDC 6500 or an SVC instruction on the IBM 370.

Other operations are used so frequently that efficiency is of
great importance, such as file OOpying, or are so complex, such as
language translation, that it would be impractical for each user to
supply them. These operations also offer a method to use the computer's

resources and are called utility programs. Although it is not common

 

usage to refer to compilers and assemblers as utility programs, the
term is applied in this manner for the purposes of this dissertation.

Both system requests and utility programs are part of the system

 

 

 

   

 

 

 

 

 

 

 

 

 

 

 

 

 

user /
specified
LOAD task
/
____...______._______/
system
COMPUTER request u::l::;s
SYSTEM Pg
ystem
machine request
language
instructions system
request
processor
machine
language
instructions
I
hardware

 

 

 

Figure 1. Structure of the total computer system showing the
possible access paths to resources available to the
user.

 

 

4
software, the difference being that utility programs are accessed by
means of the operating system control language (OSCL) whereas the system
requests are invoked from programs written directly by the user or from
utility programs.

Figure 1 then illustrates the environment in which the load, or a
model of the load, must function. The load is external to the computer
system and makes demands at several different levels to use the re-
sources provided by the computer system. These demands can go directly
to the hardware, ngL add the contents of two registers, or the demands
can be in the form of a system request, 2; a open a file, or the demands
may consist of using a utility program, ngL_compile a 500 statement
COBOL program. Demands at all three levels comprise the load and so
any model of the load must account for interactions between the user
and the computer at all three levels.

Literature Review

No results on a general model of a load appear in the literature
although such a model is implicit in all the work published on sim-
ulation models of computer systems. In [Chevy 1966, Fine 1966,

Gonter 1966, Oden 1972] very simple models of the load are used, charac-
terizing the entire load by several distribution functions. In

[Nielsen 1966] a different approach is taken, a pseudo-programming
language is used for the specification of the various programs at the
resource use level. In addition to being tedious to use, the method
lacks generality as the primitives of the language are chosen for a
particular machine operating in a special environment.

Another source of implicit load models is the literature on

instruction timings, benchmark programs and synthetic programs.

 

5

Instruction timing is a technique frequently used in the evaluation

 

of a computer system independent of any load information. In order to
simplify the comparison of two or more different computer systems it is
convenient to obtain a single figure of merit for each system. Two
parameters frequently used for comparison are the add time and the
memory cycle time of the computer. However, for computers with espe-
cially large or versatile instruction sets this measure is an over-
simplification.

In order to consider the entire instruction set in computing a
measure of performance, it is necessary to decide how many times each
particular instruction is used. Obviously, the relative frequencies of
the instructions depend on the load and so any set of frequencies is a
simple model for the load. Tables 1 and 2, taken from [Adams 1969],
give the relative frequencies of each instruction type used in several
standard mixes. Each mix is supposed to represent a particular type of
program, ng; the Knight Scientific Mix is to represent the frequencies
of instructions used in a typical scientific or computation bound
program.

The frequencies given in Tables 1 and 2 can be interpreted as the
parameters of a load model, raising the question of how they are com-
puted. The frequencies in Tables 1 and 2 were not computed, they repre-
sent the best guesses Of people familiar with the several types of pro-
grams. ﬂniller 1971] describes an experiment in which the frequencies
were actually measured for programs being executed. The frequencies,
given in Table 3, may also be interpreted as parameters for a load
model. The difference between the instruction frequencies for the

guessed standard instruction mixes and the measured frequencies should

 

Table 1. Relative frequencies of instruction types in
percentages for five standard instruction mixes.

 

 

 

 

Arbuckle Gibson Knight Kni ht General
Mix Mix Scientific Commercial 3ommercial
Mix Mix Mix
% Fixed + & - 0.0 6 10 25 1
% Fixed x 0.0 3 6 1 0
% Fixed / 0.0 1 2 0 0
% Floating
+ & - 9.5 O 10 0 0
% Floating x '5.6 O 0 0 O
% Floating / 2.0 O O 0 O
% Load & store 28.5 25 0 o h
% Indexing 22.5 0 O 0 41
% Conditional
branch 13.2 20 0 0 28
% Compare 0.0 24 O 0 26
% Branch on
character 0.0 10 0 0 0
% Edit 0.0 LI 0 o o
% Other 18.7 7 72 7h 0

100.0 100 100 100 100

 

Table 2. Relative frequencies of instruction types for the
Gibson III instruction mix.

Percentage
Instruction Occurrence

Move one word from memory to accumulator

 

Fixed point 5

Floating point 5
Store accumulator 5
Move 500 contiguous words memory to memory 1
Move 500 random words memory to memory 1
Conditional branch

Result zero 5

Result negative 5
Compare two words and set indicator

Fixed point 2

Floating point 2
Compare two digits and set indicator 1
Unconditional branch 8
Fixed point multiply, all in memory a
Fixed point divide, all in memory 1
Fixed point add, all in memory 5
Fixed point subtract, all in memory 5
Shift one character (six bits) 3
Logical AND or OR 1
Indexing ll
Indexing and indirect addressing 3
Indirect addressing 13
Floating point add, all in memory 0
Floating point subtract, all in memory a
Floating point multiply, all in memory 0
Floating point divide, all in memory 2

 

100

 

Table 3.
ATFUN

% Test or
branch 11.01

% Register
moves 0.06

% Floating
+ & - 3.88

% Floating
x & / 2.1a

% Integer
operations h5.lo
% Other 27.81
100.00

Instruction frequencies measured on a CDC 6&00
for seven computation bound programs.

FIREBALL RADEBR SACSRADAR PALLOC

10.28

n.05

 

10.28

b.50

2.68

 

 

21.11 16.56
0.30 0.078
0.18 0.837
0.13 0.805

50.19 u9.9o

28.92, 31.82

100.00

ROGEN RADSIM
29.12 26.28
0.276 0.007
0.0U6 0.08
0.038 0.093
“2.76 h9.h3
27.76 2U.11

100.00 100.00 100.00

 

9

be noted. This difference is more striking when it is known that the
programmers of ATFUN and FIREBALL considered the programs to make heavy
use of the computer’s floating arithmetic capability. In both cases all
floating point operations combined account for less than 10% of the
entire program. This discrepancy illustrates the importance of deter-
mining model parameters from actual measurements.

Whether the parameters are guessed or measured, instruction mixes
are inadequate load models because they represent interaction on only
one level, hardware. The logical extension of the instruction mix to

all three levels of interaction is the benchmark program, defined in

 

[Adams 1969] as:

l. A routine used to determine the speed performance of a
computer.

2. A routine to be run on several different computer config-
urations to obtain comparative throughput performance
figures regarding the abilities of the various config-
urations to handle the specific application.

3. A precisely defined problem that is coded and timed for
a number of computers in order to measure performance in
a meaningful and directly comparable manner.

The benchmark problem may be one of the user's own specific applications
or may be representative of a class of typical computer applications.
Although it is not stated explicitly, the choice of a particular
benchmark program constitutes specifying a model of a load since most
authors agree that it is of the greatest importance for the benchmark
program to accurately match the load which will be run on the system

being evaluated [Buckley 1969, Hillegass 1966, Joslin 1968, Joslin 1966,

 

10
Lucas 1971, Wood 1971]. One exception to this opinion appears in
[Bucholz 1969] in which a program is suggested to provide an absolute
measure of performance. But even here, the benchmark program provides
a model of a load. Bucholz makes no attempt to match any characteristic
of the benchmark load to a real load.

A benchmark can be one of three types: natural, artificial or syn—
thetic. A natural benchmark is a program selected from an operational
load to be translated and run on the computer to be evaluated. In
[Ward 1969] a very successful application of a natural benchmark is
described. The overriding factor which contributed to this success was
the fact that the load consisted of only one application and so, in
effect, the entire load was transferred to the various machines being
evaluated. Thus the model implied by the benchmark was an exact dupli-
cate of the load being modeled. In more common situations the load
being modeled is not homogeneous and so several benchmark programs must
selected.

One major drawback of natural benchmark programs is the frequent
need to recode the programs and transfer any required data sets to the
computer to be evaluated. Artificial benchmarks, programs which accom-
plish.a well specified action such as sorting or matrix inversion, may
be programmed in a machine independent manner, thus allowing easy
transfer from one computer to another. Perhaps the most frequently
used artificial benchmarks are those outlined in Standard EDP Reports
by the Auerbach Corporation:

1. General file processing.
2. Random access file processing.

3. Sorting.

 

ll

4. Matrix inversion.

5. General math problems, £2£h_polynomial evaluation.
Whatever artificial benchmark is selected, it is implicitly a model of
the load as a whole. For example, if a sort program is used as a bench-
mark, then the assumption is being made that the demands on the computer
system made by this program are similar to the demands made by the
entire load .

If a very versatile parameterized benchmark is desired, it is often
necessary to resort to a program which accomplishes no real purpose,
1:2; a synthetic program. Synthetic benchmarks are programs which are
constructed to exhibit a particular type of behavior. The distinction
between artificial and synthetic benchmarks is in the level at which
the desired behavior is specified. With an artificial benchmark, the
assumption is, for example, that sorting is typical of the load and so
a sort program is used. With a synthetic benchmark the assumption is
that a program typically performs x input or output Operations and does
y seconds of processing before each such operation where x and y are
parameters to be determined. A program is then written which will
exhibit this type of behavior. Synthetic benchmarks have the advan-
tage of providing flexible behavior [Thompson 1969] based on the para-
meters although the determination of the parameter values is difficult.

As with instruction mixes it is important to base the benchmark
program, and therefore the implied load model, on actual measurements.

A comparison of a load with supposedly representative benchmarks is
presented in [Feeley 1971]. A nine state Markov model of the EXEC 8
operating system for the Univac 1108 was used to describe the demands

a job places on the computer system both at the hardware level and at

 

 

12
the system request level. The transition probabilities between these
states were determined from measurements made on the running system and
in turn the equilibrium probabilities were computed. This was done for
a real-time load, arising from the use of the computer to monitor and
control laboratory experiments, and also for a standard batch processing
load. In addition, benchmark programs, supposed to be similar to the
two types of loads, were measured. The equilibrium probabilities ob-
tained are given in Table 4. The entries may be interpreted as the
percentage of time a program would spend in each state or operating
system module. Again the difference between the guessed load, $12;
the benchmarks, and the measured load points up the importance of
basing model parameters on measured data.

In addition to the work already cited, information on models of
computer systems and, indirectly, on models of a load may be found in
[Calingaert 1967, Joslin 1968, Mittwede 1970, SDC 1968]. A reasonably
complete annotated bibliography on modeling and other performance
evaluation techniques may be found in [Miller 1972].

Criteria for a_ﬂgdgl

Previous experience with models of a load suggests several desir-
able criteria for any new model to possess. The first of these criteria
is that the results be useful. The model must present information in
sufficient detail to answer the desired questions while at the same
time omitting unneeded detail. Selecting the proper level of detail is
important because this aids in understanding the results and keeps the
model economical to use. For example, the model of the EXEC 8 operating
system [Feeley 1971] can be used as an aid in deciding which operating

system modules, if made more efficient, would afford the largest overall

 

 

 

 

13

Table u. Equilibrium probabilities for a nine state Markov
model of the EXEC 8 operating system under four
different loads.

Real Time Standard

 

 

§£g£g 5221,2252 Benchmark ngg Benchmark
User program .019 .12 .09 .17
Interrupt handler .001 .002 .001 .0012
I/O handler .012 .04 .02 .056
Dispatcher .95 .70 .78 .59
User activity and output .009 i .11 .085 .16
Arbitrary device handler .008 .01 .Ol .01
COME module .001 .005 .Ooh .006
Dynamic allocator 0.0 .01 .007 .00h8
Drum handler 0.0 .003 .003 .002
1.000 1.000 1.000 1.000

 

 

 

 

14
improvement. The same information would be available in a trace of the
instructions actually executed. However, such a trace contains so much
unneeded information and would be so bulky that it is doubtful the trace
would help in making any decisions.

Very closely allied to the need for usefulness is the second cri-
terion, that the model should preserve the structure of the modeled
system. In the case of a load model, since the load interacts with the
system at three levels, iLgL_hardware, system request and utility pro-
gram, the model should also provide for interaction at all three levels.
Furthermore, since the load is made up of a number of independent pro-
grams submitted by different users, the model should preserve this
structure also since a considerable amount of overhead is devoted to
scheduling these independent jobs and to keeping them independent. This
preservation of structure allows information gained from the model to be
applied directly to the modeled system.

The third criterion is that the parameters of the model should be
defined in terms of directly measurable parameters of the real load.

For example, rather than having a parameter specify the number of iter—
ations of a loop of arbitrary instructions it should specify the amount
of processor time to be used. If this criterion is met the model will
provide direction in performing the measurements needed to determine

the parameters and the connection between the structure of the model and
the structure of the load will be strengthened.

The fourth criterion is again related to the usefulness of the
model. Since it is the load which is being modeled and since the build-
ing blocks of the load are relatively invariant from one computer to

another, it must be possible to implement the model in a machine

 

 

15
independent manner.

The final criterion is that the model should provide an acceptable
degree of accuracy. Since the degree of accuracy required of the model
depends upon its particular application, the model must be structured in
such a way as to allow easy verification of the degree of accuracy
achieved.

The total computer system is defined previously as consisting of a

load and a computer system. These two components are usually repre-

 

sented by a set of real production programs and the hardware and soft-
ware of an actual computer. However, this representation may take place
at different levels. For example, the computer system may be replaced
by a model of a computer system, called an emulator, while the load is
still represented by production programs. Alternatively, a model of
the programs, a benchmark program, may be run on a real computer system.
A model of a load may be combined in several different ways with
a representation of a computer system. The most obvious way is to use
the actual hardware and software of the computer system and implement
the model in such a way as to interact the same as the real programs,
312;.3 synthetic benchmark program. This program would consist of a
main procedure which would decide what demands for resources to make
on the computer system, and a set of procedures which would actually
use the resources. For example, one procedure would consist of a set
of instructions for which the execution time is known. The choice of
this set of instructions could be based on one of the instruction
mixes, taking into account the possible effects of one instruction on

its neighbors. For example, on the CDC 6600 there are multiple

 

16
arithmetic processing units. Consider a series of three multiply
instructions. The first one is initiated using one of two multipliers.
Assuming independent operands, the second one may be initiated before
the first terminates since there is still an available multiplier.
However, the third must wait for a multiplier to become available. The
time to perform a multiplication therefore depends on the neighboring
instructions. The main procedure of the synthetic benchmark would call
this timed procedure with a parameter indicating how much processor
time to use from which the procedure would compute the number of times
to repeat the set of instructions. The main procedure of the benchmark,
embodying the load model, could be written in a machine independent
manner allowing easy transfer of the model from one computer to another.

Such a benchmark program, based on a model of a real load, would
allow for corrections to data gathered by software measuring techniques.
Since the program collecting the data is part of the load on the com-
puter system, it is possible for the model to account for its behavior,
biasing the model parameters in the main procedure of the benchmark to
compensate for the demands on the computer system which will be added
by the software measurement. Thus, using a benchmark based on a load
model affords the ability to correct for the effect of the measuring
procedure and also allows the measurements to be based on a known, re-
producible load.

This second point is especially important in the process of tuning
the computer system to obtain improved performance. It is desirable to
use a load similar to the real load the computer system.will be sub-
jected to, while at the same time, it is necessary that the load be

reproducible. Both of these features are offered by a benchmark based

 

 

 

 

17
on a load model.

Another use of a load model is as a driver for a simulation model
of a computer system. The interface between the load model and the
computer system.model depends on the form of the simulation model. The
implementation of the load model must still specify what demands for
resources to make on the computer system model. However, rather than
invoking a procedure to actually use the resource, the load model simply
informs the computer system model. When the computer system.model
decides the demand has been fulfilled, it will inform the load model

which.will then make its next demand. A technique similar to this,

 

trace driven modeling [Cheng 1968], uses a record of actions performed
by a real load, called a trace, to drive a simulation model of a com-
puter system. No attempt is made to abstract the nature of the load
from the trace, however, and so the trace can only be used to drive
models of the computer system on which the trace was measured.

This use of a load model could be valuable in the design of new
computer systems or configurations. Frequently a simulation model of
the computer system is constructed so that the behavior of the system
may be studied. If, for example, the computer system is being designed
for radar signal processing, the parameters for the load model could be
computed from measurements taken on an existing computer operating under
an actual load. The load model with these parameters would serve as a
driver for a model of the computer system allowing the behavior of the
computer system to be studied under a realistic load.

Organization 2: the Dissertation

 

In Chapter 2 the form of a load model is specified and each of its

components is described. Chapter 3 contains a description of the

 

 

 

18
techniques used in computing the model parameters. The concept of a
hierarchy of instances of this model, based on the degree of resolution
provided by the model, is described in Chapter 4. Examples of several
different degrees of resolution are presented with a discussion of some
possible uses of the corresponding instances of the model. Chapter 5
then describes the application of these techniques to determine the para-
meters of a simple instance of the load model. This instance of the
model is implemented in such a way as to generate parameters for a syn-
thetic benchmark program. Finally, Chapter 6 summarizes the results of

the dissertation and suggests several topics for future investigation.

 

 

Chapter 2

Model Framework

Introduction

In this chapter the behavior and structure of a job are defined.
Based on these definitions a framework for constructing a general model
of a job is introduced. The properties of this model are discussed in
light of the criteria outlined in the previous chapter.

Definitions

 

The load on a computer system has been previously defined as the
collection of all tasks to be performed at a user's specific request
which utilize resources of the computer system. It is convenient for
modeling purposes to consider this load as a collection of independent
tasks, called jobs. This assumption of independence among jobs rules
out simulating certain situations which can arise, ng;_one job creates
a file, a subsequent job uses the file. If it is the computation of
the job, i:£L_the actual results obtained from the job, that is being
modeled, then it may be necessary to introduce some mechanism for
expressing dependency. If it is the behavior of the job, 212;.the
time pattern of the computer resource utilization, that is being mod-
eled, then independence is a valid assumption.

The resources of a computer system are the hardware and software
components to which a user has access. For example, a data channel,

a disk drive, a compiler or a sort program are all resources. Implicit

19

 

 

 

20

in the nature of any resource is an access method and a unit of measure.
The access method controls how a user may make use of a resource, 2:8;
a compiler can be executed, a disk track can be written to or read from.
The unit of measure is used to specify how much of a resource is used
and is not necessarily unique. For example, the utilization of a data
channel can be measured either in terms of the number of bits of infor-
mation transferred or the length of time the channel was reserved.

The gwng£_of a resource is an entity which is capable of manip-
ulating the resource. Each resource may have several owners, some of

which may be actively using the resource at any time. A dynamic owner

 

 

of a resource is a job which owns the resource. However, some types of
resources remain owned even when no job is currently using them. The

static owner of a resource is the owner when there is no dynamic owner.

 

Notice that a static owner cannot be a job but is either the system or
some user who is responsible for the resource. For example, most
systems provide a user with the facility to save a file for an extended
period of time, gzg;_permanent files on the CDC 6500. The user who
created the file, and presumably controls its deletion, is the static
owner. Any job which is accessing the file is a dynamic owner.

Resources which are limited as to the number of concurrent dynamic
owners they may possess are called limited resources. The central
processor is such a limited resource and may possess only a single
dynamic owner at a time. A 2 x 4 tape controller is another such re-
source which is limited to no more than two concurrent dynamic owners.
Limited resources merit special attention in modeling since they repre-
sent potential bottlenecks to any scheduling algorithm.

Based on the above definitions, the behavior of a job is defined

 

21
as being the utilization, as a function of time, which the job makes of
those resources for which it is a dynamic owner. In other words the
behavior is not the computation performed by the computer for a par-
ticular job, but the demands that the computation makes on the computer
system's resources.

While it is important for a model to preserve the behavior of the
modeled system, it is also important that the structure be preserved.
In [Ziegler 1972] structure is defined in terms of the states of the
system and the transitions between states. This microscopic view is
useful in demonstrating equivalences between various models and the

original system and can be applied to a model of a job. In particular,

 

the state variables represent the amount of each resource utilized, the
transitions represent changes in these amounts as the job progresses.

At a macroscopic level, structure can be defined relative to the
functional modules of the system. At this level the structure of a job
is readily apparent, i;£;.an initialization step, some number of inter-
mediate steps expressible in an operating system control language (OSCL)
and a termination step. This view of the structure of a job simplifies
relating events occurring in the model to events occurring in the
modeled job and y_i_c_g mg.

The following definitions formalize this view of the structure of
a job. A lgb.§£gp_is the smallest system initiated activity directly
expressible in the OSCL. Any command in the OSCL which can be repre-
sented as a series of other commands is not a job step. For example,

on the CDC 6500 the statement

FTN, I=COMPILE, R=2, 0PT=2.

 

22

specifies a single job step, a FORTRAN compilation. On the IBM 1130
the three statements

// FOR

*ONE WORD INTEGERS

*LIST ALL
specify one job step, again a FORTRAN compilation. The statement ”*ONE
WORD INTEGERS" is not a job step but is simply a parameter specification
for the compiler and would cause no action if it were to appear alone.

0n the Burroughs B6500 computer the statement

 

?COMPILE

commands the operating system to compile the following program and, if
there are no errors, load and execute the program. Since there is a
command which expresses each of these actions independently, "?COMPILE"
is not a job step but specifies three job steps.

A job step is said to be in execution when it is a dynamic owner
of a central processing unit. The term compilation is used when the job
step in execution corresponds to a compiler and the term loading is
used when the job step in execution corresponds to a loader program.

A.jgb_is a sequentially ordered set of job steps starting with a
sign-in or initialization job step and ending with a sign-out job step.
Each job step exhibits a certain type of behavior and the sequence of
these behaviors is the behavior of the job.

The assumption of a sequential ordering is not met by all OSCL's
as many possess a facility for specifying a string of job steps to be
performed independently of the parent job. By considering each inde-

pendent string of job steps as a new job introduced into the load, it

 

23
is only necessary to consider the question of independence of different
jobs as mentioned above.

The Job Model

 

The job model proposed in this section models the behavior of a
typical job in the following sense. N real jobs are run on a computer
and their utilization of resources is measured and used to compute the
parameters of the model. N simulated jobs are produced from the model
and run in some manner, either by a simulation model of the computer or
by the actual computer. Then the total utilization of each resource
simulated should approach the total utilization of each resource mea-
sured as N increases. It is not intended that any one simulated job
will have an exact counterpart in the set of real jobs. In fact, since
the job model includes no information about the scheduling algorithms,
it is impossible to predict anything about the period of real time over
which these resources would be used.

Just as the load on the computer system is considered to be the
collection of jobs run on the system, the load model is a collection of
copies of the job model. The load model must represent two character-
istics of the load, the behavior of the individual jobs‘and the number
of such jobs comprising the load. It is the intent of this dissertation
to study the properties of the job model which specifies the behavior
of the individual jobs. No attempt is made to specify the number of
c0pies of the job model required for a particular load model.

The organization of the job model is based directly on the struc-
ture of a job, as defined in the previous section. That is, one portion
of the model, the sequencer, supplies a sequence of job steps to be

performed. For each job step there is a job step submodel (JSSM) which

 

 

 

 

 

 

24
models the behavior of that one job step. The behavior of the job is
the sequence of behaviors of the individual job steps. This modular
construction has several desirable qualities among which are the cor-
respondence between the structure of the model and the structure of a
real job and the possibility of modeling different job steps with
differing degrees of detail. At the same time this structure increases
the difficulty of identifying the underlying state and transition
structure as in [Ziegler 1972].
The Sequencer

The problem of modeling the sequence of job steps can be expressed

 

in terms of a discrete stochastic process [Feller 1968]. Let N be the
number of distinct JSSM]s numbered from 1 to N. The state of the se-
quencer at step t is then a random variable Xt which assumes a value
from 1 to N with a certain probability. The sequence of job steps which

constitutes a job is

(X09X1’X23000,X)

f

where X0 and Xf must be the sign-in and sign-out JSSM's respectively.

At each step, Xi’ the joint probability

P ( xi , xi_1 , . . . , x

0)
is known. It is an aim.of this dissertation to determine the degree of
complexity required for the joint probability distribution.

A Markov process [Bharucha-Reid 1960, Parzen 1962, Feeley 1971]

is a special stochastic process satisfying the condition that if

( x0 , x1 , x2 , . . . , xn_1 , xn )

 

25
is a sequence of discrete valued random variables, the joint distribution
is defined in such a way that the conditional probability of Xn = x on

the hypotheses X X = x1, ..., Xn-l = xn_1 is identical Wlth the

0=x0’ 1
conditional probability of Xn = x on the single hypothesis Xn-l = x

In the context of the job model, the sequencer would be a Markov

n-l'

process if the present job step depended only on the previous job step.

This property leads to simplicity both in computing the model parameters
and in implementing a simulation of the resulting model since all prob-

abilities are determined based on sequences of length two.

Another property which a stochastic process may possess is sta-
tionarity, $12; the joint distribution function is not a function of
time. Again, in the context of the job model, this would require that
a job have the same pattern of behavior, as expressed in the joint
probabilities, at any time of day or any day of the week. If the model
is a stationary Markov process then it is necessary to estimate only
single numbers, the transition probabilities, from the measured data.
If the model is to be non-stationary, then time functions must be es—
timated.

The effect of using a stationary model is obvious, yi§L_the model
will be unable to express long term variations in the composition of
the load. However, the effect of using a Markov process is not as
obvious. The result of using such a first order model is that it is
possible for the model to assign a relatively high probability to a
sequence which never occurred in the measured data and perhaps can
never occur.

As an example, consider five events, [ A , B , C , D , E ], occur-

ring in sequences with the frequencies shown in Table 5. The transition

 

 

 

Table 5.

26

Sample sequence frequencies illustrating the effects of

using a Markov process for the sequencer.

SE UENCE

A C E E . . .

A B E E . . .

A C D E E . .

A B C E E . .

FRE UENCY

9

90

91

810

sequences in Table 5.

Table 6.
A
B
C
D
E
Table 7.

transition matrix in Table 6.

SE UENCE

A C D E E

A B C D E

A B E E .

A C E E .

A B C E E

A

PROBABILITY

 

.010
.081
.090
.090

.729

PROBABILITY
.009
.090
.091

.810

Matrix of transition probabilities computed from the

D E
0 0
0 .1
.1 .9
0 1
0 1

Sequences assigned non-zero probabilities by the

 

 

 

27
probabilities in the matrix in Table 6 are computed by dividing the
total number of times the event corresponding to the row occurs into
the number of times this event is followed by the event corresponding
to the column. For instance, event B occurs a total of 900 times in
sequences A B E E . . . and A B C E E . . .. Of these 900 occurrences,
90 are followed by event E and the remaining 810 are followed by event
C. Thus, the entry in row B column C is 810/900 or .9 and the entry
in row B column E is 90/900 or .1. Table 7 shows all sequences of
events assigned a non-zero probability by the transition matrix along
with the corresponding probability. A comparison of Tables 5 and 7
shows clearly that while every measured sequence is assigned a prob-
ability, the sequence A B C D E E . . ., which is not one of the orig-
inal sequences, is also assigned a non-zero probability. Furthermore,
the relative magnitude of the probabilities is not consistent.

Related to this problem is the inadequacy of a first order model
to handle the situation of frequently occurring groups of job steps.
For example, on the CDC 6500, the sequence "FTN.LGO.", to compile and
execute a FORTRAN program, appears very frequently. However, both
"FTN." and "LGO." can also appear in other sequences. Local correc-
tions can be made to the sequencer to make use of higher order prob-
abilities by relabeling "FTN." as "FTNl" and "FTN2" where "FTNl" can
occur in any context and "FTN2" can appear only in the context "LGO.".
Figure 2 illustrates the transition structure this introduces.

The results in Chapter 5 demonstrate that the limitations of a
first order Markov model as the model of a job do not seriously effect
the usefulness of the model.

Based on the assumption of a stationary, first order Markov process,

 

 

 

28

FTNl ,

arbi-
trary

 

Figure 2. State transition structure to produce the sequence of

job steps "FTN.LGO." with the correct frequencies.

 

29

the sequencer model may be represented as a six-tuple:
( S , sO , sf , T , J , A )

where S is a finite set of states,
3 e S is a unique initial state corresponding to the
sign-on procedure,
3 e S is a unique final state corresponding to the
sign-off procedure,
T:SxS * [0,1] is a transition function whiCh assigns a tran-

sition probability to each pair of states,

J is a finite set of job step submodels (JSSM),
and
A:S -*J is a function assigning one JSSM to each state.

Insuring that the sequencer is a Markov process requires:

Vs e S 23 T(s,t) = 1.
Vt€S

Furthermore, to satisfy an intuitive requirement for a job,
T(t,so) = 0 Vt e S

since a job can only be initiated once, and
T(sf,sf) = 1

since the final state corresponds to sign-off.
Ag Exgmple g; g Sequencer
The IBM 1130 has a rudimentary operating system which provides

four distinct job steps to the user as shown in Table 8. In addition

 

 

Table 8.

30

Job steps provided by the operating system on the
IBM 1130.

N§M§_ FUNCTION

JOB sign-on and initialization

FOR I FORTRAN compilation

ASM assembly language compilation
XEQ execution of previously compiled

program

 

31
to these there is an implicit sign-off job step. The compilers are such
that each subprogram must be preceded by a separate compilation direct-
ive. A typical job, therefore, will consist of a number of compilation
operations, either FORTRAN or assembly language, possibly followed by
an execution. The set of states is { s0 , s1 , $2 , s3 , sf ] with 30
being the initial state and s the final state. The assignment of

f
JSSM's to states by A is:

A(so) = JOB
A(sl) = FOR
A(sz) = ASM
A(33) = XEQ

A(sf) = sign-off.

Figure 3 shows the states and transitions pictorially.
Job Step Submodels

The sequencer serves to give the proper ordering to the JSSMis
which actually model the behavior. Since it is behavior that is being
modeled by a JSSM it is not necessary to have a one to one correspon-
dence between distinct types of job steps and JSSM's. If two job
steps have the same behavior the same JSSM will serve for both. If one
job step exhibits two or more distinguishable patterns of behavior there
will be a corresponding number of JSSM's. In the next chapter a tech-
nique is given for assessing which JSSM's are necessary.

The set of resources supplied by the computer system to be util-
ized by the JSSM's is denoted as R. The set of JSSM's, J, contains

individual models, M for each behavior exhibited by some job step.

1,

Each such model is defined as a triple:

 

32

 

 

 

 

Figure 3. The sequencer for the IBM 1130 with a simple operating

used in the example.

 

where

E.
1

33

is the set of resources for which Mi can be a
dynamic owner. The individual resources in
this set are enumerated Rij e E1, 1 S j 5 [E1].
Rij is used to denote the resource and the
range of values its utilization.may take on
interchangeably.

is a set of probability distribution functions
which determine the utilization of resources
uses. The functions

1

F.. e D. where 1 s j s [0.] are defined as
1] 1 1

in E1 the JSSM M

Fijzwij -*[O,l]

where'Wij E E1. In particular, each resource
may appear as an argument of only one distrib-
ution function.

is a set of condition functions which deter-
mine the relative rates of resource utiliza-

tion. The conditions are enumerated as

C.. e L. where 1 S j S lL.I and are defined as
13 1 1

—-o

Cijzvij Rik.
J
where Vij E E1. The use and form of the con-

dition functions is described in detail sub-

sequently.

This definition provides two methods for specifying the amount of

 

 

 

34
a resource to be utilized by the job step. For some resources, rfj,
the total amount of resource Rij utilized is determined by a distrib-
ution function. If no condition function is given for the resource,
the rate of utilization is uniform, as is assumed for the implicit
models discussed in the last chapter. In other words, if r¥j units of
the resource are to be used in t units of time, then r§jlt units of the

resource are used in each unit of time. If, however, the rate at which

R. is to be used is to depend upon some other resource, R.

dition function of the form

rij(t) = f(rik(t)) ° rgj / f(r§k)

is used where r..(t) is the utilization of resource R up to time t

13 11
j(t) might depend on several

other resources and f would be a multivariate function. As an example

in the simulation. In the general case ri

consider a job step which copies the contents of a tape to a disk. The
resources are R11, measured by the number of buffer units transferred
from tape to memory and R12, measured by the number of buffer units
transferred from memory to disk. The total amount of each resource
utilized is a random variable determined by the appropriate distribution
functions. However, the rate at which one resource is used must be
directly proportional to the rate of the other and is expressed by the

condition function
= . 3‘: it
r12(t) r11(t) r12 / ril'

This form for the condition function is based on the assumption that the
speed of the tape drive is the limiting factor and that the rate of

utilization of Ri is specified in terms of some other resource.

1

 

 

35
The second situation which can occur is that the total amount of
resource Rij is not a random variable but is simply a direct function

of the amount Of some other resource, R In this case a condition

ik'

function
rij(t) = f<r,k(t>>

is used. For example, if a compiler is considered to be a resource
R11, the utilization Of this resource, ril(t)’ is expressed in state-
ments compiled. The rate, k, at which this resource can be used is

expressed in statements per second. Then the processor time required,

r12(t), is uniquely determined by a condition function of the form

ri2(t) = ril(t) / k.

An Example gf'g_JSSM
The sequencer example presented above used five JSSM's of which

one was a FORTRAN compiler, FOR. This JSSM is expanded here as:

FOR = ( E1 , D1 , L1 )

E1 = {R11 (central processor) , R12 (disk drive) ,
R13 (card reader) , R14 (line printer) ,
R15 (elapsed time) }

D1 = {F11(R12’ R13: R14) 1

L1 = {r11(t) = min(r13(t) / k1 , r15(t)) ,

r12(t) = r13(t) r12 / r13 ’

r13(t) = min(r’i‘3 , r15(t) ° k2) ,
r14(t) = g(r13(t)) . ria / g(ri3) }’

E is the set Of resources which will determine the behavior of

 

36
the FOR job step. The distribution functions determine the joint dis-
tribution Of the number of cards to compile, the number of disk accesses
required and the number of lines printed as a result. The first con-
dition function Specifies that processor time is used as determined by

the rate Of compilation, k eXpressed in statements per second, and

1.
the Speed at which statements are presented to be compiled. However,
the processor time can be used no faster than the elapsed time. The
second condition function is similar to that for the tape to disk copy
given previously. The third condition function Specifies that no more
than the total number Of cards may be read and these may not be read
any faster than the card reader Operates, Specified in cards per

second by the rate k The final condition function relates the amount

2.
of printing to the number of cards read. Since the compiler prints a
header on the listing and compilation statistics at the end, the re-
lationship given by g is not linear but has the form approximated by

the curve in Figure 4.

Summary and Conclusions

 

In this chapter the terms structure and behavior of a job are
defined and a model for a job is introduced based on the definitions.
This model allows expression of some of the interdependencies between
the various resources available in a computer system. The structure
of the model preserves that of a real job and so allows for comparison
of the model behavior with the behavior Of the job being simulated.
For example, if the model has one JSSM which exhibits behavior which
is inefficient on the particular computer, it is a simple matter to
identify which real job step the JSSM corresponds to. This job step

can then be modified to produce more suitable behavior. The parameters

 

37

 

 

1007. '1'
%
lines
printed
0% %_
0% % cards compiled 100%

Figure 4. Condition function relating cards compiled to lines
printed showing the effect Of printing header and

trailer information.

 

37

 

 

 

1007. ‘4’
1
lines
printed
07. .1—
0% % cards compiled 100%

Figure 4. Condition function relating cards compiled to lines
printed showing the effect Of printing header and

trailer information.

38
of the model are expressed directly in terms of the resources which
determine the behavior while the use of a first order stationary Markov
process for the sequencer provides for simple computation of the para-
meters. Finally, the modular construction of the model allows added
detail in any portion without requiring any additional detail through-

out the entire model.

 

Chapter 3

Computation of Model Parameters

Introduction

 

In the previous chapter the load on a computer system was con-
sidered to be a set of independent jobs, with a first order model

prOposed for each such job. The construction of such a model is a

 

complex task requiring the precise Specification of structure within
the framework determined by considerations discussed in Chapter 2.
The computation Of the parameters required for such a model is dis-
cussed in detail in this chapter.
The procedure for Specifying the job model can be divided into
six steps:
1. Specification of '1, the set of resources of interest.
2. Measurement of a set of jobs to determine the utiliza-
tion of these resources.
3. Determination of the set of JSSM's required.
4. Specification of the structure of the required JSSM'S.
5. Computation of the transition probabilities for the
sequencer.
6. Computation of the distribution functions for the JSSM'S.
These steps are discussed in detail in the following sections.

Resource Selection

 

The resources which a computer system.makes available to the
users are of two types, those embodied directly in the hardware and

39

 

40
those embodied in the software. Examples of the hardware resources are
Obvious, ng: tape drives, card readers, or core memory. Since most
hardware resources are limited resources, and thus potential bottle-
necks, careful consideration should be given before deciding to exclude
one from R. Such a decision may be valid if it is known that it is
very unlikely that the demand for the resource to be omitted would
ever exceed the limit. For example, assume a computer with 100,000
words of memory can have at most four partitions. In addition, assume
it is known that 99% of all jobs to be run require 22,000 words or
less. In this case, the small error introduced by omitting memory as
a resource may be an acceptable trade for the simpler model.

As will be remembered from the definition Of a total computer
system, a job has access to the computer software at two levels. At
the first level are utility programs, such as compilers. A compiler
may be considered a resource for which the utilization is measured
by the number of statements compiled. In this case, the JSSM using
the resource would consist of a single distribution function specify-
ing a particular number Of statements. The software resources are
well defined at this level and the decision to include or omit any
one can be based on the usage which it is expected to receive. If the
usage is small enough, it might be worthwhile, for the sake Of concise-
ness, to lump the resource together with other infrequently used re-
sources into a miscellaneous resource.

It is at the level of the system request that software resources
are of use in modeling user written programs. The detail presented in
the model of the FORTRAN compiler in Chapter 2 is typical of what

might be expected in a model Of a general user's program. Here again,

 

41
which requests to consider resources and which to omit may be decided
based upon the usage each is expected to receive. Any requests which
are not specifically included can be lumped into a miscellaneous
resource.

Several lists of potential resources have appeared in the liter-
ature to aid in evaluating systems with respect to which features are
available. These lists are by no means complete but may be found in
[ANSI 1971, Comtre 1970, Ferrari 1972, Hunt 1971, Joslin 1965,
Mittwede 1970, Rosin 1965, Walter 1967].

Measurement and Sampling

 

 

Since the model parameters are to be determined to simulate a
real job, the utilization Of the resources decided upon must be mea-
sured in some way. Depending on the degree of detail decided upon for
the set Of resources, measurements must be made at a corresponding
level. A reasonable level of detail is commonly furnished by the
operating system log which is routinely provided by large scale
Operating systems. Information generally available from this log is
given at the job step level and consists Of such items as: processor
time, memory utilization, disk file utilization, unit record utili-
zation, and number of tape mounts. EXperiments have been conducted
by [Hunt 1971, Rosin 1965, Walter 1967] to determine the characteris-
tics of the load on a computer based on this information.

Data available from system logs is generally useless in analyzing
what occurs within a job step. For information from within job steps
software measurement techniques may be employed [Bemer 1968], con-
sisting of inserting diagnostic code at various points in the Operating

system to record the desired data. This technique is very flexible

 

42
since it may access any information available to the Operating system
and yet can be selective, producing output only under certain well
specified conditions. TWO disadvantages of this technique are that
the presence Of the diagnostic code influences the behavior of the
system in some manner which is unmeasurable except by more complicated
techniques, and modifications to a production Operating system are of
a sensitive nature due to the possibility of introducing unstable
behavior.

Hardware monitoring devices [Apple 1965, Miller 1971] are avail-
able which can supply information as detailed as the number Of times
a particular instruction is executed. Since these monitoring devices
are completely external to the computer system, they are not in com-
petition for resources with the jobs being measured. This technique
suffers from not being able to identify the monitored information
with the actual job step being performed. This is eSpecially impor-
tant in a multiprogramming computer system since portions of different
job steps may be interleaved.

The measurement procedure carried out to Obtain data from which
the model parameters may be computed is a statistical experiment.
Hence, an underlying distribution is assumed for the job steps from
which a sample is selected. Using this sample the parameters Of the
underlying distribution are estimated. The actual sampling is done
at two different times, one when the data is collected and the other
when it is analyzed.

In the first instance, data collection sampling, the motive is to
minimize the effects of measurement on the system being measured. If

the measurement consists of the system log, no sampling is required

 

43
since the overhead involved in producing the log is part of the every
day environment. With software or hardware monitoring techniques it is
sometimes difficult tO assess just what the affect is. The influence
of software monitoring can be analyzed in terms of the length Of the
measuring procedure and the number of times it is called, but deciding
the precise effect of this extra load in a multiprogramming system with
multiple jobs competing for resources is so difficult that a hardware
monitor may be required. With a hardware monitor it is possible that
the connection can cause timing problems Uniller 1971] which might
well go unnoticed while biasing the measurements.

If the data collection technique is to sample the load, the
sampling unit must be an entire job. This is necessary so that the
transition probabilities may be computed for the sequencer. The sam-
pling scheme must be constructed in such a way that the jobs select-
ed accurately reflect the make up of the total population of jobs.
Since almost any computer system load varies over time, proportional
sampling can be used to insure correspondence between the sample and
the real load [Cochran 1963, Sukhatme 1954]. Proportional sampling
requires that if n jobs are to be selected from a sample space con-

taining N jobs and Ni jobs occur in some time interval ti’ then

111 = N1 x n / N

jobs must be selected from the interval t1.
In deciding what time period, ti’ to consider for sampling,
consideration must be given to the size of the samples, n1, which

would result. It must be remembered that parameters are being estim-

ated for the entire population of size N, not for the individual

 

44
subpopulations Of size Ni' Thus, it is reasonable to pick ti small
enough so that n1 is one or two. In this way, small cycles are not as
likely to be overlooked.

If it is desirable to keep 111 large enough to estimate the para-
meters for each suprpulation, sampling theory indicates that relatively
small values Of ni would give acceptable accuracy DAfifi 1972]. For
example, assume x is a normally distributed random variable with stan-
dard deviation 20. Then for a sample Of size 15 with sample mean xm,
it can be said with 95% probability of being true that the population
mean lies in the interval [xm - lO , xmi+ 10]. By increasing the
sample size to 60 the interval is narrowed to [xm - 5 , xm + 5].

Sampling is also a consideration in analyzing the data. While
certain analysis techniques are sequential and make a single pass
through the data, others require the entire data set for iterated
calculations. In this case, memory constraints require the selection
of a subset Of the data for analysis. To avoid placing undue emphasis
on local conditions it is desirable to select the subset so that it
represents the widest possible range in time. For example, in a
university computer center environment, student jobs tend to be short
and are most likely to be submitted at a time corresponding to the
end of a class period. In contrast, research jobs will generally be
somewhat more complex and will be spread more uniformly throughout
the day. If a sample of jobs is selected immediately before the end
of a class, it would be biased towards research jobs. If the sample
is taken shortly after the end of a class, it would be biased towards
student jobs. By selecting the desired number of jobs Over a span of

time, local biases are smoothed out.

 

45

Determination of the Types g£_Behavior

 

As was pointed out in the last chapter, each JSSM models the
behavior Of a job step with the behavior characterized by the utili-
zation of the selected resources. Thus, a separate JSSM is required
for each type Of behavior. For example, considering the case of a
FORTRAN compiler in a university environment, many instances Of this
job step compile a small number of statements (student programs) while
the remaining instances involve compiling substantially more state-
ments (research programs). Furthermore, behavior of the first type
(student) probably appears in contexts different from those in which
behavior of the second type appears.

If each job step is considered to be a point in a multidimensional
sample space where the coordinates correspond to the various resource
utilizations, then the various types Of behavior can be identified by
a cluster analysis technique [Ball 1965, Edwards 1965, Rao 1971,
Ruspini 1969]. Such cluster analysis techniques detect regions Of
high point concentrations, called clusters, in the sample space. All
job steps appearing in a single cluster made similar utilization of
the various resources, and so exhibited similar behavior. Therefore,
each cluster represents a particular JSSM. Figure 5 illustrates the
result of plotting the possible utilization of two resources for a
collection Of job steps from the IBM 1130. While the data displayed
here has been constructed to demonstrate the clustering concept, real
data will not always be as well separated, thus requiring sophisticated
techniques for determining the clusters.

The goal of any clustering algorithm is to establish decision

surfaces in the sample space. These decision surfaces partition the

 

46

 

 

F F F AF
311 x x F AAF
x F
A AA
A A
A
x
XA A
x
J JS
J JSS
R12
J JOB x XEQ
F FOR 3 SIGN-OFF
A ASM

Figure 5. Hypothetical distribution of measured job step resource

utilization in a two-dimension space.

 

 

 

 

J JOB X XEQ
F FOR S SIGN-OFF
A .NSM

Figure 6. Hypothetical distribution of measured job step resource

utilization as in Figure 5 with decision surfaces shown.

 

47
Space into disjoint regions, each region corresponding to a single
cluster. Figure 6 illustrates the decision surfaces for the example
of Figure 5. A new data point is identified with a cluster, and thus
a type of behavior, according to which region it is contained in.

As can be seen in Figure 6, these decision surfaces can be very
complex, making them.difficult tO represent in a concise and efficient
manner. One method of simplification used in several clustering
algorithms [Ball 1965, Rao 1971, Ruspini 1969] is to represent each

cluster by a single cluster pgint. These cluster points, one for each

 

cluster, are used to specify the decision surfaces in the following
manner. Let x be a new data point to be identified with a cluster and
let d(',') be a distance metric defined on the sample space. Then if

the cluster points are

{pl 3 pz 3 ° ° ° ’ pm},

the point x is associated with cluster i if
d(X.pi)<d(X.pj) Vja‘i.

This representation Of the decision surfaces has the advantage Of being
compact and computationally straightforward. The clustering algorithms
which represent each cluster by a cluster point differ only in the
method of selecting this cluster point.

One appealing method of cluster point selection is described in
[Rao 1971] and depends on several assumptions:

1. There exists a probability distribution which generates
the multivariate Observations to be clustered.

2. The probability density function Of the distribution is

 

48
continuous and is characterized by one or more distinct
modes.
3. The sample covariance matrix for the set of N observations

has rank [RI (maximal rank).
Identifying the intuitive concept Of cluster point with the statistical
concept of mode is the basis for this algorithm. Statistically, the
9222 of a density function, f('), is defined as that value Of x such

that

f(X) 2 f(y) V Y 7‘ K

1:34 the value of the random variable which occurs most frequently. It
is possible for the density function to have several local maxima which,
for the purposes of clustering will be considered separately.

The choice of a technique for locating the mode of a density
function depends on the dimensionality Of the sample space. For one
dimension distributions a histogram can be constructed. The mode
corresponds to that bin of the histogram.with the largest count. For
two or more dimension data, an alternate method must be used. One
such method is imbedded in the follOwing clustering algorithm from
[Rao 1971].

1. Initially assign all data points to the first cluster.
The modes of this cluster will be located and will become
new cluster points. The data will be reassigned to the
nearest cluster point.

2. Pick a cluster and compute the sample covariance matrix
for all data points identified with that cluster.

3. Compute the eigenvalues and eigenvectors of the sample

 

49
covariance matrix and order by descending eigenvalue. The
eigenvector corresponding to the largest eigenvalue is
the direction of largest variance for the data, 352;.
Starting with the first eigenvector, project all the
data points onto the eigenvector.
Locate the local maxima of this one dimension projection
by constructing a histogram.
If two or more local maxima are located, a new cluster is
added for each. Reassign each data point to a new cluster
according to which maxima it is nearest. Return to step 2.
If a single maxima has been located, retain it as one
coordinate of the cluster point in the higher dimensional
space and, using the next eigenvector, repeat steps 4, 5
and 6. When all eigenvectors have been used proceed
to the next step.
The data points possess a single maxima on each projected
axis and so individual coordinates saved in the previous
step are the coordinates of the cluster point in the
higher dimensional Space.
If any clusters remain which haven't been analyzed,

repeat steps 2 through 9.

Applying this algorithm to a set of data produces two results. First,

each Of the data points is identified with a cluster. Second, the

coordinates of the cluster points are known allowing the identification

of new data points with the clusters. These cluster points are shown

in [Rao 1971] to be consistent estimates of the modes of the underlying

density function if the modes exist. Each cluster is taken to represent

50
one JSSM with those data points identified with each cluster being
instances of the correSponding job step.

There are two parameters which control the algorithm, the minimum
acceptable number of points in a cluster and a threshold parameter. Any
cluster found containing less than the minimum number Of points is
considered spurious and the points are reassigned to the nearest accept-
able cluster. The threshold parameter, P, assumes a value in the
interval [0 , l] which determines how large a difference is required
between two neighboring bins in the histogram for the larger bin to
be considered a local maxima. The actual threshold is computed using

the FORTRAN statement

THRESH = DIFMIN + P * ( DIFMAX - DIFMIN ),

where DIFMAX is the largest difference between two neighboring bins
and DIFMIN is the smallest difference. Thus, if P is 0, the threshold
is set at DIFMIN and any difference between neighboring bins will
define a local maxima. If P is l, the threshold is set at DIFMAX and
only the largest difference will define a local maxima.

The purpose of any clustering algorithm is to detect the under-
lying structure which the data possesses. The purpose Of the two
parameters controlling this particular algorithm is to determine how
.Strongly defined the structure must be to be detected. Thus, while
the algorithm will detect a valid structure for any parameter values,
the parameters should be adjusted to produce the most Strongly defined
structure. In general, the larger the parameter values, the more
strongly defined the structure must be. The best values Of the para-

nmters depends heavily on the data being analyzed and so must be

 

51
redetermined for each new set of data.

The number Of clusters Obtained is a function Of each of the
parameters, 2:24.33 P increases the number of clusters decreases and
similarly for the minimum number of points acceptable per cluster.

One Situation which has been observed experimentally is that this func-
tion has a cut-Off point. That is, for a certain range of parameter
values the number Of clusters varies directly with the parameter value.
For a parameter value larger than the cut-Off point, there is only a
single cluster detected. If the function behaves in this manner, the

cut-Off point is the value of the parameter to use. If the function

 

does not exhibit a well defined cut-Off point, then any value of the
parameter may be used, so the parameter value can be chosen to produce
some desired number of clusters.

Due to the nature of Rao's implementation of the algorithm, memory
constraints limit the product N x [Ii] to 6000. One Side effect of the
clustering algorithm is that it provides the necessary information to
eliminate unnecessary resources, thus increasing the number of data
points which the algorithm can process. Those resources for which the
utilization is eXpressible as a function Of one or more other resources
can be detected since their presence causes the sample covariance
matrix used in the clustering algorithm to be singular. These resources
can be eliminated from I? and reconstructed by a condition function in
a JSSM. Certain other resources are not Significant in locating the
clusters and can also be eliminated. These resources may be identified
by examining the rotation matrix used in the clustering algorithm.

Each row of this matrix contains a set of coefficients applied to the

utilization of a particular resource in determining the projections

52

onto the several eigenvectors. Thus, if the entries in one row are
several orders of magnitude smaller than the corresponding entries in
the other rows, the resource which corresponds to that row has a
negligible influence on the projected coordinates.

Once the dependent and insignificant resources are removed from
I? , it is necessary to obtain a relatively small sample for clustering
from the total set of job steps run. To date nothing appears in the
literature which characterizes the effect Of sampling and the relation

between the sample modes and the population modes. .According to

 

[Tata 1972] this is due to the definition of the mode and the fact that
it cannot be eXpressed in a form which lends itself to analysis. While
a sample size of 1000 is felt to be adequate for populations in excess
Of a million members, the following stratified sampling method affords
verification of the results.

1. Use a random sample of 1000 members and determine the
clusters.

2. For each cluster, draw a sample Of 1000 members from
the population which should belong to the cluster.

3. Determine the cluster points Of this new sample. If
more than one cluster is found, the initial random sample
was biased and step 1 must be repeated, drawing a new
initial sample.

This procedure may be used to check the consistency of the original
cluster points. The cluster points finally determined represent the
various JSSM'S and thus the states of the sequencer.

Identification of the types of behavior can serve another useful

purpose independent of modeling. Consider again the example of the

 

53
student programs and research programs at the beginning of this section.
The existence of two distinct clusters for the FORTRAN compiler suggests
that it might be profitable tO have two FORTRAN compilers. One compiler
can be designed to provide the best possible service for student
programs while the other is designed specifically for large research
programs.

JSSM Structure

 

Having identified the required JSSM'S, or more exactly the behavior
required, the exact structure needed to produce this behavior must be
decided upon. The structure given to the JSSM is based upon deciding:

l. which resources the job step uses,
2. which of these resources are random variables and which
are to be expressed as functions Of the others, and
3. what interrelations exist between the various rates of
utilization.
The problem of choosing which resources to include in the JSSM is
Similar to that of selecting the total set Of resources, '3. Additional
information is available from the clustering concerning the range Of
values the utilization of each resource may be expected to take,
suggesting that those resources which are used only Slightly might be
omitted.

Selecting those resources to be treated as random variables can
perhaps best be done in two steps. First select those resources which
are functions of other resources. This selection must be based on
personal judgement although once the selection is made there are tests
which will indicate whether or not the assumed relationship holds for

the measured data. Those resources which are selected as being functions

 

 

54

of other resources will be determined by a condition function of the
second type, rij(t) : f(rik(t)). Second, those resources not selected
must be random variables for which appropriate distribution functions
must be selected. Here again, the question of interdependencies of
variables must be considered to determine whether all random variables
are independently or jointly distributed. Statistical tests exist to

aid in answering this question.

Constructing the condition functions for a JSSM can require possible

additional data to be measured. If the data obtained previously is

 

quantized at the job step level it is sufficient to determine the
various required JSSM'S at the same level Of detail. However, if a
condition function is to be specified with a greater degree of detail
then data must be Obtained which represents the same degree Of detail.
Alternatively, a rate may be hypothesized based on some extra know-
ledge Of the job step. For example, if a two-pass FORTRAN compiler
reads all input statements in pass one and it is known that pass one
takes 70% Of the processor time, then the card read rate may be Specified
as a condition function based on the processor time in such a way that
when 70% of the processor time has been used, 100% of the statements
to be read has been used.

Transition Probabilities

 

Once the set Of JSSM'S has been determined, each measured job
step may be identified with a JSSM by the use Of a minimum distance
classifier, i;e;_the distance in the ‘lii-dimension Space is computed
from the job step to each of the cluster points. The job step is then
associated with the closest cluster. Thus for each job measured a

sequence Of JSSM'S, or states, is obtained.

55
The parameters to be computed for the sequencer are the estimates
Of the transition probabilities. This process can best be described
by considering the estimation of transition probabilities from one

state, 81. Altogether, there are
NS=]S]

states where S is the state set. There are 111 measured occurrences of
state Si. Each of these precedes some other state from S, say Sj.

The number of times Si is followed by sj is mij and

The set of transition probabilities from state Si,
1 .
{pij I s J S N3}.

is a typical multinomial distribution and the problem of computing the
estimates for pij is well defined in [Good 1965].

The maximum likelihood estimate is

Assuming N, the number Of job steps measured, and n1 and mij are
suitably large for all i and j, this is the best possible estimate.
Although almost limitless amounts Of data are available, for several
reasons mentioned previously, it is desirable to limit the sample Size.
[Good 1965] discusses estimation techniques based on "effectively small

samples." In this case "effectively small" means samples such that at

least one of the sample frequencies is very small, say mij = O or

56
m.. = 1.
1J
Quoting from [Good 1965],
"Many statisticians faced with this estimation problem would
make some ad hoc modification to the Observed frequencies.
That is, they would replace ni by some Slightly different

number, up, and use up / N as their estimate of pi ...”

In [Johnson 1932] the probability estimate

+ .
pij - (mij + k) / (ni + k NS)

is used where k is a flattening constant. In [Whittaker 1935] it is
Shown that, basing the computation of k on the Bayes postulate that

the prior distribution is uniform, the flattening constant k = l is

implied. Thus the estimate becomes

+
Pij = (mij + 1) / (ni + N3).

A somewhat improved estimate can be made as follows. Since N,
the total number of job steps measured, is much larger than any ni,
the prior distribution may be computed from the entire sample. The

revised estimate is

pi; = (mij + ni) / (ni + N)

which, in the limiting case of a uniform prior distribution, becomes

pij 0

Representation g£_JSSM Distribution Functions

 

 

The distribution functions, Fij’ which are a part Of each JSSM
are determined empirically from the measured data. To determine the

location of the various clusters, a histogram is constructed which,

57
in itself, is a representation of the density functions. For this type
of representation accuracy depends on the number of bins used in the
histogram. The assumption made when using a histogram in which each
bin corresponds to several values of the random variable is that the
distribution within a bin is uniform. Thus, if tOO few bins are used,
the representation loses accuracy. On the other hand, if too many
bins are used, the representation is accurate but unwieldy because of
the unneeded bins. Frequently, the number of bins required to obtain
an accurate representation is far too large to be useful in an imple-
mentation of the load model. A statistical model fitted to the data
resolves this difficulty.

Due to the procedure for JSSM determination, each cluster corre-
sponds to a single maxima in the multidimensional resource Space, and
each Fij will have only a single mode. Thus, any density function
with a single mode is a possible model. If the histogram is symmetric
the Gaussian distribution is an attractive model. The properties of the
distribution are well understood and there are only two parameters,
the mean and standard deviation, for which there are consistent unbiased
estimators [Keeping 1955]. Furthermore, the Gaussian distribution is
directly applicable to multivariate data.

A second distribution which may be used as a model is the Weibull
distribution [Weibull 1951] which has the form

1 - eXp[ - (x - b)c / a] x 2 b
F(x) =
0 x < b.
The parameter a is called the scale parameter and influences the form

of the distribution in much the same way as the standard deviation

58

F(x)

1.0 A
c=0

 

C>>0

 

 

 

Figure 7. Illustration of the effect Of the shape parameter, c

,

on the form of the Weibull distribution.

 

59

influences a Gaussian distribution. The parameter b, the location
parameter, is analogous to the mean for a Gaussian distribution. The
extra flexibility is afforded by c, the shape parameter. Figure 7
illustrates the effect of the shape parameter on the cumulative dis-
tribution function. For c=O, f(x) becomes an impulse function, 3:2;
highly skewed. As c increases, f(x) becomes more and more symmetric,
approaching a Gaussian distribution. Procedures for estimating a, b
and c are given in [Dubes 1970]. Unfortunately, the properties of
these estimates are not well known although simulation experiments
have verified their usefulness.
Summary

The computational procedures presented in this chapter are all
well defined and understood techniques. The basis for the determina-
tion of the model parameters is thus firmly grounded in statistical
theory. The structure Of the job model gives firm direction in the
performance of the statistical experiment by identifying those variables
(resources) which must be measured. And yet, in the specification Of
the condition functions, knowledge of the system not directly avail-

able from the measurements may be included.

   

Chapter 4

A Hierarchy of Instances of the Job Model

Introduction

 

In Chapter 2 a general framework for the construction Of models
of the load on a computer System based on a model for a job is presented.
Particular instances Of this framework may be used to model a load at
varying levels of detail. In this chapter a partial ordering on such
instances of the job model is described. For several instances Of the
model the data requirements for parameter computation and the uses of
the model instance are discussed.

Formal Basis for a_Hierarchy

 

The degree of detail provided by a particular instance of the job
model may be measured in several different ways. One such measure of
detail is the number of resources for which the utilization is modeled.
While the number and type of resources to be included in any instance
of the model depends on the computer on which the modeled load is to
be run and on the goal of the modeling, there are three general cri-
teria for including or excluding a particular resource. First, any
resource which is a limited resource for the computer on which the job
model is to be run, and therefore a potential scheduling problem,
Should be modeled. Second, no resource utilization which is eXpress-
ible as a function of one or more other resources should be included.

Finally, any resource which is not significant in determining the

60

61
types of behavior exhibited by the load Should be excluded. Techniques
for identifying those resources which meet the second and third criteria
are discussed in the previous chapter.

The application of these three criteria must be tempered by two
further considerations. First, those resources included in the model
must be consistent with the general level Of detail. For example, even
though the adder of the CDC 6600 is a limited resource, it should not
be included as a resource to be modeled if the desired level of detail
in measuring the use of the processor is satisfied by the number of
processor seconds used. Second, a resource should not be excluded if
it is essential to Satisfy the reason for modeling. For example, if
a load is being modeled to study the effect of memory utilization on a
computer, then memory must be included as a resource.

A second measure of the detail of an instance Of the load model
is the form of the distribution functions Specifying the utilization
of the various resources. That is, if the utilization of each resource
is statistically independent of the utilization of all other resources,
then all distribution functions are univariate. If certain resources
are statistically dependent on other resources, joint distribution
functions are required. Standard statistical tests are available to
identify the independent and jointly distributed resources.

The final measure of detail of an instance of the job model is
the form of the condition functions which serve to specify the relative
rates at which the resources modeled by each JSSM are utilized. These
condition functions allow the modeled resource utilizations, as functions
of time, to accurately model the measured resource utilizations.

By formalizing the idea of the level of detail provided by an

62
instance of the model, a partial ordering, SR’ of such instances of the
model can be derived. This partial ordering is defined such that, if
Ill and Iﬂz are two instances of the model for a particular load, and
Ill Sh Iﬂz, Iﬂz provides a better approximation to the measured re-
source utilizations than does Ill. The relation Ill sRIﬂz is read

"I62 refines Ill" and is based on the following definitions.

Let
R: {R1 , R2 , ,Rn]

be the set Of all resources of the computer system and let R1 Q R

and R2 (2 R be defined as

R1

ll
r-M.
7:1
71

and

For each resource, Rk’ the utilization is denoted by rk(t) and is

expressed in apprOpriate units. The job elapsed time, 'T, is the

 

ordered aggregate of all time intervals during which the job uses
any resource. The job elapsed time is measured from the beginning Of
the Sign-on job step to the completion Of the sign-Off job step.

A job is a vector valued function Of time defined to be
J(t) = r1(t) x r2(t) x ... x rn(t).

The restriction 2f_a_job to a resource set '31 is also a vector valued

 

function of time defined to be

 

63
J(t) [R1 = r1“ (t) x ri (t) x x ri (t).
l 2 m
The load on a computer system, |., is the set of all jobs run on the

computer system and the restriction c_)_f_ the load to R1, L] R1, is

 

the set of restrictions to '31 Of all jobs run on the computer system.
For a fixed value Of time, ¢ e'T , the values Of .J(T) for all jobs

in the load can be described by an n-dimension histogram. This histo-
gram represents the measured distribution Of all resource utilizations
which any instance of the model must approximate at time T.

An instance of the job model, M, over the resource set R
modeling load I. is a vector valued stationary stochastic process
[Papoulis 1965], x(t,z), where t e'T and z is an outcome Of an experi-
ment, namely the selection of appropriate total resource utilizations
according to the distribution functions Of the JSSM'S. This stochastic

process may be interpreted in two ways. For a fixed outcome, 2,

X(t . Z) = ri(t) X ré(t) X rr',(t)

where r](t) is the approximation to the utilization Of resource Ri
embodied in the condition functions. For a fixed value of time, T e 1',
x(T,z) is a random variable with a distribution of values Over all
possible outcomes, z, of the experiment.

When comparing two instances Of the job model to determine whether
one is a refinement of the other, the basis for comparison is how well
each model approximates the real jobs. While it is possible for a
model to approximate the measured jobs equally well for all values in
the interval T, such detail is difficult to Obtain and frequently not

needed. As will be illustrated by sebsequent examples, all that is

 

64

required to study many aspects of a computer system is that the model
of the job approximate the real jobs closely only at particular times.
The set of all such times at which the model must closely approximate
the measured jobs is called the consistency set of the model. Further-
more, the times which make up the consistency set are the only times
at which the modeled utilizations can be evaluated. The values Of the
modeled utilizations at all other times is immaterial. Figure 8
illustrates the utilization of one resource made by a real job and three
different models approximating this utilization. Each model has a
corresponding consistency set.

Intuitively, x2(t) is a refinement of x1(t) if x2(t) is identical
to the resource utilization made by the real jobs at least everywhere

that x1(t) is, i.e. c g c . Referring again to Figure 8, x2(t) is

1 2

a refinement of x1(t) as is x3(t). However, x2(t) and x3(t) are not

comparable Since c2 i c3 and c3 t c

2.
The actual definition of a consistency set is complicated by the

fact that, for each value in 'T, the real jobs and the instance of

the model represent not just a single value for the utilization of
each resource, rather they represent a distribution of possible values
for the utilization of the resources. The requirement to compare

two distributions is accounted for in the following definition of a
consistency set. Let ll be an instance Of the job model for a load

L over resource set 81 which Specifies the stochastic process x(t,z),

Then c g'T is a consistency set for In with error bound e if, for

 

every t' e c, the distribution of x(t',z) over all outcomes, 2,
approximates the histogram Specified by J(t') [R1 for all .145; L

with a mean squared error no greater than e. In other words, the

 

65

 

real job

(t)
r -- - 391(t)

consistency set

c1 e [t0,t2,t3,t5}

 

 

 

 

 

 

 

, I l
rm / ; x2e.)
I I
z u : consistency set
I | u ..
, a 2 c2 = it0,t1,t2,t3,t5}
[In I I I .
, ' , L I ,
r(t) /‘ I --— x (t)
a , 3
,3 a consistency set
,~ I/ s l
: : : c3=€t..t.e.t..t.i
/ ' ' |
” ' ' ' l 1
to t1 t2 t3 tn t5

job elapsed time

Figure 8. Illustration of three different approximations to a
real job, x1(t), x2(t) and x3(t), and the corre-

sponding consistency sets.

66
modeled behavior must approximate the real behavior to some specified
degree at the points defining the consistency set. For example, if
an instance Of the model models all resource utilizations at the end

of each job step with an error Of 5% or less, then e=.05 and
c = {t l a job step ends at time t].

If Ill is an instance of the model over '31 with consistency set

 

c1 and error bound el, and Similarly for Iﬂz, F12, c2 and e2, then
M1 18 a time refinement of M2, written M2 S'I‘Ml’ if and only if
c2 <2 c1, e1 g e2 and R1 = R2. M1 is a refinement of M2, written

 

1 5 e2 and c2 ; c1. Furthermore,

the relations SR and ST define partial orderings on the set of all

M2 SRMl’ if and only if R2 gal, e

instances Of the job model for a particular load.

To illustrate the definition of refinement, consider two instances
of the model, “"1 and Iﬂz. Let Ill model processor time and memory
utilization at the end of each job with no more than 5% error. Let
Iﬂz model processor time, memory utilization and data channel utili-
zation at the end of each job step with no more than 4% error. Then
““1 sRIﬂz Since I02 models more resources with a smaller error and a
larger consistency set.

When actually constructing an instance of the job model, the
degree of detail required will be determined by the use which is to
be made of the model. Once the degree Of detail is determined, a
suitable resource set, consistency set and error bound can be chosen.
If the results obtained with this instance of the model are not suit-
able, then there are three possible ways to refine the model, gig;

enlarge the resource set, enlarge the consistency set or lower the

67
error bound. In the remainder of this chapter examples of three levels
Of detail are presented. Each such example is a refinement Of the
preceding example.

Approximation at the Job Step Level

 

 

A reasonable degree of detail, based on the model framework estab-
lished in Chapter 2, is to require consistency between the model Of
the load and the real load at the end of each job step. To Obtain
this degree of detail, each JSSM must include probability distribution
functions specifying the total utilization of each resource. For
resources such as a central processor, for which the utilization is
measured in seconds, a condition function is required which relates
this utilization to real elapsed time. For example, if R1 is a central

processor with utilization, r1(t), measured in seconds, then the rate

of utilization is specified as

t C - *
9‘ - i‘
r1 t t > r1

where t is the real elapsed time, t is the real elapsed time at the

0
start of the job step and r* is the total utilization of R Specified

1 l
by the apprOpriate distribution function. It should be noted that
this condition function makes no allowances for the influence of a
scheduling algorithm on the rate at which the resource is utilized
since such a Scheduling algorithm is part of the computer system and
not part of the load.
A variety of different forms of condition functions can be used

to achieve the desired utilization for those resources not measured in

units of time and which therefore need not be synchronized to elapsed

 

68

time. For example, let R be a disk drive for which the utilization,

2

r2(t), is measured in record units transferred with r3, the total

utilization, Specified by an apprOpriate distribution function. Then

any one of the following forms Of the condition function would be

satisfactory:
O t . t
r (t) = O
2 r” t > t
2 ‘ 0’
0 r1(t) / r“
r2(t) =
r3 r1(t) = r*
or
O t - t0 < O
r2(t) =
7‘.- ° 7': 7":
r2 r1(t) / r1 0 s r1(t) 3 r1.

Of course, if additional information is available concerning the pattern
of utilization of such a resource, one form of the condition function
might be preferred Over the others although all the above forms of

the function satisfy the required consistency set equally well. Regard-
less Of the form Of the condition functions used, the data required

from the load being modeled is a record Specifying for each job step
executed the utilization of each resource.

An instance of the job model at this level Of detail may be
implemented in any Of the ways described in Chapter 1, Sig; as a
synthetic benchmark program, part of a simulation.model or part Of
an analytical model. Such an implementation could be used to study

the effect Of changes to the scheduling algorithm or the effect of

 

69
changing the hardware configuration on the total throughput of the
system. Similarly, such an implementation of an instance of the model
could be used to study the effect of scheduling disk accesses on the
average elapsed time a job requires to be processed. However, there
is insufficient detail to allow any meaningful conclusions to be reached
on problems such as the effect Of input buffer size on the time spent
waiting for input. Such problems require the availability of informa-
tion concerning the relative rates Of resource utilization within
each job step. While this information is Specified by a JSSM using
any of the above Specified forms Of the condition functions, the
consistency set requires only that the resource utilizations be consis-
tent with the real load at the end of each job step.

Modeling the Occurrence gnyait Conditions

 

 

One design criterion embodied in any multiprogramming computer
system is that a job which cannot use a resource for some reason
should relinquish ownership of that resource, allowing other jobs
access. In particular, this criterion_applies when a job step is
waiting for some external condition to Occur before the job step may

continue processing. Examples of such wait conditions are the need for

 

a tape to be mounted, a requirement for more memory for the job step

or the availability of a data channel for input or output. Two proper-
ties Of these wait conditions are of importance in modeling a load.
First, the occurrence of a wait condition depends only on the structure
of the one job step in which it occurs. Second, the length of time to
satisfy a wait condition generally depends on the number and type of
jobs running concurrently and on the scheduling algorithms used by

the computer system.

 

70

The occurrence of each wait condition represents a necessary inter-
action between the job step and the computer causing the job step to
relinquish ownership of all resources possible. When the wait condition
has been removed, the job step is eligible to continue execution once
it has regained Ownership Of all necessary resources. The Operating
system includes scheduling algorithms to decide which of the job steps
eligible for execution Should be given the needed resources and allowed
to continue. For example, if a job step must wait for a tape to be
mounted, it will release the central processor for some other job.
When the tape is mounted, the job step must wait for the scheduling
algorithm.to restore ownership of the central processor before it can
resume execution.

The effect of these scheduling algorithms can most readily be
observed in the elapsed time required for completion Of a job step.
For an interactive computer system this elapsed time is called the

response time and is an important measure of the performance of any

 

such system. In order for an instance Of the load model to be useful
in evaluating alternative scheduling algorithms, the instance of the
model must provide sufficient detail to model the occurrence Of wait
conditions. Thus, one possible refinement of the previous example is
to expand the consistency set to include the occurrence Of every wait
condition and the end of each job step. To achieve this degree of
detail, a record Of the load being modeled is required which Specifies
for each wait condition the reason for the condition and the utilization
of each resource.

The condition functions required to achieve consistency with a

real load at every wait condition are more complex than those required

 

71

to achieve consistency only at the end of each job step. The form and
parameters of such condition functions depend on the particular wait
condition and can best be described by several examples. Consider a
job step generating output. SO that the high computation rate can be
matched effectively to a lower output rate, an output buffer is used,
i;e;_every write operation in the program Simply moves the data to be
written to an area in memory designated as a buffer. Only when there
is no more room in this buffer is the data actually transferred to
the output device. Once the buffer is filled, the job step must wait
for the data in the buffer to be transferred before it can resume
computation.

The two resources to be interrelated by the condition function

are the processor, R and the output unit, R3. The utilization of the

1,
processor, r1(t), is measured in seconds and the utilization Of the

output unit, r3(t), is measured by the number of times the buffer has

been output. Then the condition function has the form

[0 r1(t)<T1
1 1’1 S r1(t) < T1 + T2
r(t)=<2 Tl+Tzsr1(t)<Tl+T2+rr-3
3
7' = 5!“
1 ri r1(t) r1

 

where T1, T2, T3, ... are values Of the random variable t distributed
according to a distribution function F(t), the distribution of process-
ing time required to generate one full buffer. The condition function

Specifies that after T1 seconds of processing the buffer will be full

 

72
and the job step will have to wait. The condition function does not
Specify the actual time needed to transfer the buffer because this
depends on the computer system. However, once the first buffer has

been output, seconds of processing are required to fill the buffer

T2
a second time, and so on.

An instance Of the model with this consistency set is not suit-
able to study such problems as the effect of decreasing buffer sizes.
For example, assume that in the measured load the wait condition occur-
red once every ten seconds of processing. If the buffer size is halved
in the model the wait condition will occur twice as many times.
However, without halving the buffer size in the real load, there is no
way to know how much processing is required to fill the shorter buffer.
It might take five seconds of processing or it might take one second
half the time and nine seconds the other half. Thus, for certain
problems a more extensive consistency set will be required.

A second example of the condition functions required to Obtain
this degree of detail comes from observing various memory organizations.
Three different organizations are currently prevalent: fixed length
memory partitions, variable length partitions and paged memory. On
a system which uses fixed length partitions a job step must reside in
a partition large enough tO satisfy its entire memory requirement. No
change Of memory length is possible and so no condition function is
required.

If a system allows variable length partitions, then it is possible
for the memory requirements of a job step to change as execution pro-
ceeds. If the length of a job step increases, a wait condition will

occur if there is not sufficient unallocated memory available, i.e.

 

73
the job step requiring the increased memory length will have to wait
until another job finishes or relinquishes the memory it is using. If
R4 is the resource, memory, for which the utilization, r4(t), is

measured in number of words, then one possible condition function

specifying memory utilization is

 

ml r1(t) (- T1
r4(t) = ( m2 T1 3 r1(t) < 72
A
[m3 T2 5 r1(t) < r1.

In this equation m1, m2, m3, T1 and T2 are parameters of the condition
function for which values must be determined from measurements made of
the real load. The condition function Specifies that m words Of

1

memory are required for the first 71 seconds of processing, then m2
words of memory are required for the next TZ'Tl seconds and so on. If

m <Im2 then there is a possibility of a wait condition occurring after

1

T1 seconds Of processing.
If a system uses a paged memory allocation scheme, the job step
must wait each time a reference is made to a page not in high Speed
memory, called a page exception. Which pages are in high Speed memory
depends not only on the structure of the job step but also on the
amount of memory available, the number and type of concurrent jobs
and the page replacement algorithm. Therefore the number of page
exceptions cannot be considered part Of the load. However, all re-
ferences to a word of memory not in the present page are made through
a page table, a resource of the computer system. This table Specifies

for each page whether it is currently in high Speed memory or in

auxiliary memory. Thus, the number of references to the page table.

 

74
depends only on the structure of the job step and so can be included
as part of the load model. If the page table is resource R5 with
utilization, r5(t) measured by the number Of references to the table,

then the utilization can be specified by a condition function of the

form

0 r1(t) 4’ 1'1
1 1'1 S r1(t)< T1 + T2
- 2 71+Tzsrl(t)<'r1+1'2+1'3
r (t) —
5
r; r1(t) = r1

where T1’ 12, T3, ... are values Of the random variable t distributed
according to a distribution function F(t), the distribution of pro-
cessing time between page table references. For each such reference,
the computer system model must determine the wait time. If the page
referenced is in high Speed memory, there is no wait time, otherwise
the job step must wait for the page to be brought into high Speed
memory.

One final example of a condition function used to model the occur-
rence of a wait condition arises from the process of interactive input.
When a job step requires input from an interactive user, the wait time
is due to that user and is not influenced either by other jobs being
run concurrently or by the computer system. Therefore, this wait
time is part of the load and, whereas the previous examples of condition
functions modeled only the occurrence of a wait condition, the condition

function for interactive input must model both the occurrence and the

 

75
duration Of the wait condition. However, the time that elapses from
the satisfaction Of the wait condition, 1:3; the user has entered the
required input, until the time that execution of the job step resumes
is still dependent on the computer system and not part of the load.
In the case of interactive input the resource being utilized is

a communication port, R and the utilization, r6(t), is measured in

6’

seconds Of connect time. This utilization may be specified by a

condition function of the form

 

r1(t) r1(t) < 1'1
0'1 + r1(t) 1'1 5 r1(t) < T1 + 72
lol+02+rl(t) 71+Tzsrl(t)<rr1+1'2+1'3
r6(t) = ( .
7': _-_—. 7
[r6 r1(t) ri'

In this condition function, 01’ 02, ... are values Of the random var-
iable S, user response time, distributed according to a distribution
function, F(s), and T1, 72, 73, ... are values Of the random variable
t, processor seconds between interactive input, distributed according
to a distribution function, G(t). This condition function specifies
that the connect time is the same as the processor time until the first
wait for interactive input, which took the user 01 seconds to satisfy,
23.2».-

Detailed Resource Utilization

 

As is discussed in the example of the wait condition caused by a
full buffer, an instance Of the model with a consistency set defined

by the occurrence Of all wait conditions and the end of each job step

76
is not useful in studying certain system parameters such as buffer size.
In particular, no meaningful inferences can be made about a system
parameter, A, using an instance of the load model with this consistency
set if two conditions hold.
1. There is a resource, R, on which the occurrence of a
wait condition depends and every unit utilization of R
does not cause a wait condition.
2. Parameter A controls the incremental utilization of R
which causes the wait condition.
Applying these two conditions to the example of the buffered output,
the resource, R, is the output device for which the utilization is
measured in record units. The parameter, A, is the number of record
units which the buffer can hold.

Based on these Observations, one possible refinement of the pre-
vious example is to expand the consistency set so that the modeled
resource utilization is consistent with the measured resource utili-
zation for every unit utilization Of each resource. That is, at
every time from the beginning to the end of the job step, all the
modeled resource utilizations will be consistent with those utiliza-
tions which are measured. Computing the parameters for a load model
with this consistency set requires a record of the load being modeled
which contains an entry each time one or more units of any resource is
used.

The form Of the condition functions used in such an instance of
the load model would be Similar to those condition functions used in
the previous examples, the difference being in the units for measuring

the utilization. Returning to the example Of the job step which

 

77
generates data to be output, the resource being modeled is still the
output unit, R3, but the unit of measure for utilization, r3(t), is
now the number of record units generated. Otherwise, the form of the
condition function is the same.

An instance of the job model providing this degree of detail can
be used to evaluate the effect of changing various parameters on the
overall system performance. At the same time, such an instance of
the model can be used to evaluate certain changes to the hardware
architecture of the computer system.

A Very High Level of Detail

 

If the utilization of the central processor is measured in number
of instructions performed rather than processor time, then this level
of detail can be applied to the central processor in a direct way,
i;§;_the consistency set is eXpanded to include the time at the end
of each individual instruction execution. Ordinarily, this level of
detail would be applied only to those resources for which the utili-
zation is incremented less frequently, such as input or output devices
or certain utility programs. The accuracy obtained from a JSSM by
forcing compliance with the string of instructions executed is not
necessary in the study of most operating system parameters. However,
an instance of the job model which incorporates the level of detail
available from a record of the instructions actually executed by the
real load [Winder 1973] can be used in studying the effect of certain
alternative organizations for the central processor.

For example, one feature found in the central processor of several
large computer systems is a very high Speed memory buffer in which the

current instruction and several following instructions are kept. As

78

long as the next instruction to be executed is in this buffer, no
memory reference is required to fetch the instruction thus speeding
the instruction execution. In order to keep the logic for loading
this buffer simple, the assumption is made that each instruction will
be followed in execution by the instruction in the next consecutive
location in memory, i;g;_no branch instructions exist. When such a
branch instruction is executed the central processor must wait for
the buffer to be refilled with instructions starting at the address
branched to.

The resource to be modeled in this example is the instruction

buffer, R with its utilization, r7(t), measured by the number of times

7,
it must be refilled due to a branch. The utilization of this resource
can be related to the utilization of the central processor, r1(t),

measured in the number of instructions executed. One possible form

of the condition function is

r
O r1(t) / n1
1 n1 3 r1(t) < n1 + n2
2 n + n s r (t) < n + n + n
l 2 3
r7(t) = < l 2 l
s = *
Lr7 r1(t) r1

 

where n1, n2, n3, ... are values of the random variable n, the number
of instructions executed between branches, distributed according to a
distribution function F(n). A JSSM including such a condition function
can be used to study the effect of varying the size of the instruction

buffer on the instruction execution rate of the processor.

79

A second example of an application of an instance of the job model
based on individual instruction executions is the study of central
processors with multiple functional units. In such a computer system,
the central processor is divided into independent functional units,
ghgh_an adder, a multiplier or a memory fetch unit. In designing such
a processor the power of each functional unit must be decided, 3:5;
should there be separate units for each arithmetic Operation or one
arithmetic unit and a separate unit for memory references. Once the
types of functional units are decided upon, the number of each type
of unit must be set. Each type of functional unit is a resource for
which the utilization is measured in the number of times it is used.
An instance of the job model with condition functions Specifying the
relative utilization of each functional unit can be used in studying
various possible configurations.

One final example of a use of the job model which requires a
sufficient level of detail to obtain consistency after each instruction
execution is in the design of a computer system using a paged memory
organization. The system parameter to be studied is the size of a
page and its effect on the number of page exceptions. As will be
recalled from the previous example of paging, the resource is the page
table with utilization measured in number of references to the table.
Whether or not a reference to the page table causes a page exception
depends on the computer system and the number and type of jobs running
concurrently. However, the page size does control the occurrence of
references to the page table and therefore, indirectly, the occurrence
of page exceptions. Thus, if page size is to be studied using the load

model, sufficient detail is necessary in the model to trace all memory

 

80

references.
Summary

In this chapter a hierarchy of instances of the load model is
presented. This hierarchy depends on the goodness of fit between the
modeled resource utilizations and the measured utilizations. Examples
of several degrees of detail are presented. One important prOperty
of the load model, mentioned in Chapter 2, is that it is not necessary
for all JSSM'S to embody the same degree of detail. From the definition
of a refinement of a model, it is clear that by simply refining one
JSSM, a refinement of the instance of the model is obtained. This
allows an instance of the load model to be efficient while providing

any degree of detail desired in certain JSSM'S.

Chapter 5

An Instance of the Load Model

Introduction

 

In this chapter the steps gone through in constructing an instance
Of the load model are described. This particular instance models the
load on the Michigan State University CDC 6500. The assumptions made
in constructing this instance of the model are discussed and an inter-
pretation of the parameters of the model is presented together with
the results of using an implementation of the model.

_Tl_}_e_ £039 52 kg Modeled

The Michigan State University CDC 6500 provides batch processing
and time-sharing computing services for research and class use. These
services are provided under the control of the SCOPE Operating system
and are invoked by the user through the SCOPE Operating system control
language (OSCL). A record of system activity is available from the
system log or DAYFILE, including a record of the commands issued by
the users and the resource utilization for each command.

This instance Of the model is designed to behave in a manner that
is consistent with the modeled load at the termination of each job step.
The data required to achieve this level of detail is a record of the
utilization of each resource at the end of each job step. The OSCL
is structured such that, in general, each command is a job step. Each
job begins with an explicit authorization job step. This job step
identifies the user to the system and determines any limits of resource

81

 

 

82

utilization to be placed on the job. Default limits are maintained
in the authorization file and may be overridden by parameters which
the user enters with the authorization job step. Only time-sharing
jobs are terminated by an explicit command, however all jobs are ter-
minated by a job step which performs the final accounting and inserts
a message in the DAYFILE. Therefore, the exact extent of each job
can be readily identified.

For each command the DAYFILE contains a record of the utilization
Of five resources. These are central processor time, peripheral
processor time, number Of system requests issued, number of record
units transferred and the number Of words of memory used. The central
processor time is a measure of the amount Of computing done to perform
the user's command. The peripheral processors are used in the per-
formance of overhead tasks done for the user by the Operating system,
so peripheral processor time is a measure of the amount Of computing
done for the user by the Operating system. The third resource, the
number of system requests issued by the program processing the command,
as defined in Chapter 1, is another measure of the amount of processing
the Operating system does for the user. The fourth resource, the
number of record units transferred, measures the amount of input and
output done while honoring the command. Since all input and output
operations are done for the user by the Operating system, this para-
meter is another measure of the amount of processing done by the
Operating system. Finally, the number of words of memory used is the
amount of high speed memory required by the program which honors the
command.

The resources chosen for modeling are the central processor,

 

83
system requests, record units and memory. These resources are measured
in seconds of processor time, number of requests issued, number of
record units transferred and number of words respectively. Peripheral
processor time is omitted for two reasons. First, peripheral processor
time would be a somewhat redundant measurement since the peripheral
processors are used only for input and output Operations or to satisfy
system requests. Second, whereas the number of system requests issued
or the number Of record units transferred by a job step depends only
on the job step and associated data, the amount of peripheral processor
time required depends on the job step using the time and on the
other jobs running concurrently. This is due to the fact that, while
a peripheral processor requires a fixed amount of time to transfer one
record unit, it might have to wait for a data channel which is being
used by another job.

One compromise in the structure of the model is required by the
form Of the OSCL. As was mentioned above, in general each command is
one job step. The major exception to this is the command to run a
user written program. For example, let LGO be a file containing the
compiled version of a user's program. To run this program two jOb
steps are required, one to load the program into the computer's mem-
ory and the other to initiate execution. These may be expressed as
"LOAD,LGO.EXECUTE." or as "LGO.". Since all information in the DAYFILE
appears at the command level, for the second form of the command there
is no way to determine the portion Of each resource used in loading
the program and the portion used in actually executing the program,
making it necessary to treat the loading and execution of the program

as a single job step. As long as this instance of the model is

 

84
implemented on a computer for which the behavior of the loader, as
a function of the length of the program loaded, is similar to the
behavior of the loader on the CDC 6500, this lumping of two job steps
into a single JSSM will cause no difficulty. If, on the other hand,
the model is implemented on a computer for which the behavior Of the
loader is significantly different, then the ”LGO." form of the command
will have to be split into two job steps and the total resources
utilized for the command will have to be divided between these two
job steps.

Identification of the Types of Behavior

 

V. A. Rao's implementation of the clustering algorithm described
in [Rao 1971] was used to determine the different types of behavior
exhibited by the job steps with reSpect to the four resources modeled.
As will be recalled from Chapter 3, this algorithm is based on three
assumptions. Since only a finite number of job steps will be run on
the CDC 6500, the aggregate Of all these job steps forms the distribu-
tion necessary to satisfy the first assumption. Although this dis-
tribution is not continuous due to the units for measuring system
requests, record units and memory, a continuous distribution can be
fitted in a standard manner. As a result, a cluster point may have
non-integral values as the coordinates for these resources. However,
since the cluster points, used only in conjunction with the distance
metric for representing the decision surfaces, have no other inter—
pretation placed on them, these fractional coordinates are legitimate.
The third assumption, that the sample covariance matrix have maximal
rank, was found to be true for all the data sets considered.

As is discussed in Chapter 3, two parameters control the clustering

85
algorithm, the minimum number of points acceptable in a cluster and a
threshold parameter. Several different runs were required to deter-
mine the parameter values which produced the most consistent results
for the data from the DAYFILE. Initially the threshold was set at 0.0,
allowing any difference between neighboring bins in the histogram to
define a local maxima. The influence of changing this parameter value
is described subsequently. The number Of job steps was 800 and the
minimum number allowed per cluster was varied from 10 to 30. When as
few as 10 job steps were allowed per cluster, 25 clusters were ob-
tained. When the minimum number acceptable was increased to 15, the
number Of clusters drOpped to 14. A further increase of the minimum
number of job steps acceptable to 30 produced 13 clusters.

This process was repeated on a second sample containing 1500 job
steps. Since approximately twice as many data points were available,
it was eXpected that there would be twice as many Spurious job steps
requiring that the minimum number per cluster be doubled to obtain
the same number of clusters. As was expected, with a minimum of 15
job steps per cluster, 27 clusters were Obtained. When the minimum
number of job steps acceptable per cluster was increased to 30 and
then to 60, 16 clusters were obtained.

A value prOportional to 30 for 800 job steps was found to be
a cut-off point. To avoid unstable behavior right at the cut-off
point, the minimum number of job steps acceptable per cluster was
set to be prOportional to 15 for 800 job steps. Table 9, which sum-
marizes the results of these runs, shows a value of 15 produced results
similar to those Obtained for a value of 30.

With a minimum of 15 job steps per cluster and 800 job steps, the

 

 

86

 

Table 9. Summary Of clustering eXperiments showing the number
of clusters obtained for several values Of the minimum
number of job steps acceptable per cluster at a threshold
parameter value Of 0.0.

minimum number Of clusters
number 800 1500
acceptable (12 noon - 2:00 p.m.) (1:00 a.m. - 5:00 a.m.)
10 25 NA
15 14 27
30 13 16
60 NA 16

 

 

87

threshold parameter was next varied from 0.0 to 1.0 to determine its
effect on the clusters located. A parameter value of 0.0 produced the
14 clusters of cluster set I. Increasing the parameter to 0.05 again
produced 14 clusters, cluster set II, but the cluster points had
shifted slightly. When the parameter was increased to 0.06, all job
steps were groupd into a single cluster. Thus 0.05 is a cut-Off point.

As is pointed out in Chapter 3, increasing the value of the thresh-
old parameter requires any maxima to stand out more strongly in order
to be located as a cluster point. Therefore, the cluster point coor-
dinates in cluster set II must be more strongly identified than those
in cluster set I. However, if the clustering algorithm is applicable
to this data, cluster set II should present an improvement to cluster
set I, not totally different clusters. To test for this the distance
from each cluster point in set I to every cluster point in set II
was computed. Using these distances each cluster in set I was iden-
tified with one cluster in set II, 3:2; cluster A in set I was con-
sidered the same as cluster B in set II if the cluster point repre-
senting cluster A was closest to the cluster point representing cluster
B. Based on this identification of the clusters, it was Observed
that 97% of the 800 job steps were identified with the same cluster
in both cluster sets I and II.

One stipulation made in Chapter 2 is that the model be stationary,
iaﬁa the parameters of the model are not functions Of time. In a
real sense, the coordinates Of the cluster points are parameters of
the model and so must be tested for stationarity. Cluster set II
was determined from job steps run from noon to 2:00 p.m. on a class

day, a peak period. Another set of 16 clusters, cluster set III, was

 

88
Obtained from 1500 job steps run from 1:00 to 5:00 a.m.. As was done
when comparing the cluster sets obtained for different parameter
values, the distance from each cluster point in set III to every
cluster point in set II was computed, allowing each cluster in set II
to be identified with one cluster in set III. Each Of the 1500 job
steps from which cluster set III was determined were then assigned to
a cluster on the basis of the cluster points in cluster set II. As
a result of this, 88% of all job steps remained in the same cluster.
This encourages the application of the assumption of stationarity to
this data.

One final consideration in determining the clusters was whether
to normalize the data before applying the clustering technique. While
memory utilization ranged from 200 to 45,000, the number of record
units transferred and the number of system requests rarely exceeded
10,000 and the central processor time never exceeded 400. It was
felt that since memory entered into each of the rotated coordinates
to a significant degree, the magnitude of the memory utilization was
overwhelming the other three resources. To see the effect, the same
800 job steps were clustered twice, once normalized and once unnor-
malized. The same number of clusters was found both times with more
than 99% of all job steps remaining in the same cluster whether or
not the data was normalized. The reason normalization has no signi-
ficant effect on the decision surface locations is that, as should be
expected, the clustering algorithm is sensitive to the distribution of
the points over the range of values, not to the range itself. This
distribution is not changed by normalizing the data and so the heavy

influence of memory is a real property of the load, not artificially

 

 

89
induced by the differing scales.

The cluster point coordinates used for the remainder of this chap-
ter were obtained from 800 job steps with normalization allowing no
fewer than 15 job steps per cluster and with a threshold parameter Of
0.05. The coordinates appear in Table 10 and are discussed in the next
section.

Interpretation of the Types g£_Behavior

 

Recalling that the cluster points themselves have no interpretation
other than their use to define the decision surfaces, the interpretation
of each cluster depends on the aggregate of all job steps assigned to
the cluster. For example, cluster one is characterized by the histo-
grams in Figures 9 to 12 which show the distributions Of utilization
of the four resources made by the job steps in cluster one. It is
these distributions which are used in the JSSM'S to characterize the
behavior.

With the cluster point coordinates known, each job step can be
identified as belonging to one cluster. These identified clusters
can be examined for any strong identification between a cluster and a
particular command or type of command. For example, 60% of all job
steps in cluster four were FORTRAN compilations. When this is related
to the resource utilization distributions for cluster four, it is

seen that memory utilization was constant at about 45000 words, the

8
length of the FORTRAN compiler.. Furthermore, central processor time
and the number Of record units transferred were distributed approx-
imately according to a negative exponential distribution, i.e. the

longer the compilation the less likely it was to occur. In a similar

manner, cluster seven was predominantly file action commands, such as

 

90

Table 10. Cluster point coordinates from 800 job steps run from

12 noon to 2:00 p.m. with a threshold parameter of 0.05.

 

cluster central system record memory
processor requests units in words
seconds transferred (octal)
l -.094 2.93 -.79 1305
2 .048 3.00 0.0 25000
3 .388 23.31 126.92 40000
4 .192 21.73 17.60 44763
5 .095 12.00 .01 10002
6 .037 21.00 0.0 12200
7 0.0 1.00 0.0 300
8 .012 7.00 316.99 24000
9 .072 0.0 0.0 25000
10 .205 35.87 142.23 45007
11 .108 0.0 -.01 45000
12 .012 8.00 158.99 24000
13 .250 7.83 97.16 27772
14 .066 14.75 320.09 23726

 

 

91

100 %95

80 e

job 60 ”
steps 40 _
20 -

 

 

1. J I 1 J, L 1 1 1 I 1
.3 .6 .9 1.5 2.1 2.7 3.3 9
1.2 1.8 2.4 3.0 3.6 4.2

central processor seconds

 

We.
0
.p
O

\n

Figure 9. Distribution of central processor seconds for job

steps of cluster one.

 

 

 

 

V46
50 ~
40 -
job
steps30 I 24
20 e
10 6 12 8
, a 2 2 2

 

 

 

 

 

 

 

 

 

l I I I l I
1 *2 3 a 3* 6 71‘8' 9'10 11 12 13 1a 15 ‘"
system requests

Figure 10. Distribution Of the number of system requests issued

by job steps of cluster one.

 

 

 

 

 

92

 

 

 

 

 

 

 

100 ~98
80 —

30b _

steps 60
40 14
20 e
,4 2 2
1 | a 1 1 n 4 a F'—r'—j a I W
2 4 6 8 10 12 14 16 18 20 22 24 26 28
record units
Figure 11. Distribution of the number of record units trans-
ferred by job steps of cluster one.

100 - 4
80 —

job _

steps 60
no .-
20 —
2 L l [2"] l l PAL] L I‘JL‘l __L V
00 700 900 1100 1300 1500 1700 1900
words of memory
Figure 12. Distribution Of the number Of words of memory

required by job steps of cluster one.

 

 

93
rewind or return. Clusters eight and twelve corresponded to the inter-
active text editor and cluster thirteen was 80% BASIC compilations.
Thus, as would be expected, particular commands exhibited homogeneous
behavior and so multiple instances Of the command were cast into the
same cluster. The number of times each command occurred in each cluster
is given in the Appendix. This data is summarized in Table 11 by
identifying each cluster with the command which occurs most frequently
if the command accounts for 40% or more of the job steps assigned to
the cluster.

Another way Of looking at the classified job steps is tO consider
all job steps representing a single activity, such as compiling, and
Observe how they were Split among the various clusters. For example,
executions Of two different compilers appeared in the total set Of job
steps in a significant number. Cluster four contained 122 compilations,
cluster ten contained 154 and cluster thirteen contained 101. The
remaining eleven clusters contained a total Of 31 compilations and
were ignored. Assuming that different compilers behave differently,
the Split into three clusters should be eXplainable in terms of the
different compilers. Cluster thirteen contained all BASIC compilations.
The FORTRAN compilations, however, were split between clusters four
and ten, thus showing two types of behavior for the FORTRAN compiler.
Referring to the classified job steps there was a simple explanation
for this Split. FORTRAN compilations done under the time-sharing
system used no Optimization but extensive tracing. These compilations
appeared in cluster four. The default for FORTRAN compilations done
under the batch system was a limited amount of optimization and no

tracing. These instances appeared in cluster ten. Thus, the split

 

94

Table 11. Identification of each cluster by command or command

type occurring most frequently in the cluster.

 

cluster identification

1 no strong identification
2 no strong identification
3 user program execution

4 FORTRAN (time-sharing)

5 job authorization

6 job authorization

7 file action commands

8 interactive editor

9 job authorization
10 FORTRAN (batch processing)
11 job authorization
12 interactive editor
13 BASIC
14 no strong identification

 

 

95
did represent a real difference in the behavior of the compiler.

When all instances of commands which cause the execution of a
user written program were Observed to see how they split among the
clusters, no homogeneous behavior was observed. Table 12 shows the
frequency with which these commands were assigned to each cluster. As
would be expected, while clusters two, three and ten represent a
substantial number of such job steps, the job steps were generally
more Spread out among the clusters than were those job steps corres-
ponding to system commands.

Estimation and Interpretation pf the Transition Function

 

To estimate the transition function for the sequencer, the set
of states must be known. In order to keep the structure of the model
as simple as possible, each cluster, and therefore each JSSM, was

assigned one state. In addition, an initial state, and a final

SO,

state, were added, giving a total of sixteen states. Of the

915’

fourteen states corresponding to the clusters, 3 corresponds to cluster

1
one, 52 to cluster two, 332;, In the matrix of transition probabilities,
appearing in Table 13, the probability of a transition from state i
to state j is found at the intersection of row i with column j. Since
each cluster occurred an adequate number of times, the maximum like-
lihood estimates Of the probabilities are used.

The interpretation of the transition probabilities is the standard
one, E;E;.the probability of the pair (Si’sj) is the proportion of the
number of times 51 appears that it is followed by sj. TO interpret
the transition probabilities in a less abstract manner it is instruc-

tive to examine the probabilities in view Of the identification of the

clusters made in the previous section. For example, state 513,

96

Table 12. Distribution of user written program executions

among the different clusters.

 

 

cluster frequency
1 4
2 80
3 79
4 34
5 ll
6 44
7 1
8 7
9 3
10 113
ll 0
12 4
13 6

14 18

 

97

 

Table 13. Matrix of transition probabilities for the sequencer
determined from 800 Job steps run from 12 noon to
2:00 p.m. on a class day.

So 81 S2 S3 34 35 36 S7 S8 39 S10 S11 S12 813 314 S15
so .00 .00 .02 .02 .03 .34 .00 .00 .00 .13 .16 .27 .00 .01 .02 .00
SI .00 .53 .12 .01 .00 .03 .00 .06 .02 .01 .12 .00 .00 .00 .02 .08
$2 .00 .00 .30 .15 .00 .00 .00 .05 .05 .05 .00 .00 .00 .00 .00 .40
$3 .00 .01 .01 .28 .02 .03 .04 .11 .26 .00 .12 .00 .01 .00 .04 .07
S4 .00 .01 .16 .01 .24 .16 .01 .02 .00 .00 .08 .04 .00 .10 .03 .14
s5 .00 .02 .00 .06 .02 .21 .00 .06 .12 .00 .11 .00 .25 .00 .03 .12
S6 .00 .OO .00 .00 .00 .01 .Ol .04 .02 .00 .00 .00 .00 .00 .03 .89
s7 .00 .01 .01 .07 .05 .22 .02 .30 .02 .00 .21 .00 .00 .00 .02 .07
38 .00 .00 .00 .01 .10 .01 .30 .02 .33 .00 .01 .01 .00 .17 .03 .01
$9 .00 .17 .68 .01 .00 .01 .00 .02 .02 .01 .00 .00 .00 .00 .00 .08
310 .00 .02 .02 .02 .06 .12 .09 .09 .02 .02 .30 .02 .01 .02 .02 .17
311 .00 .01 .00 .05 .20 .07 .10 .02 .01 .00 .35 .18 .00 .00 .00 .01
$12 .00 .Ol .00 .03 .00 .08 .04 .02 .68 .00 .00 .00 .01 .00 .10 .03
$13 .00 .00 .00 .10 .00 .00 .04 .00 .73 .00 .01 .00 .00 .02 .06 .04
811+ .00 e02 001 000 002 008 009 011 027 000 002 001 000 005 015 017

.00 .00 .00 .00 .00 .00 .00 .00 .00 .00 .00 .00 .00 .00 .00 1.0

315

 

98
corresponding to the BASIC compiler, was followed 73% of the time by
38, the interactive editor. This close interaction between the compiler
and the editor is explained by two facts. First, the majority of the
use of BASIC was through the time-sharing system. Second, the compiler
is structured such that it will initiate execution of the compiled
program.with no intervening command as the default case. Thus, a
student could compile and execute his program using the command BASIC
and then use the editor to correct any errors.

As a second example, consider the high probability of a transition
from state 39, job authorization, to state 32. While cluster two was
not as strongly identified with any one command as were other clusters,
about 30% of the job steps in the cluster were ATTACH commands. This
command gives a job access to a permanent file, a mechanism used by
programmers to avoid repeated card reading for frequently used program
and data files. Thus, a typical job might start with an authorization
step followed by an ATTACH to retrieve some saved file. Since a user
might have to retrieve several data files the probability of two or
more successive occurrences Of cluster two should be reasonably high.
As can be seen from the (82,82) entry, the probability is .30, in-
dicating that frequently users dO several ATTACH commands in a row.

The transition probabilities must be interpreted with some care
since the identifications of the clusters with commands is only
approximate. For example, the large probability of going from $2 to
315 could be taken to indicate that users frequently ATTACH several
files and then terminate the job without using the files. This was

not the case and the large probability can be explained in at least

two ways. One possibility is that the last ATTACH contained an error

 

 

9 9
which forced an abnormal termination of the job. The other, more
plausible explanation stems from the fact that cluster two also con-
tained a large number, about 40%, of job steps which corresponded to
the execution of a user written program. Thus several transitions
through 3

followed by s can be interpreted as a series of ATTACH

2 15
commands followed by an execution Of one of the retrieved programs
followed by termination.

One final example illustrates the effect of assuming a stationary

Markov process. The probability of remaining in 3 file actions, was

7,
reasonably high indicating a series of such actions. This is reason-
able Since frequently several files would be rewound or returned at a
time. State 87 was then frequently followed by state 55 for which

35% of the commands were tape library updates. This is a sensible
series of events since the files were possibly being rewound in prep-
aration for being added to the tape library. With a stationary Markov
process, the chance of rewinding another file is the same whether 0

or 100 files have previously been rewound. Since each job has a max-
imum number of files it may use, this is not realistic. Therefore,

it would be preferable if the (37,37) probability decreased and the

(37,35) probability increased as the job proceeded.

The final form of the sequencer can be expressed as:
(S 3 SO ) 815 3 T 9 J ) A)
where

s = {31 | 0 s i s 15},
T is the transition function assigning probabilities as

given in Table 13,

 

 

100
J={Mi|oSis15}and
A is the JSSM assignment function defined as

A(Si) = Mi'

Specification 2f the JSSM'S

 

 

Due to the nature of the available data and the level of detail
required, each JSSM has one of two possible forms. The total set of

resources is taken to be

R = [central processor seconds,

system requests,

 

record units transferred,

words of memory required}.

Two JSSM'S, MO and M15, require no resources Since they Simply mark
the beginning and end of the job. The remaining fourteen JSSM'S are
similar in that they each require all four resources, all use the same
type Of distribution functions although with different parameters and
all have condition functions which express a direct linear relation
between the utilization of the resources. All that is required of the
condition functions for the desired consistency set is that processor
time be consistent with the elapsed time and that the utilization of
the other resources be consistent with the modeled load at the end of
each job step.

The only difference between the JSSM'S, then, is the actual para-
meters Of the distribution functions. Some experimentation with the
form of the distributions and the method of parameter estimation was

required before the model would produce reasonable results. This

eXperimentation was integral to the implementation of the model and is

 

101
discussed in the next section.

Implementation g£_this Instance 9f the Model

 

 

It was decided to implement the model in such a way as to produce
benchmark jobs which would exhibit behavior similar to that of the
originally measured jobs. This would be done by writing a simple
program for the computer to be tested. This program would read four
parameters, the amount of processor time to use, the number of system
requests to issue, the number of record units to transfer and the amount
of memory to occupy while using these other resources. This program
would be performed once for each job step required using parameters
determined by the distribution functions for the particular JSSM.

The model was then implemented to generate the data used by the
benchmark program. In this implementation, there was one procedure
for each JSSM which included the prOper distribution functions. Upon
being invoked, each such procedure generated four random numbers
according to these distributions. There was also a main procedure
which invoked the JSSM procedures in an order determined by the tran-
sition probabilities of the sequencer.

To test this implementation, the transition probabilities and
distribution functions were estimated from the 800 job steps for which
the clusters were determined. Based on these parameters 575 jobs,
comprising some 4000 job steps, were generated. These generated jobs
were compared with 575 real jobs with reSpect to the average utilization
of each resource per job. In addition, the average number of job steps
per job was compared. Based on the form Of the various distributions,
central processor time and the number of record units transferred were

the most difficult to fit and so were used as the basis for comparison.

 

 

102

For the 575 real jobs there were an average Of 6.91 job steps,
9.13 processor seconds and 1393.38 record units transferred per job.
For the first trial, a joint Gaussian distribution was used for the
JSSM'S with the mean vectors and covariance matrices as the parameters.
This produced a set Of jobs with an average of 6.70 job steps, 22.66
processor seconds and 2367.38 record units transferred per job. This
large error was due to the fact that the majority of the cluster
distributions were skewed so that the mode and mean were separated
while the Gaussian distribution fitted for the JSSM'S superimposes the
mode and mean.

Since there were only four resources and fourteen JSSM'S each
distribution was next represented by its empirically determined histo-
gram. The number of bins was initially set to 25, producing a set of
jobs with an average of 6.70 job steps, 16.32 processor seconds and
1443.81 record units transferred. These results were closer to those
obtained for the real jobs, but it was obvious that the central proces-
sor time utilization histogram was too coarse, iLngthe bin size was
too large. The difficulty was that the distribution determined which
bin or range of values a random number should be generated from. Within
this bin it was assumed that the distribution was uniform. If the bin
size was too large, this assumption was invalid. As an example, the
bin size for cluster two processor time was 4 seconds. Of all job
steps from cluster two, 98% were in the first bin, 4 seconds or less.
Therefore, the average processor time generated for these job steps
would be 2 seconds. However, if the histogram had used a bin size of
0.5 seconds, it would have been seen that 96% of all job steps used

0.5 seconds or less. Using this histogram, the average processor time

 

 

103

would have been about .25 seconds.

In the final model, the histograms were constructed with a bin
size which produced a uniform distribution within each bin. For central
processor seconds 100 bins were used, 50 bins were used for system
requests, and 25 bins were used fOr record units transferred and memory
required. The final set of jobs used an average of 6.70 job steps,
9.46 processor seconds and 1443.81 record units per job. These results
are summarized in Table 14.

Comments pp_the Model Implementation

 

It is important to consider whether or not the implementation of
the model can be validly transferred to a computer other than the
CDC 6500. Even if this transfer is not possible, the model implemen-
tation provides a useful tool for evaluating different configurations
Of the CDC 6500. Two considerations which must enter into answering
this question are, first, what is the relation between resource util-
ization units on the CDC 6500 and those on another computer, and second,
does the unusual architecture of the CDC 6500 influence the load in
such a way as to make the transfer impossible.

The scaling of resources which is necessary depends on which
resource and computer are being considered. For example, system re-
quests would not have to be scaled unless the computer which the model
implementation was being transferred to, the host computer, allowed
direct input and output, ipg;_not through a system request. If the
host computer did allow such direct access to input and output devices,
then those system requests which correspond to input and output,
possibly 90% or more, would have to be disregarded. Central processor

time would have to be scaled to account for any differences in speed

 

 

 

104

Table 14. Summary of the results Obtained from the imple-
mentation of the CDC 6500 load model.

 

 

approximation job steps epu utilization record units
performed

Measured load 6.91 9.13 1393.38

Gaussian 6.70 22.66 . 3367.38

25 bins 6.70 16.32 1443.81

Variable bins 6.70 9.46 1443.81

 

105
between the CDC 6500 and the host computer. Since this scaling need
only account for the hardware instruction set, an instruction mix can
be used to determine the apprOpriate scaling factor. Memory require-
ments might also require scaling depending on the nature of the instruc—
tions available. Again, instruction mixes Offer a technique for deter-
mining the needed scale factor. Finally, the record unit on the
CDC 6500 is 64 words or 3840 bits. This unit would have to be Scaled
apprOpriately depending on the size of the record unit on the host

computer.

 

The second consideration, the effect of the CDC 6500 architecture

on the nature Of the load, depends upon the realization that the model

“lb—-

cannot predict the effect of a new computer architecture on the pro-
gramming style of the users. The model represents only the results of
transferring the users' present programs, embodying the users' current
style, to the host computer. Once the users' style has been influenced
by the new computer, the characteristics of the load have changed and

a new instance of the load model would be required.

Improvement pf the Model Instance

 

 

There are several changes which can be made in the structure of
this instance Of the model which could lead to a more useful model.
The first of these improvements concerns the state set of the sequencer.
If each cluster is divided among different states for each different
command which occurs in the cluster, the sequences of job steps produced
by the sequencer could be more meaningfully interpreted. Each of these
several states would have the same behavior and would thus invoke the
same JSSM. For example, cluster five consisted predominantly of

authorization job steps, with about 30% library update job steps and

   

106
10% other job steps. With a single state representing all three possible
occurrences of cluster five, it could appear that cluster seven, file
actions, was followed 22% of the time by job authorization. With clus-
ter five split into three states it would be Obvious that cluster
seven was followed by an occurrence of cluster five arising from a
library update.

The second improvement is similar to the first in that it also
requires an eXpansion of the sequencer state set. Since a job can be
expected to behave differently after an error has occurred in a job
step, it might be useful to add one or more JSSM'S for error behavior.
For example, this error job step would occur after an error in compiling
a program in one job step since any subsequent job step which attempts
the execution of the erroneous program will immediately terminate the
job.

The third improvement is to provide two separate instances of the
model, one for those jobs run in the time-sharing mode, the other for
those jobs run in the batch processing mode. This would allow exper-
imentation with the relative proportions of the two types of jobs. In
addition, this separation would simplify the implementation since the
method of submitting the batch processing and time-sharing jobs differs.

The final improvement is to construct a refinement of this instance
of the job model. This refinement can be accomplished either by ex-
panding the resource set, expanding the consistency set, or improving
the approximation to the real jobs provided by the histograms for
resource utilizations. A.special augmented DAYFILE is available which
contains an entry for every occurrence of a wait condition. This would

provide the necessary data to allow an expansion of the consistency set

 

 

107

of the instance of the model. Further data gathering techniques would
have to be develOped to achieve any other type Of refinement.
Summary

In this chapter, an instance of the job model is presented together
with examples of the computation steps needed to estimate the para-
meters. These parameters are related in an intuitive manner to the
structure Of the computer system and load. An implementation Of this
instance of the model which generates parameters for a synthetic
benchmark program.is discussed. Despite the simplicity of the model
instance the resource requirements predicted by the model implementa-
tion are comparable tO the requirements observed for an equivalent

number of real jobs.

 

 

Chapter 6

Summary and Further Research

Summary pf the Dissertation

 

In Chapter 1 the current literature on the modeling of computer
systems is reviewed with the distinction being made between that part
Of each model which reproduces the behavior Of the computer and that
part of each model which reproduces the behavior of the programs
running on the computer. This ability to distinguish between the
computer and the load on the computer is the key point on which the
remainder of the dissertation is based.

The distinction between the computer and the load on the computer
is stated more precisely in Chapter 2. The load is considered to
comprise a number of independent jobs and therefore the model Of the
load is made up of a suitable number of models of individual jobs. A
model of such a job is a stochastic process, 122;.a random function
of time, eXpressing the utilization of the various computer provided
resources. A framework for a statistical model of a job is presented
based on the structure Of typical operating system control languages.
The Similarity of the model structure to the structure Of a real job
simplifies the computation of parameters for the model and enhances the
usefulness of the model in studying specific portions of the load.
Chapter 3 then describes the computational techniques needed to compute

the parameters for a model constructed according to the framework.

108

 

109

For a particular load many different instances Of the model may be
constructed within the framework described in Chapter 2. Each instance
may emphasize a different aspect of the load or may simply represent
the load with a different level of detail. In Chapter 4 a relation
between instances of the load model is introduced. This relation
induces a partial ordering on the set of all instances of the model
of a particular load.

Chapter 5 then demonstrates the application of the techniques of
Chapter 3 to construct an instance of the model for a real load. This
instance Of the model is implemented to generate parameters for a
synthetic benchmark program. The results obtained with this implemen-
tation compare favorably with the real load.

Conclusions

 

This dissertation presents a unified approach to the problem of
modeling the load on a computer system. The model, based on a restrict-
ed form of a stochastic process, provides a large degree of flexibility
for eXpressing the interdependencies of the various resources used by
a job. At the same time, the form of the model emphasizes the cor-
respondence between the model and a real job, thus simplifying the
computation of parameters.

Modeling is normally an iterative process. That is, a model is
constructed and used in some manner. The results Obtained from this
use suggest certain improvements to the model and so the process may
be repeated. The refinement relation and correSponding hierarchy of
instances of the model provide guidance in determining how to improve
an instance of the model. The examples Of various instances of the

model presenting different degrees of detail suggest a wide

 

 

110
applicability of the model framework.

Further Research

 

The investigation Of the effect of relaxing some of the assump-
tions made in formulating the load model is one area for further
research. For instance, one important assumption is that the load
consists Of a number of independent jobs. The effect of this assump-
tion is that the only influence one job has on another is in the sched-
uling of access to the resources, i;§;_one job cannot affect the outcome
of another job, only the length Of time required to reach the outcome.
Examples Of situations where this assumption does not hold are nu-
merous in real loads but perhaps the most interesting example is that
of multitasking. In such a system one job can create another job and
these two jobs can be run concurrently. The two jobs are not indepen-
dent, however, since one job can determine the status or wait for the
completion of the other. Modifying the job model to simulate such
behavior would allow more accurate load models for multitasking computer
systems.

The model framework introduced in Chapter 2 allows the Specifi-
cation of a stochastic process as a model of a job. One effect of
maintaining the correspondence between the model structure and the
job structure is that the stochastic process which may be used is
restricted. For example, only first order probabilities are used in
determining the sequence Of job steps which makes up a job. A poten-
tially fruitful topic for further research is to investigate the class
of stochastic processes which can be eXpressed within the framework
for the load model. One result of such an investigation could be a

revised structure for the model which does not restrict the stochastic

 

 

 

 

lll
stochastic process and at the same time maintains the structural
correspondence to a real job.

Related to this topic is the fact that the composition of the
load is not stationary, iii; the characteristics of the jobs varies
over the course of a day. A general solution to this problem is to
use a non-stationary stochastic process as a model. An alternative
approach which is computationally simpler is to use the clustering
algorithm described in Chapter 3 to determine how many different types
of jobs are present in the load. The dimensions used in determining
the type of a job would include which job steps were utilized in the
job and some measure of the sequence of the job steps. A separate
instance of the model would be constructed for each type Of job and a
global sequencer would be used to determine the order and interval
at which different types of jobs are submitted.

The model framework can also be applied to the analytical
modeling of a total computer system. For instance, one desired result
from an analytical model might be the distribution of the turn around
time for a job. Consider a simple system for which the turn around
time is the sum of the processor time used, the time spent waiting for
a disk head to be positioned and the time spent transferring data.

The processor is not shared by the disk drive is. In this case the
time which elapses between subsequent disk references is a factor in
determining the time required to position the head. That is, two
successive references from the same job probably reference consecutive
areas of the disk. If sufficient time elapses between the two refer-
ences for another job to reference the disk, the disk heads will have

to be repositioned. The distribution of turn around time for a job

 

 

 

 

112
then depends on three things: the distribution of processor time between
disk references specified by a condition function, the distribution
of elapsed time between all disk references and the time required to
transfer the data. The method for combining these items to obtain a
single distribution for turn around time is not straightforward.
Furthermore, in the model of a more complex system, the turn around
time would depend on competition for several resources and the ensuing
distribution of wait times.

[Ziegler 1972] describes a theory Of modeling which equates the
state of a model with the value of various state variables and a tran-
sition with a change of one or more of these variables. The state
variables of the job model are the resource utilizations and the changes
of these variables are determined by the condition functions. Further
investigations of how the job model fits into this theory of modeling
might lead to a better understanding of the formal properties of the
job model.

One further tOpic Of research is the study of the refinement
relation defined in Chapter 4 and the partial ordering it induces on
the set of all instances of the model for a particular load. If a
suitable operator on models is defined a semigroup might be obtained
and the refinement relation might be a congruence relation. Such
added structure would be useful in determining the effect of using
different levels Of detail to model different types of jobs or job

steps.

 

 

 

APPENDIX

 

APPENDIX

The following table which shows the number of occurrences of each

command in each cluster is the basis of Table 11.

command cluster number
1 2 3 4 5 6 7 8 9 10 11 12 13 14

 

 

AFENTRY 0 0 0 0 l 0 0 0 0 1 0 0 0 30
AFLIMITS 0 0 0 0 0 0 0 0 0 0 0 0 0 1
ALTERAF 0 0 0 0 0 14 0 0 0 0 0 0 0 2
APLIB 0 0 0 0 111 O 0 0 0 0 O 0 0 0
ASSETS 0 0 0 0 2 0 0 0 0 0 0 0 0 0
ATTACH 38 61 3 3 10 0 0 0 5 6 15 0 l 0
AUDIT 0 0 3 0 0 0 0 0 0 l 0 0 0 0
AUDITTY 0 0 1 0 0 0 0 0 0 0 0 0 0 0
AUTORFL 0 0 4 0 l 0 3 O 0 10 9 0 0 0
BASIC 0 O 0 0 0 0 0 0 0 0 0 0 101 0
BKSP 1 0 0 0 0 0 l 0 0 0 0 0 O 0
CATALOG ll 0 0 0 5 0 5 0 0 6 0 0 0 0
COBOL 0 0 0 0 0 0 0 0 O 5 0 0 0 0
COMPASS 0 0 5 1 0 0 0 0 0 22 l 0 0 0
COPY 0 O 0 0 6 0 0 0 0 0 0 0 0 O
COPYBF 0 0 O 0 8 0 0 0 0 0 0 0 0 0
COPYBR 0 0 0 0 20 0 0 0 0 0 0 0 0 0
COPYCF 0 0 0 0 5 0 0 0 0 0 0 0 0 0

113

   

114

command cluster number
1 2 3 4 5 6 7 8 9 10 11 12 l3 l4

 

 

COPYCR 0 0 0 0 l6 0 0 O 0 O 0 0 0 0
COPYL 0 0 0 O 7 0 0 0 0 O 0 0 0 0
COPYN 0 O 0 0 12 0 0 0 0 0 0 0 0 0
COPYSBF O O 0 0 7 0 0 0 0 0 0 0 0 0
DAYFILE 0 0 0 0 0 4 0 0 0 0 0 0 0 0
DEBUG 2 0 l 0 0 0 3 0 l 4 l 0 0 0
DISPOSE 22 0 0 0 2 0 0 0 2 5 5 0 1 0
DMP 0 0 0 0 0 0 8 0 0 5 l 0 0 O
Ilﬂﬂi O 0 0 O 0 0 0 0 0 3 0 0 O 0
ERRS 0 34 0 0 0 0 0 0 0 0 0 0 26 0
EXEC O 0 0 0 0 0 0 O 0 l 0 0 0 0
EXIT O 0 0 0 0 0 l6 0 0 0 0 0 0 0
FILES 0 0 0 0 5 O 0 0 0 0 0 O 0 0
FTN 1 0 3 121 l 1 0 0 0 118 6 0 0 0
GETPL 0 0 0 0 7 0 O 0 0 0 0 0 O 0
HAL 0 0 8 0 0 0 0 0 0 0 0 0 0 0
HELP 0 0 0 O 9 0 0 0 0 0 0 O 0 0
LGO 4 80 79 34 ll 44 l 7 3 113 0 4 6 l8
LISTAPE 0 0 0 0 1 0 0 0 0 0 0 0 0 O
LOAD 0 0 0 40 0 0 0 0 0 34 3 0 0 1
LOCK 1 0 0 0 0 0 O 0 0 0 0 O 0 0
LOGIN 0 8 12 19 197 0 0 0 75 94 154 O 5 11
LOGOUT O 0 0 0 0 173 0 0 0 0 0 0 0 0
MAP 0 0 l 0 ll 0 0 0 0 8 12 0 O 0

MESSAGE 0 0 0 0 l 0 0 0 0 0 0 0 0 0

 

 

115

command cluster number
1 2 3 4 5 6 7 8 9 10 ll 12 13 14

 

 

 

MODAF 0 0 0 0 0 0 0 0 0 O 0 0 0 10
MODE 0 0 0 0 0 0 0 0 0 l 1 0 0 0
NEWPASS 0 0 0 0 0 21 0 0 0 0 0 0 0 9
NEWPN 0 1 0 0 0 0 0 0 0 0 0 0 3. 0
NEWPL 0 0 0 0 0 0 0 0 0 2 0 0 0 0
NOGO 7 0 0 3 0 0 0 0 0 9 0 0 0 0
PCFILES 0 0 0 0 3 0 0 0 0 0 0 0 0 0
PEANUTS O 0 O 0 l 2 0 0 0 0 0 0 0 0
PLANIT 0 0 9 O 0 0 0 O 0 0 0 0 0 0
PURGE 2 0 0 0 4 0 0 0 0 l 0 0 0 0
REDUCE 0 O l 0 2 0 l 0 0 0 0 0 0 0
REQUEST 0 O 0 4 12 0 9 0 0 9 l 0 0 0
RETURN 4 0 0 0 O 0 49 0 0 0 0 0 0 0
REWIND 7 O 0 O 0 0 114 0 O 2 0 0 0 0
RFL 5 0 2 0 4 0 5 O 1 l 2 0 1 0
RTL l O 0 0 0 0 0 0 0 0 l 0 0 0
RUN O 0 7 0 0 6 0 0 0 4 0 0 0 0
SCAN 0 0 0 0 0 5 0 0 0 0 0 0 0 0
SELECT 0 0 0 0 0 0 0 0 0 0 0 0 0 3
SEND O 0 0 0 4 0 0 0 0 0 0 0 0 0
SETCORE 0 0 0 l 0 0 0 O O 9 0 0 0 0
SETUP 0 0 2 0 0 O 0 537 0 0 0 126 0 33
SITUATE 0 0 0 0 10 0 0 O 0 0 0 0 0 0
SKIPF O 0 0 0 0 0 9 0 0 0 O 0 0 0
SORTMRG 0 0 0 O 0 0 0 0 0 l 0 0 0 0

UPDATE 0 0 39 0 0 0 0 0 0 0 0 0 l 0

 

 

 

BIBLIOGRAPHY

 

 

BIBLIOGRAPHY

Adams, R. T., Caldwell, W. H. G. Jr., and Carlton, D. E., 1969,
"Evaluation of Time-Sharing Systems -- Benchmarks," GITIS-69-22,
Georgia Institute of Technology, Atlanta, Georgia.

Afifi, A. A., and Azen, P. A-, 1972, Statistical Analysis: A_Computer
Approach, Academic Press, New York, New York.

 

ANSI, 1971, Report_ to SPA.RC from Ad Hoc Committee on Operating System
Control Languages, American National Standards Institute, New York,
New York.

 

 

 

 

 

Apple, C. T., 1965, "The Program Monitor -- A device for Program
Performance Measurement," Proceedings pf the ACM 20th National
Conference, 66-75.

 

 

Ball, G. H., and Hall, D. J., 1965, "ISODATA, A Novel Method of Data
Analysis and Pattern Classification," Technical Report, Stanford
Research Institute, Menlo Park, California.

Bharucha-Reid, A. T., 1960, Elements pf the Theory pf Markov Processes
and Their Applications, McGraw-Hill Book Company, Inc., New York
New York.

 

 

 

Bemer, R. W., and Ellison, A- L., 1968, "Software Instrumentation

Systems for Optimum Performance," Proceedings pf the IFIP Congress,

 

 

Buckley, F. J., 1969, "Estimating the Timing of Workload on ADP
Systems -- An Evaluation of Methods Used," Computers and Auto-

 

Bucholz, W., 1969, "A Synthetic Job for Measuring System Performance,"
IBM Systems Journal, 4:309-318.

 

Calingaert, P., 1967, "System Performance Evaluation: Survey and
Appraisal," Communications pf the ACM, 1:12-18.

 

Chavy, P. P., and MacDougall, M. H., 1966, Simulation Of an ECS- Based
Op_ratingp§ystem, Control Data Corporation, ‘Minneapolis,
Minnesota.

 

 

Cheng, P. S., 1968, "Trace-Driven System Modeling,” IBM Systems
Journal, 4:280-289.

 

116 1

   

117

Cochran, W. G., 1963, Samplinngechniques, John Wiley and Sons, Inc.,
New York, New York.

 

Comtre, 1970, Analysis of Major Operating Systems, ESD-TR-70-377, The
Comtre Corporation, Coral Gables, Florida.

Dubes, R. C.. 1970, Data Reduction with Grouping and Weibull Models,
Interim Scientific Report No. 7, Division of Engineering Research,
Michigan State University, East Lansing, Michigan.

 

 

Edwards, A. W. F., and Cavalli-Sforza, L. T., 1965, "A Method for
Cluster Analysis," Biometrics, 21:362-375.

 

Feeley, J., 1971, A Markov Analysis of the Naval Undersea Research and
Develppment Center 1108 Exec 8 System Using Data Measured py_ the
Computer Performance‘Monitori:g_Device, Naval Undersea Research
and Development, Pasadena, California.

 

 

 

 

Feller, W., 1968, An Introduction to Probabili_y Theory and Its
Application, John Wiley and Sons, Inc., New York, New York.

 

 

 

Ferrari, D., 1972, "Workload Characterization and Selection in Computer
Performance Measurement," COMPUTER, 4:18-24.

Fine, G. H., and McIsaac, P. V., 1966, "Simulation of a Time-Sharing
System," Management Science, 6:8180-Bl94.

 

Gonter, R. H., 1966, A Basic 6600 Throughput Simulation Model, Control
Data Corporation, Minneapolis, Minnesota.

 

Good, I. J., 1965, The Estimation of Probabilities: An.E ssay_ on Modern
Bayesian.Methods, M. I. T. Research Monograph No. 30, M. I. T.
Press, Cambridge, Massachussetts.

 

 

 

Greenbaum, H. J., 1968, ASimulator of Multiple Interactive Users to
Drive a Time-SharedA _Computer System, Technical Report TR- 58,
M. I. T. Project MAC, Cambridge, 'Massachussetts.

 

 

Hillegass, J. R., 1966, "Standardized Benchmark Problems Measure
Computer Performance," Computers and Automation, 1:16-19.

 

Hunt, E., Diehr, G., and Garnatz, D., 1971, "Who are the Users?
An Analysis of Computer Use in a University Computer Center,"
Proceedings pf the AFIPS SJCC, 231-238.

 

 

Johnson, W. E., 1932, "Probability: Deductive and Inductive Problems,”
Mind, 41:421-423.

Joslin, E. 0., 1965, "Application Benchmarks: The Key to Meaning-
ful Evaluations," Proceedipgs pf the 20th ACM National Conference,
27-370

 

 

 

 

 

118

Joslin, E. 0., 1968, Computer Selection, Addison-Wesley Publishing CO.
Reading, Massachussetts.

 

Joslin, E. 0., and Aiken, J. J., 1966, "The Validity of Basing

Computer Selection on Benchmark Results," Computers and Automation,
1:22-23.

 

Keeping, E. 8., 1962, Introduction to Statistical Inference, D. Van
Nostrand Company, Inc., Princeton, New Jersey.

 

 

Lucas, H. C. Jr., 1971, "Performance Evaluation and Monitoring,”
Computing Surveys, 3:79-91.

 

MacDougall, M. H., 1970, "Computer System Simulation: An Introduction,"
Computing Surveys, 3:191-209.

 

Miller, E. F. Jr., 1972, "Bibliography on Techniques of Computer
Performance Analysis," COMPUTER, 5:39-47.

‘Miller, E. F. Jr., 1971, An __periment in Hardware Monitori_g, RM- 1517,
General Research Corporation, Santa Barbara, California.

 

 

Mittwede, W. C., 1970, Cm mputer Operating Capabilities: A
Source Selection and Analysis.Aid, ESD-TR- 71- 74, The Comtre
Corporation, Coral Gables, Florida.

 

 

Nielsen, N. R., 1966, The Analysis_ of General Purpose Time Sharing
Systems, Document NO. 40-10-1, Computation Center, Stanford
University, Palo Alto, California.

 

Oden, P. H., and Shedlin, G. S., 1972, "A Model of Memory Contention
in a Paging Machine," Communications pg the ACM, 8:761-771.

 

Papoulis, A., 1965, ProbabilityJ Random Variables and Stochastic
Processes, McGraw-Hill Book Company, Inc., New York, New York.

 

Parzen, E., 1962, Stochastic Processes, Holden-Day, Inc., San Francisco,
California.

 

Rao, V. A., 1971,A‘Mu1timodal Distribution Based C1ustering_A1gorithm,
Ph. D. Dissertation, Department of Computer Science, Michigan
State University, East Lansing, Michigan.

 

Rosin, R. F., 1965, "Determining A Computer Center Environment,"
Communications 2f the ACM, 7:463-468.

 

Ruspini, E. H., 1969, "A New Approach to Clustering," Information and
Control, 15:22-32.

 

SDC, 1968, A Guide to Computer System‘Measurement, System Development
Corporation, Santa'Monica, California.

 

 

 

119

Sukhatme, P. V., 1954, Sampling Theory_g£_$urveys with Applications,
Iowa State College Press, Ames, Iowa.

 

 

Tata, M., 1972, Personal communication.

Thompson, K. L., 1969, Time Sharing Services Comparison, Computer
Laboratory, Michigan State University, East Lansing, Michigan.

 

Walter, E. S., and Wallace, V. L., 1967, "Further Analysis Of a
Computing Center Environment," Communications of the ACM, 5:266-272.

 

ward, M. R., 1969, Time Sharing Studies, GMR-868, General Motors
Corporation Research Laboratories, Warren, Michigan.

 

Weibull, W., 1951, "A Statistical Distribution Function of Wide
Applicability," Journal gqupplied Mechanics, 18:293,297.

Whittaker, E. T., and Watson, G. N., 1935, A_Course of Modern Analysis,
The University Press, Cambridge, England.

Winder, R. 0., 1973, "A Data Base for Computer Performance Evaluation,"
COMPUTER, 3:25-29.

Wood, D. C., and Forman, E. H., 1971, "Throughput Measurement Using
a Synthetic Job Stream," Proceedings of the AFIPS Conference,
51-56.

Ziegler, B. P., 1972, "Toward a Formal Theory of Modeling and
Simulation: Structure Preserving Morphisms," Journal of the ACM,
19:742-764.