ECONOMIES 0F SCALE m COMPUTER CENTERS Thesis for the Degree of Ph. D. MICHIGAN STATE UNWERSITY GERALD L MUSGRAVE 1973 "r“? LIBRARY ‘ Michigan State: University This is to certify that the thesis entitled ECONOMIES OF SCALE IN COMPUTER CENTERS presented by Gerald L. Musgrave has been accepted towards fulfillment of the requirements for ' Ph.D. degree in Economics % .49 v] / Mqior professor Date 0-769 bl- ‘uil ABSTRACT ECONOMIES OF SCALE IN COMPUTER CENTERS BY Gerald L. Musgrave The objective of this study is to determine if economies of scale exist in the production of computer out- put. A model of the production process was constructed on the assumption that the computer centers attempt to minimize the cost of producing an exogenously determined expected level of output. Stochastic disturbances are assumed to be present in the production process due to the unpredictable nature of computer hardware and software failure. The factor demand equations are also assumed to be non-deterministic because 0 and (a1 + azanp - l) < 0 d(C/Q ) then dQ p < 0, average cost decreases as capacity P output increases. ii. If a1, a2 > 0 and (a + a 1nQp - 1) > 0 1 2 d(C/Q ) then -—-E-'> 0, average cost increases as capacity de output increases. iii. If a1 < 1 and a2 = 0 then ((11 + azanp - l) < 0 and we have case i. above. iv. If a1 < l and a2 > 0, we have the U-shaped average cost function. 17 Since the technical change was the concern of the paper, Knight hypothesized any change in technology which occurred would cause a shift in the cost function. His single equation became 2 n (a) 1nC = a + alanp + a2(anp) + 2 B.w. o j=1 J 3 where the wj terms are binary variables (1 if the machine was first produced in year j, 0 otherwise), j = 51, ..., 62. Knight ran a second equation omitting the (anp)2 term (b) 1nC = do + alanp + 2 B.w.. He found the Rz's of equations (a) and (b) to be "close" and thus "very little additional explanatory power was gained by allowing for a U-shaped average cost curve."17 In addition, the original regression results were not pre- sented but the second regression was presented after some of the observations were excluded. The sample points were deleted if "actual cost exceeded that predicted by more than one-half the standard deviation (error) of predicted "18 cost. This process of eliminating observations which do not fit is more prevalent in engineering than in other 17K. Knight, "A Study of Technological Innovation-- The Evaluation of Digital Computers" (unpublished Doctoral dissertation, Carnegie Institute of Technology, 1963). lerid. 18 fields. It seems the objective was to eliminate "over- priced" systems,which means ones from a different population. Once this process of elimination was completed, it was assumed the equation would hold exactly except for possible measurement errors. Thus the influence of other explana- tory variables, other than Qp, was supposedly held constant--like a laboratory experiment. The reader inter- ested in the econometric approach to such issues is referred to Kmenta [1971, Chapter 10 and the section on omission of relevant explanatory variables]. Knight estimated al at 0.519 for scientific computation and a1 equalling 0.459 for commercial data processing. These results suggest econo- mies of scale exist for computers produced between 1950 and 1962. The only paper which dealt directly with the issue of economies of scale is that by Solomon [1966] who con- sidered International Business Machines Systems 360 models 30, 40, 50, 65 and 75. The author assumed that no techni- cal differences existed between these machines and believed that any differences in production rates could be explained by differences in the size of computers. This paper was concerned with what was described as "cost vs. performance" and since the paper was engineering in nature, no behavioral 19 model was constructed. The regression equation was 19 . M. Solomon, "Econom1es of Scale and the I.B.M. System/360," Communications of the ACM, Vol. 5 (June, 1966), pp. 435-40. 1n(Ci) = a + b 1n (Ti), 1 = l, ..., 5 where C is monthly rental and T is the time to compute a set of computer instructions. Note that T is the inverse of output, the quantity of instructions processed in a unit of time, e.g., (T = é), and we have 1n(Ci) = a - b ln(Qi), i = l, ..., 5. Output was defined in four ways: matrix multiplication, floating point square root, field scan, and a scientific instruction mix suggested by Arbuckle [1966]. Using ordin- ary least squares, he found 2 Instruction Type 5? -0.494 Matrix Multiplication ~ .989 -0.478 Floating Point Square Root .999 -0.632 Arbuckle's Scientific Mix .977 -0.682 Field Scan .969 He concludes, from the sample of five, that economies of scale exist for the 360 series. 4, Summary The available evidence suggests that the parameters of the production function can be estimated from the cost function. Two studies have been reviewed which use this reduced form cost function technique on regulated indus- tries with some success. 20 Evidence concerning economies of scale of computer centers is less convincing. Since the work thus far is in the area of engineering curve fitting,the construction of an economic production process may be interesting and is the subject of the next chapter. CHAPTER III THEORETICAL MODEL 1; Introduction In this chapter, the problem of specifying a model of the cost minimizing computer center operating with a .Cobb-Douglas production function is analyzed. It is assumed that the productive process is stochastic and that the mana- gers of the computer center include this fact in their cost minimizing actions. What follows is an analysis and justi- fication of the assumption of the nondeterministic nature of the productive process of computer output. In Sections 3 and 4, output and factor inputs are discussed. Section 5 deals with the unit prices of the factors of production and Section 6 is a presentation of the behavioral constraints which influence the computer center management. In Sec- tion 7, the specification of the cost minimization model is presented. The final section is a summary of the work in this chapter. 3; Stochastic Nature of the 1 rOEuctiOn Proces§_ The specification of computer production developed in this chapter is based on the assumption that the produc- tion function is of the Cobb-Douglas type. This assumption 21 22 seems at least partially justified on the basis of the Nerlove [1965] and Merewitz [1971] studies on regulated industries as indicated in the previous chapter. Thus, for each computer center j we have where Qj is the output of computer center j and Xij repre- sents the input of factor 1 to the process of the jth center. U . is a normally distributed random variable, 03 with assumed expectation E(Uoj) = 0 and constant variance 02, indicating unpredictable variations in computer output.' The disturbance is the result of unpredictable hardware, software and Operator error. It is also assumed that E(Uo Uok) = 0 for all j = k. If the assumption of homogeneous non-human capital 3 is drOpped and we accept the idea that technical change is embodied in successive vintages of computers which become more efficient over time, the inclusion of Solow-neutral technical change is possible. While the first delivery date of the type of computer is known, the actual installa- tion date is not and numerous field modifications occur which make the actual vintage of the machine uncertain. Since all the machines in the sample are second generation computers, they are assumed to be of the same vintage. 23 3;_Output from the Production Process It is assumed that the computer produces a single homogeneous product. This product will be measured in physical units which will be called "units of machine compu- tation." These physical units can be thought of as the amount of "computation" a representative computer would complete in a Specified unit of time. Alternatively, it is possible to conceive of this unit of output as a flow of standardized tasks performed in a specified unit of time. This study evaluates the "through-put" of the com- puter. The through-put is the amount of work the computer performs during a given period of time. Various defini- tions have been prOposed to measure computer through-put. These methods include actual job comparisons, benchmarks, program kernels,and instruction mixes. Actual job comparisons comprise the most funda- mental method of defining the output of a computer. This method involves using particular tasks on which the computer center Operates. These tasks, in the form of actual pro- grams and the data which are Operated upon, are processed on the computers which are under investigation. Each pro- gram is run on the computer and the amount of time required to complete the task is determined. This method is often used in business data proces- sing applications such as payroll. Since tasks of this form 24 are often coded or programmed in a standard language-- COBOL--it is possible to compare directly the number of actual payroll runs which could be completed in a unit of time. It is also possible to do some scientific tasks such as matrix inversion in this same fashion, if the program were written in the FORTRAN or ALGOL programming language. However, even these standard high level languages have inconsistencies. (For example, some instructions are more efficient on one machine than another, and in some cases certain instructions are available on particular computers only.) These differences stem from variations in design characteristics and thus the standard language programs operate with various degrees of efficiency, or in some cases do not Operate at all on various computers. If one were to use this method of actual job com- parison, one would be faced with three basic problems.’ The most important problem would be the cost of actually run- ning the programs on the several computer systems. A second problem is to model the influence of the particular set of instructions which comprise the program in the output measure; for example, some machines allow the use of buffer- ing, which is the ability of the computer to read input information and store it while operating upon data previ- ously stored in the computer. If a test program were written using this instruction, the machine which does not contain the feature in its instruction set could not in 25 general execute any of the program's instructions. The out- put would be zero. To attempt Optimal programming for all machines would be costly. The third difficulty is to deter- mine the generality or applicability of a particular job to the whole range of jobs processed on different computers. Since the specified job may not be representative of the whole class of jobs performed, the measure of output may be suspect. In any event, further analysis of the generality of the particular program would be necessary. For these reasons, the actual job comparison technique was not used in the study. A benchmark is a carefully defined problem which is coded and then timed on various machines. This benchmark program becomes the numéraire for computer output. The num- ber of these benchmarks which a computer can process in a unit of time is the output of the computer system. Appendix A contains basic examples of five separate compo- nents of a benchmark. These could be combined or weighted by the relative frequency of occurrence of each component. These frequencies could be determined via professional judgment or statistical analysis of past jobs on a given computer system. This benchmark method involves the speci- fication, in minute detail, of the actual task to be performed, the data to be processed, programming of the task, and its execution on the various computers. This method is therefore costly to employ on an ad-hoc basis. One firm, 26 Auerbach Information, Inc., publishes data on a set of benchmarks which the company has specially prepared.1 Unfortunately, the Auerbach firm does not have benchmarks for the majority of computers in the sample. Program kernels are programs or parts of programs such as: comparison of detail transactions with master file and sequence-checking of both files, table look-ups, and block data transfers. As a practical matter, in defining either benchmarks or program kernels, one can be as precise about the methods used to complete the task one chooses, but as more constraints are placed on such methods, the specialized features of various computers have less impor- tance. Thus, in both kernels and benchmarks, the design of the output measure includes the subjective valuation of the analyst as to the importance of special machine capabili- ties and specialization. In addition, both methods not only measure the computer's production but also the ability of the programmer to produce efficient code. As with benchmark programs, no standard kernel programs are avilable which have been processed on machines of the type included in this study. The cost of such an undertaking was considered pro- hibitive. Instruction mixes are combinations of individual instructions, such as: add, subtract, transfer or branch. lStandard EDP Reports (New York: Auerbach Informa- tion, Inc., 1970). 27 Each instruction performs a Specific task in a predictable way. Each of these discrete tasks can be timed (usually by the manufacturer of the computer or the times are available in the professional literature) and the Speed of particular instructions is available. Each of these instruction times can be weighted by their relative frequency of occurrence. MD Such as S = WiI. where S is the speed of the instruction i=1 mix, IIMU Wi = 1, and each Wi = the relative fequency of l i occurrence of instruction 1, n is the number of instruc- tions under consideration, and Ii is the time to complete the i22 instruction. This method would yield a measure of the Speed of the individual computer under consideration. To find a measure of output of the computer, one would use Q = l/S, which yields the number of such mixes per unit of time. If the instruction times were in units of micro- seconds, the output would be the number of instruction mixes performed in one microsecond. It Should be noted that the instruction mix tech- nique is somewhat analogous to the previous methods of actual jobs, kernels,and benchmarks, since these are in effect collections of instructions. The primary difference between the instruction mix technique and actual jobs or kernels or benchmarks is that the instruction mix does not have a specified purpose. That is, the mix, if actually 28 performed on a computer would not produce an inverted matrix or a transfer of a block of data. One could examine a benchmark or kernel or job and determine the relative fre- quency, Wi’ of each instruction or class of instructions. Then, by using the weight Wi multiplied by Ii’ one would have an instruction mix. One advantage of constructing an instruction mix is that one need not actually run the mix on the computer. It is possible to Obtain the times, as indicated earlier, from engineering data. Given the timings of instructions and the weights or relative frequencies of each of the instruc- tions, the speed and output per unit of time can be deter- mined for any individual machine. This method is less costly than the previous methods because of the saving in computer time. This method is easy to apply to a wide range of machines which may be unavailable for physical test, such as machines no longer in general use or those in the develOpment stage and not available to the public. In these cases, the engineering data would be the only avail- ahde information the economist has at his disposal. Another axivantage of the instruction mix technique is that it elimi- nates the confounding influence of quality differentials in Programming. Since computer programming skill varies, even a Specific task can be done with various instructions being executed or the problem solved in a Shorter time period. This confounding influence of confusing computer output and 29 programming efficiency is suSpect in either the kernel or benchmark, or actual job method. Since the instruction mix is not designed to perform a Specific task, the programming ability of the analyst does not influence the output measure. Some well-known difficulties do exist with the use of instruction mixes. The first difficulty lies in the fact that some of the actual execution times for various instructions, e.g., floating-point instruction is not con- stant, in practice an arithmetic mean of many sample times is typically used as the time for the instruction. A sec- ond difficulty is that some instructions are specialized in. nature and not directly comparable across machines. Thus, one finds himself using a "generalized" or"representative" set of instructions. Because the weights used for each instruction are commonly based upon some form of dynamic trace of instructions on an individual machine, the appro- priateness of the set of weights, W, for alternative Computers is assumed to be relatively unimportant up to the C31aSS of jobs categorized as business data processing or SScientific computation. A fourth problem arises Since machines are designed in quite different ways. That is, different number of registers, word sizes, fixed and Variable length words, and Single- and multiple-address Ihoqic, exist across machines.2 Even though these problems ‘ 2R. A. Arbuckle, "Computer Analysis and Thruput. EValuation," Computers and Automation (January, 1966), pp. 12-19. 3O exist, in this study a method analogous to the instruction mix technique has been chosen because this method uses data on machine performance which are available and the other methods require operational data which are not available. The method selected includes the use of the fixed point addition instruction time, storage-retrieval cycle time, and word size. The complete add time is the time required to acquire from memory and execute one fixed point add instruction using all features such as overlapping Inemory banks, instruction look-ahead and parallel execu- tLion. The add is either from one full work in memory to a rwegister, or from memory to memory; but not from register ‘tc> register. Thus, the add time is the number of micro- seeconds normally required to perform one addition, of the b + c, upon fixed point operands at least 5 typea== (or an equivalent number of bits) in length. de cimal digits ZXJ.1. the execution times include the time required to access bC>tfln operands from working storage and store the result in working storage. This insures valid comparisons between CCDDHEJuters with one address, and multiple-address formats.3 Storage cycle time in the context of this study is related to internal core storage only. For core storage, Cycle time is the total time to read and restore one storage \ 3Auerbach Computer Technology Reports (New York: 2. Ame1Tk>ach Information, Inc., 1969): Chapter 11' P' 31 It is the minimum time interval between the starts This measure word. of successive accesses to a storage location. must not be confused with access time. Access time is the interval of time between the instant when the computer or control unit calls for a transfer of data to or from a storage device and the instant when this operation is com- pleted. Thus, access time is the sum of the time interval when the computer or control unit calls for 1. transfer of data and the beginning of trans- mission, and G 2. the time it actually takes to transfer the data. (:ycle time is composed, in part, of access time. In addi- trion, it includes the time to restore the original data rwead. Cycle time in general will be longer than acceSs trifle by the amount of time needed to rewrite the work just read before another read operation can be initiated.4 The length of each computer word will be defined as ‘tllee word Size. Word size is expressed in terms of the maximum number of binary digits or decimal digits or alpha- numeric characters the computer word can accommodate. It 18 similar to the maximum number of numbers or letters which can be typed on a Single line on a typewriter. Just as some typewriters have longer or shorter carriages, so M: 4Computer Characteristics Review (watertown, asSachusetts: Key Data Corporation, 1969) . 32 computers have different Size words. In some computers the word size is not fixed but iS variable. For variable word length computers, data is usually presented in the form of the number of bits, digits, or characters which comprise a byte. The timing of these variable length instructions is on the basis of bytes. The method used to determine computer output can be presented by: OUTM = {(CTle + ATle 1*wsTl}*HRs- ij 1 l 1 2 1 j where OUTMij is the output of the iEE-type of computer at the jEE-installation where CTi is the cycle time in micro- seconds for the i22 computer, ATi is the add time in microseconds for the iEE computer and WSi is the word size in bits for the 1E5 computer, and HRSj is the number of hours the jEE installation Operates its computer per month. The term CT;1 or AT;l indicates the number of cycles or additions the computer could process in one millionth of a second. The number of cycles in one hour would be CT;q*(3.6*109), the elimination of the constant term (3.6*109) in the output measure does not harm our results Sirmce it is simply a scaling of the output variable. As stated earlier, the constant term in the production function includes this constant scaling factor. The output measure is a combination of three influé ences in the design of computers. We are concentrating 33 on the output of the central processing unit, CPU, of the computer. Since much of the total system (CPU, storage modules, peripheral units such as printers, types, drums, and disks) is dependent upon the output of the central processing unit, we are justified in considering this some- what restricted view of the computer system. The method used produces numerical values which are similar to measures produced by the alternative techniques of actual jobs, kernels,and benchmarks. For example, a group of computer specialists using a method of actual job comparisons found the relative output of two IBM systems (370/155 and 360/75) to be in the ratio of 1.60 to 1. That is, the IBM 370/155 would do 1.6 units of work in the time it would take to 5 Using produce 1 unit of work on the IBM 360/75 system. the adopted measure of output, the IBM 370/155 would pro- duce 1.345 units of computation and the IBM 360/75 would produce .75 units in one microsecond. Thus the relative tratio of Speeds would be 1.79 to 1. Since the adopted method was derived for second generation computers and the systems in question are third generation, a problem of cxnnparability might exist but the instruction mix method appears to be acceptable on the grounds of producing meas- ures which are consistent with other methods in current use . ..k 5M. Sewald, M. Rauch, L. Rodick, and L. Wertz, "A P_I_‘agmatic Approach to Systems Measurement," Computer Deci- E12555 (July, 1971), p. 39. 34 The three influences of add-time, cycle time, and word size are used to measure computer output. Add-time measures the ability of the machine to process a unit of data. This processing of data is what a computer accom— plishes. To the extent that a Single instruction does not represent a whole set of possible processing functions, the output measure will not reflect what work is actually done; this is the basic objection to the use of any instruction mix as stated earlier. It is assumed that the add-time is representative of the set of instructions and that the divergence from a "representative" mix is small. Cycle time is included to account for transfer, acquisition, and distribution of data internal to the com- puter. It is not enough to process the instruction (addition); the machine must also be able to acquire the data to operate upon and return or transfer the data prior to future Operations. The cycle time measures this func- ‘tion and is more than a single instruction. The cycle time indicates what limits are placed upon the speed of the ImaChine to process instructions, since the data are usually Imanipulated during the other processing Operations. WOrd size enables us to compare machines which have cttfferent word lengths. Since add time and cycle time are 1J1 units of microseconds per word, comparisons of machines With different word lengths is unacceptable. The unaccepta- bilgity arises because computer words which are longer 35 contain more information and bigger words require more time, other things equal.6 The word Size is measured in bits and the result of the multiplication Of the weighted sum of add time and cycle time by W8;l yields the speed in terms of bits per microsecond. In this way, we have removed the objection to the use of add or cycle time in computers with nonstandard word Sizes. The terms W1 and W2 are included to weight the importance of cycle time and add-time. Because computer engineers have not produced a scientific measure of out- put, it was considered interesting to see how sensitive the model would be to various sets of weights. In the empirical chapter, results are presented assuming various weights for add and cycle time. Thus far we have determined the unit of output for a particular computer, OUTMi ='{[c'r;1wl +CAT11W2]*W5;1}. This indicates the units of output which would be produced :h1 one-millionth of a second; the number of units produced 131 one hour would be OUTMi*(3.6 x 109). For a particular cunnputer center, say, the jEE.center, the number of units 0f (output would be the output unit measure, OUTMi, E 6Gordon Raisbuk, Information Theory (Cambridge: Massachusetts Institute of Technology Press, 1966), p. 8. 36 multiplied by the number of hours the machine is operating, HRSj. The number of units of output produced by the j35 center during the sample period of one month is given as l 1 - —1 W1 + ATi w ]*wsi }. OUTMij = {[CTi 3 4; Factors o£_Production The basic input factors are assumed to be labor and capital. These factors of production are combined to produce "computer output," which was defined in the previous section. Labor input is probably the most important factor of production under management control which can alter the level of costs. This is the factor which management can adjust upon relatively short notice in order to meet the operational objectives of cost minimization. The labor. service required in the production process is clearly differentiated into three groups. The first group is the administrative or management group. This group is responsible for the long-term plan- Irtng, day-to-day management of the facility, and the achievement of productive objectives. This first group Generally is comprised of a director and several others th> are responsible for functional areas such as machine (Kmerations, business matters, programming, clerical duties, telJeprocessing and on-line data acquisition and control, etc. 37 The second group is the programming staff. This group includes those personnel who are responsible for the genera— tion of general purpose or application programs. General purpose programs are used numerous times, usually with different sets of data and control cards. Examples of these are statistical packages, mathematical programming, circuit analysis,and numerical analysis algorithms. Also in this group are systems programmers. Systems programmers are responsible for generation and maintenance of the set of programs which control the operation of the computer. These programs control the flow of work to the computer, assign various hardware components of the computer system to specific tasks, call standard routines for programs, and maintain accounting records of the computer's Operation. Systems programmers also are typically responsible for the various language processors (compilers/assemblers) which convert user written programs to a form the computer can "understand." Because the maintenance of these programs requires a knowledge of the Operating system and the machine language used to produce parts of the compilers, the system ;programmers are in charge of this task. Programmers also act.as consultants to those who are writing their own PrOgramS and need assistance. The third category includes the operations staff. C)Perations staff include such tasks as key-punching, gerueral tab-room card preparation, clerical work (both 38 administrative and those tasks related to physical handling of card inputs to and outputs from the facility), computer machine Operators, and maintenance personnel. These three groups of employees--administration, programming, and operations--are the components of the labor service input. They are directly controllable and their use rate can be altered to meet the center's Objective. Capital usage includes the service of the elec- tronic digital computer and the associated equipment such as the tape disk and drum devices, the on- and off- line input/output units, and communication modules. In addition, the capital usage includes the physical building, air conditioning, Office space, tab-room and keypunch equipment. Since computers are often rented or leased under agreements of an original 24 month lease and open termina- tion on 6 months to 30 days' notice, facilities often have the opportunity of changing machine models or manufac- turers within a Short planning horizon. Although it is possible to adjustythe use of capital equipment by switch- ing to alternative manufacturers, such a procedure is less likely to occur in practice. A locked-in effect results after a particular manufacturer is first selected. This locked-in effect occurs because programs are special- izeéi factors and cannot easily be changed to function on an Etlternative manufacturer's machine. Further, specialized 39 program packages usually take advantage of special features of one machine which are not available on other machines. This locked-in effect is not directly considered in the model and it is assumed that such influences do not drastically influence management planning. 5; Output and Input Price The previous discussion centered on the output and inputs of the stochastic production process; this section is about the unit price of these variables. Prices of the factors of production are included in this study but the Ixrice of computer output is not. In the context of university computer centers, it .153 apprOpriate to assume that factor prices are determined <3J1E labor service, administration, programming and opera- 1::ions are determined in the local geographic area where 1:11e center is located. These prices vary from center to <3€3nter because of geographic immobility of these employees. The next price to consider is the cost Of capital f Or the university computer center. We are concerned with the unit cost of capital or the unit price of capital. The most probable assumption seems to be that cost is 4O determined on a national level in terms of a general inter- est rate structure, given the risk of default. This price of capital is assumed constant for all universities and colleges in the sample. If any differences exist between universities, it would be due to special tax treatment of the bonds issued in the states, market imperfections, or the risk of the issuing agent and not systematically related to the Operation of the computer centers in ques- tion. Computer centers in institutions of higher educa- ‘tion do not directly charge individual users for computer cautput. The center could use a scheduling algorithm or £>riority system which gives better service, in terms of ifaster production, to various users. By adjusting the Jyength of time between submission of execution of a E>rogram, the center can encourage or discourage computer visage. This is equivalent to charging an implicit price Of the output to the user Since waiting time is a cost. 'dIhis implicit price influence is assumed to be small with Irespect to total demand for computer output and such SCheduling systems only alter the relative number of pro- grams in each job category. Such a change in mixture of jObS is assumed to be independent of the total production <>f5 the facility. Under these assumptions, output price is not controlled by the computer center. In this respect, the installation is in the same position as the firm in a 41 regulated industry where the rates are determined by the regulator's administrative fiat. In the next section, we will discuss the effect that the lack of output price has on the construction of a model for the production process. It will be Shown that it is possible to model the process without the use of output price information. 6. Behavioral Constraint The computer center is regarded as a firm with the objective of minimizing the cost of producing an expected level of output. The assumed objective of cost minimization is preferable to profit maximization because the computer center does not charge users for computer output and thus does not produce revenues. Cost minimiza- tion is also preferred to the alternative assumption of output maximization for a given level of cost. The st minimization model and one Should note that both the fjxrst and second order minimization conditions are not irifluenced by a change in product price if the level of expected output is unchanged. If the expected level of 011 tput changes, only the absolute level of resource use is altered and input ratios are constant for homogeneous PI‘Oduction functions . 43 Z;_Specification of the Cost Minimizatibn Model This section contains a model of the technical relation between the inputs and output and the behavioral constraints presented in previous sections. It should be remembered that the objective of the computer center is to minimize cost subject to a given level of expected output. The basis of the cost model can be seen from the following analysis of the production system. Given Rt : Q = f(Xi) i = l, ... n inputs, where the operator f is a set of rules to obtain the maxi- mum output from various inputs. If we assume pure competition in the factor markets, we have Pi i=2, 3, 0.. n R. : -'= g (X.,X.) i > j; e 3' f 3 1 j = 1, 2, n-l. Re is the economic rule which indicates how to obtain the lowest cost for a given level of output. The gf Operator, ‘which restricts our use of inputs to those which minimize lcost for a given level of output, has the subscript f to :note that the cost rule must incorporate the production rule. These n - 1 equations determine factor demands. The rule DC is the definition of total cost and can be Written as 44 The cost equation and the notation used above allows us to demonstrate the relationship between cost, predetermined output, and prices:7 c = f (Q, P1, P2, ..., Pn). The cost function includes the definition of cost, the technical information from the production function, and the cost minimization rule. The next step is to define the operations in Specific functional forms. Beginning with the Cobb-Douglas production function which was introduced in Section 2, we have 1 13 where the subscript j refers to the jEE computer center and ‘Uo is a random variable normally distributed with zero mean 7The following procedure leads to the Shephard-Uzawa (duality theorem discussed in Chapter II. (1) Rewrite Re :for each input r in terms of one of the remaining inputs as )tr = g; (Pixi) which is the constant output factor demand Eschedule for Xr' (2) Rewrite Rt as Q = f* (9;), the f* maps t:he factor demand schedules to output. (3) Solve the cost definition Dc in terms of output and factor prices. Trans- Posing the previous relation and solving for 9;, we have here g? = hr(Q) for all inputs. Using the definition of Dc, We have c == H[hr(Q), p1, P2, ... P ] or C = f(Q,p). n 45 8 The remaining symbols - 2 and var1ance o and E(UQiUor) — 0. have been defined earlier. The rule for obtaining the low- est cost for a given level of expected output can be obtained from the Lagrange technique as follows. From the Lagrangian function n L: : PiXi-A[E(Q) -QO]. Since output is stochastic, as explained in Section 2, the mathematical eXpectation of output is entered in the con- straint as E(Q). The first order conditions for constrained cost minimization are 8L 8Xi 83(0) 1 = l, ... n. 8X1 = Pi-’1 Iuecall that the inputs have a perfectly elastic supply for ee1ch center and the prices are known with certainty as :stated in Section 5. We will assume for the moment that tile marginal productivity conditions are deterministic and k 8The assumption of normality of the production func- tixon disturbance may create some estimation problems. The Hmtltiplicative log normal disturbance used in the transforma- ticzn of the production relation may cause attention to be Shcifted to the conditional median rather than conditional meain which is of interest here. Goldberger states that in Préictice the minimum variance unbiased estimators which accnount for this fact may not differ detectably from those Whgxch do not, and we assume the simpler specification in this case. For more information, see J. Goldberger, "Inter- pre tation and Estimation of Cobb-Douglas Functions," Qflzglgmetrica, Vol. 36 (July-October, 1968), pp. 464-72; and IL’ iiellner, J. Kmenta,and J. Dréze, "Specification and 46 and that adjustments to optimal levels of inputs are instan- taneous. After eliminating A, we obtain (n-l) independent relations 3-3.11 1 i Xk The whole system then consists of n + 1 equations 35:513.}. ifk,k=l,...(n-l) i i xk n C: 2: PK i=1 11 n cl UO Q=A H X 8 i=1 twith unknowns xi, 1 = l, ... n and c. We have completed Re, the first-order condition for a relative extrema point, assuming that this is also the glxobal extrema in the constrained feasible region. Now we: proceed to the second-order conditions. To economize cm; exposition, only the two factor case will be considered in detail, but the generalization to n inputs is also dis- cusssed. Defining f as Q = f(xl, x2), the second-order condition for constrained minima is that Eat‘l-imation of Cobb-Douglas Production Function Models," mmetrica, Vol. 34 (July-October, 1966), pp. 734-95. 47 £11 £12 -Pl f21 f22 ‘92 > 0 ° 8f _ 8f Where fi — gig, fij — giggig-and Since Pi = Afi, we can multiply column 3 by 1/1, row 3 by 1/1 and the determinant by AZ, and we preserve the initial value of the determinant which is 2 - if f22 f2>0. We need only examine the value of the bordered Hession determinant, since 12 > 0. Thus, 2f’ f f — f f2 - f f2 > o 12 l 2 ll 2 22 1 ° Given positive marginal products from the first-order conditions, f'i i < 0 (marginal products decreasing) is a sufficient condition for second-order stability, but not a necessary condition. What is necessary is that the iso- quant is concave from above in all directions, which is eQILivalent to establishing diminishing marginal rate of technical substitution. This is demonstrated next. 48 OMRTS 3X1 f Define MRTS = [fl] . To demonstrate that 2 <0 for all i is equivalent to the second-order condition, we write f ; 3X 3X 1 __£ - __£ 315:) ”(in * £12 [MID ““12 " f2?- [.le = < 0 . ”TX?" f; 3x2 f1 . S1nce if: = -f; , 1.e., the slope of the isoquant, 3 £1 E. f f 2 = -1—- f ff + f - l f + f - l < 0 3x 2 21 11 12 f“ )1 ‘ f1 12 22 f” - 1 f2 2 2 f2 Multiplying by f— , we have 2 f. ——____ — l_. 2 — + f f2 < 3x1 ' f3 [f2f11 2f12f1f2 22 1] 0 ° 2 Vkith f: > 0, the bracketed term is the negative of our second-order condition and we have demonstrated our objec- tive. The necessary condition for second-order stability is °° 'Phe second-order condition is satisfied for the Cobb- ZDouglas production function if alaz > 0. No restriction is placed on the sum of the output elasticities as compared to profit maximization where the sum is restricted. It should be noted that no allowance has been made for institutional or other restraints on the cost minimi- 211mg activity of the computer center. This seems justified 50 Since little if any union activity or other administrative intervention in the management of the centers existed. This clearly is not the case for other regulated industries and in fact is the basis of the Averch-Johnson (1962) model of the firm under rate of return regulation. Examining the cost minimizing rule more closely, we note that we have implicitly assumed the desired amounts of inputs to be those which are actually used. The rule Re becomes more realistic and complicated when the desired and actual quantities of inputs are not identical. We could expand the nature of the 9f Operator, but gf would lose some of its economic interpretation as a deter- ministic procedure for management to follow. A much more interesting approach is to represent the cost minimizing conditions in terms of desired levels of inputs. A partial adjustment relation is assumed to exist between the actual and the desired levels of inputs. The desired values are denoted x; and are not directly observa- lale, but the actual values Xi are presumably being adjusted tn: Xi. The explanation of the incomplete adjustment of x1 :is that adjusting inputs has a cost and it takes time to adjust factors [see Griliches (1967)]. It is possible to write the partial adjustment model as: = (X Yi 1.1-. X. * lrt/xi,t-l i,t/Xi,t‘1) 51 where Ui,t is a random influence in adjusting the actual to the desired levels of input. It is due to randomness in the availability of extra or overtime labor or unknown variability in transaction time to sell or acquire inputs. Ui,t is normally distributed with zero mean and constant t I are assumed to be statistically independent of U variance oi, and E(Uirtijt)= 0, i # j. The Ui terms o,t' the production function disturbance. This independence may be justified on the grounds that the production function disturbance originates from computer hardware and operator failure and is technical in nature but the Ui,t terms are the result of stochastic nature of factor availability and transaction time. The adjustment coefficients are defined in the bounded region where 0 < Yi Z l and 71 is the rate of adjustment of Xi to XE. If we solve the equation for X2, we have y. y. * = 1 1 2‘. X t l’t ilt if vvhich expresses the unobservable Xi t in terms of current I aind lagged observable Xi's-. The marginal productivity Chanditions in terms of observable quantities then become 52 Hid—11') Y Y Y J J i ii = 2L xj.t xj,t— 1 e P5 L] Yi 1 [3L2] i 75 j Y Yil Y J xil t Xirt e The inputs are three types of labor, and it is reasonable to assume that the costs Of adjustment are equal. This assumption will result in y's which are similar. Consider the case where Yi = 7., the marginal productivity condition 3 P. a. Xt . . * PJ a] X1 ._ .1 1 U. P- a. x. XY’l e 1't Y (1') becomes -£ = —£~ 3ft 1f§71 . . 0],. U. 3 J Y'1 Jlt — c—i :Let us return to equation (1) and note that the marginal [productivity conditions are deterministic in desired inputs. IIf managerial error, inertia, resistence to change and the ILike are present, we find that the equation determining the desired factor input has an additional disturbance, + Ui_ t firN(0,oi), for the (n-l) marginal productivity equa- I tions. Equation (1') becomes 53 1 —-— + y i,t + Uj,t + YUi,t 1 ( Y ) y-l j,t- Y-l i,txi,t-l (1") Bit mltd .p—n- 9|J2 LJ LJ. r _X._ X X h which is the general case. The Special case of instantane- ous adjustments yields 4. Pi _ oi Xi,t] e(Ui't + Uj't + Ui't) With instantaneous adjustments the equations remain stochastic. Since we are concerned with the possible bias introduced via the stock adjustment of inputs, the cost functions will be derived using both equations (1") and (2). In the data analysis chapter the possible bias introduced by the omission of relevant lagged explanatOry variables will be considered. First, considering the instant adjustment case 'the marginal productivity condition can be rewritten as O 0 U? + U. K. = P -—a1 X e( 1't J't) 1,t Pi a. j,t + * - I I I innere Ui, — Ui,t + Ui,t' Der1V1ng the fixed output t deunand function for Xi using the production function, we haXIe 54 aj(Ul t + Uth) ,-, Uola Xi,t %<_l>- @3156) :1 r J ai. This equation is homogeneous of degree where r = i "Mr-'5 1 zero in prices, which is the desired result. The final procedure is to solve for total cost in terms of input prices and output. For the instantaneous adjustment case, we have Taking logarithms of both sides of the equation and expand- ing the combined stochastic disturbance term around the mean of each disturbance, via Taylor series, results in -1 n a. —— (3) 1n C = 1n r(A H d.l) r + (l) 1n Q t i=1 1 r t n 'ai n UO ‘1" X — 1n Pi + { 2 U* _ E'— + 1n (2)} i=1 r i=1 1 For the non-instantaneous adjustment where 0 '< Y < l the constant output factor demand function is mm? +. U-) -1 Y 1 21, 1 J , Q [J (1.9] . l,t t J t’ >< 55 * = . where Ui,t ul't :1, _ a —U n Oi r % (IEl) n —% —f3 n U* Ct = rY(A H a. ) Qt Qt-l H P1 {e ( 2 e 1)} i=1 1 i=1 i=1 n n (l-Y) (l-Y) ' [.2 Xi,t-l) (.3 Pi J ° 1:1 1:1 Taking logarithms of both Sides of the cost function hand- ling the error term as in (3) and approximating the lagged .input and price summation terms via Taylor expansion around ‘Y = 1 and dropping higher order than on terms of y, we have nae; L4) 1n c = {[(n+1) - ln(2)] + [1n rY A n a.1 } t i=1 1 l Y-l) + (r) 1n Qt + < r 1n Qt-l n ai -l n + :l[(—; + 12—) 1n Pi] + i: 39.} r 0 Equation (3) is the single period cross-section model and ecIl-lation (4) includes observations on the previous period '11‘E3uts and output. Of course, if we have an equilibrium 56 situation where y = 1, equation (3) would be equivalent to (4). The apparent difference in the intercept is influ- enced by the remainder in the Taylor series which is not zero. The effect of the omission of possibly relevant explanatory lagged variables is examined in the next chapter on data and estimation. 8;_Summary This chapter considered the problem of Specifying ea model, the production of computer output. The maintained kiypothesis of cost minimization subject to the Cobb-Douglas Ebroduction function was presented. Stochastic disturbances vvere assumed in the production function, the factor demand sschedules, and the stock adjustment process for desired \Iersus actual level of factor employment. Various inputs tx) the production process were discussed in terms of their rtelevance to the process. In addition, a method of Hueasuring the output from the computer center has been developed. This method appears to be satisfactory in com- Fnarison to the high cost alternatives. It has been demonstrated that a reduced form cost fianction can be derived from the production function and nMarginal productivity conditions which result from the c<>stminimization objective. This function is linear in the logarithms of cost and the explanatory variables of output and factor prices. 57 It was also shown that this cost function is con- sistent with the Shephard—Uzawa duality principle and thus contains all the information from the production function. Thus we can test the hypothesis of economies of scale Via empirical estimation of the parameters of the model. This estimation is the subject of the next chapter. WAVZJMHM. m1»:- . r, . _ - . .‘ a: CHAPTER IV DATA AND EMPIRICAL RESULTS _l__._ Introduction This chapter deals with the availability of data and procedures used to relate the model to the data. In the light of the needs for data on physical output, dollar costs and factor prices, it is shown that such data exists for a cross section of computer centers. Section 2 con- tains a discussion of the sources of data. A description of the methods by which these data were obtained is in Section 3, since these methods may influence the interpre- tation of the empirical results. Section 4 contains a discussion of the cost function and the statistical pro- cedures employed to estimate the parameters of the model and test the hypothesis of economies of scale. The fifth Section is an analysis of the statistical bias which is introduced by the omission of relevant lagged explanatory Variables. The empirical results of the estimation pro- cedure and hypothesis test are presented in the Sixth Section. Section 7 is a summary of the chapter. 2; Sources g_f_ Data_ In the first quarter of 1966, the National Science Foundation contracted with the Southern Regional Education 58 59 Board to collect data on the Computer Sciences Project. This project concerned the very rapid expansion of the computer facilities of institutions of higher education. Some government officials became aware of the increased demand for trained computer scientists and technicians as complementary factors of production in the defense and Space industries. The Egg entailed the develOpment and testing of a questionnaire "which could be used to provide the kind of information needed for future planning of relevant Government agencies."1 The study was to deter- xnine the sources and uses of funds for instructional aactivities and research in colleges and universities in tzhe United States. An inventory of computers was also Eprepared. The fiscal year 1965 was used as the sample Ipoint for all actual expenditures which are the concern of 'this study. The data available are the 1964-1965 expenditures fkor equipment rentals, rental or amortized cost for building Space to house computer activities, and maintenance costs rust included in the previous two categories. These three (Lategories are summed for each installation to form the txbtal capital cost of Operating the center. Added to this Elire the salaries and wages of all personnel, other direct CNDstS of materials and supplies. Unit wage rates for three \ 11. Hamblin, Comppters ip_Higher Education (Zthanta: Southern Regional Educational Board, 1967), p. 21. 60 categories of employees--administrative and other profes- sions, systems and utility programmers, and all others (keypunch, machine Operators, clerical, and technicianS)-- are available for each installation in the fiscal year 1964-1965. Each institution reported the number of hours the facility was operated in a typical month. Since it is assumed that the cost of capital for all computer centers is constant, the true cost function can be written as 1 Y-1 = * _ ___. (4.0) 1n Ct A + (r) In Qt + ( r ) ln Qt-l 3 a. + Z [(_i.+ l_£) In P. ] ._ r 2 1ft 1—1 4 1:_1 + .5 I< 2 > In xi..-11 + Vt 1—1 3 U0 Where V = { E U? - —} and t -_ 1 r 1-1 an _ 4 Oi 3% Z\* = (_E.1n Pn) + 2.76 + [1n rY(A H Oi ) i=1 th “finere capital is the n factor and Ph is the constant Clost of capital. . Technical engineering data on the machine charac- t6Bristics of various computers are available. The source (”if these data includes the trade journals for data on new 61 machines.2 Specialized publications contain detail perform- ance data for a wide range of computers.3 The manufacturer of the computer publishes performance data.4 It is possible to perform experimentation on computer systems and measure performance on a case-by-case basis.5 Fortunately, all the computers in the sample were listed in the specialized publications. Data on word size, add time, and cycle time for each machine were obtained from Computer Characteristics Review. With the questionnaire data on the number of hours the computer was in Operation, it was possible to construct an output measure for all computer centers using the method presented in Chapter III. 3;_Method pf Data Collection Since the data on cost, hours of operation, and factor price were collected via questionnaire, it seems appropriate to discuss the methods used. In early 1966, 2Such as Electronics News (New York: Fairchild Publishing CO., 1972); or Com uter Components Review (Nor- 1969) wood: Commander Publishing, ; or Data Products News (New York: Data News, Inc., 1970). 3 Auerbach Computer Reviews and Ke Data (New York: Auerbach Information, Inc., January, 1968). 4Such as IBM Technical Ppblications (White Plains: Technical Documentation Center, International Business Machines, 1970). 5An example of a study of this nature is the Mobile 911 Study prepared by Sewald, pp. 21., pp. cit., comparing various IBM computers in the Mobile Oil Corporation. ' 62 the National Science Foundation let contract NSF-C465 which required the Southern Educational Board to finalize the questionnaire, disseminate it to the institutions, process the returns, and summarize the results. This questionnaire had already been developed by the Mathe- matical Sciences Section of NSF. The Bureau of the Budget recommended that the National Center for Education Statis- tics of the Office of Education draw a "stratified system- atic" random sample of approximately 700 of the 2,219 institutions of higher education which existed in fiscal year 1965. The questionnaires were mailed in mid-July 1966 and follow-up reminders were sent in September, December, and the end of January, 1967. Eventually, 669 institutions responded. Since this was the first survey of sources and uses of funds for computers operated by inStitutions of higher education, it posed problems for‘ those administrators who had to complete the queStionnaire. Hamblin states, ". . . though the temptation to use a random number generator might have been strong at times, a high percentage of institutions made an honest effort to obtain and report accurate figures."6 His team edited, cross-checked, and in some cases phoned the various respondents to check accuracy. He stated that machine ¥ 6I. Hamblin, "Expenditures, Sources of Funds, and Utilization Of Digital Computers for Research and Instruc- tion in Higher Education, 1964-1965 with Projections for 1968-1969," Communications of the ACM, Vol. 7 (April, 1968), PP. 257-262. _—'——_'_—_ 63 rental and salaries were within the range of plus or minus 10 per cent of the true values. Unfortunately, we have no information on the accuracy of this estimate. Even though Hamblin checked the returns carefully of the 669 institutions responding, only 133 at most were available for this study. The other responses had to be dropped because they lacked at least one value for type of computer, Operation time, salaries, or cost. In a number of cases, the center was operating first generation com- puters and these centers were also excluded from the study. The institutions which did not fully respond to ‘the questionnaire may systematically cause an unknown bias ‘to enter the estimation of the parameters. We have no way (of determining if the institutions which were unwilling ‘to submit salary, cost.or utilization data are systemati- <:ally related to each other or to those who gave full .information. Of course, these data are of the question- :naire variety and all the customary caveats should be observed . 4. Cost Function The cost function (4.0) can be written as 64 _ 1 -1 (4.1) 1n cj - 21* + (f) In Q3. + (I—r ) 1n 03* 3 a. _1_ Y-1 + i [( r + 2 ) 1n Pi,j] 4 + 2 [(1%) 1n x335] +vj . lfiere the double asterisks represent lagged values of the \rariables for the j computer center. The remaining symbols aare the same as before. Some points about the methodology atre in order. First, recall that the maintained hypothesis iricludes the Cobb-Douglas production function and its fc>rmulation via cost minimization into the cost function (4 .0) and it also includes the assumption with regard to ‘tlie disturbance term.7 (4 .2) Vj is normally distributed (4.3) E(Vj) = 0 (4.4) E(VJ?) = 02 (4.5) E(Vij) o (i 76 k) C 4 .6) Non-stochastic explanatory variables 7It is assumed that the logarithm of the disturbance 1:erm V is normally distributed with zero mean and o2 , which <>f coufse implies that the distribution of (er ) is log- Iiormal in the multiplicative cost model. In some applications the influence of this positive Sikewed disturbance might be undesirable, but not in this <=ase. It is assumed that the median of the distribution is 65 (4.7) No exact linear relation exists between any of the explanatory variables and of course the number of observations is greater than the number of coeffi- cients to be estimated. Under these conditions, it is well known that the OLS esti- mators of A*, (%) and (oi/r) have the classical desirable properties, while the estimator Of A* and r obtains all those asymptotic prOperties of the estimator, while the small sample prOperties (such as unbiasedness) do not "carry over."8 Since the data were collected via a questionnaire, measurement errors may be present. If errors are present only in the dependent variable, cost we would have observed 3, the true value would be Cj' and they would be related as . C! = . + t (4 8) 3 CJ v3 where v. is the error in measurement of cost. If 2 vj m N(0,ov) and (4.9) E(v§,v§) = 0 (j ¢ k) less than the mean because of the influence of some exogenous demands which place heavy or peak demands upon computer sys- tems. For any individual system, a few peak periods such as end of terms or quarters will be reflected in skewed output distributions. These two influences reinforce the accepta- bility of the maintained hypothesis of positive skewedness of the output distribution in the multiplicative cost model. 8J. Kmenta, Elements pf Econometrics (New York: Macmillan, 1971), p. 458. 66 (4.10) E(V,.v?) = 0 3 3 we would have . _ 1 11 (4.11) 1n Cj - A* + (f) In Qj + ( r ) 1n Q3* 3 a- 1 1-1 + Z [(—;'+ 2 ) 1n Pi .] i=1 '3 4 -1 + z [(17) 1n xix] +n. i=1 '3 3 where .=V +v‘? with .’\IN002 ad 2=oz+oz. ”3 j 3' nJ ( . N) n on v* Thus (4.11) would be equivalent to (4.0), estimators having the desirable properties as given, with the note that the disturbance (nj) contains disturbances which influence the. economic relations in the cost function (Vj) and those due to measurement errors (v3). Because the measurement of total cost includes many components, some error is likely to occur; it is assumed that no error in measure- ment of the independent variables exists. It seems justifiable to assume that the independent variables are free of measurement error on two grounds. First, output has been defined in an engineering or sci- entific way. This methodology of "Operationism" is accepted in engineering.9 Of course, the answer to the 9P. W. Bridgeman, The Logic p£_Modern Ph Sics (New 'YOrk: Macmillan, 1928); and The Nature of PhysicaI Theory (New Jersey: Princeton University Press 1 67 question of the "correct" definition of computer output is unknown. Second, the data on salaries are quite good because of the institutional requirements such as federal tax and internal auditing requirements. If the independ- ent variables included significant measurement errors, OLS estimation would lead to inconsistent estimators of the parameters. We could use the method of instrumental vari- ables to Obtain consistent estimators, but no instruments are available. It seems best to assume no errors in i measurement or errors only in the cost data. “H—' .. The concern of the next two sections is to present an analysis of omitted variables and the empirical results of the statistical estimation of the output elasticities and to test the hypothesis of economies of scale. It Should be noted before we present these results that the maintained hypothesis includes all the assumptions which have been made but not subject to test and the results are conditioned on these assumptions. The appropriateness of using indirect least squares estimation and the proper- ties of the estimators which result are also dependent on the maintained hypothesis. The results of Section 6 present the multiplicative inverse of the sum of the out- put elasticities (%) and the individual output elasticities. In addition, the confidence interval for the sum of the output elasticities is also presented. The hypothesis of eeconomies of scale is tested using a t-test as follows. 68 The null hypothesis is that constant returns to scale exist, which occurs when the inverse of the sum of the output elasticities (1/2ai) equals unity. The alterna- tive hypothesis is that econOmies of scale exist. That is, the sum of the output elasticities is greater than one; only large values [(l/Xai) < 1] would constitute evidence against the null hypothesis. ‘5. Omission pf Relevant-Explanatopy Variables Using the logarithmic version of the cost function (where the asterisk represents natural logarithms of the variables), we have (1) C* = X*8 + Z*y + 8* . Here C* is the (n x 1) vector of observations on total cost, X* is the (N x K) matrix of observations of included inde- pendent variables, 8 is the (K x 1) vector of coefficients of included variables, 2* is the (N x L) matrix of observa- tions on excluded variables, y is the (L x 1) vector of coefficients of the excluded independent variables, and 8* is the stochastic disturbance discussed in the previous chapter. The cost model would be the classical normal linear multiple regression model if the Z variables were included. The estimators of 8 would be unbiased but, because the 2* variables are omitted, specification error is present in the cost equation resulting from the omission 69 of relevant explanatory variables. The error is due to the fact that no observations are available on the omitted variables and because of Specification error the possi- bility of bias must be considered. When (2) C* = X*B + s** is estimated but the true model is equation (1), we have A (3) E(B) (xvc'xwr'l x*'(x*e + 2*y) 3+ (x*'x*)"1 X*'(Z*y) . The 2 matrix can be partitioned as [Z*, 25 ... 2:] for the L omitted variables, which results in (x*'x*)‘1 x*'z#y. (4) E(B) = B + l J 3 "ML" 3' and the coefficient of the iEE included variable is A L (5) E(Bi) = 81 + .2 dtij 3:1 where K 6 z. = X d..x* + R. ' = , ... ( ) 3 i=1 )1 1 J j l L and Rj is the residual in the least squares regression (6) and since the X's are non-stochastic the equation is purely 70 A L descriptive. The bias of Bi is equal to 2 dtij and its i=1 Sign and magnitude depend on d.. [the empirical relation 31 between the included variable X1 and the excluded variables 3 excluded variable and the dependent cost variable in equa- z- in equations (6)] and Yj [the relation between the tion (1)]. A sufficient condition to establish the direc- tion of bias would require all the Signs of the (dji' yj) terms to be the same for positive and Opposite for negative bias. Consider the cost model develOped in the previous chapter which can be written = b* (7) C* O * * * * t + leE + b P + b P + b4P3,t + b P 2 1,t 3 2,t * + Y1QE-1 + szl,t—1'*Y3X2,t—1 + Y4x3,t-1 + Y5X3,t-1 + 5E - Since P4 is assumed to be the same for all computer centers, we have ba' = {b0 + bsPy} and equation (7) can be rewritten as w = *c * * * * * (3) c b + let + b P + b P + b P + YlQ t- t 0 2 1,t 3 2,t 4 3,t 1 * * * * * + Y2X1,t-1 + Y3X2,t-1 + Y4X3,t-1 + Y5X4,t-1 + 6t we estimate I 71 *=*u «x 1: e * * (9) Ct b0 + blot + b2P1,t + b3P2,t + b4P3,t + at . Since we are concerned with economies of scale and bl = (%) where r is the sum of the output elasticities, the b1 term is examined first. (10) E(Bl) = bl + dllYl + d21Y2 + d31Y3 + d4lY4 + d51Y5 where (11) QE-l = diloz + d12P1,t + d13P2,t + d14P3.t + R1,t (12) X;,t-l = d1.1,1QE + d1+1,2P1 + d2+1,3P2,t + d1+1,4P3,t + R£+1,t , l = 1, ... 4. The signs of the Y coefficients are presumed to be positive. The centers with large past period output and inputs have large current costs. Centers which are large are consis- tently large,and those that are relatively small in the current period were small in the previous period. The signs of the d terms are assumed to be positive. This £+l,l conjecture is reasonable since there is evidence that larger computer centers pay higher wages than the smaller 10 ones. A clarifying digression may be important before the empirical results are presented. The assumption of a 10"Annual Salary Survey," Computers and Automation (January, 1972), p. 43. 72 positive relation between current prices and both lagged output and inputs need not be a functional relation. Equations (5) are descriptive; all the variables in those equations are non-stochastic. Also, it should be remem- bered that each facility has its own geographically insulated labor market. Since it has been postulated that centers are relatively consistent in their output, it is possible to have a direct relation between observed current input prices and lagged input quantities and still retail downward SlOping factor demand schedules. If we consider the traditional factor demand schedule in the factor price and quantity space, we find the schedule shifts to the right for centers with larger expected out- put. With a positive SlOped supply schedule, we trace a positive SlOped locus in factor price and quantity space. This is what one concludes from the model and the data.' We see that E(bl) > b, when b1 is estimated from (9) and the expectation of the estimator of the sum of the output elasticities may have a negative bias because 81 = (%)'11 The expectation of the estimator of the returns to scale parameter is possibly less than would be the result if the model were correctly specified. It should be noted that the estimators of the price variables 11It should be noted that the small sample prop- erties do not "carry over" so that l/E(bi ) # E(l/Bi) 73 have a positive bias. Following the same method as used in current output, we have 5 (l3) E(bk) = bk + {jildj'ij} k = 2, ... 4 where the d's were defined in equations (11) and (12). The bias is positive since both the d.. and Yi terms are 31 I I I O 2 to remain p031t1ve as developed in the output case.1 6. Empirical Results The form of the regression results are presented as follows: = + e1; + (1;). + (1:). + (is); . .2 = (SO) (Spl) (Sfiz) (Sfi3) where the asterisk represents the natural logarithms of the variable, n is the number of observations, the "hat" (A) indicates parameter estimates and on C it is a reminder that equation holds for the fitted values of the variables. R2 is the coefficient of determination, 8(6) is the esti- mated standard error of the coefficient (17;) and the asterisk indicates natural logarithm of the variables. 12It is also known that the estimated standard error of the estimator contains an upward bias. The customary test of significance will tend to reject the null hypothesis less frequently than is appropriate at the given level of significance. See Kmenta, Econometrics, pp, 212., p. 314. 74 It is clear that the coefficients of the production function-structural equation can be determined from the estimates in the reduced form cost function. We have exact identification of all the structural equations. Thus the estimators of the production function parameters obtained via indirect least Squares on the cost function are maximum likelihood estimators. The regression used output as defined in Chapter III with the weight for add-time and cycle-time being equal (wl = w2 = .5). This resulted in the estimated relation for the cost function A c3 = 2.174 + 0.4280; + 0.676P* + 0.3082nP*, + o.2a9p*. 1j 23 3] (0.044) (0.178) (0.161) (0.143) R2 = .608 n = 115 where input 1 is systems programmers, input 2 is adminis- tration, and input 3 is Operational personnel, as previously defined. TABLE I.--Test Statistic with WA = we = .5 Value t-ratios Significance Level Output Q 9.637 1% Systems Programmers Pl 3.807 2% Administration P2 1.911 10% Operations P3 2.014 5% 75 It is clear from the computed t-ratios that we can reject the simple null hypothesis at the 10% significance level, that each coefficient is equal to zero against the alterna- tivehypothesis that the coefficients are not equal to zero. The Size of the coefficient of determination and the F-statistic equaling 43.672 reinforces the maintained hypothesis that the functional form of the production func- tion is satisfactory, but of course is not a test of specification error. It is possible to test the hypothesis of linearity of the logarithmic cost equation with either the Durbin- Watson test13 or the Yule-Kendall4 normality test. The Durbin-Watson test statistic is 115 2 d l 115 2 2 e. j=1 3 which was originally designed to test autoregression; 1.e., disturbances uncorrelated over time against the alternative .hypothesis of one-period autoregression. If we order the :residuals (ej) with respect to increasing values of the ciependent variable, we can test whether the residuals of ———r 13J. Durbin and G. 8. Watson, "Testing for Serial (narrelation in Least Squares Regression I," Biometrica fiIune, 1950), pp. 409-28. l4U. Yule and M. Kendall, Ag Introduction to the Theory g Statistics (London: Charles Griffin, 1950). 76 the regression are random as would be true for the dis- turbances if the population regression equation is linear. The calculated value for d is 1.69 and the critical region upper bound at the 1% level is 1.63. We cannot reject the null hypothesis of linearity. A less restrictive test is the normality test. This "turns" test involves counting the number of terms (P) defined as when ej_ < e. > e. or > e l 3 3+1 et—l t < et+1' where the residuals are again in order of increasing output levels. Under the null hypothesis of randomness the mean of (P) would be . _ 2(n-2) _ l6n - 29 0 . E(P) — ———§—— and Var (P) — -_—90——_ The alternative hypothesis of non-linearity is 2(n-2) H : E # 3 P-2(n-2)/3 /(l6n-29)/90 and the test statistic is m N(0,l) P = 70 and the test statistic equals 2.20. Using a two- tailed test, at the 1% level of significance, we cannot reject the null hypothesis of randomness and thus linearity. These tests are not Specification error tests of the function form of the production function but they tend to justify the maintained hypothesis. 77 The parameter estimates are used to investigate the economies of scale question. The null hypothesis is that the sum of the output elasticities is equal to one or (l/Zai) equals one. We will only accept values of Zao 1 > 1 as evidence rejecting the null hypothesis and thus the alternative hypothesis is that Zai > 1 or (l/Zai) < 1. Using a one-tailed t-test, we find HIP II }_a ) H : (%) < 1 N 1 (E) ' 1 0.428 - 1.00 "ET77"‘= .044 = ’13°0 (Q) and the critical region begins at 2.326 at the 1% signifi— cance level. I have no idea about the relative cost of Type I and Type II error. The 1% level in this and the previous linearity test was adOpted and can only be justi- fied on "popularity" grounds. We reject the hypothesis of constant returns to scale. We estimated the sum of the output elasticities (sometimes called the production function coefficient)15 to be 2.34 which indicates substan- tial economies of scale. 15C. E. Ferguson, The Neoclassical Theory pf Pro- duction and Distribution (London: Cambridge Univer51ty Press, 1969), pp. 158-63. 78 Since (l/Zai) was estimated rather than (Zai), the confidence interval for Zai must be approximated. An approximation develOped by Klein (1953) was used to obtain the large sample variance of the sum of the output elas- ticities. The general form of the approximation of the variance of 8 = f(§l, § 2, ... Bk) k f 3f A k a 2 A 3f A A var (a) k 2 L—n—] var (Bi)-+ 2 Z [(—w—)(—w—)] Cov (Bj'Bi) 1 Bi j IA n-k,.025 (25.) The output elasticities for the individual factors are given in Table II. TABLE II.--Output Elasticities Systems Programmer 5%%§%2% 1.581 1 . . . 3LN(Q) Administration 0.720 BLNzXZ) . 8LN(Q) Operations BLN X 0.675 3 The possible bias introduced by the omission of relevant lagged explanatory variables is a possible explana— tion for the calculated sum of the individual output elasticities being larger than the inverse of the coeffi- cient of the output variable. Appendix C contains a discussion of the influence of the relative weights of cycle time and add time. All of the estimated coefficients are positive as expected and the rather large output elasticity for systems programmers is consistent with economic theory when a factor such as systems programming 80 is used in relatively small amounts and the production function exhibits economies of scale.16 ZL’Summary In this chapter we outlined the method of data collection and indicated some of the procedures used in this process. The sources of empirical data were described. The cost function and its logarithmic transformation were discussed and some of the important consequences were noted. The empirical results indicate that these data reject the hypothesis that the production function for computer output exhibits constant returns to scale when confronted with the alternative of economies of scale. We have no evidence to support the hypothesis that the economies of scale coefficient is changed due to weights in the measure of computer output. The measure of output proposed in this work does not produce significantly dif- ferent empirical results than more expensive procedures used to measure output of computers. 16Milton Friedman, Price Theory (Chicago: Adline Publishing Co., 1962). CHAPTER V CONCLUDING REMARKS The purpose of the study was to examine the hypothesis of economies of scale in the production of computer output. The model which was developed using the Shephard-Uzawa duality principle contains the assumption that the objective of the computer center is to minimize the cost of producing an expected level of output subject to a Cobb-Douglas production function. Engineering studies, cited in the literature review, have attempted to discover a relation between cost and output. These studies did not systematically develOp models or test hypotheses but were restricted to analytic curve fitting. In contrast to the engineering studies, the model developed in this study specifically accounts for the optimization behavior of the facility and the technical relation between inputs and output from the production process. The model has five equations (three marginal pro- ductivity equations, the definition of cost, and the production function), and five unknowns (systems programming, administration, operations service, capital, and total cost). The parameters of these structural equations can be estimated by indirect least squares regression on the 81 82 reduced form cost function. The reduced form cost function is linear in the logarithms of cost and the explanatory variables output and prices of the factors of production. In deriving the model, the stochastic nature of the produc- tion process was noted and the disturbance due to unpredictable machine malfunction and operator error were explicitly included. Also, the factor demand equations were assumed to be stochastic rather than deterministic since it seems reasonable to assume that the computer center management can make mistakes in factor employment. The final stochastic element of the model is the adjustment process of the desired versus actual level of factor usage. The optimizing conditions only determine the desired level of factor usage. Since the level of factor usage cannot be instantaneously altered, a stock adjustment model represents the relation between actual and desired levels of the inputs. The introduction of the stock adjustment process creates a difficulty. If we had instantaneous adjustment, we would only need data on current prices, cost, and output. But with non-instantaneous adjustment we need information on lagged output and inputs. The cross section of data available from the National Science Foundation contains only 1965 data. Thus we have a specification error result- ing from omission of relevant lagged explanatory variables. The possible bias which is introduced because of the 83 specification was considered, and its influence on the parameter estimates was discussed. The most important finding was that the bias was positive with respect to the multiplicative inverse of the sum of the output elas- ticities. The estimates of the sum of the output elasticities has a negative bias. Under the specification of the model, we have exact identification, and the indirect least squares esti- mates are equivalent to maximum likelihood estimates. Indirect least squares estimation was performed on the cost function and the results are discussed next. The null hypothesis of constant returns to scale was rejected in favor of the alternative hypothesis of increasing returns to scale. The sum of the output elas- ticities was estimated to be 2.33. This seemingly high result was demonstrated to be consistent with economic theory in the cost minimization case. In addition, the output elasticities of systems programming is greater than one. It was noted that this is consistent with economic theory when increasing returns to scale exist and the factor is used in relatively small quantities. This is likely to be the case in our study. Because the elasticity was large and on general principles, we considered testing the hypothesis of log-linearity of the cost function. Using the Durbin-Watson and a generalized runs test, we 84 were not able to reject the hypothesis of log—linearity of the reduced form cost function. Comparing the results of this study to previous results, we find they are similar. The Knight and Solomon studies and now Musgrave find increasing returns to scale at the firm level. It should be remembered that we are concerned with the production of computer output rather than the computer manufacturing industry. Other studies on the firm level, which also use engineering measures of output, have found analogous results. Engineers use the "Six-Tenths Rule" in estimating cost as capacity output increases. Symbolically, two plants are related as .6 2 1 X1 where Ci is cost and Xi is capacity output of plant i. In terms of neoclassical production theory, the exponent is the inverse of the sum of the output elasticities. Both Moore [1959]1 and Alpert [1959]2 find similar results for the mineral and chemical industries. The computer engineers would have the ".43 rule" as a result of this study. But care should be taken not to extend these results too far. The sample data do not provide any information about the lF. Moore, "Economies of Scale: Some Statistical Evidence," Quarterly Journal of Economics, Vol. 70 (January, 1962), PP. 138-50. 28. B. Alpert, "Economies of Scale in the Metal Removal Industry," Journal of Industrial Economics, Vol. 17 (November, 1959), pp. 175-81. 85 production function outside the interval covered by the observed variables. One should not expect average unit cost to approach zero if he built one large computer to do the world's computation. Some implications from this study might be drawn. The first is that the method chosen to measure computer output seems satisfactory for our purpose. This implica- tion is especially strong when the cost of the alternative formulations are considered. The maintained hypothesis of cost minimization is clearly acceptable for university computer centers. The assumption may be correct for the majority of computer facilities in Operation today. Rela- tively few centers are service bureaus-~selling their output. Most computer centers are c00perating factors in firms and government organizations; with decentralized management, the organization of the computer facility can be thought of as a cost center which is consistent with our model of computer centers. This fact reinforces the generality of the results since the maintained hypothesis is close to reality. One would expect to find increasing pressures to centralize computational activity to a single computer center. I believe a study of such activity would confirm this hypothesis. Also, one would expect the increased output from larger computers to be in various forms. One form would simply be more pages of output or new applications 86 where lower unit costs make some applications feasible where previously they were too costly. An interesting form would be the improvement in the quality of the output. One would expect the accuracy of computational algorithms to improve and the presentation of results to be closer to what the human wants rather than what the computer dictates. One final caution is in order. This study should not be interpreted as a complete justification for centrali- zation of computer activities. Much more analysis would be needed to obtain a first approximation to the answer of centralization versus decentralization. The computer itself is only one aspect of the total computer center. A whole constellation of management issues need to be analyzed prior to centralizing any computer activities. The defini- tion of output in time-sharing systems and the influence of software availability are major issues that need to be included in any centralization versus decentralization discussion. The whole organizational structure of the center as a component of a total firm or university should be studied on a cost benefit or effectiveness basis. None of these issues detract from the findings of this study but mention of these issues may prevent the unwary from making unwarranted conclusions. The hypothesis tested in this study, like all hypotheses, is subject to further test. New data on 87 mini-computers or time-sharing computers (where the user employs a small terminal to interactively communicate with the machine) may alter the results. An alternative func- tional form of the production function could alter the findings. Until further study of these issues is made, the results stand. BIBLIOGRAPHY 88 BIBLIOGRAPHY Alpert, S. B. "Economies of Scale in the Metal Removal Industry." Journal of Industrial Economics, Vol. 17 (November, 1959), pp. 175-81. Arbuckle, R. "Computer Analysis and Thruput Evaluation." Computers and Automation (January, 1966), pp. 12-19. Arrow, K.; Chenery,}L;.Minhas,EL; and Solow, R. "Capital- Labor Substitution and Economic Efficiency." Review of Economics and Statistics (August, 1961), pp. 225-50. Auerbach Computer Reviews and Key Data. New York: Auerbach Information, Inc., January, I968. Auerbach Computer Technology Reports. New York: Auerbach InformatiOn, Inc., 1969. Averch, H. and Johnson, L. "Behavior of the Firm under Regulatory Constraint." American Economic Review, Vol. 52 (December, 1962), pp. 1,053-69. Billings, H., and Hogan, R. "A Study of the Computer Manu— facturing Industry in the United States." Unpub- lished Master's thesis, U.S. Naval Postgraduate School, Monterey, California, 1970. Bridgeman, P. W. The Logic of Modern Physics. New York: Macmillan, 1928. The Nature 9£_Physical Theory. New Jersey: Princeton University Press, 1936. Calingaert, P. "Systems Performance Evaluation: Survey and Appraisal." Communications of the ACM (January, 1967), pp. 13-16. Cobb, C. and Douglas, P. "A Theory of Production." American Economic Review, Vol. 18 (March, 1928), pp. 139-65. Computer Characteristics Review. Watertown, Mass.: Key Data Corporation, 1969. 89 90 Computer Components Review. Norwood: Commander Publishing, 1969. "Annual Salary Survey." Computers and Automation. January, 1972, p. 43. Data Products News. New York: Data News, Inc., 1970. Douglas, P. The Theory_g§ Wages. New York: Macmillan, 1934. . "Are There Laws of Production?" Presidential Address, American Economic Review, Vol. 38 (March, 1948). pp. 1-41. Durbin, J. and Watson, G. S. "Testing for Serial Correla- tion in Least Squares Regression I." Biometrica (June, 1950), pp. 409-28. Electronics News. New York: Fairchild Publishing Co., 1972. Ferguson, C. The Neoclassical Theory of Production and . Distribution. London: Cambridge University, 1969. Friedman, M. Price Theory. Chicago: Adline Publishing Co., 1962. Goldberger, J. "Interpretation and Estimation of Cobb- Douglas Functions." Econometrica, Vol. 36 (July- Griliches, Z. "Distributed Lags: A Survey." Econometrica, Vol. 35 (November-March, 1967), pp. 31-18. Hamblin, I. Computers in_Higher Education. Atlanta: Southern Regional Educational Board, 1967. . "Expenditures, Sources of Funds, and Utiliza- tion of Digital Computers for Research and Instruction in Higher Education, 1964-1965 with Projections for 1968-1969." Communications of PBS ACM, Vol. 7 (April, 1968). PP. 257-62. —— Huesmann L. and Goldberg, R. "Evaluating Computer Systems through Simulation." Computer Journal (August, 1967), Pp. 148-52. Henderson, J. and Quandt, R. Microeconomic Theory. New York: McGraw-Hill, 1971. 91 IBM Technical Publications. White Plains: Technical Docu- mentation Center, International Business Machines, 1970. Ihrer, I. "Computer Performance Projected through Simula- tion." Computers and Automation. April, 1967, pp. 21-27. Johnston, J. Statistical Cost AnaIysis. New York: McGraw-Hill, I960. Kmenta, J. "On Estimation of the C.E.S. Production Func- tion." International Economic Review, Vol. 8 (June, 1967), pp. 180-89. . Elements of Econometrics. New York: Macmillan, 1971. Knight, K. "A Study of Technological Innovation--The Evaluation of Digital Computers." Unpublished Doctoral dissertation, Carnegie Institute of Technology, 1963. Malinvaud, E. Statistical Methods of Econometrics. Amster- dam: North Holland Publishing Co., 1968. Marschak, J. and Andrews, W. "Random Simultaneous Equa- tions and Theory of Production." Econometrica, Vol. 12 (July-October, 1944), pp. 143-205. McFadden, D. An Econometric Approach to Production Theory. Forthcoming. Merewitz, L. The Production Function in the Public Sector: Production of Postal Services IS the U.S. Post Office. Berkeley: Center for—Planning and'Develop- ment, 1969. Moore, F. "Economies of Scale: Some Statistical Evidence." Quarterly_Journal of Economics, Vol. 70 (January, 1962): PP. 138-50. Mundlak, Y. Review of Production Functions. Forthcoming. Nerlove, M. "Returns to Scale in Electricity Supply." Measurement in Economics: Studies in Mathematical Economics and—EconometEICS in Memory—of Yehuda Grunfeld. Edited by C. Christ. Stanford: Stanford University Press, 1963. . Estimation and Identification of_Cobb-Douglas Production Functions. Chicago: Rand McNally, 1965. 92 Raisbuk, G. Information Theory. Cambridge: Massachu- setts Institute of Technology Press, 1966. Ramsey, J. and Zarembka, P. "Specification Error Tests and Alternative Functional Forms of the Aggregate Production Function." Journal of the American Statistical Association, Vol. 66'(September, 1971), pp. 471-77. Schneidewind, N. "Analytic Model for the Design and Selec- tion of Electronic Digital Computers-" Unpublished Doctoral dissertation, University of Southern California, 1966. Schwab, B. "Economic Evaluation and Selection of Elec- tronic Data Processing Systems." Unpublished D.B.A. dissertation, University of California at Los Angeles, 1967. Sewald, M.; Rauch, M.; Rodick, L.; and Wertz, L. "A Prag- matic Approach to Systems Measurement." Computer Decisions (July, 1971), pp. 38-40. Sharpe, W. The Economics pf Computers. New York: Columbia University Press, 1969. Shephard, R. Cost and Production Functions. Princeton: Princeton University Press, 1953. Solomon, M. "Economies of Scale and the I.B.M. System/360." Communications pf the ACM, Vol. 5 (June, 1966), pp. 435-40. Standard EDP Reports. New York: Auerbach Information, Inc., 1970. Uzawa, H. "Duality Principles in the Theory of Cost and Production." International Economic Review, Vol. 5 (May, 1964): pp. 216-20. Walters, A. "Production and Cost Functions: An Econo- metric Survey." Econometrica, Vol. 31 (January- April, 1963), pp. 1-66. Yule, U., and Kendall M. Ag Introduction to the Theory pf Statistics. London: Charles Griffin, 1950. Zellner, A.; Kmenta, J.; and Dréze, J. "Specification and Estimation of Cobb-Douglas Production Function Models." Econometrica, Vol. 34 (July-October, 1966), PP.I784-95} APPENDICES 93 APPENDIX A l. Generalized File Processing Problem A The essence of most business data processing appli- cations is the updating of files to reflect the effects of various types of transactions. This benchmark problem is a file processing run in which transaction data in a detail file is used to update a master file, and a record of each transaction is written in a report file or journal (Figure A-l). This type of run forms the bulk of the work load for many computer systems, in diverse applications such as billing, payroll, and inventory control. The listed "activity" factors of 0.0, 0.1, and 1.0 refer to cases in which an average of O, 0.1, and 1.0 trans- action record, respectively, must be processed for each record in the master file. Low activities are character— istic of applications such as inventory control, whereas a payroll run might well have an activity factor of 1.0. All calculated processing times are reported in terms of the number of minutes required to process 10,000 master- file records. Figure A-2 is a general flow chart that summarizes the computational process. Both the master file and detail file are sequentially arranged, and conventional batch 94 95 a Ewfinoum mcwmmmooum mflflm UmNflHmuwch How Emummflo somII.HI« mmanm mHHm pnomom mawm moumme omuwpms soc Emummm HmpsmEoo oaflm umumme UHO mafim “sowuommcmuuv Hflmumo 96 A Look for next record from old master file Is Input next new block Yes block from required? old master file No Unpack master ¢ record and form ‘— control totals Is Update master this master record to ecord an active reflect detail {-——1 one? _‘ transaction Get next detail transaction Format and ‘write report ¢ record Repack master record and move to output area affecting this . ster recor Output block to updated master file 4 J FIGURE A-2.--Genera1 Flowchart for Generalized File Processing Problem A 97 processing techniques are employed. Record lengths are 108 characters for the master file, 80 characters (1 card) for the detail file, and 120 characters (1 line) for the report file. Record layouts are fixed for the detail and report files, but are left flexible for the master file in order to take advantage of the specific capabilities of each computer system. Card reading and printing are performed on-line in all standard configurations except paired configurations VIIB and VIIIB, in which card-to-tape and tape-to-printer transcriptions are performed off-line, usually by a separate small-scale computer. The master file is on magnetic tape in all standard configurations except Configuration I, where it is on punched cards. 2. Random Access File Processing Problem This benchmark problem represents a wide range of real-time computer applications in which an on-line master file is accessed to answer inquiries and/or updated to reflect various types of transactions. Figure A-3 shows the basic run diagram. Examples of this type of processing include real-time inventory control, credit checking, airline and hotel reservations, and on-line savings systems. In contrast to Generalized File Processing Problem A, described in Item (1), this problem uses random access storage to hold the entire master file on-line, and processes all transactions as they occur, without prior sorting. A11 98 mHHm unommm Emanoum mcflmmmooum maflm mmoood Eoccmm How Emummao cam .mn¢ mmeHm soap IOMmCMHu. Hflmumo EwumMm kuomeou mawm Hmummz 99 calculated times are reported in terms of the time in milliseconds required to process each transaction and the total time in minutes required to process 10,000 trans- actions. This problem is evaluated for one or more of the three random access standard configurations (IIIR, IVR, and VIIIR). Where there are two or more random access devices that could satisfy the specified capacity require- ments, our choice is based upon considerations of economy, system throughput, software support, and reliability. Therefore, disc files will normally be chosen in prefer- ence to drums (which are relatively expensive) or magnetic strip devices (which tend to be relatively slow and less reliable). Figure A-4 is a general flow chart that summarizes the computational process. The master file is sequentially arranged in random access storage, and a two-stage indexing procedure is used to determine the location of each master- file record that needs to be accessed. Record lengths are 108 characters for the master file, 80 characters (1 card) for the detail transactions, and 120 characters (1 line) for the report file. Record layouts are fixed for the detail and report files, but are left flexible for the master file so that the specific features of each computer system can be advantageously utilized. 100 Use TW-stage indexing procedure to obtain master record index 4 Fetch master record 4 Update master record to reflect detail transaction Fetch next detail transaction 0 Store master record in its original location J Format and ‘F——————~ write report record FIGURE A-4.--Genera1 Flow Chart for Random Access File Processing Problem 101 The detail transactions (e.g., inquiries, orders, or deposits) are assumed to be arriving in a random sequence and at a continuous rate that is high enough to ensure that one or more transactions are always waiting to be processed. Therefore, it makes no difference whether the transactions enter the system via an on-line reader, a simple remote inquire terminal, or a multiterminal data communications network. This assumption means that the Random Access File Processing Problem does not attempt the highly complex and variable task of measuring the effi- ciency of real-time data communications networks; it simply measures the central computer system's ability to locate and update randomly addressed master-file records. The report file is written on either magnetic tape or a random access device, presumably for printing at some later time. Each report record is also made available for optional transmission back to the remote terminal that initiated the transaction (though the processor time required to effect this transaction is not included in the published timing figures). 3. Sorting Because conventional data processing techniques usually require all records to be arranged in a particular sequence, sorting Operations are an important and time- consuming part of the work load in most business computer installations. This benchmark problem requires that a file 102 consisting of 10,000 records, each 80 characters in length, be arranged sequentially according to an 8-digit key, such as an account number. The "Standard Estimate" column lists the estimated sorting times calculated by our analysts for sorting Opera- tions that use straightforward magnetic tape merging techniques. Two-way tape merging is used in the four-tape Standard Configuration II and three-way merging in all of the larger systems. Whenever timing data is available for a standard, manufacturer-supplied sort routine, the time required to perform the same 10,000-record sort is listed in the "Availa- ble Routines" column. Because most manufacturer-supplied sort routines now use internal sorting and merging tech- niques which are more SOphisticated than those used to prepare our estimates, the "Available Routines" sort time will often be substantially less than the "Standard Esti- mate" time for a given configuration. Nevertheless, the Standard Estimates provide useful, directly comparable indications of each computer system‘s basic capabilities to perform magnetic tape input-output Operations. 4; Matrix Inversion In many scientific and Operations research applica- tions, such as multiple regression, linear programming, and the solution of simultaneous equations, the bulk of the central processor's time is spent in inverting large 103 matrices. This benchmark problem involves the inversion of lO-by-lO and 40-by-40 matrices. It measures the speed of the central processor on floating-point calculations: no input or output Operations are involved. All matrix elements are held within the system's main storage unit in floating-point form with a precision equivalent to at. least eight decimal digits. The "Standard Estimate" columns list the matrix inversion times calculated by our analysis through a simple estimating procedure that uses the system's floating-point arithmetic speeds. Whenever timing data is available for a standard, manufacturer-supplied matrix inversion routine, it is reported in the "Available Routines" columns. 5; Generalized Mathematical Problem A. Another frequently encountered scientific problem involves the evaluation of polynomial equations of the 2 + Cx3 + Ex4 + Fxs. This benchmark type Y = A + Bx + Cx problem includes the following basic steps: 1. Read in input record consisting of 10 eight-digit numbers, 2. Perform a floating-point calculation that consists of evaluating 5th order polynomials, executing five divi- sion Operations, and evaluating one square root. 3. For every 10 input records, form and print one output record consisting of 10 eight-digit numbers. 104 The "Computation Factors" of l, 10,and 100 mean that the standard calculation described above is performed 1, 10, or 100 times, respectively, for each input record to show the effects of varying ratios of computation to input/ output volume. Processing times are listed in terms of milliseconds per input record. These examples are drawn from Auerbach Information, Inc. 105 0000.0! 6606.~ 0600.0 m~06._ ~mgc.g Hgmmn-ccmhmm~ $0.0 cm.m 00.—fi 00.0 00.00 ~omn~uficcahrum 00mg.m 0NNW.~ m60_.N ammo.m mo~0.c Hemm~.~crrhmr— 00._m mm.g 00.0 00.6 cc.mcc~ ”gnu—HWCUhhrwu 0~N0.~ momm._ emom.m 6000.« 000«.K .gmm_~—COFFKU~ 00.0 «0.0 00.0" 00.6 00.nm~ Hgmm-~ccr600~ mmm0.0I CNN6.~ m~06._ cgm6._ 000m.m "amm~__CKC6mwn mm.o 00.0 00.0 60.0 00.00. ~om0~_—OKC6mU~ mm00.m 00~0.~ «60..m 0000.0 0000.6 .cmw__~0006mm~ 60.00 mo.m 00.0 06.6 00.m~m_ pgmm_—~CCTFFU. mgm0.g 06—6.~ 000m.m 600~.m 0000.6 mamau—ncrmbm0~ 60.6w 6m.m 60.0 60.m 00.hcm_ mammmuwcwmbrmfi 0000.01 @000." omco.~ 0006.m «000.0 «cmm06—0000Nr~ 06.0 00.0 00.6 00.0” 00.6q mamUO6—cwmcnu~ 0000.0: 0000." «000._ 0000._ mmmm.m ~mmNAU—cmmrmm~ 00.0 00.0 00.0 00.0 00.00 "mmmgwmoununua c~0©.0 Obmw.~ Umm~.m €0m0.~ Cwmc.c md~_00~cwccmww mm." 00.0 00.fim .m.6 00.—00 dqm_mKHCUOQQU~ 0000.0! mo_o.o cmom.m 0000.. «00¢.d aqmn-—Cumflnc~ 00.0 cm.m 00.0“ 00.0 oc.00~ "gnu—"nouruncu 00mo.m "600.“ 6omc.m cmcm.m m¢OH.m ~00N1-00~cmm_ 00.0“ 60.0 6©.~H 00.0” 00.UF~ .gmm~..cr~cmm~ mmc~.~ 6600.0 occm.m mofim.o 600~.¢ ~¢mm~_~cm6nmm— g~.m rm.” 00.m~ 6c.— 00.06 ~000-~ombmamg 0060.0 6000.” 0606.0 «060.0 ~oor.a gamma—"000000— 00." 00.0 00.0" 00.a oo.c6 .qmn~._ouumnn~ 006m.“ «000.. "000.0 m00m.~ ~6N_.q _amm._yomuom_~ mu.m rc.g co.g_ 00.0 cc.0( dem-~cu~00- 0000.6: 0600.0 0600.0 0.06._ 0600.0 «mm—6m_cmcq«_~ "0.0 Fr.d 00.—~ 00.0 00.~r mam—bnficucqn—M 06mm.c @000.“ 0~06.~ cmom.m 000~.m figmm-—Ombm«- 00.m 00.0 co.0 (0.0. cc.ca~ "cmm~_~006rn- 0 mm mm a. o conumoflufluamcH muma OSBII.HI< mqmdfi 106 mgmw.m ~m.m~ @OM0.0 mm.— 000N.~ ©©.m Nmmm.d $0.0 00m®.C m0.— REKN.C 0N.~ FEFN.~ Ow.m FWOM.CI Ffloc me~.0 ON.— ¢HF®.OI "0.0 OQMQ.OI 06.0 0600.0 #0.“ 0000." CO.N ~NOQ.C| 00.0 ~RmN.O ON.“ m®~0.0 Om.N 0.0 00.— GOO©.~ CO.m €000.— OO.m «000.0 F®.~ OQ~O.N Cmob 0000.” CO.m C¢OOo~ CO.m OO0¢o~ KN.¢ 0000.— CO.m 000N.~ Efl.m "000.0 CO.N Rm~fi.~ ~F.m ($00.” CO.m Mfivm.o mm.N o—mc.m 06.0" 0,06.— 00.0 Nwmm.m hm.0H 0006.N 00.0“ amm0.m 00.6 omom.m 00.0— ¢OFO.N 00.0 0000.N 00.0~ 0N66.N 00.0— Omom.m 00.0— 00¢U.N 00.0~ 00—0.c 00.0 g_wr.m 00.0— 0mgc._ 00.6 6mmw.m 00.0" 600%.— 06.0 OKQC.~ 0C.F ©NOM.N 00.0" H000.“ F©.C~ O$Ob.m CC.K~ OQ—G.m CE.F Ovmcom COOR~ QOFO.A CCOm 0m¢0.~ 00.0 0000.— 00.0 0000.0 CC.¢ mmwm.~ Cflof ~¢O~.m C&.¢ €¢0C.— CC.fl OMMN.R mm.0 0606.0 oo.m0 0000.0 00.00 m-~.0 CO.~KQ —OF—.m 00.0%— 0000.0 00.CFN OKNC.Q OC.RC~ man—.0 00.0(d (mub.m 00.—Q bk~0.€ 00.0m~ maom.m 00.00 "000.0 CC0~IU 00—0.r 00.00 mQCU.Q 00.00 €®NU.C CC.QC momm.c 00.0N~ ~amn__—oworn~m ~00N-~000bn_« "gmm_m~cwuuu—n "gnu—mnouuu«~n ~00N~0~Ommmmum "emu—0.00000—m «gm~6~_cmocw—N mam~6-cwocn~m figmm_.—owccmom “gmm_—~cwccncm mqmomuacrncno_ «gnom_—meqn0~ adagmn—Orqcn6u men—d~_0060«6~ men—mm—ounomh_ «ca—om—Omncmru ~dmm—.~Om_cm6_ ~60N-~00~0n6u «cm_6-ommr«0~ «gm.b-00r0ncu ammm~NLCKOcm0~ "nmn—0~ouccn(~ mgm~m_~06rmru~ nemfinaucrrcrup mum—m.~cmrmmuu ugm—m._oumcmw~ "chm—uwcmm6mr_ "emu—nficma6rwu “Una—~_cmm6mm~ figmmfi-omm6mm— 107 0KOK.N 00.n— Nc_m.~ mm.0 0000.0 00." mm~..N 00.0 “Own.— 00.m 0000.0 00.0~ 0000.0 00.0 om00.0 00.— 0000.0 00.0— "0.0." m~.0 00mm.0 00." 00.0.0 .0.— 0000." 00.m 000m.0| $0.0 00W0.C| F0.0 ”000.0 0~.N mm0m.0l 00.0 0m00.~ 00.U C~0~.— Om.m GHO0.~ 00.0 (000." C0.m N000.~ 00.0 000N.~ 00.0 Ufi0o.~ Fm.“ 0000.~ 00.W 0.0 CO.~ fi0mm.~ 00.0 amm0.~ fi0.0 m0mm.~ C0.0 "000.0 00.N 000m.“ 00.0 ~0N0._ 0~.0 0.0 00.~ WmM~.0 0a.— FOON.~ 00.0 ~00O.~ 0m.0 0000.— 00.0 0000.0 00.0~ 0000." 00.0 hb~m.m 00.m— 0mmm.m 0m.m~ 0000.0 00.0 0000.0 00.00 0000.” 00.0 0000.N 00.0" 000m.m 00.0" 0000.N 00.0. ~00~.0 00.0 00—~.m 00.0 O00r.m 00.C~ 0000.“ 00.U 0m00.m 00.0w «ho—.0 00.0 mfimm.m 00.0 0~0O.~ 00.0 0mcm.n 00.0~ 0000." 00.0 0_~N.m m~.0 nom~.w 00.0 who—.0 00.0 0000.0 00.- 0000." 00.0 0000.0 00.0 b~00.~ ...0 m00m.~ 00.0 0000." 00.0 mom~.u 00.0 000~.0 00.0 0000.“ 00.0 0000.n 00.0 00—0.n 00.0 00m0.n 00.0 000G.U 00.000 FUUO.U 00.0mm 00mfl.m 00.0v «000.0 00.000 «U00.W 00.0mm OmCC.m 00.00" 0~0~.m 00.—h~ €n0r.m 00.0" 0000.0 00.000 h—0N.m 00.00— omwc.m 00.00 0000.0 00.—n— 0000.0 00.00n $000.U 00.000 C—oc.0 00.00 0000.0 00._a 0000.0 00.00 0000.0 00.0vp «cmuuwncwuwmou mom~n-ouwumcn ~0mm__~0U00nOk _0mm_~—cwvcmom ~0mm-~CKCOm$N. ~000-~0U0000N _0mm-—0000NON ~0mm-.cmocmmm ~0mm-~0000000 ~0m0_.pcwbbmbm .0mmfi-0000000 .000~_~0U0U000 00—0m_~0000«KN 00~cm~_cwarnwu GQNHAU~0U~0NKN mmmfimu.00~0000 ~0mmg-cumouwm "cmmunucrnewrm ”cmm__~ommwmmm Hammfiguorummwm _0mm_cgcwmmmmm ~0mm~0~00rvmwm ~0mm-~cw~mmqm ~00n-—0U~0000 ndm~0-00rurqm mamwufificwrnrdm acm—mwfiouo0mcfi «cm—muficu00mqm mrmcm.~00(c«0m mmmcm_~000000m mmm~0~wcmccucm a00~m-0000000 ~00m_~.om~0urw “qmnuu—CruonVK ~0mm__~000mnmm "qmnu—pcwaauan 108 Comm-w mmom #00000! MFoO VOGmoCI mcoo mmmuom Fmom mmmhom -om~ ~OFm-N Ofiomm fiC¢¢uCI mono mcmMo~ mmom OOWNoN Pmoo mcomom "OoFN NCPOou Fuck «Dam-OI 06.0 cmmdoo «mo— finmmom “moo“ Nb~©o0l Qmoo FOO~om ~N0NN ®-Ooo OQoN "mmuoul «moC wmocoo cm." cmoo.~ coon coon.— coom ~©Nm.~ oooq flown." 00.6 omooow co.m €600.” co.m mmocoo om.. mom~.~ oucm —Nmo.~ ©m.m mmmwog hm.o mmmmog Cm.n ~mocoo coom fimm¢._ Faun Commofi co.m momoofi «c.m comm.“ coca moan.” cooq whom-N 00.0 QOFCom COom OOmCoN FOoF Omomom 00.0— omeofi COoF OQCFoN 000$" 600$." CCQK NFO~¢N 00.0 mncnom Ofiom ~NOOo~ Obcfl "OmOoN OCo¢~ QOFCoN COom OPOPoN oc.- Pfimcom mmo—~ COmKoN FOoK~ O—Wcom C®.—~ omcflou 00oF omomom COoC~ «orc.u ccoa m_0b.~ ccoc mcmmon ccoo oomo.m b©.h moooo~ Kroc cmmm._ Km.c "moc.o Cc.m c0b0.m ocom hhcocu crow mmmco~ m~.m oomoofi bcob cmcmom CCoC— coocon CCoW mama.“ rocm ¢occo~ CC.K mmom.m ogoo oqcm.~ mm.m m~05._ CCoO FOOmoc ccormg Houmoa ccamw Fomfiofi CComF morcom 00.0mm Goomoc OOoOUU O~WMom co.-m CCChom Occuu room-q ooomm~ m0¢mom ooohdm Kficmoc Ooocflm ccomom OCoECM ~vom.c CCoQF mwmwom occur «Ewe-c CCobbh mcmhoa OOoa- meowoe oo.boc mhoooa coole— Nficqom CC.CF .qm«_-owowmwm ~an———OUOUAKP ”mmm~c~curmanc "mmnmqncwrmnnc argcm..cmcomwc nm~cm—~cmccm~c ”qmm~fi~om.cn~c "cam—uncw—empc mqfi—N-cwowm~c «cannuncuounpc _cmm__.omcwmmm qumfiuucmrumam “cmm~—~cuncnbm ~qm«.-cwucmhn Hemm_-cwomamm —qmm~—~0U0unur ~H m0mm.~ 00.¢ MOON." b0.r Ofimvom 00.u~ OQ—0.m 0m.h «ho~.m 00.0 occa.N 00.m~ coom.m 00.m~ m~mm.m 00.0 M060." 00.: .omc.m cc.¢~ m~®b.~ 00.0 dfimm.m 00.0— omor.m CC.K~ mmmm.n 00.0 ~¢0~.m 00.x O¢~0.« 0m.h mmhm.0 C€.m 60b0.m 00.6 m~0h.~ 00.0 mmm~.m 00.0 mew—.N mm.m ommu.~ Fa.c mmqm._ mm.0 O¢~0.m 0K.b O¢~0.m Cm.b c000." CC.U ~mo0.c CC.m Mend.” CE.0 0000.— 00.0 «Cam.— h~.0 €000._ cc.m 000C.“ m0.0 omvh.m 00.m_r KOCFoQ OO.~—~ mO0O.¢ 00.06" KON6.K 00.0mm 0mm0.m 00.0a PNON.W cc.mC~ N0~0.0 oo.c_a 00~0.U CC.~bm cth.m 00.Nmm 0r00.m 00.0r cflmcod 00.0w m0mm.c 00.0m— mwow.m cc.hm UPUR.U 00.no— ~Cm0.m CC.¢~ RWFC.K 00.00” U0~No€ cc.m( mQ00.0 00.-q ~q.m_-0m0mmem ”o.m~_~cu0nmcm nom_¢~.cw0hmrm mam—m—~0K0bwmw Hemmfiumcmmbmmm ~q0«.-cvubmrm manommucmomnom «naommuom00m0m mmmcmuncwcm00¢ mmmcmudcmcrmoc mcm~w-0bwcmoc mvm_m—uorwcnoc mamam-0wccmoc awn—nnficmcqmoc mdmpm-0u0¢moc «cm—«~ucr0cmoc "cmm.-0r0cncc _cmm_-cwccm0¢ mc-mmncwcmmoc avfiunmfichCKOd mem~mr~0m0hm0¢ mqmfimh~0r0huoc Hemmgu_cwmmmh¢ "cmm_-0wmvmhc «macmmucuocm0c mm_cmm—cuccm0¢ mdm.m~—cmtom0c aom_0-CUOOn0¢ wdmr-~omkom0¢ .qmm~—~CUnon0d "qmm~_~0000m0¢ "cmm_-ccccn0c ”dmm~_~cuomm0c ucmmuuacmomncc “qmm__~0urbm0¢ ~qmm-aourbm0c lll NMOQ.0 0m.~ CQOQ.0 c0.— €000.~ 00.0 0000.0! 00.0 Fw00.— 0~.W 00€0.0| 00.0 WNOO.HI WWOO ~UWCQC COOK CNFFON OQOMQ 0wm0.Wl ~C.C ONMQ.C QU.~ nvmnoo cg.— mmNN.~ 00.0 mmov.~ 00.Q 0000.— 00.0 0.0 CO." 0d0N.~ 60.0 OQCN.— 00.0 0000." 00.m WKKO.C 00.N hficm.~ mm.m F000." 00.6 0000." C0.0 «com.» CW.Q 0€0¢.N 00.0" 000$.N c0.m_ «00¢.N 00.0. «h0~.m 00.0 m.mm.m Om.0 000b.m 00.0— .mmcq.w 0m.- ~CG—0R CU.¢ ~0C0.m $0.0” d000.~ 00.0 0000.” CC.W OQQQ.K 00oflu 0~h0.~ 00.0 m—hm.~ 0m.0 0000.0 00.~H 0000." CC.m 0000.” 00.0 m~0h.~ 00.0 "600.— 00.0 €000." C0.r qah0.~ 00.0 0000.“ cc.w m~0h.~ 00.0 m00q.~ 00.6 0000.0 00.00— bumm.m 00.Fcu .mmc.0 00.0a0 m~00.r CO.~K m00m.c 00.mm~ m~0m.m 00.00 00.—.a 00.~0 0000.6 €0.00 0.00.0 00.~mm mHCQ.m 00.00 Ovrc.m CC.~F c000.a 00.06 "vmmm—uowummom ~vmm-~0mnrnom adfimuuncm0rmmm "000—nncwtvnflm "QMN—unomcmmmm ~€CR-~CU¢Um¢m -mm~m~CWCfiN¢m uuflfium~owcfiwmm ~¢flN-~CK~FKFm HQMN-HCW~PNFM ucmmuu~0macmhm ~Qmm—ua0mdowbm mmnumracrtcmhm Ava—RU~CUIQRFW ”Qmm~——CKCQQWW udmn—HHCUCQRUK udmm—uucwhkficm ~€mm———CWPRFQW ~mmm~W~0WO~m¢m ~FMKHW~CUO~EOW RQMOFu~Cmnqum manor—HCrupmcm mqmubu~0U00mcm fldmuh~powhomcm APPENDIX B The purpose of this section is to provide a con- venient reference to the second-order equilibrium conditions for profit maximization. This section is analogous to Henderson and Quandt [1958, pp. 61-62]. Given 0‘1 0‘2 Q = f(Xl,X2) = Axl X2 perfect competition requires , the equilibrium condition under (a) f < 0, f < o 11 22 and f11 f12 (b) > 0 f21 f22 f -130- -1 ii ‘ a 2 ‘ “1(“1 )"92 xi xi which is negative if a1 < l which is the stability condition for part a. Expanding fllf22 - fiz, we find 112 113 2 a d2 a a - .9. - 9.. .. 1 = - _ 1 2 2 [alml Uxf] [a2(0‘2 1)X2] [XIX2 (1 Cl1 O‘2)[X2X2 Q 2 1 2 which is positive if a1 + d2 < 1.If a1 + a2 = l we have the Samuelson indeterminacy case and if 1 the stability condition is violated. APPENDIX C Since the question of the influence of the weights in the output measure was introduced in Chapter III, two additional regressions were run. The first regression used .9 for add time and .1 for cycle time which yielded C? = 2.974 + 0.453 t + 0.644P*. 3 Q) 13 (0.040) (0.163) + 0.220P*, + 0.176P*. R = .676 23 3] (0.148) (0.132) where the symbols are the same as in the previous equation. Again the estimates of the parameters have the expected sign but the coefficient for administration and Operations per- sonnel are not different from zero at the 10% level. Table C-I indicates the t-ratios for the estimated coefficients. TABLE C-I.--Test Statistic with WA = .9, WC = o l V l t- t' Significance a ue ra 1o Level Y 11.466 1% P1 3.958 1% P2 1.489 20% P3 1.330 20% 114 115 The high R2 and F value equaling 57.225 can be interpreted as justification for the functional form of the production function where most of the variation in cost measure is attributed to variations in output and systems programming. The weight for add-time was changed to .1, cycle time to .9, and regression resulted in C? = 1.903 + 0.367Y6 + 0.722p*. + 0.323p*. + O.338P*. J J 13 23 33 (0.043) (0.188) (0.171) (0.152) R2 = .566 with the elasticities all positive and significantly dif- ferent from zero at the 10% level (see Table C-II) with high R2 and F = 35.716. TABLE C—II.--Test Statistic with WA = .1, WC = .9 Value t-ratio Significance Level Y 8.377 1% Pl 3.838 1% P2 1.891 10% The evidence does not allow us to state that the estimates of economies of scale are altered with the choice of weights in the output measure. 116 The study by Knight was discussed in the litera- ture review and it was considered interesting to see what would result if we used Knight's output measure rather than OUTM. Unfortunately, Knight does not have all the machines in our sample but, using only those machines for which Knight provides data, we found the following results for n = 100. C? = -2.l98 + 0.394Q? + 0.691P*. + 0.342P*, + 0.428P*, 3 J 13 2] 33 (0.056) (0.190) (0.172) (0.150) R2 = .591 where the symbols are the same except Y'* is output as defined by Knight. The elasticities are positive as expected; the negative intercept is questionable but nothing will be said about its interception. As a comparison, the data were run using the OUTM measure which resulted in $ = . + . # + . . + . *. + . *. CJ 2 518 0 406QJ 0 646P13 0 285P23 0 260P33 (0.043) (0.163) (0.145) (0.131) R2 = .679 If we look at the sum of the output elasticities again, we find no evidence to suspect the coefficients are 117 different. The general fit of the equation using OUTM is better than that obtained by using Knight's measure. Because of the difficulty of obtaining measures like Knight's, OUTM appears to be the low cost method of obtaining output measure. 3 0329 111)) IIHIIUHII