TESTING OF AND ESTIMATION SUBJECT TO INEQUALITY RESTRICTIONS USING A MINIMUM DISTANCE ESTIMATOR A Disscriaflon . for II": Degree OI DIS. D. MICHIGAN STATE UNIVERSITY Lawrence nggron Marsh I9 ‘ , “W —1-:y "\ . -1 Line! A.) 1\ 1'1 R Y as. w: «r t’ULstésbtgi thm Univ: 1J7 'f‘HF—‘SIS This is to certify that the thesis entitled TESTING OF AND ESTIMATION SUBJECT TO INEQUALITY RESTRICTIONS USING A MINIMUM DISTANCE ESTIMATOR presented by Lawrence Capron Marsh has been accepted towards fulfillment of the requirements for Ph . D . degree in Egongmics Qiajor professor Ja s 8. Ramsey Date I" I 2" 76 0-7639 ABSTRACT TESTING OF AND ESTIMATION SUBJECT TO INEQUALITY RESTRICTIONS USING A MINIMUM DISTANCE ESTIMATOR BY Lawrence Capron Marsh Economic theory and empirical evidence from previous research may suggest that certain inequality restrictions may be appropriate for particular parameters in an econo- metric model. This dissertation develops a test for the appropriateness of restrictions that constrain parameters to finite intervals. In addition, it considers the use of a minimum distance estimator that incorporates inequality restrictions into the estimation procedure itself. By definition the minimum distance estimator selects that point in the constrained space which is nearest to the unconstrained maximum likelihood estimate. The distribu- tional properties of the minimum distance estimator are com- pared to those of the unconstrained maximum likelihood es— timator and the constrained maximum likelihood estimator in extensive Monte Carlo sampling experiments. It is shown that the minimum distance estimator is greatly superior to the unconstrained maximum likelihood estimator in terms of various definitions of mean squared error. The equivalence of the constrained maximum likelihood estimator and the minimum distance estimator in a number of circumstances is demonstrated by noting that in general there is not a substantial difference between the estimated Lawrence Capron Marsh distributional prOperties of the minimum distance estimator and those of the constrained maximum likelihood estimator. A single sample approximation of the variance, absolute bias,»and mean squared error of the minimum distance esti- mator is develOped and many examples of applying the mini- mum distance estimator in regression analysis are presented indicating the effect of increasingly stringent inequality restrictions as imposed by the minimum distance estimator. Thus, in general, the usefulness and effectiveness of the minimum distance estimator are demonstrated. TESTING OF AND ESTIMATION SUBJECT TO INEQUALITY RESTRICTIONS USING A MINIMUM DISTANCE ESTIMATOR BY Lawrence Capron Marsh A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Economics 1976 @Copyright by LAWRENCE CAPRON MARSH 1976 ACKNOWLEDGMENTS Professor James B. Ramsey provided the inspiration and guidance for this dissertation. Although the computer pro- gramming and testing procedures are my own work, the basic idea of the minimum distance estimator as defined herein and the alternative conceptions of mean squared error in the multivariate case belong to Dr. Ramsey. Professor Ramsey sacrificed a great deal of his own research time to read and reread the many variations of this manuscript including several parts that are not included in this final version. Professor Ramsey was exceptionally prompt in reviewing and returning chapters and providing detailed suggestions for , revising both content and presentation. Professor Robert H. Rasche, director of graduate studies, and professors William J. Haley, Robert L. Gustafson and Lester V. Manderscheid assisted me in the course of writing this dissertation and in learning the fundamentals of econometrics. Great appreciation in this regard is also reserved for Professor Jan Kmenta whose teaching served as the foundation upon which this disserta— tion is built. The advice of Dr. C. Shapiro in regard to hypothesis testing was also appreciated. ii My greatest debt by far is to Janet K. Marsh for the countably infinite hours she spent typing and retyping the rough drafts and proofreading everything from computer out- put to the final document. I I retain credit for myself for any errors. iii Chapter II. III. IV. V. TABLE OF CONTENTS LIST OF TABLES O O O O O O O O O O 0 LIST OF FIGURES. O O O O O O O O O O INTRODUCT ION O O O O O O O O O O O O 1.1 Statement of the Problem . . . 1.2 Outline of the Thesis. . . . . 1.3 Review of the Literature . . . TESTING FOR INEQUALITY RESTRICTIONS. THE MINIMUM DISTANCE ESTIMATOR AND ITS DISTRIBUTION . . . . . . . . . . 3.1 Distributional Characteristics MONTE CARLO EXPERIMENTS. . . . . . . 4.l Criteria for Comparing the UML, CML, and MD estimators . . . . 4 2 Three Monte Carlo Models . . . 4 3 Discrete Points Model. . . . . 4.4 Square Model . . . . . . . . . 4 5 Elliptical Model . . . . . . . 4 6 Conclusion . . . . . . . . . . APPLYING THE MINIMUM DISTANCE ESTIMATOR. LIST OF REFERENCES . . . . . . . . . iv Page viii supra 13 29 31 38 38 46 52 76 96 119 120 135 LIST OF TABLES Table l Variance and Covariance Specifications . . . . . . 2 Average Ranks of Smaller Absolute Estimated Bias for Points Model. . . . . . . . . . . . . . 3 Average Values of Estimated Bias for Points Model. 4 Average Ranks of Smaller Estimated Variance for Points Model . . . . . . . . . . . . . . . . 5 Average Values of Estimated Variance for POintS MOdel O O I O O O O O O I O C O O O O O O 6 Average Ranks of Smaller Estimated MSE A for POintS MOdel O O O O O O O O O O O O O O O O 7 Average Values of Estimated MSE A for Points Model 8 Average Ranks of Smaller Estimated MSE B for Points Model . . . . . . . . . . . . . . . . 9 Average Values of Estimated MSE B for Points Model 10 Average Ranks of Smaller Estimated MSE D for Points Model . . . . . . . . . . . . . . . . ll Average Values of Estimated MSE D for Points Model 12 Average Ranks of Smaller Estimated MSE E for POintS MOdel O O O O O O O O O O O O C O O O 13 Average Values of Estimated MSE E for Points Model 14 Average Ranks of Smaller Absolute Estimated Bias for Square Model . . . . . . . . . . . . . . . . 15 Average Values of Estimated Bias for Square Model. 16 Average Ranks of Smaller Estimated Variance for Square Model . . . . . . . . . . . . . . . . Page 51 55 56 59 60 63 64 66 67 69 7O 72 73 78 79 81 Table 17 l8 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Average Values of Estimated Variance for Square Model . . . . . . . . . . . . . . . Average Ranks of Smaller Estimated MSE A for Square Model . . . . . . . . . . . . . . . Average Values of Estimated MSE A for Square Model . . . . . . . . . . . . . . . Average Ranks of Smaller Estimated MSE B for Square Model . . . . . . . . . . . . . . . Average Values of Estimated MSE B for Square Model . . . . . . . . . . . . . . . Average Ranks of Smaller Estimated MSE D for Square Model . . . . . . . . . . . . . . . Average Values of Estimated MSE D for square MOdel O I O O O O O O O O O O O O 0 Average Ranks of Smaller Estimated MSE E for Square Model . . . . . . . . . . . . . . . Average Values of Estimated MSE E for Square Model . . . . . . . . . . . . . . . Average Ranks of Smaller Estimated Absolute Bias for Elliptical Model . . . . . . . . . . . . . Average Values of Estimated Bias for Elliptical Model . . . . . . . . . . . . . Average Ranks of Smaller Estimated Variance for Elliptical MOdel O O O O O O O I O O O O 0 Average Values of Estimated Variance for Elliptical Model . . . . . . . . . . . . . Average Ranks of Smaller Estimated MSE A for Elliptical Model . . . . . . . . . . . . . Average Values of Estimated MSE A for Elliptical Model . . I . . . . . . . . . . Average Ranks of Smaller Estimated MSE B for Elliptical Model . . . . . . . . . . . . . Average Values of Estimated MSE B for Elliptical Model . . . . . . . . . . . . . vi Page 82 84 85 86 87 89 9O 91 92 98 100 102 104 106 108 109 110 Table 34 35 36 37 38 39 40 Average Ranks of Smaller Estimated MSE for Elliptical Model . Average Values of Estimated MSE D for Elliptical Model . Average Ranks of Smaller Estimated MSE for Elliptical Model . Average Values of Estimated MSE E for Elliptical Model . Estimated Variance, Absolute Bias and Mean Squared Error of the MDE for the Kendrick Regressions. Estimated Variance, Absolute Bias and Mean Squared Error of the MDE for the Ramsey-Zarembka Regressions. Estimated Variance, Absolute Bias and Mean Squared Error of the MDE for the Ferguson Regressions. vii Page 111 112 114 115 123 126 129 LIST OF "FIGURES Figure Page 1 Normal Distribution with Mean Restricted to Finite Interval . . . . . . . . . . . . . . . 14 2 Power Curve Under Null Hypothesis. . . . . . . . . 16 3 Power Function . . . . . . . . . . . . . . . . . . l9 4 Population Mean at 0 . . . . . . . . . . . . . . . 20 5 Population Mean at .5 . . . . . . . . . . . . . . 20 6 Likelihood Ellipse Contours and the Constrained Space. . . . . . . . . . . . . . . . 30 7 pOdOf. Of MDE. C O O O O O O O O O O O O O O O O O 34 8 Discrete Points Model. . . . . . . . . . . . . . . 46 9 Square Model . . . . . . . . . . . . . . . . . . . 47 10 Elliptical Model . . . . . . . . . . . . . . . . . 47 viii CHAPTER I INTRODUCTION 1.1 Statement of the Problem In order to understand fully how to utilize the behav- ioral and predictive capabilities of an econometric model, all available information, both a priori and empirical, must be used effectively in the estimation process. Such informa- tion could be in the form of inequality restrictions. Fail- ure to use all available information results in models that tend to be less reliable and tend to have larger variances than necessary and are therefore inefficient. On the other hand, imposing restrictions that are invalid leads to estima- tors which may not only be biased but may be inconsistent as well. The need for testing for the appropriateness of re- strictions is clear, especially in the case of inequality re- strictions since they have been ignored most often both theoretically and practically. In particular, a test proce- dure for restrictions which constrains regression coeffi- cients to finite intervals needs to be developed. Research economists occasionally wish to restrict the values of one or more regression coefficients by inequality constraints. A coefficient might be desired which is posi- tive, negative, or restricted to lie within a particular range such as zero to one. For example, the slope of a sup- ply curve could be required to be positive while that of a demand curve could be required to be negative. A coefficient representing the marginal propensity to consume could be re- quired to lid between zero and one. 1 r'. ‘ 'UDH ~.-v- .I-r —.v.. 5- u. 0 ts. u‘-— VOv-A 5.x .— o... u,‘ -._b 8. -— ‘A A common §d_hgg approach to this problem is first to run an unrestricted regression. If all the coefficients of interest have the right signs or lie within the right range, then the initial regression results are accepted. However, if any of the coefficients have the wrong sign or lie out- side the restricted range, then a new combination of explana- tory variables is tried or the form of the regression equa— tion is altered. This procedure is continued until the desired results are obtained. The above ad_hgg_procedure does not take into account the effect of such a procedure on the properties of the re- sulting estimators themselves. A more straightforward approach is to impose the desired restrictions on the estima- tion procedure itself. Such a one-step procedure simplifies the derivation of the properties of the resulting estimators. Inequality restrictions may be incorporated into the estimation procedure by use of a minimum distance estimator such as that proposed by Ramsey (30, p.8). Ramsey's minimum distance estimator, ém’ selects that point in the con- strained set which is nearest to the unconstrained maximum likelihood estimator, éu' Ramsey defines his minimum dis- tance estimator, Em, by the expression llz _min He -e —e€C|l_9_u-9_|I2 —-u—-m which indicates that gm is defined as the point in the con- strained space, C, which gives the minimum value for the ll. . \ U A..- '- vv "‘nq fi o.‘ s. A" 5““ (I) l l u . ‘a , "~ square of the Euclidean norm of the vector representing the difference between the unconstrained point, Eu, and every pos- sible point, 0 e C, in the constrained space. Obviously, by definition, the minimum distance estimator (MDE),will be at least as close to the unconstrained maxi- mum likelihood estimator (UMLE) as the constrained maximum likelihood estimator (CMLE) will be. In some cases it may be possible for the MDE to be closer to the UMLE than the CMLE is. As will be demonstrated in a later chapter, this will often be the case for multivariate situations. 1.2 Outline of the Thesis The next section of this chapter will briefly review some of the literature dealing with inequality restrictions which is appropriate for regression analysis. Chapter II will deve10p and explain a test procedure for deciding if in- equality restrictions are appropriate for a particular situa- tion where such restrictions limit the parameter of interest to a finite interval. Chapter III will discuss the minimum distance estimator as a particular method of dealing with inequality restrictions with reference to some of the work of Ramsey and Penneck. Chapter IV will determine estimates of small sample properties of the minimum distance estimator as compared to the constrained and unconstrained maximum likeli- hood estimators by means of Monte Carlo sampling experiments and present a method of obtaining an estimate of the variance of the minimum distance estimator given a particular sample. ‘I Fr—wq?w__—————-—————v————— ”.v ... 4- . -._ “y Finally, Chapter V will provide some examples of using the minimum distance estimator and will demonstrate under what circumstances and to what extent in these examples use of the minimum distance estimator can result in a reduction in the estimated variance of the regression coefficients and how this reduction in estimated variance is not substan- tially offset by an increase in estimated bias in the calcu- lation of estimated mean squared error. 1.3 Review of the Literature Methods of testing for and incorporating exact linear restrictions in regression analysis have been well developed by Chow (6, p.591), Chipman and Rao (5, p.198), Toro and Wallace (37, p.558), Fisher (10, p.361), Wallace and Anderson (39, p.l), Wallace (38, p.689), and others. How- ever, these techniques cannot be directly applied in the case of inequality restrictions. Dealing with inequality re- strictions has proven to be a more difficult problem. Con— sequently, the literature on inequality restrictions has been much more limited and less productive. Theil and Goldberger (35, p.65) proposed a method for approximating inequalities on coefficients so as to incor- porate indirectly inequality restrictions into the estima- tion procedure. They combine a priori and sample informa- tion in the equation: X l"<2 [C II [C7 + R ['1 [<1 .. I”... r - — . - . -. . v. ‘ ,.-. -4\ .vv‘ I. ‘ ‘ H .a ‘A..~ bv'u -.y- u—g- < he-.. h-uu' - b. 4" *‘v‘ Opt . -V I] QA- ”v . ‘OA 1 I“ ‘A hb‘d where the a priori information is given by the expression 5’: Rb_+ g, b_is the vector of coefficients; R is a matrix of weights; r is the resulting vector of values set by the restrictions except for a disturbance term vector, g, which is assumed to have a zero mean and a nonsingular variance- covariance matrix. This a priori information tends to re- strict the posterior coefficients to be within desired ranges where the a priori coefficients, b, represent the midpoints of the desired ranges and the variance-covariance matrix of !.is used to restrict the end points of the ranges to be a set number of standard deviations away from the mid- point so that the probability of a sample observation falling outside of a desired range is very small. The sam- ple information is represented‘by the usual regression equa- tion y = Xb_+ E! where y is a vector of observations on the dependent variable, X is a matrix of observations on the ex- planatory variables, and u_is a disturbance term vector with zero mean and nonsingular variance-covariance matrix. The restricted parameter estimates can be represented by g = (X*'U*-1X*)—1X*'U*-1Y* [I X 1.123 E where y? = , X* = , and U* = a R 32' 22' However, their Bayesian method of mixing a_priori and sample information does not guarantee that the inequality re- strictions will hold. By adding a point estimate from near “A- g.3 ;_ ,-,« v~ . a ..-d.. i Viv-OQI ,4..-- a. ~»-0:;. M n. ._ '01“, ‘-~ ..a -.. - u. nu u». :‘v. .us ‘- ‘s ._ ‘ . A" a: 2“ the midpoint of the restricted range to the original sample data before estimating the coefficients and by setting a small standard deviation for the selected point estimate, a coefficient can be derived which has a high probability of beingrwithin the restricted range. Thus, it becomes highly likely using this method that an income elasticity will lie between zero and one. However, it cannot guarantee an esti- mate between zero and one. The smaller the standard devia- tion, the more concentrated the probability becomes around one point. In reducing the probability of getting an esti— mate outside the restricted range, the probability of get— ting an estimate near either of the end points of the range is substantially reduced. Furthermore, the larger the sam- ple size, the more weight is given to the sample observations and the harder it is to guarantee an estimate within the re- stricted range. Also, the method is only relevant for inter- val inequality restrictions and is generally not very useful in cases of single one-sided inequality restrictions unless they are arbitrarily truncated at some point. Consequently, their approach leaves the way open for a technique that would constrain a coefficient by the desired inequality restric- tions and at the same time would not result in excessive con- centration of probability about a single point. Judge and Takayama (19, p.166) combine prior and sample information in a regression model such that inequality re- straints are placed on individual coefficients or combina— tions of coefficients by minimizing the quadratic form given by the sum of squared residuals subject to the inequality constraints. The analysis is based on the standard regres- sion model Xb + 2! Eu = 0, Egg' = 021 l‘< II where y is a vector of observations on the dependent varia- ble, X is the regressor matrix, 2 is the vector of regression coefficients, and u.is the disturbance vector which has zero mean and a diagonal covariance matrix equal to the constant variance 02 times the identity matrix I. The estimator bf* is defined by that value of b which minimizes ufu_= (y—X§)'(y-Xb) subject to inequality restrictions such as 1 u < < < l) 0_£1_121_:—:1 1 u 2) 5.21112152 :0 3) r1 < b < r and r < O < ru -3 _ _3 _ 3 _3 _ _._3 1 u < < 4) Er_§2r_£z. where r: and r; for i = l, . . a 4, are known vectors of up— per and lower bound constraints for the unknown coefficients in the iEE-set bi and 5 is a row vector of weights with one s. for each b.. 3 J This non-linear programming problem is solved by use of the Kuhn-Tucker equivalence theorem. Their discussion of the sampling properties of the restricted estimators refers to Zellner (41, p.l). Zellner assumed a multivariate nor- mally distributed disturbance term with zero means and known homogeneous variances. The inequality restricted estimators were indicated to be distributed as truncated normal. Pro- perties such as bias, variance, and mean squared error can be determined for the single explanatory variable situation, but become quite difficult to evaluate for models with two or more explanatory variables. The latter situation must be dealt with by numerical integration procedures or simulation techniques. Zellner's work suggests that the mean squared error of the inequality restricted estimator is smaller than the mean squared error of the corresponding unrestricted es- timator. However, the work in this area is incomplete since small sample prOperties of the inequality restricted estima- tor have not been comprehensively evaluated fer a multiparam- eter situation. Moreover, neither Zellner nor Judge and Takayama have dealt with the problem of hypothesis testing. Lovell and Prescott (25, p.913) presented an analysis of a related hypothesis testing problem with inequality con- straints for a two-step procedure which is quite similar to a special case of the minimum distance estimator. Their two-step procedure involves imposing a single one-sided in- equality constraint on one coefficient in a regression equa- tion. In the example, the coefficient of the first explana- tory variable is restricted to be non-negative. If in the .p" ”.0 ' or“ raw ~¢O — -.-- an» -..,, ,. — >— v-‘v. “'.9 "\ 'Bvd as v- . 4‘ -o-g "ou ut- . \“' 5“ ' I (I) l 1 ~ \ ~ initial regression that coefficient is non-negative, then the initial results are accepted. Otherwise, that coeffici- ent is set equal to zero and the regression is rerun without the first explanatory variable. Minimum distance estimation would also set the first coefficient equal to zero in this event, but would accept the other coefficients as initially estimated without rerunning the regression. Lovell and Prescott indicate that the two-step procedure results in parameter estimates that are biased and inefficient. The main thrust of the article, however, is to show that the fi- nal student-t statistics are upward biased in absolute value and in effect exaggerate the significance levels of the cor- responding t-tests. This result follows from the work of Bancroft (l, p.190) and others. Lovell and Prescott do not, however, deal with the problem of testing for the inequality constraint in the beginning to decide if it is a valid re- striction. Nor is the problem of testing within the restric— tion dealt with because the concern is with testing the un- restricted coefficients after the first coefficient has been restricted (i.e., dropped from the equation in this case) and the regression has been rerun on the remaining variables. However, Lovell and Prescott demonstrate that the two-step procedure results in parameter estimates with smaller mean squared error than the initial parameter estimates if the disturbance term is normally distributed. It is noted that both the exaggeration of significance levels and the reduc- tion in mean squared error are directly related to the -. :. ‘v- v- u... w... «\u 10 absolute value of the degree of correlation between the re- stricted parameter variable and the other explanatory varia- bles in the model. Thus, the Lovell and Prescott analysis is most appropriate for highly ill-conditioned data as is often found in economic time series. Since Zellner's (41, p.l) analysis was performed for only one explanatory variable and one inequality constraint under the classical normal linear regression model assump- tions, Hussain (1?, p.1) extended his analysis to simultane- ous equation models and in particular to two-stage least squares constrained and unconstrained estimators by means of Monte Carlo simulated sampling. Hussain used mean absolute error and root mean squared error to compare the estimators. His results indicated that constrained two-stage least squares was superior to unrestricted two-stage least squares by these criteria. Where appropriate, two-stage least squares was preferable to constrained ordinary least squares, and constrained ordinary least squares was preferable to un- restricted ordinary least squares. A more general approach which deals with finitely bounded compact sets of which an inequality constrained pa- rameter space may be viewed as a special case has been de- veloped by Ramsey (30, p.l) who suggested his minimum dis- tance estimator as an alternative to the maximum likelihood approach. Ramsey has shown the consistency of the minimum distance estimator and its superiority over the unconstrained maximum likelihood estimator in terms of mean squared error -v- ‘.-¢ 5v“ op —. 2‘ ~ - :— .‘u :- «\y s.. «a. v._ ,.s ~\~ 2‘ s... I. ‘- V ‘fl 11 analytically and in limited sampling experiments under vari- ous circumstances. In particular, he has shown that the MDE has a mean squared error smaller than, or at most equal to, the mean squared error of the UMLE in the single parameter case when the constrained parameter space is either a convex set or consists of a finite number of points. Also in the single parameter case Ramsey has shown that the CMLE is equi- valent to the MDE when the rate of change of the logarithm of the likelihood function with respect to the parameter, is a nowhere increasing and somewhere decreasing function 6. This result implies that 6c = 6m and that the mean squared errors of the CMLE and the MDE are equal and less than or equal to that of the UMLE: MSE 6C = MSE e < MSE 6 . Under more general conditions specified by Ramsey for the single parameter case, he has shown that the above results hold at least asymptotically. In the multiparameter case 9, of Ramsey found that the MDE has a mean squared error at least as small as that of the UMLE in the sense that A_ !A_ < A_ IA- mam 90> <9.“ 90> mane.) <9. 9.) under the assumption that the constrained parameter space consists of a finite number of points or contains a non- countably infinite number of points and is a convex set. A»- Q.‘ 7’ a- av 'V’ -4- AC. ..a I. :4 ~A _ .. :q s: s a ‘4‘ 12 In addition to offering a substantial reduction in mean squared error, the MDE is particularly useful because it is considerably easier to calculate than the CMLE for regression coefficients restricted by inequality constraints. Since the distribution of the MDE can easily be derived from the dis- tribution of the UMLE, the analytical form of the finite sample distribution of the MDE is at least known whenever the probability density function of the UMLE is known. Further- more, although the CMLE has the same asymptotic distribution as the MDE, the derivation of the CMLE from the UMLE, as well as the proof of CMLE induced reduction in mean squared error over the UMLE, requires a number of conditions that may be more or less demanding and difficult to satisfy as indicated by Ramsey (31, p.8). Consequently, these advantages of the MDE suggest that it may be preferred to the CMLE. The literature as reviewed in this chapter has led to the conclusion that a more extensive examination of the MDE and its distributional properties as compared with the CMLE and the UMLE would be useful and interesting for the purpose of deciding how best to incorporate inequality restrictions into estimation procedures. However, before proceeding with a more detailed analy- sis of the minimum distance estimator, testing procedures need to be develOped to determine if inequality restrictions are appropriate or not in a given situation in the first place. Such procedures will be develOped in Chapter II. .-a~ .— Q-- v... c». M,‘ -v- (I) (I: 0“ I" ‘ A ‘ In (I) I" f 9 y. it s CHAPTER I I TESTING FOR INEQUALITY RESTRICTIONS Although test procedures for equality restrictions are well-known and extensively used, such procedures for inequal— ity restrictions that limit a parameter to a finite interval have not been adequately developed, in particular where the variance is unknown (20, p.213) except in very general ab- stract terms for the exponential family (24, p.315). This chapter will develop the needed test procedures related to the normal distribution before proceeding to an analysis of the incorporation of such inequality restrictions into the estimation process by means of the MDE in Chapter III. In particular, this chapter will deal with a composite null hy— pothesis where the mean, 6, of a normally distributed random variable, x, is restricted to values in a closed interval. At first it will be assumed that the variance of x, 02, is a known constant. Later this assumption will be changed in order to deal with the case where the variance is unknown but takes on the same constant value under both the null and al- ternative hypotheses. This analysis will then be applied to a regression problem under the classical normal linear re- gression model assumptions. The null hypothesis consists of an interval in that any point in the interval is an acceptable value for the true population mean under the null hypothesis. Chernoff and Moses (4, p.257) have referred to such an interval in hypo- thesis testing as an indifference zone. The null hypothesis 13 IQI‘ ..~ a Ck 14 may be specified as: H391<6<82 (l) The corresponding alternative hypothesis is: H - e < 61 or e > 62 (2) a. as shown in Figure 1. f(X) 4-—“ i —1==;; C1 61 61:92 92 C2 X,6 Figure 1 Normal Distribution with Mean Restricted to Finite Interval Note: Graph of normal distribution with mean between or at points 61 and 62. H0: 61< 6 < 82. Ha: 6 < 61 or 6 > 62. 6: mean of normally distributed random variable, X. f(X): probability density function of X. cl: critical point for left-hand critical region. c2: critical point for right-hand critical region. A a-‘ V" .A VV‘ 15 Before proceeding, it is desirable to establish agree- ment on the following selected definitions. A critical re- gion, C, will be defined as a subset of the sample space, S, such that if an observed value of a statistic, t, falls in the critical region C the null hypothesis will be rejected. If the observed statistic does not fall in the critical re- gion, the alternative hypothesis will be rejected in favor of the null hypothesis. A statistic is a real-valued function of the observed sample values alone and is not dependent on any unknown pa— rameter values. In this initial case, the mean of the ob— served sample values will serve as the statistic. The power of a test is the probability for a given pa- rameter value that the sample point will fall in the critical region of the test, i.e., Pr {t s c|e}. The power function expresses power as a function of the admissible parameter values, which have been defined by Cramer (7, p.528) as all parametric points which are regarded as a priori possible under an admissible hypothesis. The parameter space is de- noted by 0 where 6 s 0, while the acceptance region is de- noted 00 and the rejection region is denoted 01 such that Q = 90 U 01. The significance level of a test is the supremum of the power function of the test when the null hypothesis is true, sup 6600 Rao (33, p.375).) As Rao indicates, although under a simple i.e., « = Pr {t e ole} (see Hogg and Craig (16, p.269) or null hypothesis the power function yields a single value for 16 the significance level, under a composite null hypothesis the power function yields a set of values and the signifi- cance level is taken to be the supremum of that set. In the case of an interval null hypothesis with a single completely specified nuisance parameter, the supremum of the power function under the null hypothesis is the maximum value of the power function under the null hypothesis. A typical power curve drawn for values of the power function under the null hypothesis is shown in Figure 2 Power Power .OSOfi N'OSO ' l I I I I I I I : I i I ' I , I , I I l- ' I I : ' I : : .020 1 I I ».020 0 .5 1 6 Figure 2 Power Curve Under Null Hypothesis HO: 0 i 6 i l m = .05 Ferguson (9, p.224) defines an unbiased test as a test which has a power function which takes on values less than or equal to the significance level under the null hypothesis and 17 values greater than or equal to the significance level under the alternative hypothesis. Given a particular level of significance, «, a uniformly most powerful unbiased test of size a is defined by Ferguson to be a test of size a which has power equal to or greater than the power of any other unbiased test of size a for all admissible parameter values under the alternative hypothesis. In other words, it is an unbiased test which has maximum power under the alternative hypothesis. The following testing procedure may be used for testing the interval hypothesis (1) against the alternative hypothe- sis (2) when the parameter of interest is the mean of a normally distributed random variable with a known constant variance. The proposed test statistic is the mean of the ob- served sample values, which is distributed as N(8, oz/n), where n is the sample size. Select two critical points c1 where c < 8 and c and c 1 1 2, 2 > 62, such that the signifi- cance level is equal to some desired value, a, that is a = sup Pr {§ 6 C}, where C = {(-®, c1), (c2, CD)}. For rea- sons shortly to become apparent, the critical points must be located symmetrically relative to the interval hypothesized in (l) in order to have an unbiased test. This symmetry may be expressed by the condition: 61-C1=C2-62 (3) This condition may be rewritten as: (C1 + C2) = (61+ 62) (4) 18 Equivalently, c1 and c2 may be thought of as symmetrical 8 +0 about the midpoint of the interval, ——7—1 . The power func- tion of this test in terms of standardized units is: -6 Power = A) + (l - ¢( ://H)) (5) (://H where ¢(-) refers to the standard normal cumulative distribu- tion function, and n is the number of observed sample values. If x is greater than c2 or less than c1, reject the null hy- pothesis (1). If § is contained in the interval [c1, c2], reject the alternative hypothesis (2) in favor of the null hypothesis (1). It is important to keep in mind that although all of the points in the indifference zone are acceptable under the null hypothesis, only one point is actually the true value of the population mean 60. For example, if the parameter of interest is the mean of the distribution generating the observations, only one point in the indifference zone is actually the true mean. However, the interval hypothesis may be accepted re— gardless of which particular point in the indifference zone is the mean of the generating distribution. For example, suppose that 61 = 0, 62 = l, 02 = 10, n = 10 and the desired significance level is .05. The appro- priate critical points are c1 = -l.68 and c2 = 2.68. The power function for this example is: Power = ¢(-l.68 -'6) + (l - ¢(2.68 6)) (6) .A J .. .... :v ’a - r ,- 3'3. 0 Q‘ _ -o ~ OdU ‘- ‘ .Uv - “ :‘H I, ‘ 9“» \ 1'“ (I) 19 The power function has been graphed in Figure 3. Power 1.00 6>l a) c O N - 8 .50 c m H m Q4 M -a *8 c -a .05 0 -7 0 l 8 Figure 3 Power Function In order to illustrate that these critical points cor- respond to a significance level of .05, it is necessary to determine the supremum of the power function under the null hypothesis. As already noted, the significance level of a test is equal to the supremum of the power function under the null hypothesis, and in this case the supremum of the power function under the null hypothesis is the maximum of the power function under the null hypothesis. Figure 4 shows what happens if the true value of the pa- rameter of interest is assumed to be at the lower end, i.e., 00 = 0, of the hypothesized interval. The value of the power function at this end point is .046 + .004 = .05. This value has been plotted in Figure 2 which showed the values of the 20 f(e) ‘.o46 \J :[llme -1.68 o .5 1 2.68 e O P(0 : -l.68 00:0) = .046 P(e Z 2.68 00:0) = .004 Figure 4 - Population Mean at 01 16(8) .014 .014 I'lml: ' 8 -l.68 0 .5 1 2.68 e 0 8(0 : -l.68 00=.5) = .014 P(0 : 2.68 00=.5) = .014 Figure 5 - Population Mean at .51 1. Graph of normal distribution with mean between points 0 and l inclusive. Ho: 0 i 0 i 1. Ha: 6 < 0 or 0 > 1. c1 = -l.68 and c2 = 2.68 .1: ‘3 lat." . 'n 0" '5" . Iv- . “(‘0‘ 6 vuv v _‘. _.-..v. ~-“‘ 6 ..- y ‘ e ‘ ’4.“ X g... . F F 'H» n... .3318 f e a. _"“ hp 'nc‘xl s,» . ‘. . 5 . ”a 5“» .. “a ; HI. n 0. . s n ‘- AV~I "V‘P‘ 5- ‘ n . ..‘ g \ ~ 21 power function under the null hypothesis. Figure 5 indicates the situation under the assumption that the true parameter value lies at the midpoint of the range, i.e., 60 = .05. In this case, the value of the power function is .014 + .014 = .028. As seen in Figure 2, this point turns out to be the minimum point of the power function. Reversed, Figure 4 would show the end point solution 00 = l yielding a value for the power function of .004 + .046 = .05. Consequently, the supremum of the power function under the null hypothesis, which in this case is the maximum of the power function under the null hypothesis, occurs at either end point, 0 or 1, and is equal to .05. Since .05 is smaller than any of the values that the power function takes on under the alternative hypothesis, this test is an unbiased test as defined by Ferguson (9, p.224). By examining Figure 3 it becomes intuitively clear that the symmetry of the critical points about the hypothesized interval is essential in order for the test to be unbiased. If the critical points were not placed symmetrically about the interval and, therefore, one side of the interval was favored over the other, the power curve would enter the in- difference zone in Figure 3 at a lower power value on one side than on the other. Such a situation would result in the power function having at least one point under the al- ternative hypothesis with a smaller value than at least one point under the null hypothesis. Consequently, by defini- tion such a test would be biased. a 9“. ...< . ‘Vu‘ v.- Ll: O {I} 5 .‘UA. I I (I) 8; § ‘8 22 In addition to designating a test statistic, a testing procedure must indicate how the appropriate critical points are derived given a particular desired significance level. The four equations numbered (8) through (11) which will be discussed below can be solved simultaneously to obtain the required critical points, c1 and c2, given a desired signi- ficance level, a. A The likelihood function for the set of n observations, x1, . . ., Xn’ given the mean, 0, of a normal distribution with variance, 02, is: -n male) = (2662)? exp (xi-0)2 (7) :3]! I QIA "red Define m1 as the probability of being in the left—hand critical region of the test under the normal distribution with mean, 01. Since the test is symmetrical and the normal distribution is also symmetrical, this is equivalent to the probability of being in the right-hand critical region when the true mean is 02. c1 m 1: f L<§I91)d§ or f L(§_I92)d§ (8) CI Next define «2 as the probability of being in the right- hand critical region of the test under the normal distribu- tion with mean 61. Again, since the test is symmetrical, this is equivalent to the probability of being in the left- hand critical region when the true mean is 02. ‘.V V‘,. \ 23 00 C1 0:2 = f L(x|01)dx or f L()_<_|02)d§ (9) C2 “m Recall the symmetry condition: C1 + C2 = 61 + 62 (10) Since under the null hypothesis the power of this symmetrical test is maximized at 01, or, equivalently, 02, the signifi- cance level, a, of this test may be expressed as: a = «1 + a2 (11) Given a particular desired significance level a, the above four equations (8) to (11) may be solved for the four unknowns: «1, «2, c1 and c2. The above test of size a for a normal distribution with unknown mean, 00, but known variance, 02, which has the test statistic, i, and critical points c1 and c2 is a special case of the one-parameter exponential family and provides a test ‘of the hypothesis HO: 01 i 0 3.52 against the alternative Ha: 0 < 01 or 6 > 02 and is therefore a uniformly most power- ful unbiased test as indicated by Lehmann (24, p.126) and Fisz (11, p.581). Next consider the case where the variance, 02, is un- known. The usual student-t statistic may be used in this case where the normally distributed statistic, x, is centered [u 24 about either of the end points of the interval (01 or 02) 1, and divided by the estimated standard deviation of i, s/n2 where n is the sample size and 1 n - S = 621' i (XI'XI- Thus, The density function of the student-t distribution is n f(t) = 1(4) 1 (it -00 < t < 00 an - 2 . VIn-lin F(E§l) (1+ —§TI 1'1 a-l where F(a) = f y e"y dy for a > 0 is the well-known gamma 0 function. Define a; as the probability of being in the left— hand critical region of the test under the student-t distri— bution with n-l degrees of freedom. This may be written as 01' “i = f f(t) dt (12) where CI is the left-hand critical point. Define «5 as the probability of being in the right-hand critical region of the test under the same student-t distribution. Thus, .5 = f f(t) dt ‘ (13) CE The symmetry condition for the critical points of the student-t distribution test may be derived as follows: 6 -6 U I c2 - 2 1 — - c1 where c2 s/né 9 '6 "2 c+c= / 1 2 s/ni The level of significance, 25 (14) , is (15) Given a particular level of significance a', the above four equations (12) to (15) may be solved for the four un- knowns: a' a' I l 1, 2, c1, and oz. The student-t test described above for the case of un- known variance, 02, is not a uniformly most powerful unbiased test, but instead is a uniformly most powerful unbiased scale invariant test as follows from general results for the ex- ponential family by Hodges and Lehmann (14, p.261). The testing procedures described above may be applied to a linear regression problem when the dependent variable is normally distributed. A interest, b k' in the vector of coefficients (k is 0, . . . An estimated regression coefficient of I or K) g- = 80, £31, . . ., BK defined by 1§_= (x'xrlx'x for the regression model y = Xb_+ E.is normally distributed when the dependent variable, y, is normally distributed given a fixed regressor matrix, X, and an error term, E: . ...‘a,‘+ 'woubnti r x‘ 26 Given the usual regression situation where the variance of the dependent variable 02 is unknown, the restriction 61k 3. k‘: 62k may be tested by solving simultaneously the four equations CI «ik = flkf(tgk) dtgk (16) “5k = { f(tgk) dtfik (17) 2k 92k'91k I I = Clk + C2k “"""‘"‘sb (18) k “' = “ik + “5k (19) where f(-) is the probability density function of the student-t distribution which for a linear regression with K-l explanatory variables will have n-K degrees of freedom. The statistic tgk is equal to (Bk-elk)/sgk where sgk is the kth diagonal element of the variance-covariance matrix 52(X'X)‘1 and $2 = efg/(n-K) based on the vector of residuals, e. Having thus determined Clk and Cék for the significance level a', the null hypothesis HO: elk < b < 6 is rejected — k — 2k ' A ' I I if tbk is less than c1k or greater than C2k’ To illustrate this finite interval test consider a con- sumption function of the form C. = b +b Y.+e. where C. is 1 o l 1 l i aggregate consumption of goods and services, Yi is aggregate personal disposable income, and Si is the disturbance term. A test of the hypothesis that the marginal propensity to 27 consume lies between zero and one may be formulated as a test of the null hypothesis HO: 0 < b1 i 1. If 581 = 1, ~ A b1 = 1.1, and 20 annual observations provide 18 degrees of freedom, the critical points obtained by solving equations (16) through (19) at the 5% level of significance are -l.802 and 2.802. In this case the test statistic tgl is equal to 1.1 and the null hypothesis is not rejected. In the multiparameter case, the hypothesis QJ 5.9.:_92 may be tested by obtaining the solution to the following equations: B'x'xa /(K-l) * _ _L ,_l 61 _ e'e/(n-K) (20) 65X'X62/(K-l) 62 = — e'e7(n-K) (21) * * c1 - 9T = 93 ' c2 (22) C? «T = g h(Fg)dF8 (23) a3 = f* h(F6)ng (24) C _. _. 2 1 + 2 (25) where h(-) is the probability density function of the Snedecor's F distribution with (K-l) and (n-K) degrees of b'X'Xb/(K-l) freedom. The statistic Fb is equal to — — _ e'e/(n-K) CT and c; obtained for significance level a*, reject the . With null hypothesis H0: ‘21 2.2.1.92 if F8 is less than cf or greater than c*. To demonstrate this test procedure the following standard linear multiple regression equation is used: Yi = b0 + blxli + bZXZi + Si where Yi IS the dependent varia— ble, X and x2i are the explanatory variables, and 8i is the 1i error term. The coefficients b1 and b2 might be restricted to non-negative values such that b1 and b2 are no greater than one. These restrictions and an X'X matrix based on 20 hypothetical observations result in a two-sided F-test with critical values of = .05 and c3 = 62.37 at the 5% level of significance. The estimated regression coefficients bl = .89 and 82 = .44 yield a test statistic FB = 50.38 which lies within the critical values and, therefore, indicates that the null hypothesis of restricting the coefficients to the finite intervals should not be rejected. This chapter has developed a test procedure for de— ciding if inequality restrictions are appropriate for a particular situation where such restrictions limit the parameter of interest to a finite interval where the parame- ter may be a regression coefficient. Having determined the appropriateness of such a proposed inequality restriction, one may wish to proceed with the incorporation of such an inequality restriction into the estimation process by means of the minimum distance estimator whose distribution will now be discussed in Chapter III. CHAPTER III THE MINIMUM DISTANCE ESTIMATOR AND ITS DISTRIBUTION Once an inequality restriction has been deemed appro- priate as by the testing procedures developed in Chapter II, alternative means of dealing with the inequality restriction in estimation need be considered. In order to compare such estimators in terms of their distributional properties, it is first of all necessary to have some idea of the nature of their distributional characteristics. Since the minimum dis- tance estimator is a new estimator in relation to the well- known constrained maximum likelihood and unconstrained maximum likelihood estimators, the distribution of the mini- mum distance estimator will be discussed in this chapter relatiVe to those of the maximum likelihood estimators to serve as a basis for understanding the comparison of the es- timated distributional properties by means of Monte Carlo experiment in Chapter IV. As already noted, the minimum distance estimator is de- fined in terms of the unconstrained maximum likelihood esti- mator. Intuitively, this relationship is best expressed graphically. Figure 6 gives an example of a constrained space with an unconstrained maximum likelihood estimate Eu lying outside of the constrained region. The point labeled EC is the constrained maximum likeli- hood point which represents the point in the constrained set with the greatest likelihood. When the constrained space is A defined by inequality restrictions on 8, 9C can be obtained 29 30 Figure 6 Likelihood Ellipse Contours and the Constrained Space by the constrained likelihood function by use of a Lagrangian function subject to the Kuhn-Tucker conditions which require the solution to lie in the constrained space. The point in the constrained space that is closest to the unconstrained maximum likelihood estimate is the minimum distance estimate which is labeled §m in Figure 6. Figure 6 emphasizes the importance of requiring that the constrained parameter space be convex in order to obtain unique minimum distance estimates gm since multiple solutions might occur for estimates in non-convex regions. In addition to the convexity assumption, and in order to relate the dis- cussion more easily to classical normal linear regression analysis, the underlying random variable y is assumed to be normally distributed with mean 90 and variance-covariance -15? .‘v' q-ov‘ I»... be he.“ you/e. ('0 3A.. _ by“: 31 matrix 021 where I is an appropriately dimensioned identity matrix. 3.1 Distributional Characteristics 'In the single parameter case, an lies in a one dimen- sional parameter Space. The dependent variable y takes on values yi with the error term E defined to be the difference between the y and the population parameter 60 such that s. = yi - 60. In this case, 60 is assumed to be the ex- pected value of yi and therefore the expected value of 61 is zero. The variance of yi is given by 02, and since 60 is a constant, the variance of Ei is also given by 02. Conse- quently, the probability density function of yi is -(Yi-eo)2 f(yi) = (2n)-%(02)-% exp 202 (26) A The unconstrained maximum likelihood estimator, Bu, is n equal to the arithmetic mean of the sample § = Z yi where n i=1 is the sample size. Therefore, Bu is distributed as normal 2 . . 0' . . With mean 60 and variance H—u This may be summarized as follows: A 02 6u m N(eo'H—) . (27) f A _% 02 _% -n(5u—60)2 (9n) — (2“) (3-0 EXP 2029* (28) 32 A The constrained maximum likelihood estimators 6c and 3:, are derived by maximizing the likelihood function, L(6|x), subject to the general inequality constraints, 9(6), where gj(8) :_O for j = 1,2, . . ., r. The natural logarithm of the likelihood function l(8|x) = ln L(8|x) may be substituted into the Lagrangian expression: L = 1(9|§) - X 9(9) (29) subject to the following Kuhn-Tucker conditions: BL/BB - A 3g(8)/36 = 0 (30) (BL/36 - A 8g(6)/88) 9 = O (31) 6 3 O (32) 9(6) f_ 0 (33) A 9(6) = 0 (34) A _>_ 0 (35) Note that solving the Kuhn—Tucker conditions for the desired estimates is a considerably more difficult problem than solving the usual Lagrangian partial derivatives based on equality restrictions. -. #IQA . _“ \Anp.y ““3 -- ‘ 2- «A N‘N—L“ I". ‘12)- 33 The minimum distance estimator, gm! is specified in the single parameter case by finding the 6 that minimizes the squared distances such that min A -A 2 = (eu em) 96C (éu-e)2 (36) Given a lower boundary Constraint 61 and an upper boundary A constraint 62, am is restricted to take on values as follows: em = 92, if an 3 62 8n, otherwise (i.e., 61 < éu < 62) (37) Thus, a regression coefficient that falls outside of a re- stricted interval is set at the nearest boundary end-point of the restricted interval. The probability density function of the minimum dis- tance estimator is of the mixed discrete and continuous type: p1 for 6m = 81 9(6m) = P2 for 6m = 62 A A 38 f(Bu) for 61 < am < 62 ( ) where p1 = A f f(e )de and p2 = A f f(é )dé . 6 <6 u u 6 >6 u u u 1 u 2 34 The probability density function of the minimum distance estimator can be graphed as shown in Figure 7. CD) 91 6 62 Figure 7 p.d.f. of MDE Ramsey (30, p.11) shows that the minimum distance esti- mator is a consistent estimator of 60. Referring to Wald's 1949 proof of the consistency of the unconstrained maximum likelihood estimator, 8U, Ramsey points out that the defini- tion of the minimum distance estimator given in equation (36) implies that: (Bu 6m) : (eu Go) (39) The consistency of the minimum distance estimator then fol- lows from the consistency of the maximum likelihood estima- A tor, since the consistency of Bu implies that: lim Pr (Idu-e = 0) e 1 (40) Ol Thus, 6m is also a consistent estimator of 60. nus a; .lk- \n bnfi‘. ‘~. bang .‘ 15 3.“. 3v;- ‘ ,_. «A n- b— » I» ”a “.I-v— b- :1 Ili— ‘ a." - ‘4 .u . “ A“. - 'LV.. \- In 'IA' \ l ‘V' 5 NI‘: n 35 The expected value of the minimum distance estimator in this case can be expressed as follows: 62 = , . A . “ . “ 41 E61“ p161+ pz 62 + £1 Bu f(Gu) deu ( ) ’ . A Although the unconstrained maximum likelihood estimator,6u, is an unbiased estimator of 60, the minimum distance estima- tor, 8m, is biased except in the Special case where 61 and 62 are equi-distant from 80 to form a symmetrical distribu- tion as was shown by Penneck (29, p.15). However, 61 and 62 are determined by economic theory, and in general, 60 cannot be expected to fall midway between them. In the case of single one-sided inequality constraints, the minimum distance estimator will generally be biased. In that event the direc- tion of the bias will be in the unconstrained direction. Since 6m is generally a biased estimator of 60, it would not be apprOpriate to compare its efficiency relative to 8n in terms of variance alone. Mean squared error offers a better basis for comparison in this case. The mean squared error of the minimum distance estimator may be written in this case as: MSE 5 = E (6 -e )2 (42) m m o . This expression may be restated algebraically as the sum of the variance plus the bias squared: 36 A = A-A 2 A- 2 43 MSE em E (em Eem) + (Eem do) ( ) Similarly, the mean squared error of the unconstrained maxi- mum likelihood estimator is: MSE 6 = E (6 -e )2 (44) u u 0 Since in this case the unconstrained maximum likelihood es- timator is unbiased, its mean squared error is equal to its variance: MSE d = e (6 -Ed )2 = o ' (45) u u u Clearly, when calculating the mean squared error under the assumption that the inequality restrictions are valid, i.e., that 61 5_eo : 62, the squared distances will be less for the minimum distance estimator than for the uncon- strained maximum likelihood estimator for those points that fall outside the restricted range. For points within the re- stricted range, the squared distances will be the same. Con- sequently, the minimum distance estimator will have a mean squared error as small or smaller than that of the uncon- strained maximum likelihood estimator when the inequality re- strictions hold true. Il‘ '__ an. iv VV‘ 37 Having discussed the distributional characteristics of the MDE, the distributional properties of the MDE will now be compared by means of a Monte Carlo experiment in Chapter IV to those of the UMLE and the CMLE. CHAPTER IV MONTE CARLO EXPERIMENTS In this chapter the bias, variance, and mean squared error of the UMLE, CMLE, and MDE will be compared for small samples to determine which estimator is preferable for fi- nitely bounded compact sets of which inequality restrictions serve as a special case. In his recent paper, Ramsey (30, p.36) has shown that under certain assumptions the constrained maximum likeli- hood estimator has the same asymptotic distribution as the minimum distance estimator. His preliminary sampling ex- periments have indicated that both estimators lead to a con- siderable reduction in mean squared error. However, a more extensive empirical analysis is needed to compare the small sample characteristics of the uncon- strained maximum likelihood (UML) estimator, the constrained maximum likelihood (CML) estimator, and the minimum distance (MD) estimator. 4.1 Criteria for Comparing the UML, CML, and MD Estimators The following criteria will be used for comparing and evaluating the UML, CML, and MD estimators. Estimated Expected Value The estimated expected value of the jEE-parameter may be expressed by: (46) CD)| II L“|I—' II M t" CD) i 38 39 where L is the number of replications of the Monte Carlo sampling experiment. Estimated Variance The estimated variance can be calculated for the jEE parameter by the estimated expected value of the square of an estimator minus the square of its estimated expected value: Est. Var e. = 6. - (9.)2 (47) where Although this estimator of the variance is appropriate for Monte Carlo sampling experiments, it cannot be used to estimate the variance of the minimum distance estimator in the usual single sample situation since one sample only yields one minimum distance estimate and thus the variation of the MDE cannot be observed in this manner when only one sample is available. However, an estimate of the variance of the minimum distance estimator can be obtained for the one sample case by first considering the true population variance of the minimum distance estimator. The true population variance of the minimum distance estimator may be expressed as: A Var 6m — E(6m Eem) E8 (E8 ) (48) 2 m m where 40 62A A A E6 = p1-61 + pz'ez + f eu-f(6u|60) °d6u m 61 and A2 _ 92 62 $282 f 8 I8 8 Eem - P1 1 + p2 2 + 61 u ( u 0) d u Therefore, Var em = Pl‘el + P2'92 + 12 ‘ (P1'91+Pz'92+11)2 where a (49) 62" A A 11 = f 6 -f(e |e )oda 91 u u o u and 62A2 A A I2 = f 6 -f(6 l6 )°d6 61 u u 0 u p1 and p2 may be calculated from the cumulative distribution A function of an as: 6 -8 P1 = F( 1 O) u and e -e e -e p2 = F( 00 2) = 1 - F( 2 O) u Cu 2 A where Cu is the variance of an and F(-) is the cumulative distribution function of a standardized normally distributed random variable with mean zero and variance one. I1 is then determined as in Penneck (29, p.14) by: (e -60)2 462-90)2 0' .. I: = (l-pi-p2)-60+ ———3r [exp { 1 2 } -exp { (IZTTY2 ZOu 20$ }] 41 Penneck (29, p.19) also derives 12 as: I = (oz-62)-(1-p -p ) + 2-e -I 2 u 0 1 2 o 1 For a one—sided left-hand constraint 8m 3.61 the above two-sided expressions reduce to: E§m = p161 + I? 36; = plef + I; Var 6m = 916% + I; ‘ (9191+I*)2 where , If = I 6 -f(é le )-d6 = (1-p1)8 + .32_r [exp{‘(91':o)2}] 81 u u o u o (2n)é 20u and 7 8 * _ A2. A . A = 2- 2 - . * I2 — f eu f(euleo) deu (cu 60)(l p1)+ 260 11 61 For application to classical regression analysis, the appropriate diagonal element of 52(X'X)“1 serves as a con- sistent estimator of the variance, 03, of an ordinary least squares coefficient as indicated by Kendall and Stuart (20, p.82). In addition the unconstrained maximum likelihood estimator, §u' seryes as a consistent estimator of 60 as follows frOm Theil (34, p.392) in obtaining’an estimator of the variance of the minimum distance estimator by substitu- 2 u and 60 in the above equations. ting in for o ”4“? " ‘ ."‘ ".'U- " . -.q a . A - C non-v ‘ "~v. an . 4 "O -‘on‘ 1 (p (D (n u ~o~ A,- .‘ ogu‘d. --‘.~~q :3 5....“ a Au 5. vi-.. 42 In support of these results ten Monte Carlo sampling experiments based on a sample size of ten and an experiment size of 500 replications compared the true population variance, 0;, of the minimum distance estimator to its sam- ple estimator, 3;. The unconstrained population variance was set at 1.0 while the corresponding minimum distance popu- lation variance was .341 based on the constraint that the estimates be positive with population mean at zero. The re- sults indicate that the estimator, 3;, of the variance of the minimum distance estimator did almost as well in esti- mating the true population variance, 0; = .341, of the mini- mum distance estimator as the standard unbiased estimator, 2 u' of the variance of the unconstrained estimator did in 8 2 estimating its true pOpulation value, cu = 1.0. These re- sults appear to support the use of a; as a reasonably good . . 2 approx1mation of cm. Estimated Bias The estimated bias of each of the three estimators for the jEE parameter can be calculated as the estimated ex- pected value minus the true population parameter value: Estimated Bias of (5. = 3.4) . 50 J . J 03 ( ) where eoj is the true population parameter value. In the one sample situation the bias of the minimum distance estimator can be expressed for a single constraint 43 as Glpl + I? - 60 where the maximum bias occurs when the true population parameter is at the boundary of the constrained region such that 80 = 01. With 61 substituted for 00 the bias becomes elp1 + If - 91 where p1 = mil-59$) = F(0) = .5 so that the . . f u 0u 0u maXimum bias is 61(.5) + (l-.5)61 + ————r - 01 = ———~;-. (2W)? (2“)2 Estimated Mean Squared Error Although estimated bias and estimated variance are of interest separately their joint importance might be con- sidered by different possible measures of estimated mean squared error. Each different definition of mean squared error reflects a loss function for the trade-off between the larger bias but smaller variance of the minimum distance estimator. Although a multitude of arbitrary definitions may be conceived of to reflect the trade-off between bias and variance, the following definitions are multiparameter extensions of the usual univariate definition of mean squared error and, therefore, seem to be more likely to be generally acceptable than many possible alternatives. Estimated MSE A The standard definition of mean squared error is equiva— lent to variance plus squared bias for each parameter sepa- rately in the multiparameter situation. The estimated mean squared error (MSE A) for the jEE-parameter could then be calculated by the estimated variance plus squared estimated .-. .. v -~ O -1- .h,‘ "Vvv "r Y“ ‘ —u. '4‘“ s“ I " A v.‘ x N»- w‘ ‘M 4 ‘ . ‘n 9.. ‘ u x i . “a 5‘ I \. .\ \ .‘ N \ . 44 bias: A l L A Est. MSE A of ej E z (e..-e .)2 = Est. Var aj + (Est. Bias of §j)2 (51) for the jEE parameter where j = l, . . ., J for the J pa- rameters. Estimated MSE B A summary measure of mean squared error (MSE B) in the multiparameter situation might be obtained by summing MSE A over all parameters for each estimator and dividing by the number of parameters. Estimated MSE B summarizes the im— portance of estimated variance and estimated bias for all parameters simultaneously just as estimated MSE A does for each parameter individually. Since MSE B is defined by Ramsey (30, p.20) by E(§fgo)'(§fgo), estimated MSE B may be defined by: Est. MSE B of Q = (@..-9 .)2 (52) Estimated MSE Dl Another formulation of mean squared error (MSE D) con- siders the mean cross-product of errors between parameters. 1. Since MSE C is defined by Ramsey (30, p.20) for all positive semi-definite matrices A in the expression .E(0-00)'A(8-60) nothing conclusive can be shown by consider- ing only one such matrix or even a limited number of such matrices in a sampling experiment. Moreover, Ramsey has indicated that MSE C and MSE D are equivalent (30, p.20). ‘Vflm chi}... l 13.. $. .~ “‘9“ HD 3")”: .- T‘avr: Lza r=~ A.» C Ls *hE A '2 ‘K \ b. “"A\ usi- s. ~ h ‘1 45 This cross-product concept is similar to that of a covariance in a variance-covariance matrix. The estimated mean cross- product is calculated as the average value of the product of the difference of one estimator from its true population parameter value times the difference of another estimator from its true pOpulation parameter value. The diagonal elements in the matrix of estimated mean error squares and cross-products are the estimated mean squared error (MSE A) terms themselves for each parameter. The off-diagonal ele— ments in this matrix are the estimated mean cross-product errors between parameters. Ramsey (30, p.20) expresses MSE D algebraically by E(§f§b)(§fgb)'. A Est. MSE D of (a: = 1 rue )Itab 8 -g )(_6_ -_e_)' (53) O O l l A matrix of estimated mean error squares and cross—products is then calculated for each estimator. If the estimated mean error matrix of one estimator is subtracted from the estimated mean error matrix of another estimator and the difference is a positive semi-definite matrix, then the first estimator is said to have a smaller estimated mean squared error than the second estimator in the sense of this definition of estimated MSE D. Estimated MSE E A final definition of estimated mean squared error (MSE E) might be considered which also uses the estimated --o~. ”QvCM p. ‘ n». Ay- suuv. w“ ”h 2‘ c3 ‘1. g a: s -. . 46 mean error matrix described above. Ramsey (30, p.20) de- A fines MSE E algebraically by |E(§fgo)(gfgn)'| where [M] represents the determinant of any square matrix M. Esti- mated MSE E may be expressed: Est. MSE E of d = |% ”Mt" (6,-e )(3—9 )'| <54) ‘1 ‘b '1 ‘0 i 1 Under this definition of estimated MSE E, if the deter- minant of the estimated mean error matrix of one estimator is smaller than the determinant of the estimated mean error matrix of another estimator, then the first estimator is said to have a smaller mean squared error. 4.2 Three Monte Carlo Models Monte Carlo experiments will be performed on the fol- lowing three models: (1) a discrete points model where the constrained space consists of a set of nine discrete points as shown in Figure 8; (4,4) Figure 8 Discrete Points Model 47 (2) a square model where the constrained space is a square as shown in Figure 9; (4,4) Figure 9 Square Model (3) an elliptical model where the constrained space is an ellipse with center at (5,5) as shown in Figure 10. Figure 10 Elliptical Model Samples of size 30, 60, and 100 will be used for each of the three models. Unconstrained maximum likelihood (UML), minimum distance (MD), and constrained maximum likelihood (CML) estimates will be calculated for each sample and a Monte Carlo sampling experiment size of 1,000 replications will be used. 48 UML Estimator Since the three models are the same except for the na- ture of their constrained space, the unconstrained maximum likelihood estimator is calculated in the same manner for each model. Various specifications of the variance- covariance matrix are tried in order to examine the effect of altering the shape and direction of the major axis of the set of ellipses representing the likelihood function. In the sampling experiments, this set of likelihood ellipses is centered at the unconstrained maximum likelihood esti— mate point (i, y) where for the bivariate pair of random variables (X, Y), the sample means are defined by 1n 1“ z = — X x. and y = — Z yi where (Xi' yi) represents an in- n i=1 1 n 1:1 dividual sample observation and n is the number of observa— tions in the sample. Thus, the unconstrained maximum likeli- hood estimator is given as 6' = (0 , 0 ) = (2, y) for the —u 111 112 first three models. Since calculation of the MD and CML es— timates depends upon the nature of the constrained space, their calculations will be discussed separately as each model is examined. Population Mean Specification The mean of the bivariate normal distribution will initially be set at the point (5,5) which represents the center point of the constrained space for all three models. Penneck (29, p.15) has indicated that the minimum distance estimator is an unbiased estimator of the population mean 49 in the special case where the population mean happens to lie at the center of the constrained space. In order to compare this situation with the other extreme possibility of having the mean as far away from the center point as possible, the entire set of sampling experiments will be repeated with the mean at the corner point of the constrained space which is at the point (4,4) for both the discrete points model and the square model. Penneck (29, p.17) indicates that the largest values for bias for the MD and CML estimators occur when the true population mean is as far away from the center point as possible. Consequently, the sampling experiment for the elliptical model-will be rerun with the mean at the bottom point of the constrained space ellipse which happens to be the point (5,5-/3) for the particular ellipse chosen for this experiment and is as far away from the center point as possible. Before proceeding with the analysis of the individual models it might be worth noting at this point that in order to get a different estimate for constrained maximum likeli- hood than for the minimum distance approach, it is necessary that the two variances, a: and 0;, on the diagonal of the variance-covariance matrix be significantly different from one another. If the variances are the same, the likelihood "ellipse" will approximate a circle and the CML and MD esti- mates will tend to be the same. In addition, in the case of the square, the covariance, 0 should be nonzero or the CML and MD estimators will xy' 50 tend to be the same even when the variances, o: and a; , ldiffer significantly from one another. This tendency occurs because the square was constructed with sides parallel to the horizontal and vertical axes.‘ Since zero covariance implies that the major axis of the likelihood ellipse is also parallel to one of these axes, the majOr or minor axis of the likelihood ellipse will tend to form a 90° angle with the side of the square and thus result in equal MD and CML estimates. Also note that the correlation coefficient, 0, o for the likelihood function is defined by p = 5~¥%L-, or in x other words, the covariance divided by the square :oot of the variances. Maximum difference between the MD and CML estimates might be expected when in addition to unequal variances, the correlation coefficient is set at .5 or -.5. This corresponds to Ramsey's contention (30, p.8) that "in two-dimenSional space the values taken by the two estimators will differ most markedly when the angle between the major 'axes' of the likelihood contour and of the parameter space Tr II 153-. Variance-Covariance Specifications The square roots of the variances and covariance of the bivariate normal random number generator will be designated V1, V2, and CV, respectively, and the correlation coefficient will be expressed as p. The experimental design for the sampling experiments for each model and each of the three sample sizes consists of four different specifications for VARIANCE AND COVARIANCE SPECIFICATIONS ~..... a. Unequal Variances and Nonzero Covariance 51 TABLE 1 V1 V2 CV p 5 20 -7.07 -.5 10 15 +8.66 +.5 15 10 -8.66 -.5 20 5 +7.07 +.5 b. Equal Variances and Nonzero Covariance V1 V2 CV O 5 5 —3.54 —.5 10 10 +7.07 +.5 15 15 -10.61 -.5 20 20 +14.l4 +.5 c. Unequal Variances and Zero Covariance V1 V2 CV p 5 20 0.00 .0 10 15 0.00 .0 15 10 0.00 .0 20 5 0.00 .0 d. Equal Variances and Zero Covariance V1 V2 CV p 5 5 0.00 .0 10 10 0.00 .0 15 15 0.00 .0 20 20 0.00 .0 52 the variances and covariances each of which was run at four different settings with the results combined. The variance— covariance specifications are given in Table 1. In order to summarize the resulting estimates of bias, variance, and the various definitions of mean squared error, two sets of tables are presented. One set of tables is de- signed to indicate how often one estimator is better than another in terms of smaller bias, variance and mean squared error. The three estimators were ranked one, two, or three with a rank of one corresponding to the smallest value for bias, variance or mean squared error. These ranks were then averaged for each of the variance-covariance specifications. A second set of tables shows the average values of bias, variance, and mean squared error for each of the three esti- mators under the various variance-covariance specifications. This set of tables is designed to give some idea as to the (size of the bias, variance, and mean squared error of each of the estimators. 4.3 Discrete Points Model The first model to be considered will be the discrete points model in which the constrained space consists of a set of discrete points. The rationale behind this model is that it is designed to fill the gap cited by Ramsey (30, p.25) for multiparameter situations that "with respect to the situation in which 6 contains only a finite number of points, no finite sample size results have yet been obtained 53 on mean squared error properties." This model will consider not only the mean squared errors, but also the other charac- teristics of the estimators which were discussed in detail above. ‘More specifically, the following nine points have been chosen to represent the constrained space for the discrete points model: (4,4), (4,5), (4,6), (5,4), (5,5), (5,6), (6,4), (6,5), and (6,6). The midpoint (5,5) serves as the true population mean of the bivariate normal distribution. MD Estimator The minimum distance estimator, §$,= (8 can be m1' 6m2)’ calculated for the discrete points model by substituting the . . A ' = A A . - nine pOSSible values for gm (eml, Omz), into the expres sion D2 = (8m1-2)2 + (0m2-9)2 where D2 represents the square and the MD esti- of the distance between the UML estimate Eu mate gm, and by choosing the value for g& = (Bml, emz) that corresponds to the smallest value for D2. CML Estimator The constrained maximum likelihood estimator, = (0C1, §c2)' is calculated as follows. The paired ran- dom variables (X, Y) have the joint bivariate normal proba- g. _C bility density function given by (l- 2H”2 -1 f(x y) =_.____9___._._exp o ' Znoxoy 2(1-pz) x-B x- -6 '-6 [}__32192 -2p(__%nlq(Z_3229+({_3229€] X X y y 54 T 2 2 fl 0 I c wnere p,_0x, and 0y are tne known correlation coeffiCient . . I _ and variances as described above and where 80 — (801, 802) is the unknown mean of the bivariate normal distribution which will be estimated in the constrained case by the con- strained maximum likelihood estimator 8' = (8 , 8 ). —c c1 c2 The product of n values of the above probability density function where (x,y) is replaced by each of the n sample . . = o . values (xi,yi) and wnere go (8 2) is replaced by eacn 01' of nine possible values, (801, 802), for the constrained maximum likelihood estimator 8; = (8 ) forms the appro- c1' 8c2 priate likelihood function which may be maximized by mini- mizing its exponent as follows: Minimize, n C = z (_i_23192_2p(_i__c_9(.ia_c_) + (_i%_c_)2 ' x i=1 by trying each of the nine possible values, (8 8 ), and cl’ c2 cl’ 8C2), which corresponds to the smallest C value for the nine points. The point thus picking the CML value, 8C = (8 chosen will be the constrained maximum likelihood estimate for the discrete points case. Sampling Results Estimated Bias The sampling results corresponding to the three sample sizes, two means, and various general specifications of the variances and covariance are summarized in Table 2 and 55 TABLE 2 AVERAGE RANKS OF SMALLER ABSOLUTE ESTIMATED BIAS FOR POINTS MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 2.75 1.63 1.63 a. 1.00 2.50 2.50 N-30 b. ‘2.75 1.63 1.63 b. 1.00 2.50 2.50 _ c. 2.63 1.75 1.63 c. 1.00 2.50 2.50 c. 2.75 1.81 1.44 d. 1.00 2.63 2.38 Total 2.72 1.70 1.58 1.00 2.53 2.47 a. 2.63 1.50 1.88 a. 1.00 2.38 2.63 N=6O b. 2.50 1.81 1.69 b. 1.00 2.50 2.50 c. 2.75 1.63 1.63 c. 1.00 2.81 2.19 d. 2.75 1.63 1.63 d. 1.00 2.75 2.25 Total 2.65 1.70 1.71 1.00 2.61 2.39 a. 2.75 1.50 1.75 a. 1.00 2.38 2.63 N=lOO b. 2.13 2.25 1.63 b. 1.00 2.50 2.50 c. 2.63 2.00 1.38 c. 1.00 2.25 2.75 d. 2.88 1.75 1.38 d. 1.00 2.25 2.75 Total 2.60 1.87 1.53 1.00 2.34 2.66 TOTAL 2.66 1.74 1.60 1.00 2.49 2.51 Table 3. The notations a,b,c, and d refer to the variance- covariance specifications previously referred to in Table 1. As expected when the true population mean happens to be at the center point (5,5) of the constrained space, Table 3 shows that the estimated bias is quite small for all three estimators which tends to support the hypothesis that they may be unbiased estimators in this circumstance. In addi- tion to being small, the estimated bias for the three esti- mators tends to be positive almost as often as negative and 56 TABLE 3 AVERAGE VALUES OF ESTIMATED BIAS FOR POINTS MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. -.006 -.008 .001 a. -.006 .283 .340 N=30 b. -.031 -.021 .000 b. -.031 .568 .740 c. -.187 -.057 -.043 c. -.l87 .766 .759 d. -.075 -.046 -.044 d. —.075 .571 .575 a. .030 .028 .032 a. .030 .217 .262 N=60 b. .006 -.004 .002 b. .006 .497 .668 c. -.019 .002 .002 c. -.019 .718 .712 d. -.034 -.027 -.027 d. -.O34 .467 .461 a. .006 .001 -.004 a. .006 .128 .161 N=100 b. -.058 -.034 -.021 b. -.058 .390 .576 c. .079 .018 .016 c. .079 .653 .658 d. -.015 .000 .000 d. -.015 .354 .357 since estimated skewness in this case is approximately zero for all three estimators, this further suggests that the true value of the bias for all three estimators in this spe- cial situation is zero. It is interesting to note that while the estimated bias is quite small for all three estimators when the mean is at the center point, the UMLE tends to have a larger average estimated bias in Table 3 than the other two estimators. Furthermore, Tabel 2 indicates that the UMLE tends to have a larger estimated bias more frequently in addition to having a larger average value. This is particu- larly interesting because the UMLE is known to be unbiased even for small samples. At the center point, the MDE tends to have a slightly smaller average estimated bias than the CMLE as can be seen 57 in Table 3. Table 2 indicates that the MDE has the smallest estimated bias more frequently than the CMLE when the mean is at (5,5). The estimated bias of the CMLE is almost equal to that of the~MDE for variance-covariance specifications c and d which suggests that the MDE and the CMLE may be giving nearly equal estimates when the covariance is zero. In- creasing the sample size when the population mean is at the (5,5) center point does not appear to influence the size of the estimated bias of any of the three estimators which is to be expected because it is essentially zero initially and therefore not subject to further reduction. When the true population mean of the random number generator is shifted to the (4,4) corner point, the average estimated bias of the CMLE and MDE becomes larger than that of the UMLE and strictly positive as it tends toward the center point of the constrained space as predicted by Penneck (29, p.17). Table 2 indicates that the UMLE has the smallest bias every time for all sample sizes and specifications when the mean is at the (4,4) corner point. The average estimated bias of the MDE tends to be slightly larger than that of the CMLE especially for specifications a and b which require non- zero covariances. However, the estimated bias of the MDE is smaller than that of the CMLE more frequently for small sample sizes as shown in Table 2. For sample size 100 the CMLE tends to have a smaller estimated bias more frequently than the MDE. The relative performance of the CML and MD 58 estimators in this situation is primarily dependent upon the slope of the major axis of the likelihood ellipse. The CMLE does better than the MDE when the slope of the major axis of the likelihood ellipse is negative in the (4,4) case. A negative slope means that the CMLE is more likely to select the (4,4) corner point or the nearest side points when the (4,4) corner point is the true population mean. Table 3 indicates that the estimated bias for the MDE and the CMLE decreases for all cases as sample size increases when the mean is at the (4,4) corner point. The UMLE gives the same values for estimated bias at the (4,4) corner point as it gave for the (5,5) center point since it is not in- fluenced in any way by the constrained space. Sample size does not appear to influence the estimated bias of the UMLE which is shown in Table 3 to be practically zero anyway. Estimated Variance In all cases the average estimated variance of the UMLE is substantially larger than that of either the CMLE or the MDE as shown in Table 5. When the mean is at the (5,5) cen- ter point the estimated variance of the UMLE is almost always larger than that of the other estimators as indicated in Table 4. For sample size 30 the UMLE's estimated variance is always the largest, but as the sample size increases a few exceptions occur but the UMLE still has the largest estimated variance in most cases. This result seems intuitively plausible since the UMLE can choose points anywhere in 59 TABLE 4 AVERAGE RANKS OF SMALLER ESTIMATED VARIANCE FOR POINTS MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 3.00 1.00 2.00 a. 3.00 1.13 1.88 N=30 b. 3.00 1.00 2.00 b. 3.00 1.38 1.63 c. 3.00 1.50 1.50 c. 3.00 1.75 1.25 d. 3.00 1.50 1.50 d. 3.00 1.75 1.25 Total 3.00 1.25 1.75 3.00 1.50 1.50 a. 3.00 1.13 1.88 a. 3.00 1.25 1.75 N-60 b. 3.00 1.13 1.88 b. 3.00 1.50 1.50 c. 2.88 1.88 1.25 c. 3.00 1.75 1.25 d. 2.75 1.94 1.31 d. 3.00 1.75 1.25 Total 2.90 1.52 1.58 3.00 1.56 1.44 a. 2.75 1.25 2.00 a. 3.00 1.25 1.75 N=100 b. 2.50 1.38 2.13 b. 3.00 1.38 1.63 c. 2.50 1.50 2.00 c. 3.00 1.25 1.75 d. 2.50 1.88 1.63 d. 3.00 1.25 1.75 Total 2.56 1.50 1.94 3.00 1.28 1.72 TOTAL 2.82 1.42 1.76 3.00 1.45 1.55 two-dimensional space whereas the constrained estimators are limited to choosing from among nine particular points. How- ever, as sample size increases the variation of the uncon- strained sample estimates should be expected to decrease. If this decrease is substantial enough, most if not all of this variation may take place within the grid of the nine constrained space points. UMLE to have a smaller variance than either of the con- This situation might cause the strained estimators in some cases especially when the sample 3 til ..3 i“ 4“ ‘ \‘Le;\7 ‘§ .- E h 130 size .1‘: . 5&511‘) .9“ ~\~ 60 TABLE 5 AVERAGE VALUES OF ESTIMATED VARIANCE FOR POINTS MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 10.415 .879 .880 a. 10.415 .859 .820 N-30 b. 10.693 .851 .868 b. 10.693 .691 .804 _ c. 14.342 .890 .897 c. 14.342 .847 .840 d. 14.267 .885 .881 d. 14.267 .855 .846 a. 5.454 .812 .829 a. 5.454 .781 .750 N-6O b. 5.526 .802 .832 b. 5.526 .592 .724 _ c. 6.752 .852 .854 c. 6.752 .779 .777 d. 6.470 .838 .836 d. 6.470 .777 .774 a. 2.967 .754 .782 a. 2.967 .655 .636 N-lOO b. 3.278 .771 .799 b. 3.278 .468 .618 _ c. 4.093 .804 .806 c. 4.093 .710 .698 d. 4.051 .789 .788 d. 4.051 .678 .671 size is large. The average estimated variance of the MDE in Table 5 is slightly larger than that of the CMLE when the mean is at the (5,5) center point except for specification d which requires equal variances with zero covariance. There the MDE's estimated variance is slightly smaller than that of the CMLE. The table of ranks indicates that the MDE's estimated variance is often larger than that of the CMLE al- though specification d is again an exception. Since the average estimated sample correlation coefficient was approxi- mately -.003, the CMLE may have had a slight tendency to select the upper left-hand and lower right-hand corner points when the MDE chose the left or right-hand side points, re- spectively. The ranks of the CMLE and the MDE are equal for 61 specification c at N=30 and the MDE has a smaller rank than the CMLE at N-60 for specification c. Since specification c requires zero covariance, it is likely that this results in CML and MD estimates that are nearly identical. In any event the estimated variances of the MDE and the CMLE are very close to one another. As sample size increases when the mean is at the (5,5) center point, the estimated variances of all three estimators decrease with that of the UMLE de— creasing most substantially as indicated in Table 5. When the population mean is at the (4,4) corner point the estimated variance of the UMLE is always larger than that of the CMLE or the MDE as can be seen in Table 4. The aver- age value of the estimated variance of the MDE is smaller than that of the CMLE when the population mean is at the (4,4) corner point except for specification b. The average estimated sample correlation coefficient for specification b was substantially negative resulting in a negative slope for the major axis of the likelihood ellipse. The unequal variances may have accentuated the elongation of the like— lihood ellipse in this case. These developments would tend to cause the CMLE to be more likely to choose the (4,4) cor— ner point or a nearby side point than it otherwise would. The ranks of the estimated variance of the MDE are smaller than those of the CMLE when the sample size is 60, equal when the sample size is 30, and larger when the sam- ple size is 100. In general the estimated variance of the MDE and that of the CMLE are so close to one another, no 62 clear cut pattern emerges. However, the estimated variances of both of the constrained estimators are clearly smaller for the (4,4) corner point case than for the (5,5) center point case. This reduction in estimated variance may help offset the increase in estimated bias that occurs when shifting from the (5,5) center point to the (4,4) corner point. The average estimated variances of all three estimators substan- tially decrease as sample size increases as shown in Table 5 for both the (5,5) center point case and the (4,4) corner point case. 14.812). The sampling results for estimated mean squared error A are summarized in Table 6. In the case where the constrained space consists of a finite number of points Ramsey (30, p.17) has indicated that "it appears that nothing can be said in general about the mean squared error gain for the minimum distance estimator. A similar tentative conclusion holds for the constrained maximum likelihood estimator for finite sam- ple sizes." In the absence of an analytical proof of mean squared error gain for the MDE or the CMLE, sampling experi- ments provide a means of determining the relative values of mean squared error for the alternative estimators. In this instance the sampling experiments indicate that in almost all cases the estimated MSE A of the UMLE is larger than that of either the CMLE or the MDE. In particular, Table 7 reveals that when the mean is at the (5,5) center point, the 63 TABLE 6 AVERAGE RANKS OF SMALLER ESTIMATED MSE A FOR POINTS MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML no a. 3.00 1.00 2.00 a. 3.00 1.50 1.50 N_30 b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 7 c. 3.00 1.50 1.50 c. 3.00 1.75 1.25 d. 3.00 1.50 1.50 d. 3.00 1.75 1.25 Total 3.00 \11.25 1.75 3.00 1.62 1.38 a. 3.00 1.13 1.88 a. 3.00 1.38 1.63 v—eo b. 3.00 1.13 1.88 b. 3.00 1.50 1.50 1‘ c. 2.63 2.00 1.38 c. 3.00 1.75 1.25 d. 2.75 2.00" 1.25 d. 3.00 1.75 1.25 Total 2.84 1.56 1.60 3.00 1.59 1.41 a. 2.75 1.25 2.00 a. 3.00 1.38 1.63 N_100 b. 2.50 1.38 2.13 b. 3.00 1.50 1.50 7 c. 2.50 1.50 2.00 c. 3.00 1.38 1.63 d. 2.50 1.81 1.69 d. 3.00 1.38 1.63 Total 2.56 1.48' 1.96 3.00 1.41 1.59 TOTAL '2.80 1.43 1.77 3.00. 1.54 1.46 UMLE's average estimate of MSE A is larger than that of either the CMLE or the MDE. As sample size increases the UML esti- mate of the MSE A decreases substantially while the estimates of the CMLE and MDE decrease moderately. This decrease is due entirely to the decrease in variance as sample size in- creases since the bias remains essentially unchanged. Table 6 shows that the UMLE estimate of MSE A is almost al- ways larger than the CMLE and MDE estimates. The frequency with which the UMLE is larger decreases as sample size in— creases o 64 TABLE 7 AVERAGE VALUES OF ESTIMATED MSE A FOR POINTS MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 10.415 .879 .880 a. 10.415 1.639 1.386 N=30 b. 10.694 .851 .868 b. 10.694 1.014 1.352 c. 14.401 .892 .900 c. 14.401 1.404 1.402 d. 14.269 .885 .881 d.’ 14.269 1.471 1.465 a. 5.457 .812 .830 a. 5.457 1.386 1.187 N=6O b. 5.526 .802 .832 b. 5.526 .839 1.170 c. 6.758 .854 .856 c. 6.758 1.324 1.329 d. 6.485 .840 .838 d. 6.485 1.316 1.303 a. 2.969 .754 .783 . 2.969 1.080 .968 N=lOO b. 3.281 .772 .799 . 3.281 .620 .950 4.094 1.140 1.117 a b c. 4.094 .805 .806 c. d. 4.058 1.037 1.026 d. 4.058 .790 .789 When the mean is at the (4,4) corner point, the average value of the UMLE estimate of the MSE A is still larger than that of either of the other estimators. Table 6 indicates that with the mean at (4,4) the UMLE estimate is always the largest. However, the CMLE and MDE average estimates are larger than they were in the (5,5) case. All the average estimates decrease as the sample size increases but again the UMLE estimate of MSE A decreases faster than that of either the CMLE or the MDE. Ramsey (30, p.25) has indicated that for the situation where the constrained Space consists of only a finite number of points mean squared error gains can be shown to hold asymptotically under suitable regular— ity conditions in the multivariate case. However, the above sampling results suggest that for small samples the mean squared error gain is even greater than for large samples. 65 In comparing the CML and MD average values of the esti- mates of MSE A, Table 7 shows very little difference between their average values in the (5,5) case. When the mean is shifted to the (4,4) point there is only slightly more dif- ference. When comparing the average ranks of the CML and MD estimates, Table 6 shows a difference between the (5,5) and the (4,4) case. When the mean is at the center point, the CMLE rank is more frequently smaller than the MDE, while the reverse is true at the corner point at least most of the time for specifications a, c, and d. Since the estimated variance is dominant over the estimated bias in the calcula- tion of estimated MSE A even in the (4,4) case, the explana- tion relating to the negative estimated sample correlation coefficient for the estimated variance would also apply to estimated MSE A. In particular the CMLE may have a greater tendency to select the (4,4) corner point or near by points than the MDE does in this special circumstance. MSE B The sampling results for estimated mean squared error B are summarized in Table 8 and Table 9. When the mean is at the (5,5) center point the UMLE almost always gives the largest estimate of MSE B, but doing so less frequently as sample size increases as can be seen in Table 8. Table 9 shows that the average value of the estimated MSE B of the UMLE is considerably larger than those of the CMLE and MDE. As the sample size increases the average values given by all 66 TABLE 8 AVERAGE RANKS OF SMALLER ESTIMATED MSE B FOR POINTS MODEL Mean at (5,5) Mean at (4.4) UML CML MD UML CML MD a. 3.00 1.00 2.00 a. 3.00 1.50 1.50 N—3O b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 _ C. 3.00 1.63 1.38 C. 3.00 1.75 1.25 d. 3.00 1.50 1.50 d. 3.00 1.63 1.38 Total 3.00 1.28 1.72 3.00 1.59 1.41 a. 3.00 1.00 2.00 a. 3.00 1.50 1.50 N—60 b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 - C. 3.00 1.88 1.13 C. 3.00 2.00 1.00 d. 2.50 2.13 1.38 d. 3.00 1.75 1.25 Total 2.87 1.50 1.63 3.00 1.69 1.31 a. 3.00 1.00 2.00 a. 3.00 1.50 1.50 N-lOO b. 2.50 1.50 2.00 b. 3.00 1.50 1.50 _ C. 3.00 1.25 1.75 C. 3.00 1.25 1.75 d. 2.50 1.88 1.63 d. 3.00 1.25 1.75 Total 2.75 1.41 1.84 3.00 1.37 1.63 TOTAL 2.87 1.40 1.73 1.55 1.45 three estimators decrease, with the average value of the UMLE estimate decreasing most substantially. When the mean is moved to the (4,4) corner point the UMLE always has the largest estimate of MSE B. Table 9 shows that the average values of estimated variance for the CMLE and MDE have in- creased overall, but continue to decrease as sample size increases. When the average ranks of the CMLE and MDE estimates of the MSE B are compared in Table 8 it can be seen that the CMLE on the average gives smaller estimates more frequently AVERAGE VALUES OF ESTIMATED MSE B 67 TABLE 9 FOR POINTS MODEL Mean at (5,5) Mean at (4,4) CML UML CML MD UML MD a. 6.901 .821 .838 a. 6.901 1.321 1.180 N—30 b. 9.555 .840 .862 b. 9.555 .967 1.289 — C. 7.615 .743 .742 C. 7.615 .927 .918 d. 13.889 .897 .892 d. 13.889 1.457 1.441 a. 3.566 .728 .768 a. 3.566 1.072 .946 N=60 b. 4.696 .778 .822 b. 4.696 .784 1.151 C. 3.608 .657 .657 C. 3.608 .815 .813 d. .6.884 .850 .847 d. 6.884 1.294 1.291 a. 1.962 .655 .696 a. 1.962 .793 .716 N=100 b. 2.773 .742 .772 b. 2.773 .572 .908 C. 2.174 .565 .566 C. 2.174 .649 .639 d. 3.952. .805 .803 d. 3.952 1.090 1.084 when the mean is at (5.5). However closer examination re- veals that when the covariance is zero the MDE more fre- quently gives a smaller estimate of MSE B. When the mean is shifted to the (4,4) point, the average rank of the MDE is smaller than that of the CMLE. Table 8 also shows that when the covariance is nonzero (specifications a and b) the CMLE and MDE have the smallest estimate of MSE B equally often. Table 9 shows that the average values of the CMLE and MDE estimates of MSE B are very close, particularly when the co- variance is zero. When the mean is shifted to (4,4) the dif- ference between the CMLE and MDE estimates increases, but is still small. In particular, the MDE has a slightly smaller estimated MSE B when the variances are equal while the CMLE has a slightly smaller MSE B when the variances are unequal. 68 Thus the MDE tends to have a slight edge over the CMLE when the likelihood ellipse approximates a circle. Only speci- fication b in the (4,4) case showed the estimated MSE B of the CMLE to be slightly smaller than that of the MDE. Since the average estimated sample correlation coefficient tended to be negative in this case, the slope of the major axis of the likelihood ellipse tended to be negative which tended to keep the CML estimates closer to the (4,4) point than those of the MDE. The opposite result would be expected if the slope had been positive. MSE D The sampling results for estimated mean squared error D are summarized in Table 10 and Table 11. Since estimated MSE D is a matrix, an estimator is said to have a smaller estimated MSE D when the difference matrix is positive semi- definite. The difference matrix is obtained by subtracting the MSE D matrix of that first estimator from the MSE D ma- trix of a second estimator. The average values given in Table 11 are the values of the determinants of the MSE D dif- ference matrices. Table 10 shows in the (5,5) center point case and the (4,4) corner point case that the CMLE frequently has the smallest estimated MSE D, followed by the MDE, and then by the UMLE with the largest estimate. As sample size in- creases the prominence of this pattern diminishes. Table 11 shows a large range in the size of the determinants of the 69 TABLE 10 AVERAGE RANKS OF SMALLER ESTIMATED MSE D FOR POINTS MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 3.00 1.00 2.00 a. 3.00 1.00 2.00 1_30 b. 3.00 1.00 2.00 b. 2.75 1.00 2.25 “ c. 3.00 1.00 2.00 c. 3.00 1.75 1.25 d. 3.00 1.00 2.00 d. 3.00 1.50 1.50 Total 3.00 1.00 2.00 2.94 1.31 1.75 a. 2.00 1.50 2.50 a. 2.75 1.25 2.00 1_60 b. 2.50 1.25 2.25 b. 2.75 1.25 2.00 1‘ c. 2.25 1.75 2.00 c. 3.00 1.25 1.75 d. 2.50 1.50 2.00 d. 3.00 1.50 1.50 Total 2.31 1.50 2.19 2.88 1.31 1.81 a. 2.00 1.50 2.50 a. 2.75 1.25 2.00 N=100 b. 2.50 1.25 2.25 b. 2.75 1.25 2.00 c. 2.00 1.50 2.50 c. 3.00 1.00 2.00 d. 2.50 1.50 2.00 d. 3.00 1.00 2.00 Total 2.25 1.44 2.31 2.88 1.12 2.00 TOTAL 2.52 1.31 2.17 2.90 1.25 1.85 MSE D difference matrix for U-C and U-M. When one of the con- strained estimators has a larger MSE D matrix than the UMLE it is only slightly larger. As sample size increases the average values of the determinants decrease for U-C and U-M in both the (5,5) and (4,4) case. These two columns also show that the determinants are usually smaller when the variances are unequal (specifications a and c) than when they are equal (specifications b and d). The values of specifica- tion c are especially small. The values of bias and variance \-_1 h \ §BJ 64-. NA :1 . Q ‘lvh “5 0.. .p» a... :u .: L.~ ~1~ ho. . .l v.1 a a . .o. 70 TABLE 11 AVERAGE VALUES OF ESTIMATED MSE D FOR POINTS MODEL Mean at (5,5) J>~H Mean at (4,4) U-C U-M C-M U-C U-M C-M a. 16.549 18.132 -.120 a. 15.277 17.084 -.O36 N-30 b. 38.466 42.530 -.187 b. 33.351 30.105 .075 — C. 3.186 3.305 -.000 C. 4.864 5.071 .000 d. 168.535 168.662 -.000 d. 153.552 153.945 .000 a. 2.684 3.009 -.055 a. 2.555 3.121 .004 N=60 b. 7.253 8.765 -.134 b. 29.919 34.323 -.080 C. -.011 .001 -.000 C. .779 .833 -.000 d. 36.253 36.281 -.000 d. 30.704 30.731 -.000 a. .332 .390 -.018 a. .553 .737 .003 N=1OO b. 1.697 2.208 -.053 b. 1.643 .344 .112 C. -.231 -.234 -.000 C. .281 .275 -.000 d. 9.872 9.888 .000 d. 7.861 7.090 -.000 in this case are small and very nearly equal for all three estimators. The second variate has a considerably larger average estimated variance than the first variate in this in- stance and since the covariance is zero, the MDE and the CMLE may have an unusually strong tendency to choose the points immediately above and below the (5,5) center point as well as the center point itself when the mean is at (5,5). As the sample size increases, the variance of the second variate declines sufficiently to allow a sufficient propor- tion of the UML estimates to fall between the (5,4) and (5,6) points so that the estimated MSE D of the UMLE actually be- came smaller than that of the constrained estimators. While the CML MSE D matrix is almost always smaller than the MD MSE D matrix in the (5,5) case, it is by only a small amount. '«.f‘v 4 ”A- r- .. - ..s... .— —— ‘rvfl‘v .. ’bov- ‘ Ha - .4? -u.._ ‘59-... msv‘v .. x .b" V 3 ". \‘ ‘ O‘- 1*: 71 When the mean is at (4,4) each of the two constrained esti- mators has the smallest MSE D matrix half of the time. 5.83.33. The sampling results for the estimated mean squared error E are summarized in Table 12 and Table 13. As in the other mean squared error cases, the UMLE usually gives the largest estimated MSE B when compared to the CMLE and MDE. Table 12 shows that when the mean is at the (5,5) center point, the UMLE almost always has the largest estimated MSE, although the frequency of this decreases as the sample size increases. When the mean is moved to the (4,4) corner point, the UMLE always has the largest estimated MSE B. Table 13 shows that the average values of estimated MSE E given by all three estimators decrease as sample size increases for both the (5,5) and (4,4) cases. Again, as in previous measures of mean squared error, the estimated values given by the UMLE decrease much more rapidly than do those of the CMLE and MDE. It can also be seen in Table 13 that each estimator usually gives smaller average estimates of MSE B when the variances are not equal (specifications a and c) than when the variances are equal (specifications b and d). This is not the case, however, for the CMLE when the mean is at (4,4). The larger estimated MSE E for the equal variance cases may be explained by noting that the sum of squares of the variances listed in Table 1 is greater than the sum of their cross-products. The exception for the CMLE in the (4,4) OF 72 TABLE 12 AVERAGE RANKS Mean at (5,5) SMALLER ESTIMATED MSE E FOR POINTS MODEL Mean at (4,4) UML CML MD UML CML MD a. 3.00 1.75 1.25 a. 3.00 1.50 1.50 _ b. 3.00 1.75 1.25 b. 3.00 1.50 1.50 N“30 c. 3.00 1.50 1.50 c. 3.00 1.75 1.25 d. 3.00 1.50 1.50 d. 3.00 2.00 1.00 Total 3.00 1.62 1.38 3.00 1.69 1.31 a. 3.00 1.75 1.25 a. 3.00 1.50 1.50 _ b. 3.00 1.75 1.25 b. 3.00 1.50 1.50 N“60 c. 3.00 2.00 1.00 c. 3.00 2.00 1.00 d. 2.50 2.25 1.25 d. 3.00 1.75 1.25 Total 2.87 1.94 1.19 3.00 1.69 1.31 a. 3.00 1.25 1.75 a. 3.00 1.50 1.50 _ b. 2.50 2.00 1.50 b. 3.00 1.50 1.50 N‘loo c. 3.00 1.25 1.75 c. 3.00 1.25 1.75 d. 2.50 2.00 1.50 d. 3.00 1.25 1.75 Total 2.75 1.625 1.625 3.00 1.37 1.63 TOTAL 2.87 1.73 1.40 3.00. 1.58 1.42 case is again due to a negative average estimated correlation coefficient. In all cases the average value of the UML esti- mated MSE E is considerably larger than the estimates of the other two estimators. When the performance of the CMLE and MDE are compared in Table 12, it can be seen that the MDE more frequently gives the smallest estimate of MSE in both the (5,5) and the (4,4) case, especially for smaller sample sizes. However Table 13 shows that there is little actual difference between the size of the CML and MD estimates of MSE B. When the mean is 73 TABLE 13 AVERAGE VALUES OF ESTIMATED MSE E FOR POINTS MODEL H ) Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 26.919 .670 .594 a. 26.919 1.379 .830 N=30 b. 53.723 .699 .616 b. 53.723 .792 1.615 c. 11.950 .530 .526 c. 11.950 .567 .542 d. 192.630 .803 .794 d. 192.630 1.792 1.746 a. 6.699 .510 .465 a. 6.699 .824 .532 N=60 b. 13.617 .602 .497 b. 13.617 .568 1.300 c. 3.097 .393 .392 c. 3.097 .367 .356 d. 47.219 .722 .717 d. 47.219 1.428 1.408 a. 2.055 .399 .399 a. 2.055 .431 .299 N=100 b. 4.775 .528 .452 b. 4.775 .315 .817 c. 1.043 .262 .263 c. 1.043 .170 .170 d. 15.585 .647 .643 d. 15.585 1.028 1.025 shifted to (4,4) both the CML and MD average estimates usually increase in size, but still remain much smaller than the UMLE average estimate. Summary of Discrete Points Model Sampling Results In summarizing the discrete points model, all three estimators appeared to be unbiased when the mean was at the (5,5) center point. The CMLE and the MDE had small to moder- ate estimated positive bias when the mean was shifted to the (4,4) corner point and the size of their biases was nearly the same. Also at the (4,4) point, the estimated biases of the CMLE and MDE decreased as sample size increased, but re- mained unchanged at close to 0 in the (5,5) case. 74 Calculations of the average estimated variance of each of the three estimators showed it to be much larger for the UMLE than for either the CMLE or the MDE in both the (5,5) and the (4,4) situations. As in the case of bias, the two constrained estimators had average estimated variances of nearly the same size. As sample size increased, the esti- mated variances of all three estimators decreased. Although that of the UMLE declined most dramatically, it still re- mained the largest. For all four of the mean squared errors which were examined, the UMLE gave substantially larger average esti- mates than did the CMLE or the MDE. The estimates of MSE A, B and E always decreased as sample size increased for all three estimators. Because MSE D was presented in matrix form, it was not determined if the estimates were also decreasing there. In particular, the MSE A average estimate of all three estimators was dominated by the effect of the variance, especially in the (5,5) case. In the (4,4) case, the bias affected the CMLE and the MDE about equally. For both the MSE A and MSE B situations, the CMLE tended to give smaller estimates than the MDE when the mean was at (5,5), and the MDE tended to give smaller estimates when the mean was at (4,4). In the case of estimated MSE D the CMLE performed slightly better than the MDE in many cases. In the case of estimated MSE E the MDE usually did slightly better than the CMLE. 75 In the discrete points case, the estimated skewness of all three estimators was nearly always equal and very small relative to the standard deviation when the mean was at (5,5). There was also about an equal number of positive and negative values. This suggests that all three estimators had symmetric distributions in the (5,5) case. In the (4,4) corner point case, the small estimated skewness of the UMLE remained unchanged. The average estimated skewness of the CMLE and the MDE increased, but remained close to one an- other. The ratio of average estimated skewness to standard deviation for the constrained estimators ranged from about 0.2 to 2.0. While the UMLE had about an equal number of negative and positive estimates of skewness, the constrained estimators always gave positive estimates in the (4,4) case. In both the (5,5) and (4,4) cases, the estimated kurto- sis of the UMLE, which was standardized relative to the normal distribution, was quite small and fluctuated between negative and positive values. The estimated kurtosis of the CMLE and MDE was almost always negative, and the ratio of their kurtosis to the standard deviations ranged from about -.04 to -1.4. Also in the discrete points case, all four estimated moments of the CMLE and the MDE were close to one another. The first moments of all three estimators tended to be very close in the (5,5) case, while the CMLE and MDE estimated first moments were slightly smaller than that of the UMLE in the (4,4) case. The second estimated moment of the UMLE 76 was somewhat larger than those of the constrained estimators in many cases. The third estimated moment of the UMLE was usually larger than the estimated third moments of the con- strained estimators in the (5,5) case. However, the re- versevwas true in the (4,4) case. The fourth estimated moment of the UMLE was considerably larger than those of the constrained estimators in almost all caSes when the mean was at (5,5) and was often larger in the (4,4) case. 4.4 Square Model In the second model the constrained space is defined as the points in and on a square which is located with corner points at (4,4), (4,6), (6,4), and (6,6). As in the first model, the midpoint of the constrained space is at (5,5). The constrained space defined as a square effectively re- stricts both variates of the bivariate normal distribution to finite intervals. Such restrictions may oCcur in regres- sion analysis as double inequality restrictions on each of two regression coefficients. As in the previous case, the characteristics of the estimators and their distributions will be analyzed. As before, the UML estimate for this model is g; = (6M, 8 ) = (52,17). u2 MD and CML Estimates The minimum distance estimate and the constrained maxi- mum likelihood estimate may be calculated as follows. If the UML estimate lies within the square or on its edge, then the 77 UML estimate, the CML estimate, and the MD estimate are all equal. If the UML estimate lies outside the square, then the 8 ), can be cal- calculation of the MD estimate, 8' = (8 , —m. ml m2 culated by projecting directly onto the nearest side of the square in a horizontal or vertical direction as appropriate. If the UML estimate lies in one of the corner quadrants formed by extending the sides of the square outward from the corner points, then the corner point itself corresponds to the MD estimate. 8 The CML estimate, 8é = (8 is calculated by a c1' c2)' search routine that involves minimizing the exponent of the likelihood function which in this case is equivalent to maximizing the likelihood function itself. If the UML es- timate lies directly to one side of the square, then that side is searched for the point that corresponds to the maxi- mum of the likelihood function. If the UML eStimate lies in one of the corner quadrants, then the two nearest sides of the square are searched. Therefore, as in the discrete points model, the CML estimate, 8C, is obtained by minimiz- ing the exponent: -_- 0x 0y n '-e 2 x’-6 --e --e 2 C = Z [BXl cl) -2p( 10 C1)(Y10 CZ)+(Y1 c2) ] 1 X' Y subject to the condition that the solution lies on the appro- priate boundary of the constrained space. 78 TABLE 14 AVERAGE RANKS OF SMALLER ABSOLUTE ESTIMATED BIAS FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 2.63 1.63 1.75 a. 1.00 2.50 2.50 N=3O b. 3.00 1.50 1.50 b. 1.00 2.50 2.50 c. 3.00 1.63 1.38 c. 1.00 2.50 2.50 d. 3.00 1.63 1.38 d. 1.00 2.63 2.38 Total 2.91 1.59 1.50 1.00 2.53 2.47 a. 2.88 1.88 1.25 a. 1.00 2.38 2.63 sto b. 2.88 1.63 1.50 b. 1.00 2.50 2.50 c. 3.00 1.63 1.38 c. 1.00 3.00 2.00 d. 3.00 1.50 1.50 d. 1.00 2.75 2.25 Total 2.94 1.66 1.40 1.00 2.66 2.34 a. 2.50 2.00 1.50 a. 1.00 2.50 2.50 N_ 0 b. 2.63 1.88 1.50 b. 1.00 2.50 2.50 '1 0 c. 2.50 1.63 1.88 c. 1.00 2.38 2.63 d. 2.50 1.63 1.88 d. 1.00 2.38 2.63 Total 2.53 1.78 1.69 1.00 2.44 2.56 TOTAL 2.79 1.68 1.53 1.00 2.54 2.46 Sampling Results Bias The sampling results for estimated bias for the square model are summarized in Table 14 and Table 15. when the true population mean happens to be at the (5,5) As expected center point of the constrained space, the estimated bias is quite small for all three estimators which tends to support the hypothesis that they may be unbiased estimators in this circumstance. In addition to being small, the estimated 79 TABLE 15 AVERAGE VALUES OF ESTIMATED BIAS FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. -.006 -.006 .001 a. -.006 .299 .352 N=3O b. -.031 -.018 -.003 b. -.031 .573 .742 c. -.187 -.049 -.041 c. -.l87 .762 .759 d. -.075 -.044 -.045 d. -.075 .572 .574 a. .030 .030 .029 a. .030 .244 .281 N=60 b. .006 .001 .005 b. .006 .494 .672 c. -.019 .005 .005 c. -.019 .717 .716 d. -.034 -.023 -.023 d. -.034 .478 .474 a. .006 .006 .007 a. .006 .171 .205 N=100 b. -.058 -.025 -.020 b. -.058 .379 .578 . .079 .011 .011 c. .079 .655 .655 d. -.015 -.010 -.010 d. -.015 .369 .371 bias in the (5,5) case tends to be positive almost as much as negative at least for sample sizes 60 and 100. Since the estimated skewness for these three estimators is quite small and therefore suggests that their distributions are symmetric, the frequency of positive and negative values further sug- gests that the true value of bias for all three estimators in this special situation is zero. While the estimated bias is quite small for all three estimators, the UMLE tends to have a larger estimated bias than the other two estimators. The estimated bias of the CMLE is almost identical to that of the MDE in the (5,5) case. The table of ranks indicates that the MDE had a smaller estimated bias slightly more fre- quently than the CMLE although both of these constrained estimators did somewhat better than the UMLE in terms of 80 frequency of smaller bias. The estimated bias of all three estimators remains essentially unchanged as sample size in- creases in the (5,5) case. When the true population mean of the random number gen- eratorwis shifted to the corner point (4,4), the estimated bias of the CMLE and MDE becomes larger than that of the UMLE and strictly positive so that it tends toward the cen- ter point of the constrained space. The rank table indi— cates that the UMLE always has the smallest estimated bias in the (4,4) case. The estimated bias of the CMLE and the MDE is essentially the same except for specification b. Since specification b requires equal variances with nonzero covariance, the likelihood ellipse should approximate a cir- cle. .However, in this case the average estimated standard deviations are approximately 15 and 18 with a substantially negative estimated correlation coefficient. Consequently, the CMLE is able to choose points closer to the (4,4) corner point than those points chosen by the MDE on the average un- der specification b. When sample size increases in the (4,4) case, the estimated bias of the UMLE remains virtually un- changed. However, the estimated bias of the CMLE and the MDE declines in all cases as sample size increases when the mean is at the (4,4) corner point. Overall there is not much difference between the MDE's estimated bias and that of the CMLE although both constrained estimators do have a larger estimated bias than the UMLE in the (4,4) case. 81 TABLE 16 AVERAGE RANKS OF SMALLER ESTIMATED VARIANCE FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 3.00 1.00 2.00 a. 3.00 1.25 1.75 N=30 b. 3.00 1.00 2.00 b. 3.00 1.38 1.63 c. 3.00 1.63 1.38 c. 3.00 1.88 1.13 d. 3.00 1.50 1.50 d. 3.00 1.75 1.25 Total 3.00 1.28 1.72 3.00 1.56 1.44 a. 3.00 1.13 1.88 a. 3.00 1.13 1.88 N=60 b. 3.00 1.00 2.00 b. 3.00 1.38 1.63 c. 3.00 1.88 1.13 c. 3.00 2.00 1.00 d. 3.00 2.00 1.00 d. 3.00 2.00 1.00 Total 3.00 1.50 1.50 3.00 1.62 1.38 a. 3.00 1.00 2.00 a. 3.00 1.25 1.75 N—lOO b. 3.00 1.00 2.00 b. 3.00 1.25 1.75 _ c. 3.00 1.38 1.63 c. 3.00 1.75 1.25 d. 3.00 1.63 1.38 d. 3.00 1.63 1.38 Total 3.00 1.25 1.75 3.00 1.47 1.53 TOTAL 3.00 1.34 1.66 3.00 1.55 1.45 Estimated Variance The sampling results for estimated variance for the square model are summarized in Table 16 and Table 17. The UMLE always gives the largest estimate of variance both when the mean is at the (5,5) center point and at the (4,4) corner point. The estimated variance Of the UMLE must necessarily be equal to or greater than that of the constrained estima— tors because the constrained space square is a convex set. Table 17 of the average values of the estimates shows that 82 TABLE 17 AVERAGE VALUES OF ESTIMATED VARIANCE FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 10.415 .829 .835 a. 10.415 .811 .783 N=3O b. 10.693 .804 .825 b. 10.693 .654 .773 c. 14.342 .863 .862 0. 14.342 .817 .809 d. 14.267 .853 .851 d. 14.267 .823 .808 a. 5.454 .751 .775 . a. 5.454 .717 .691 N=6O b. 5.526 .737 .766 b. 5.526 .542 .682 c. 6.752 .804 .805 c. 6.752 .741 .738 d. 6.470 .794 .793 d. 6.470 .731 .727 a. 2.967 .693 .710 a. 2.967 .605 .572 N=100 b. 3.278 .664 .725 b. 3.278 .396 .564 c. 4.093 .743 .743 c. 4.093 .643 .646 d. 4.051 .727 .726 d. 4.051 .620 .619 the UML estimate of variance is much larger than either of the two constrained estimates in both the (5,5) and (4,4) case. This difference is greatest when the true population mean is on the boundary but it is also very large when the mean is in the center of the constrained space. All of the estimates decrease as sample size increases, with the UML estimate showing the most substantial reduction. Conse- quently, the advantage in terms of smaller estimated vari- ance of the MDE and CMLE over the UMLE is especially appar- ent in small samples but was substantial even for the larg- est sample size. A comparison of the CML and MD estimates in Table 16 in- dicates that the CMLE more frequently gives the smallest es- timate when the mean is at (5,5). The situation is reversed 83 in the (4,4) case, but the difference between the average ranks is not as great. Table 17 shows that the average values of the two constrained estimates are very close to one another particularly when the covariance is zero (speci— fications c and d). A zero covariance implies that the major axis of the likelihood ellipse tends to be either ver- tical or horizontal and therefore its point of tangency with the square (which has sides that are also either vertical or horizontal) represents not only the CML estimate, but the MD estimate as well. When the mean is moved from the center to the corner point, both of the constrained estimators give smaller average estimates of variance as expected. Estimated MSE A The sampling results for estimated MSE A for the square model are summarized in Table 18 and Table 19. As in the case of estimated variance, the UMLE always gives the largest estimate of MSE A in both the case where the mean is at the (5,5) center point and when it is at the (4,4) corner point. The similarities between the estimated variance and the estimated MSE A are also evident in Table 19 where the average values of the estimates of MSE A of all three estimators are nearly the same as their estimates of variance, particularly in the (5,5) case where the bias is essentially zero and for the UMLE in the (4,4) case. In terms of MSE A which considers each parameter separately, Ramsey (30, p.12) has proven analytically that the mean 84 TABLE 18 AVERAGE RANKS OF SMALLER ESTIMATED MSE A FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 3.00 1.00 2.00 a. 3.00 1.38 1.63 b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 N=30 c. 3.00 1.63 1.38 c. 3.00 1.88 1.13 d. 3.00 1.50 1.50 d. 3.00 1.63 1.38 Total 3.00 1.28 1.72 3.00 1.59 1.41 a. 3.00 1.13 1.88 a. 3.00 1.38 1.63 _ b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 “-60 c. 3.00 1.88 1.13 c. 3.00 2.00 1.00 d. 3.00 2.00 1.00 d. 3.00 2.00 1.00 Total 3.00 1.50 1.50 3.00 1.72 1.28 a. 3.00 1.00 2.00 a. 3.00 1.38 1.63 N_100 b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 ‘ c. 3.00 1.38 1.63 c. 3.00 1.75 1.25 d. 3.00 1.63 1.38 d. 3.00 1.63 1.38 Total 3.00 1.25 1.75 3.00 1.56 1.44 TOTAL 3.00 1.34 1.66 3.08 1.62 1.38 squared error of the MDE is less than that of the UMLE. The sampling experiments for the square model completely support Ramsey's theorem since in all cases the estimated MSE A Of the UMLE is larger than that Of either the CMLE or the MDE. This held true for the experiments performed with the true population mean at the (5,5) center point as well as for those carried out with the mean at the (4,4) corner point. As sample size increases, the UMLE, CMLE, and MDE all give smaller estimates of MSE A, with those of the UMLE decreasing most rapidly. This result suggests that all three estimators AVERAGE VALUES OF ESTIMATED MSE A FOR SQUARE MODEL 85 TABLE 19 M-“ Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 10.415 .829 .835 a. 10.415 1.607 1.346 N=30 b. 10.694 .804 .825 b. 10.694 .983 1.324 c. 14.401 .866 .865 c. 14.401 1.375 1.368 d. 14.269 .853 .851 d. 14.269 1.452 1.424 a. 5.457 .751 .776 a. 5.457 1.317 1.132 N=60 b. 5.526 .737 .766 b. 5.526 .785 1.133 c. 6.758 .806 .806 c. 6.758 1.293 1.289 d. 6.485 .795 .795 d. 6.485 1.268 1.266 a. 2.969 .693 .710 a. 2.969 1.052 .901 N=100 b. 3.281 .665 .726 b. 3.281 .540 .898 c. 4.094 .743 .744 c. 4.094 1.064 1.070 d. 4.058 .728 .727 d. 4.058 .981 .978 may be consistent estimators and that they may converge in terms of MSE A as sample size increases. While the size of the average estimates of the CMLE and MDE are very close to one another in all cases, the CMLE more frequently has the smallest estimate when the mean is at (5,5). estimate when the mean is (4,4). However the MDE most frequently has the smallest The average values table shows a departure from the similarities with the variance estimates. Instead of decreasing, the CML and MD estimates of MSE increase when the mean is shifted to the (4,4) corner point. This is due entirely to the increase in estimated bias since estimated variance decreased slightly when the mean shifted to (4,4). This result differs from the mean 'w \ v \ a «a 1 l m . .J . 7.,» ~ . . u .\. a a a: A: ‘H ‘. Q \. a. s.. \s \“ 86 TABLE 20 AVERAGE RANKS OF SMALLER ESTIMATED MSE B FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 3.00 1.00 2.00 a. 3.00 1.50 1.50 N=30 b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 C. 3.00 1.50 1.50 C. 3.00 1.75 1.25 d. 3.00 1.50 1.50 d. 3.00 1.75 1.25 Total 3.00 1.25 1.75 3.00 1.62 1.38 a. 3.00 1.00 2.00 a. 3.00 1.50 1.50 N=60 b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 C. 3.00 2.00 1.00 C. 3.00 2.00 1.00 d. 3.00 2.00 1.00 d. 3.00 2.00 1.00 Total 3.00 1.50 1.50 3.00 1.75 1.25 a. 3.00 1.00 2.00 a. 3.00 1.50 1.50 N=100 b. 3.00 1.00 2.00 b. 3.00 1.50 1.50 C. 3.00 1.50 1.50 C. 3.00 1.75 1.25 d. 3.00 1.00 2.00 d. 3.00 1.75 1.25 Total 3.00 1.12 1.88 3.00 1.62 1.38 TOTAL 3.00 1.29 1.71 3.00 1.66 1.34 squared error graph drawn by Penneck (29, p.22) which shows the maximum mean squared error of the MDE at the midpoint of the restricted interval at least in the univariate case. The CML estimate usually increases more than the MD estimate except for Specification b where the variances are equal and the covariance is nonzero. Here the CML average estimate remains smaller than the MD average estimate. Estimated MSE B The sampling results for estimated MSE B are summarized in Table 20 and Table 21. 87 TABLE 21 AVERAGE VALUES OF ESTIMATED MSE B FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 6.901 .756 .780 a. 6.901 1.286 1.131 N=30 b. 9.555 .784 .813 b. 9.555 .930 1.255 c. 7.615 .669 .667 c. 7.615 .882 .874 d. 13.889 .865 .861 d. 13.889 1.425 1.402 a. 3.566 .655 .700 a. 3.566 1.007 .895 N=60 b. 4.696 .709 .757 b. 4.696 .742 1.106 c. 3.608 .576 .575 c. 3.608 .768 .765 d. 6.884 .800 .799 d. 6.884 1.256 1.255 a. 1.962 .577 .608 a. 1.962 .760 .660 N=100 b. 2.773 .644 .697 b. 2.773 .513 .860 c. 2.174 .490 .491 c. 2.174 .596 .599 d. 3.952 .741 .741 d. 3.952 1.032 1.030 Ramsey (30, p.22) has proven in his theorem 3 that even in the multiparameter Situation mean squared error B of the UMLE is larger than that of the MDE when the parameter space is a convex set. Ramsey's analytical proof is further veri- fied by the sampling experiments for the square model in that the estimated MSE B of the UMLE was always larger than that of the MDE for all sample sizes and for both the (5,5) centered population mean and the (4,4) corner mean. 'Table 21 shows that the average estimated MSE B of the UMLE is consid- erably larger than either of the constrained estimates. AS sample Size increases, the estimates of MSE B of all of the estimators decrease in both the (5,5) and (4,4) case. AS with variance and MSE A, the UML estimates decrease most rapidly, but remain larger than the constrained estimates. 88 Table 21 also Shows that almost all the average estimates are smaller when the variances are unequal (specifications a and c) than when the variances are equal (specifications b and d). Here again this is probably due to the fact that the sum of the squares of the variances is greater than the sum of their cross-products as given in Table l. The average ranks table for the CML and MD estimates of MSE B Shows that the CML estimate is more frequently smaller when the mean is at (5,5) while the MD estimate is more fre- quently smaller when the mean is at (4,4). Table 21 indi- cates that the CML and MD average estimates of MSE B are very close to one another, especially when the covariance is zero (specifications c and d). This result relates to the previous explanation concerning the major (and minor) axis of the likelihood ellipse being parallel to a side of the square. Shifting the mean to (4,4) almost always increases the size of the CML and MD average estimates which again is a result of the increased estimated bias of the constrained estimators as the true population mean moves toward the boundary of the constrained space. Estimated MSE D The sampling results of estimated MSE D for the square model are summarized in Table 22 and Table 23. AS in the points model, MSE D is a matrix and the average values given in Table 23 are the values of the determinants Of the MSE D difference matrices. 89 TABLE 22 AVERAGE RANKS OF SMALLER ESTIMATED MSE D FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 3.00 1.00 2.00 a. 3.00 1.00 2.00 N=30 b. 3.00 1.00 2.00 b- 2.75 1.00 2.25 c. 3.00 1.00 2.00 c. 3.00 1.50 1.50 d. 3.00 1.25 1.75 d. 3.00 1.50 1.50 Total 3.00 1.06 1.94 2.94 1.25 1.81 a. 3.00 1.00 2.00 a. 3.00 1.00 2.00 N=60 b. 3.00 1.00 2.00. b. 2.75 1.25 2.00 c. 3.00 1.50 1.50 c. 3.00 2.00 1.00 d. 3.00 1.50 1.50 d. 3.00 1.75 1.25 Total 3.00 1.25 1.75 2.94 1.50 1.56 a. 3.00 1.00 2.00 a. 2.75 1.25 2.00 N=100 b. 3.00 1.00 2.00 b. 2.75 1.25 2.00 c. 3.00 1.00 2.00 c. 3.00 1.00 2.00 d. 3.00 1.00 2.00 d. 3.00 1.25 1.75 Total 3.00 1.00 2.00 2.87 1.19 1.94 TOTAL 3.00 1.10 1.90 2.92 1.31 1.77 Table 22 Shows that the UMLE always gives the largest estimate of MSE D when the mean is at (5,5). When the mean is moved to the (4,4) corner point, it still gives the largest estimate most frequently, but not as often as the sample size increases. The CMLE gives a smaller estimate of MSE D more frequently than does the MDE in both the (5,5) and (4,4) cases. Table 23 indicates that the difference matrices for U—C and U-M are much larger than the matrix of C—M in both the (5,5) and (4,4) cases. Since the determinants of U-C and 90 TABLE 23 AVERAGE VALUES OF ESTIMATED MSE D FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) U-C U-M C-M U-C U-M C-M a. 17.353 18.852 -.121 a. 15.695 17.643 -.023 N=30 b. 39.251 43.089 -.193 b. 33.842 30.537 .077 c. 4.829 4.866 -.000 c. 5.673 5.794 -.000 d. 169.349 169.460 .000 d. 154.337 154.337 .001 a. 3.056 3.410 -.061 a. 2.823 3.388 -.004 N=60 b. 7.696 9.161 -.l32 b. 6.009 3.610 .128 c. .665 .682 -.000 c. 1.116 1.131 .000 d. 36.854 36.862 .000 d. 31.115 31.126 —.000 a. .537 .642 -.024 a. .628 .857 .005 N=100 b. 1.959 2.475 -.065 b. 1.828 .469 .120 c. .057 .058 -.000 c. .376 .376 -.000 d. 10.277 10.275 -.000 d. 8.197 8.211 -.000 U-M are strictly positive here as are the diagonal elements of the corresponding difference matrices, the MSE D of the UMLE is clearly larger than that of the MDE and the CMLE. Also in both the (5,5) and (4,4) cases the U-C and U-M de- terminants always decrease as sample size increases which suggests the possible convergence of the unconstrained and constrained estimators in terms of MSE D as sample size in- creases. When the mean is shifted from (5,5) to (4,4) the U-C and U-M determinants almost always decrease which proba- bly reflects the increased estimated bias Of the CMLE and the MDE in the (4,4) case since the UMLE is not affected by the constraints. When the mean is at (5,5), the C-M deter- minant is almost always negative, while in the (4,4) situa- tion there is an equal number of positive and negative 91 TABLE 24 AVERAGE RANKS OF SMALLER ESTIMATED MSE E FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 3.00 1.75 1.25 a. 3.00 1.50 1.50 V=3O b. 3.00 2.00 1.00 b. 3.00 1.50 1.50 ‘ c. 3.00 1.50 1.50 c. 3.00 1.75 1.25 d. 3.00 1.50 1.50 d. 3.00 1.75 1.25 Total 3.00 1.69 1.31 3.00 1.62 1.38 a. 3.00 2.00 1.00 a. 3.00 1.50 1.50 N=6O b. 3.00 2.00 1.00 b. 3.00 2.00 1.00 c. 3.00 2.00 1.00 c. 3.00 2.00 1.00 d. 3.00 2.00 1.00 d. 3.00 2.00 1.00 Total 3.00 2.00 1.00 3.00 1.87 1.13 a. 3.00 1.75 1.25 a. 3.00 1.50 1.50 N=100 b. 3.00 2.00 1.00 b. 3.00 1.50 1.50 c. 3.00 1.25 1.75 c. 3.00 1.50 1.50 d. 3.00 1.25 1.75 d. 3.00 1.75 1.25 Total 3.00 1.56 1.44 3.00 1.56 1.44 TOTAL 3.00 1.75 1.25 3.00 1.68 1.32 values. Table 23 also shows the C-M determinant is almost zero whenever the covariance is zero. This reflects the closeness of the CMLE to the MDE in terms of MSE D. Estimated MSE E The sampling results for estimated MSE E for the square model are summarized in Table 24 and 25. The average ranks table indicates that the UMLE always gives the largest esti- mate of MSE E in both the (5,5) and (4,4) cases. Table 25 also shows its estimates to be considerably larger than 92 TABLE 25 AVERAGE VALUES OF ESTIMATED MSE E FOR SQUARE MODEL Mean at (5,5) Mean at (4,4) UML CML MD UML CML MD a. 26.919 .564 .508 a. 26.919 1.284 .733 N=3O b. 53.723 .605 .540 b. 53.723 .711 1.523 c. 11.950 .409 .407 c. 11.950 .465 .454 d. 192.630 .748 .741 d. 192.630 1.691 1.626 a. 6.699 .414 .373 a. 6.699 .711 .435 M=60 b. 13.617 .500 .413 b. 13.617 .501 1.198 ‘ c. 3.097 .279 .277 c. 3.097 .272 .268 d. 47.219 .639 .638 d. 47.219 1.320 1.309 a. 2.055 .306 .285 a. 2.055 .383 .231 N=100 b. 4.775 .404 .353 b. 4.775 .249 .730 c. 1.043 .177 .177 c. 1.043 .119 .120 d. 15.585 .549 .549 d. 15.585 .906 .905 those of the CMLE and MDE. As sample size increases, the average estimates of all three estimators decrease with those of the UMLE decreasing most rapidly. This result sug- gests the possible consistency and convergence of the three estimators. Comparing the CML and MD estimate average ranks, Table 24 Shows that the MDE most frequently gave the smallest estimate Of MSE E, regardless of the location of themean. The average values of the CML and MD estimates are very close to one another, particularly when the covariance is zero (Specifications c and d). Zero covariance implies the parallel relationship between the likelihood ellipse and the square that often results in equal estimates for the CMLE and the MDE as previously discussed. When the mean is 93 shifted from (5,5) to (4,4) almost all of the constrained estimates of MSE E increase, especially for small sample Sizes which probably reflects the increased estimated bias observed as the true mean Shifts to the boundary of the con- strained Space. Summary of Square Model Sampling Results All three estimators appeared to be unbiased when the population mean was in the center of the constrained space. When the true population mean shifted to the boundary of the constrained Space, the UMLE appeared to remain unbiased but the CMLE and the MDE appeared to have small to moderate es- timated biases which were nearly the same Size. The esti- mated bias of the constrained estimators when the mean was at the boundary decreased as sample size increased. The estimated variances of the constrained estimators were considerably smaller than that of the UMLE at both the center and on the boundary of the constrained Space. The estimated variances of all the estimators tended to decline as sample size increased. The UMLE always had the largest estimated mean squared error for MSE A, B, and E, and almost always for MSE D. The estimated MSE A, B, and E of all three estimators always de- creased as sample size increased. It was not determined if the estimated MSE D was decreasing for all the estimators because it was in matrix form. For the estimates of MSE A and B the CMLE tended to do slightly better than the MDE when 94 the true population mean was in the center of the constrained space, but the reverse was true when the mean was at the boundary of the constrained space. For MSE D the CML esti- mate was generally slightly smaller than the MD estimate. For MSE E the MD estimate was generally slightly smaller than the CML estimate. When the true population mean was at the center of the constrained Space, the estimates of the skewness of the UMLE, the CMLE, and the MDE were very small relative to their standard deviations and the estimates were nearly always equlal. There were about an equal number of positive and neg’ative values which suggests that these estimators have asymmetric distributions when the population mean is at the center of the constrained space. The UMLE continued to have small estimated skewness when the mean shifted to the bound- ary of the constrained space, however the estimates of the Skewness of the CMLE and the MDE, while remaining nearly enzual to one another, increased. The ratio of the average «estimated skewness to the standard deviations for the con- strained estimators ranged from about 0.5 to 6.5 when the Ixzpulation mean was at the boundary, while the ratio for the UMLE remained essentially zero. The estimated kurtosis of the UMLE was quite small in alhuost all cases and fluctuated between positive and nega- tiAna'values. The estimates of the kurtosis of both of the constrained estimators were nearly the same and were almost always negative. When the mean was at the center point, 95 the ratio of the average estimates of the kurtosis of the constrained estimators to their standard deviations ranged from about -1 to -2. When the mean was at the boundary, the average ratio of the estimated kurtosis of the constrained estimators to their standard deviations ranged from about -.5 to -7.0. The CML and MD estimates of the four moments were quite close to one another in each case. When the mean was in the center of the constrained Space, the estimated first moments of all three estimators were nearly the same. When the mean Shifted to the boundary of the constrained space the esti- mated first moments of the CMLE and the MDE remained close to one another but were somewhat larger than that of the UMLE. The estimated second moment of the UMLE was somewhat larger than those of the constrained estimators when the mean was in the center of the constrained space, but the es- timated second moments of all three estimators were about the same when the mean shifted to the boundary of the con- strained space. The estimated third moment of the UMLE was greater than that of the constrained estimators when the mean was in the center of the constrained space, but became about equal to those of the constrained estimators when the mean was Shifted to the boundary. The estimated fourth moment of the UMLE was considerably larger than that of the constrained estimators when the mean was at the center point of the constrained space, and was about equal to the con- strained estimates when the mean was at the boundary. The 96 estimated third and fourth moments of the UMLE exhibited considerably greater variation than those of the constrained estimators when the mean was at the boundary of the con- strained space. 4.5 Elliptical Model An alternative formulation of the constrained Space in the continuous case is an ellipse. As in the case of the discrete points and the continuous square, the ellipse may also be centered around the point (5,5). In particular, the ellipse may be given by the equation: (y-S)2 (x-S)2 _ - 3 + 2 l (35) The UML estimate is the average of the observed sample values, and is designated for this bivariate normal situa- tion as E6 = (8u1,8u2) = (2,?) as described above. If the UML estimate is in or on the ellipse then the CML and MD estimates are equal to the UML estimate. Otherwise the MD estimate can be obtained by finding the values of 81 and 82 that minimize the expression D2 = (81-36)2 + (82-?)2 where D2 represents the square of the distance between the UML esti- mate and the MD estimate subject to the condition that the solution be on the ellipse. A boundary search procedure is used to obtain the MD estimates. If the UML estimate falls outside of the ellipse, the CML estimate is obtained by maxi- mizing the likelihood function (or minimizing the exponent of 97 the likelihood function) subject to the condition that the solution be a point on the constrained space ellipse. The CML estimates are also obtained by a boundary search proce- dure. Sampling Results Since in the discrete points model and in the square model the Shift from the (5,5) center point to the (4,4) corner point was accomplished by shifting both coordinates of the mean of the bivariate normal in the same direction and by the same amount, the two variates of the bivariate normal were considered together in the presentation of tables and exposition for the discrete points model and the square model. However, in the elliptical model the shift from the (5,5) center point to the (5,5-/3) bottom point on the ellipse is accomplished by changing only the second coordi- nate of the true population mean. Consequently the two variates of the bivariate normal will be treated separately in the tables and exposition for the elliptical model. Estimated Bias The sampling results for estimated bias for the ellipti- cal model are summarized in Table 26 and Table 27. When the true population mean is at the (5,5) center point both variates of the UMLE give the largest estimates of bias most frequently. When the mean is moved to the (5,5-/3) bottom point on the ellipse, however, the first variate of the UMLE 98 gives the largest estimate of bias less often than the CMLE. Meanwhile the second variate always gives the smallest es- timate of bias. The average values table shows that the es— timated bias of all three estimators is quite small and aboutfequal for all three estimators when the mean is at the center of the constrained space and also for the first variate when the mean is shifted to the bottom point on the constrained space ellipse. This result suggests that all TABLE 26 AVERAGE RANKS OF SMALLER ABSOLUTE ESTIMATED BIAS FOR ELLIPTICAL MODEL Mean at (5,5) V1 V2 UML CML MD UML CML MD a. 2.75 1.75 1.50 a. 2.50 1.75 1.75 N=30 b. 2.50 1.50 2.00 b. 3.00 1.25 1.75 C. 3.00 1.00 2.00 C. 2.75 1.25 2.00 d. 3.00 1.00 2.00 d. 3.00 1.25 1.75 Total 2.81 1.31 1.88 2.81 1.38 1.81 a. 3.00 1.50 1.50 a. 2.00 1.75 2.25 N=60 b. 3.00 1.75 1.25 b. 3.00 1.50 1.50 C. 3.00 1.50 1.50 C. 3.00 1.50 1.50 d. 3.00 1.75 1.25 d. 2.75 1.75 1.50 Total 3.00 1.62 1.38 2.69 1.62 1.69 a. 2.75 1.75 1.50 a. 2.50 2.00 1.50 N=100 b. 2.50 2.00 1.50 b. 2.00 2.00 2.00 C. 2.25 1.50 2.25 C. 2.50 2.00 1.50 d. 2.25 2.00 1.75 d. 2.50 1.75 1.75 Total 2.44 1.81 1.75 2.37 1.94 1.69 TOTAL 2.75 1.58 1.67 2.62 1.65 1.73 99 TABLE 26 — Continued Mean at (5,5-/3) V1 V2 UML CML MD UML CML MD a. 1.50 3.00 1.50 a. 1.00 2.50 2.50 N=30 b. 1.75 2.75 1.50 b. 1.00 2.50 2.50 C. 3.00 1.25 1.75 C. 1.00 2.50 2.50 d. 3.00 1.25 1.75 d. 1.00 2.50 2.50 Total 2.31 2.06 1.63 1.00 2.50 2.50 a. 1.25 3.00 1.75 a. 1.00 2.50 2.50 N=60 b. 1.00 3.00 2.00 b. 1.00 2.25 2.75 C. 3.00 1.75 1.25 C. 1.00 2.50 2.50 d. 3.00 1.75 1.25 d. 1.00 3.00 2.00 Total 2.06 2.38 1.56 1.00 2.56 2.44 a. 1.00 3.00 2.00 a. 1.00 2.50 2.50 N=100 b. 1.00 3.00 2.00 b. 1.00 2.50 2.50 C. 2.75 2.00 1.25 C. 1.00 2.50 2.50 d. 2.50 2.00 1.50 d. 1.00 3.00 2.00 Total 1.81 2.50 1.69 1.00 2.62 2.38 TOTAL 2.06 2.31 1.63 1.00 2.56 2.44 three estimators may be unbiased in this special circum- stance. The exception is when the mean is Shifted to (5,5-/3). While this obviously does not affect the UMLE, it greatly increases the average estimated bias of the second ‘variates of the constrained estimators which become strictly positive. For both variates and for both means, the esti- mates of the UMLE are about equally negative and positive. Increasing the sample size does not appear to affect the size of the estimated bias of the UMLE which is practically zero anyway . 100 TABLE 27 AVERAGE VALUES OF ESTIMATED BIAS FOR ELLIPTICAL MODEL Mean at (5,5) v1 v2 UML CML MD UML CML MD a. -.006 -.009 -.006 a. -.234 -.072 -.090 N=30 b. -.045 -.025 -.018 b. -.031 -.029 —.022 c. -.006 -.000 .001 c. -.243 -.066 -.085 d. -.075 -.051 -.056 d. .035 .028 .028 a. .030 .029 .018 a. .032 .015 .020 N=60 b. .052 .033 .029 b. .006 .003 -.002 c. .030 .025 .011 c. .076 .040 .044 d. -.034 -.019 -.021 d. -.018 -.015 -.015 a. .006 .006 .008 a. .005 .005 .004 N=100 b. .034 .013 .009 b. -.058 -.027 -.032 c. .006 .007 .007 c. .014 .021 .021 d. -.015 -.014 -.015 d. -.016 -.013 -.012 Mean at (5,5-/3) v1 v2 UML CML MD UML CML MD a. -.006 -.083 -.035 a. -.234 1.177 1.100 ‘_ b. -.045 -.190 -.094 b. -.031 1.079 1.097 “-30 c. -.006 .001 .002 c. —.243 1.167 1.084 d. —.075 -.048 -.050 d. .035 .863 .863 a. .030 -.045 -.020 a. .032 1.055 1.012 _ b. .052 -.173 -.087 b. .006 .927 .937 N-50 c. .030 .028 .012 c. .076 1.042 .992 d. -.034 -.006 -.009 d. -.018 .632 .630 a. 0006 -0058 -0033 a. .005 .854 .824 _ b. .034 -.195 —.105 b. -.058 .744 .752 N-loo c. .006 .007 .005- c. .014 .828 .795 d. -.015 -.007 -.008 d. -.016 .474 .473 101 When the CMLE and MDE average ranks are compared it can be seen that the two estimators give the smallest estimates almost equally often in the (5,5) case with the CMLE doing slightly better for both variates. However when the mean is movedwto the bottom point at (5,5-/3), the variates perform differently. The first variate of the CMLE more frequently gives the largest estimate while that of the MDE more often gives the smallest. The second variate of the CMLE also gives the largest estimate most often, with the MDE average rank not far behind. Table 27 Shows that the average esti- mates of bias of both estimators are usually fairly close to one another. Also, the average estimates do not usually seem to be affected by sample size. The exception to this state- ment is the case of the second variate when the mean is at (5,5-/3). Increasing the sample size here reduced the average estimates of the CMLE and the MDE. This suggests the possibility that both constrained estimators may be asympto- tically unbiased. The second variate here is also the only place where all of the estimates of bias of the constrained estimators are positive. Elsewhere the number of positive and negative values are about equal. Estimated Variance The sampling results for estimated variance for the elliptical model are summarized in Table 28 and Table 29. The UMLE always gives the largest estimate of variance re— gardless of which variate or which mean is being examined. 102 Table 29 Shows that the UMLE'S average estimate of variance is usually considerably larger than the CML and MD estimates except for the first variate under Specification c where in this case the average estimated variance of the first vari- ate itself was quite small relative to its value for Speci- fications a, b, and d in this instance. AS a result all three estimators had smaller estimates for variance. The UMLE estimated variance is closer to that of the constrained TABLE 28 AVERAGE RANKS OF SMALLER ESTIMATED VARIANCE FOR ELLIPTICAL MODEL ——~.— ”*1“... -~ Mean at (5,5) v1 v2 UML CML MD UML CML MD a. 3.00 1.50 1.50 a. 3.00 1.50 1.50 1_30 b. 3.00 2.00 1.00 b. 3.00 1.00 2.00 *‘ c. 3.00 1.50 1.50 c. 3.00 1.50 1.50 d. 3.00 1.50 1.50 d. 3.00 1.50 1.50 Total 3.00 1.62 1.38 3.00 1.37 1.63 a. 3.00 1.50 1.50 a. 3.00 1.50 1.50 fl-60 b. 3.00 2.00 1.00 b. 3.00 1.00 2.00 “ c. 3.00 1.50 1.50 c. 3.00 1.50 1.50 d. 3.00 1.50 1.50 d. 3.00 1.50 1.50 Total 3.00 1.63 1.37 3.00 1.37 1.63 a. 3.00 1.75 1.25 a. 3.00 1.25 1.75 N_100 b. 3.00 2.00 1.00 , b. 3.00 1.00 2.00 ‘ c. 3.00 1.50 1.50 c. 3.00 1.50 1.50 d. 3.00 2.00 1.00 d. 3.00 1.00 2.00 Total 3.00 1.81 1.19 3.00 1.19 1.81 TOTAI. 3.00 1.69 1.31 3.00 1.31 1.69 103 TABLE 28 - Continued Mean at (5,5-/3) V1 V2 UML CML MD UML CML MD a. 3.00 1.50 1.50 a. 3.00 1.00 2.00 N=3O b. 3.00 1.75 1.25 b. 3.00 1.00 2.00 C. 3.00 1.50 1.50 C. 3.00 1.50 1.50 d. 3.00 1.75 1.25 d. 3.00 1.50 1.50 Total 3.00 1.62 1.38 3.00 1.25 1.75 a. 3.00 1.50 1.50 a. 3.00 1.00 2.00 N=6O b. 3.00 1.75 1.25 b. 3.00 1.00 2.00 C. 3.00 1.50 1.50 C. 3.00 1.25 1.75 d. 3.00 1.75 1.25 d. 3.00 1.50 1.50 Total 3.00 1.62 1.38 3.00 1.19 1.81 a. 3.00 1.50 1.50 a. 3.00 1.00 2.00 N=100 b. 3.00 11.50 1.50 b. 3.00 1.00 2.00 C. 3.00 1.50 1.50 C. 3.00 1.25 1.75 d. 3.00 2.00 1.00 d. 3.00 1.50 1.50 Total 3.00 1.62 1.38 3.00 1.19 1.81 TOTAL 3.00 1.62 1.38 3.00 1.21 1.79 estimators were small to begin with since they were re- stricted to lie in the constrained Space, while that of the UMLE tends to decrease rapidly as a larger and larger pro- portion of its estimates fall within the constrained space. AS sample size increases the UMLE average estimate always de- creases fairly substantially. The average ranks of the two constrained estimators Show that the first variate of the MDE most frequently gives the smallest estimate of variance while the second variate of the (HTLE most often gives the smallest estimate, regardless of 104 TABLE 29 AVERAGE VALUES OF ESTIMATED VARIANCE FOR ELLIPTICAL MODEL Mean at (5,5) v1 v2 UML CML MD UML CML MD a. 3.382 .836 .635 a.' 10.415 1.431 1.731 d=30 b. 7.415 .918 .814 b. 10.693 1.355 1.512 c. .830 .535 .271 c. 14.342 1.750 2.142 d. 13.474 .957 .956 d. 14.267 1.442 1.444 a. 1.674 .682 .555 a. 5.454 1.399 1.589 N=60 b. 3.863 .873 .805 b. 5.526 1.266 1.368 c. .457 .375 .239 c. 6.752 1.700 1.904 d. 7.283 .960 .962 d. 6.470 1.291 1.286 a. .954 .538 .468 a. 2.967 1.265 1.370 N=100 b. 2.264 .793 .731 b. 3.278 1.128 1.221 c. .255 .237 .181 c. 4.093 1.610 1.696 . 3.840 .918 .916 d. 4.051 1.231 1.233 Mean at (5,5—/3) v1 v2 UML CML MD UML - CML MD a. 3.382 .788 .580 a. 10.415 1.174 1.443 N=30 b. 7.415 .822 .746 b. 10.693 1.102 1.281 c. .830 .520 .244 c. 14.342 1.501 1.814 d. 13.474 .907 .903 d. 14.267 1.286 1.283 a. 1.674 .606 .489 a. 5.454 .997 1.147 N=60 b. 3.863 .744 .698 b. 5.526 .902 1.020 c. .457 .368 .223 c. 6.752 1.364 1.511 d. 7.283 .871 .873 d. 6.470 1.069 1.062 a. .954 .443 .371 a. 2.967 .737 .811 N=100 b. 2.264 .611 .587. b. 3.278 .674 .766 c. .255 .226 .141 c. 4.093 1.024 1.089 d. 3.840 .765 .763 d. 4.051 .830 .831 105 the location of the mean. The average values table shows that the average estimates of the CMLE and MDE are rela- tively close to one another compared to the much larger Size of the UMLE. In addition the constrained average estimates always decrease as sample Size increases. The estimated variance of the constrained estimators for the second vari- ate is always larger than that for the first variate because the major axis of the constrained space ellipse is parallel to the axis of the second variate which allows a greater range for variation for the second variate than for the first variate. Estimated MSE A The sampling results for estimated MSE A for the ellip- tical model are summarized in Table 30 and Table 31. The UMLE always gives the largest estimate of MSE A when the mean iS at the (5,5) center point. When the mean is moved to the (5,5-/3) bottom point the UMLE still always gives the larg- est estimate for the first variate, however it occasionally does not give the largest estimate for the second variate. The estimated MSE A of the UMLE is occasionally smaller than that of the constrained estimators for the second variate partly because the estimated variance of the UMLE is gener- ally smaller than usual in this Case and partly because the estimated bias of the constrained estimators is considerably larger for the second variate when the population mean is set at the bottom point on the ellipse. Table 31 Shows that the v“ .. ,. 5‘ WV. 106 UMLE average estimate of MSE A is usually considerably larger than the average estimates of the CMLE and MDE. All Of the UMLE average estimates decrease as sample Size in- creases. .Comparing the CMLE and the MDE in the average ranks table Shows that the MD estimate of the first variate is most frequently the smallest in both the center point and bottom point mean situations. On the other hand, the CML TABLE 30 AVERAGE RANKS OF SMALLER ESTIMATED MSE A FOR ELLIPTICAL MODEL Mean at (5,5) V1 V2 UML CML MD UML CML MD a. 3.00 1.50 1.50 a. 3.00 1.50 1.50 N=30 b. 3.00 2.00 1.00 b. 3.00 1.00 2.00 C. 3.00 1.50 1.50 C. 3.00 1.50 1.50 d. 3.00 1.50 1.50 d. 3.00 1.50 1.50 Total 3.00 1.62 1.38 3.00 1.38 1.62 a. 3.00 1.50 1.50 a. 3.00 1.50 1.50 N=60 b. 3.00 2.00 1.00 b. 3.00 1.00 2.00 C. 3.00 1.50 1.50 C. 3.00 1.50 1.50 d. 3.00 1.50 1.50 d. 3.00 1.50 1.50 Total 3.00 1.62 1.38 3.00 1.38 1.62 a. 3.00 1.75 1.25' a. 3.00 1.25 1.75 N=100 b. 3.00 2.00 1.00 b. 3.00 1.00 2.00 C. 3.00 1.50 1.50 C. 3.00 1.50 1.50 d. 3.00 2.00 1.00 d. 3.00 1.00 2.00 Total 3.00 1.81 1.19 3.00 1.19 1.81 TOTAL 3.00 1.68 1.32 3.00 1.32 1.68 v I’ I' ’(1 l4: TABLE 30 - Continued 1W Mean at (5,5-/3) V1 V2 UML CML MD UML CML MD a. 3.00 1.50 1.50 a. 3.00 1.00 2.00 N—3O b. 3.00 2.00 1.00 b. 3.00 1.00 2.00 _ C. 3.00 1.50 1.50 C. 2.75 1.00 2.25 d. 3.00 1.75 1.25 d. 3.00 1.50 1.50 Total 3.00 1.69 1.31 2.94 1.12 1.94 a. 3.00 1.50 1.50 a. 3.00 1.00 2.00 N=60 b. 3.00 2.00 1.00 b. 3.00 1.00 2.00 C. 3.00 1.50 1.50 C. 2.75 1.25 2.00 d. 3.00 1.75 1.25 d. 3.00 2.00 1.00 Total 3.00 1.69 1.31 2.94 1.31 1.75 a. 3.00 1.50 1.50 a. 3.00 1.00 2.00 N=100 b. 3.00 2.00 1.00 b. 3.00 1.25 1.75 C. 3.00 1.50 1.50 C. 2.75 1.25 2.00 d. 3.00 2.00 1.00 d. 3.00 1.75 1.25 Total 3.00 1.75 1.25 2.94 1.31 1.75 TOTAL 3.00 1.71 1.29 2.94 1.25 1.81 estimate for the second variate is most frequently the smallest regardless of where the mean is. Table 31 Shows that the average MSE A estimates of the CMLE and the MDE are relatively close to one another. The average constrained estimates for the second variate are larger than those for the first variate. Shifting the mean to the (5,5-/3) bottom point of the ellipse slightly reduces the Size of the aver- age constrainted estimates for the first variate but in- creases the average estimates for the second variate which re- flects the positive bias for the second variate at the bottom 108 TABLE 31 AVERAGE VALUES OF ESTIMATED MSE A FOR ELLIPTICAL MODEL Mean at (5,5) V1 V2 UML CML MD UML CML MD a. 3.387 .839 .639 a. 10.415 1.431 1.731 N=30 b. 7.417 .919 .814 b. 10.694 1.356 1.513 C. .830 .535 .271 C. 14.401 1.755 2.149 d. 13.509 '.959 .958 d. 14.269 1.442 1.444 a. 1.675 .683 .556 a. 5.457 1.400 1.590 N=60 b. 3.866 .874 .806 b. 5.526 1.266 1.368 C. .458 .376 .239 C. 6.758 1.701 1.906 d. 7.283 .960 .962 d. 6.485 1.293 1.289 a. .954 .539 .469 a. 2.969 1.266 1.371 N=100 b. 2.265 .793 .731 b. 3.281 1.129 1.222 C. .255 .237 .181 C. 4.094 1.611 1.696 d. 3.846 .918 .916 d. 4.058 1.232 1.234 Mean at (5,5-/3) V1 V2 UML CML MD UML CML MD a. 3.387 .798 .580 a. 10.415 2.455 2.622 N=30 b. 7.417 .859 .754 b. 10.694 2.267 2.485 C. .830 .520 .244 C. 14.401 2.862 2.989 d. 13.509 .910 .906 d. 14.269 2.793 2.769 a. 1.675 .632 .495 a. 5.457 1.844 1.934 N=60 b. 3.866 .774 .705 b. 5.526 1.762 1.897 C. .458 .368 .224 C. 6.758 2.449 2.495 d. 7.283 .871 .873 d. 6.485 2.251 2.243 a. .954 .468 .378 a. 2.969 1.274 1.306 N=100 b. 2.265 .649 .599 b. 3.281 1.228 1.333 C. .255 .226 .141 C. 4.094 1.710 1.721 d. 3.846 .765 .763 d. 4.058 1.582 1.583 109 TABLE 32 AVERAGE RANKS OF SMALLER ESTIMATED MSE B FOR ELLIPTICAL MODEL Mean at (5,5) Mean at (5,5-/3) UML CML MD UML CML MD a. 3.00 1.50 1.50 a. 3.00 1.50 1.50 N=30 b. 3.00 1.00 2.00 b. 3.00 l.25 1.75 c. 3.00 1.25 1.75 C. 3.00 l.50 1.50 d. 3.00 1.50 1.50 d. 3.00 1.50 1.50 Total 3.00 1.31 1.69 3.00 1.44 1.56 a. 3.00 1.75 1.25 a. 3.00 l.50 1.50 N=60 b. 3.00 1.00 2.00 b. 3.00 1.25 1.75 C. 3.00 1.50 1.50 C. 3.00 1.50 1.50 d. 3.00 1.50 1.50 d. 3.00 2.00 1.00 Total 3.00 1.44 1.56 3.00 l.56 1.44 a. 3.00 1.25 1.75 a. 3.00 1.50 1.50 N=lOO b. 3.00 1.00 2.00 b. 3.00 1.25 1.75 C. 3.00 l.50 1.50 C. 3.00 1.50 1.50 d. 3.00 1.00 2.00 d. 3.00 2.00 1.00 Total 3.00 1.19 1.81 3.00 1.56 1.44 TOTAL 3.00 1.31 1.69 3.00 1.52 1.48 point on the constrained space ellipse. In all cases the average constrained estimate of MSE A almost always declines as sample size increases which suggests that the CMLE and the MDE may be consistent estimators of the population mean. Estimated MSE B The sampling results for estimated MSE B for the ellip- tical model are summarized in Table 32 and Table 33. variates are combined under the definition of estimated The "a. \- ..5. .\ “' 110 TABLE 33 AVERAGE VALUES OF ESTIMATED MSE B FOR ELLIPTICAL MODEL Mean at (5,5) Mean at (5,5-/3) UML CML MD UML CML MD a. 6.901 1.135 1.185 a. 6.901 1.627 1.601 N=30 b. 9.555 1.137 1.164 b. 9.555 1.563 1.620 c. 7.615 1.145 1.210 c. 7.615 1.691 1.617 d. 13.889 1.201 1.201 d. 13.889 1.851 1.837 a. 3.566 1.041 1.073 a. 3.566 1.238 1.215 N=60 b. 4.696 1.070 1.087 b. 4.696 1.268 1.301 c. 3.608 1.038 1.073 c. 3.608 1.409 1.359 d. 6.884 1.127 1.126 d. 6.884 1.561 1.558 a. 1.962 .902 .920 a. 1.962 .871 .842 N=100 b. 2.773 .961 .976 b. 2.773 .938 .966 c. 2.174 .924 .938 c. 2.174 .968 .931 d. 3.952 1.075 1.075 d. 3.952 1.174 1.173 MSE B. The UMLE always gives the largest estimate of MSE B regardless of the location of the mean. Table 33 shows that the average UML estimate is always considerably larger than those of the CMLE and the MDE, which as in the case of the square tends to reconfirm Ramsey's theorem (30, p.22) for convex sets indicating that the MSE B of the UMLE should be larger than that of the MDE. UML average estimates decline. As sample size increases, the The average ranks table shows that the CML estimate of MSE B is most frequently smallest when the mean is at (5,5). The MDE gives smaller estimates slightly more often than the CMLE when the mean is moved to (5,S-/3). Table 33 shows that the CML and MD average estimates are relatively close to one another particularly when the variances are equal and the 111 ‘TABLE 34 AVERAGE RANKS OF SMALLER ESTIMATED MSE D FOR ELLIPTICAL MODEL Mean at (5,5) Mean at (5,5-/3) UML CML MD UML CML MD a. 3.00 1.00 2.00 2.75 1.00 2.25 N=30 b. 3.00 1.00 2.00 3,00 1.00 2.00 C. 3.00 1.00 2.00 2.75 1.00 2.25 d. 3.00 1.00 2.00 3.00 1.50 1.50 Total 3.00 1.00 2.00 2.87 1.13 2.00 a. 3.00 1.00 2.00 2.75 1.00 2.25 N'6O b. 3.00 1.00 2.00 3.00 1.00 2.00 — C. 3.00 1.00 2.00 2.75 1.25 2.00 d. 3.00 1.00 2.00 3.00 1.50 1.50 Total 3.00 1.00 2.00 2.87 1.19 1.94 a. 3.00 1.00 2.00 a 2.75 1.00 2.25 N‘lOO b. 3.00 1.00 2.00 b. 3.00 1.00 2.00 _ C. 3.00 1.00 2.00 C. 2.75 1.25 2.00 d. 3.00 1.00 2.00 d. 3.00 1.50 1.50 Total 3.00 1.00 2.00 2.87 l.l9 1.94 TOTAL 3.00 1.00 2.00 2.87 1.17 1.96 covariance is nonzero. Shifting the mean to (5,5-/3) usually increases the size of the average estimates as a re- sult of the increase in bias. ple size the smaller the increase. increased, estimators decline. Estimated MSE D However, the larger the sam- As the sample size is all of the average estimates of the constrained The sampling results of estimated MSE D for the ellip- tical model are summarized in Table 34 and Table 35. As for 112 TABLE 35 AVERAGE VALUES OF ESTIMATED MSE D FOR ELLIPTICAL MODEL Mean at (5,5) Mean at (5,5-/3) U-C U-M C-M U-C U-M C-M a. 15.048 17.103 -.100 a. 13.529 15.300 -.046 N=3O b. 35.476 38.466 -.115 b.. 32.244 33.514 -.061 C. 3.734 6.841 -.104 C. 3.571 6.681 -.035 d. 160.863 160.854 -.000 d. 144.515 144.868 .000 a. 2.226 2.816 -.037 a. 2.232 2.699 -.013 N=60 b. 6.077 7.096 -.051 b. 5.623 6.010 -.023 C. .418 1.064 -.028 C. .383 .998 -.007 d. 32.819 32.830 -.000 d. 27.140 27.185 -.000 a. .297 .423 -.010 a. .468 .604 -.003 N=100 b. 1.285 1.618 -.023 b. 1.629 1.710 -.009 C. .043 .178 -.005 C. .068 .269 -.001 d. 8.254 8.252 -.000 d. 7.595 7.599 -.000 the two previous models, MSE D is a matrix and the average values given in Table 35 are the values of the determinants of the MSE D difference matrices. Table 34 shows that when the mean is at (5,5) the CMLE always gives the smallest estimate of MSE D, the MDE gives the second smallest estimate, and the UMLE always gives the largest estimate. When the mean is moved to (5,5-/§)there are a few exceptions, but the general pattern remains the same. The average values table shows that the estimated MSE D matrices of the CMLE and the MDE are always smaller that that of the UMLE. Table 35 also shows that shifting the mean to (S,5-/§) usually decreases the average difference between the UMLE and the constrained estimators' MSE D matrices, but less so as sample size increases. It can also be seen that the CML average estimate of the MSE D matrix is always slightly smaller or the same as the average MD esti- mate. The difference is always zero when the variances are equa1»and the covariance is zero (specification d) where the likelihood "ellipse" approximates a circle. Shifting the mean to (5,5-/3) decreases the difference between the CML and MD estimated MSE D matrices. Increasing the sample size also decreases the difference which suggests that the constrained estimators may converge asymptotically. Estimated MSE E The sampling results for estimated MSE E for the ellip- tical model are summarized in Table 36 and Table 37. The UMLE always gives the largest estimate of MSE E regardless of the location of the mean. Table 37 shows that its average estimates are considerably larger than those of the CMLE and the MDE. As sample size increases, the estimates of all three estimators decrease, with the UMLE estimate declining most rapidly suggesting its convergence asymptotically to the constrained estimators. The MDE more frequently gives the smallest estimate of MSE than does the CMLE. This is true in both the (5,5) and the (5,5-/3) cases. Table 37 shows that the constrained average estimates are relatively close to one another par- ticularly when the variances are equal and the covariance is zero (specification d). Shifting the mean to (5,5-f3) tends 114 TABLE 36 AVERAGE RANKS OF SMALLER ESTIMATED MSE E FOR ELLIPTICAL MODEL Mean at (5,5) Mean at (5,5-/3) UML CML MD UML CML MD a. 3.00 2.00 1.00 a. 3.00 1.50 1.50 N=30 b. 3.00 2.00 1.00 b. 3.00 2.00 1.00 c. 3.00 2.00 1.00 c. 3.00 1.50 1.50 d. 3.00 1.50 1.50 d. 3.00 1.50 1.50 Total 3.00 1.87 1.13 3.00 1.62 1.38 a. 3.00 2.00 1.00 a. 3.00 1.50 1.50 “:60 b. 3.00 2.00 1.00 b. 3.00 2.00 1.00 * c. 3.00 2.00 1.00 c. 3.00 1.50 1.50 d. 3.00 1.50 1.50 d. 3.00 2.00 1.00 Total 3.00 1.87 1.13 3.00 1.75 1.25 a. 3.00 2.00 1.00 a. 3.00 1.50 1.50 N=100 b. 3.00 2.00 1.00 b. 3.00 1.75 1.25 . 3.00 2.00 1.00 c. 3.00 1.50 1.50 d. 3.00 1.00 2.00 d. 3.00 2.00 1.00 Total 3.00 1.75 1.25 3.00 1.69 1.31 TOTAL 3.00 1.83 1.17 3.00 1.69 1.31 to increase the average estimates of the CMLE and the MDE, especially for small sample sizes which reflects the in- creased estimated bias for the constrained estimators. When the sample size is 100, the estimates are actually decreased in some cases which results from the compensating decrease in estimated variance especially for the second variate. 115 TABLE 37 AVERAGE VALUES OF ESTIMATED MSE E FOR ELLIPTICAL MODEL Mean at (5,5) Mean at (5,5-/3) UML CML MD UML CML MD a. 26.919 1.192 1.021 a. 26.919 1.907 1.414 N—30 b. 53.723 1.245 1.108 b. 53.723 1.878 1.666 _ C. 11.950 .938 .583 C. 11.950 1.489 .730 d. 192.630 1.383 1.383 d. 192.630 2.532 2.500 a. 6.699 .907 .773 a. 6.699 1.061 .831 N=60 b. 13.617 1.066 .934 b. 13.617 1.254 1.139 C. 3.097 .639 .455 C. 3.097 .901 .557 d. 47.219 1.241 1.241 d. 47.219 1.960 1.958 a. 2.055 .623 .559 a. 2.055 .514 .411 N—lOO b. 4.775 .829 .742 b. 4.775 .688 .645 — C. 1.043 .382 .306 C. 1.043 .387 .243 d. 15.585 1.130 1.131 d. 15.585 1.209 1.207 Summary of Elliptical Model Sampling Results In the elliptical model the estimated bias was essen- tially zero for both variates of all three estimators when the mean was in the center of the constrained space and for the first variate when the mean was at the bottom of the con- strained space ellipse. For the second variate when the mean was at the bottom of the ellipse, the UML estimate continued to appear to be unbiased but both of the constrained esti- mators had moderate positive estimated bias that declined as sample size increased. than that of the constrained estimators. However, The estimated variance of the UMLE was always greater the esti- mated variance of the UMLE tended to converge toward that of 116 the constrained estimators as sample size increased. The estimated variance of the CMLE was quite close to that of the MDE in most cases. The estimated variance of each of the three estimators always decreased as sample size increased. The UMLE always had the largest estimated mean squared error for MSE A, B, and E. It also always had the largest estimate of MSE D when the mean was at the center of the constrained space, and almost always had the largest estimate of MSE D when the mean was at the bottom of the ellipse. The MDE tended to have the smallest estimate of MSE A for the first variate while the CMLE tended to have the smallest es— timate of MSE A for the second variate. For estimated MSE B, the CMLE tended to have the smallest estimate when the mean was at the center of the ellipse, while the MDE tended to have the smallest estimate when the mean was at the bottom of the ellipse. The CMLE always had the smallest estimate of MSE D when the mean was in the center of the Constrained space and almost always had the smallest estimate when the mean was at the bottom. The MDE almost always had the smallest estimate of MSE E when the mean was at the center of the constrained space and usually had the smallest estimate when the mean was at the bottom. In all of the estimates of mean squared error as variously defined, the CMLE's estimate of MSE was quite close to that of the MDE. When the true population mean was at the center of the constrained space, the estimates of the skewness of the UMLE, the CMLE, and the MDE were very small relative to their 117 standard deviations, and the estimates were nearly always equal. There were about an equal number of positive and negative values which suggests that these estimators have symmetric distributions when the population mean is at the center of the constrained space. The UMLE as well as the first variate of the constrained estimators continued to have small estimated skewness when the mean was shifted to the bottom of the elliptical constrained space. However for the second variate the estimates of skewness of the CMLE and the MDE increased and were almost always positive when the mean shifted to the boundary. The ratio of the average es— timates of skewness to the standard deviations for the second variates of the constrained estimators when the mean was at the boundary ranged from about 0.5 to 5.0. The estimated kurtosis of the UMLE was quite small in almost all cases and fluctuated between positive and nega- tive values. When the mean was at the center point the es— timates of kurtosis of both the constrained estimators were nearly the same and were usually negative. The ratio of kurtosis to their standard deviations ranged from about -0.1 to -1.0. When the mean was shifted to the boundary of the constrained space the ratios of kurtosis to the standard deViations for the first variates ranged from -0.5 to -1.5 While the ratios of the second variate ranged from -l.0 to +6.0. The CML and MD estimates of the four moments were quite (Hinge to one another in each case. When the mean was in the 118 center of the constrained space, the estimated first moments of all three estimators were nearly the same. When the mean shifted to the boundary the first moments for the first variate of the three estimators were about the same, but the UMLE's.first moment for the second variate tended to be smaller than those of the constrained estimators. The esti- mated second moment of the UMLE was somewhat larger than those of the constrained estimators when the mean was in the center of the constrained space. When the mean was at the bottom of the ellipse, the second moments of the three esti- mators for the first variate were about the same, but the second moment of the UMLE for the second variate tended to be smaller than those of the constrained estimators. The estimated third moment of the UMLE was greater than those of the constrained estimators when the mean was in the center of the constrained space and was greater for the first vari- ate when the mean was at the boundary. The estimated third moment of the UMLE for the second variate was often smaller but occasionally somewhat larger than those of the con- strained estimators, and in general exhibited greater varia- tion than those of the constrained estimators. The estimated fourth moment of the UMLE was generally quite a bit larger than those of the constrained estimators, although occasion- ally its estimated fourth moment for the second variate was smaller than those of the constrained estimators when the mean was at the boundary. 119 4.6 ‘Conclusion The above Monte Carlo sampling experiments have clearly demonstrated the superiority of the MDE and the CMLE over the UMLE in terms of the various definitions of mean squared errorvunder different specifications and conditions as sug— gested by Ramsey (30, p.20). In addition, the virtual equi- valence of the MDE and the CMLE in a number of situations has been demonstrated. In general, there has not been a great deal of difference between the results obtained for the MDE and those obtained for the CMLE. In those circumstances where the CMLE gave a smaller mean squared error than the MDE, the MDE may be preferred because of its ease of calcu- lation relative to the difficulties encountered in calcula- ting the CMLE for situations involving finitely bounded compact sets. Now that the MDE has been shown to lead to considerable reduction in mean squared error, examples of its use in single sample situations will be presented in Chapter V. a; F‘u .. . 5L n- 5. ha '5‘. CHAPTER V APPLYING THE MINIMUM DISTANCE ESTIMATOR Having discussed the testing procedures of Chapter II and the minimum distance estimating procedures of Chapters III and IV, Chapter V will now demonstrate the use of the tests and the minimum distance estimator in single sample situations where restrictions are imposed on various produc- tion functions. This chapter will demonstrate under what circumstances and to what extent in these particular examples use of the minimum distance estimator can result in a reduc- tion in the estimated variance of the regression coefficients and how this reduction in estimated variance is not substan- tially offset by an increase in estimated bias in the calcu- lation of estimated mean squared error. Restrictions may be imposed for any number of reasons including a_priori theoretical considerations, restrictions suggested by previous research and published studies, or by testing procedures such as those described in Chapter III. In the heuristic examples given below of applying the mini- mum distance estimator, no attempt has been made to justify each restriction demonstrated although some such reasons may occur to the reader for these or for additional restrictions in these or in other examples of interest. This technique may be applied to data from Kendrick (21, p.192) using a log-linear representation of a CBS pro- duction function derived from a truncated Taylor series ex- pansion of the form: 120 a u v. ar v. . we r . b‘ t H .. 1~ 9 logQ = b0 + bllogK + bzlogL + b3(logK-logL)2 + e (56) where Q = output, K = capital, L = labor, and e - error term. ho is the constant intercept term, b1 may be referred to as the capital coefficient, b2 may be referred to as the labor coefficient, and b3 is the coefficient of the substitution term. The above equation may be estimated by ordinary least squares. Tests may then be performed on various proposed restrictions which might be incorporated by use of the mini- mum distance estimator. Tests of these inequality restric- tions are performed as special cases of the test procedures discussed in Chapter III and are in terms of the student-t distribution. Table 39 lists some possible restrictions on the coefficients of labor and capital where the test statis- tic derived from the regression for each proposed restric- tion must be greater than the critical value of -l.7 for the restriction to be accepted at the 5% level of significance. None of the coefficients violated the restriction that the coefficients be greater than zero. Consequently, this re- striction was accepted at the 5% level for the coefficients of capital and labor at each of the three sample sizes. The restriction that the coefficients by greater than one-half was violated for the labor coefficient in both the sample size 60 and sample size 78 cases. However, these violations were not substantial enough to cause the one-half restric- tion to be rejected in any of the tests of these coeffici- ents. All of the labor coefficients but none of the capital ‘a- 'N 51 coefficients violated the restriction requiring the coeffi— cients to be greater than one, but this restriction was re- jected only for the labor coefficient in the sample size 78 case. All of the coefficients violated the.restriction that the coefficients be greater than two. Moreover, the restric- tion was rejected at the 5% level in every case except for the capital coefficient for sample size 60 where the esti- mated variance was noticeably larger than for the other two sample sizes. This larger variance might explain why the test statistic was not sufficiently large enough to result in the rejection of the restriction at the 5% level in this one case. The estimated variance of the minimum distance estima- tor, the estimated absolute value of the bias of the minimum distance estimator, and the estimated mean squared error of the minimum distance estimator are also given in Table 38 along with the percent reduction in mean squared error re- sulting from the use of the minimum distance estimator in- stead of the unconstrained estimator. The maximum absolute bias was also calculated and happens to correspond to the estimated absolute bias under the restrictionz% 3.2 in each case. The reduction in estimated variance was generally not substantially different although usually slightly larger than that of the estimated mean squared error. The reduc- tion in estimated variance tended to increase as progres- sively more stringent restrictions were placed on the regres- sion coefficients. Thus in this sense the efficiency of the 123 TABLE 38 ESTIMATED VARIANCE, ABSOLUTE BIAS AND MEAN SQUARED ERROR OF THE MDE FOR THE KENDRICK REGRESSIONS 6 Est. Test 6m Est. Est. Est. MSE u Var Stat. > Var |Bias| MSE Reduc— 0 — 0 8 6 tion u m m m N=30 Labor .655 .108 2.0 0 .106 .001 .106 3% .5 L .059 .068 .063 42% 1.0 1 .001 .131 .017 85% 4.1 2 .000 .131 .000 100% Capital 1.246 .036 6.5 0 .036 .000 .036 % 3.9 8 .036 .000 .036 0% 1.3 1 .031 .007 .031 16% 4.0 2 .000 .076 .000 100% N=60 Labor .282 .382 .5 0 .204 .130 .221 42% .4 8 .078 .247 .139 64% 1.2 1 .000 .247 .048 87% 2.8 2 .000 .247 .001 99% Capital 1.658 .482 2.4 0 .479 .001 .479 1% 1.7 k .445 .010 .445 8% .9 1 .353 .063 .357 26% .5 2 .074 .277 .150 69% N=78 Laxn: .428 .059 1.8 0 .055 .003 .055 6% .3 b .013 .096 .023 62% 2.4 1 .000 .097 .001 99% 6.5 2 .000 .097 .000 100% Capital 1.549 .053 6.7 0 .053 .000 .053 0% 4.6 8 .053 .000 .053 0% 2.4 1 .052 .000 .052 % 2.0 2 .000 .092 .001 97% 124 estimates of the coefficients of labor and capital for this production function tended to improve substantially when the minimum distance estimator was used to impose effective a_ prior} restrictions in the estimation process. The contri- bution of the minimum distance estimator to production func- tion theory consists of the extent to which this improvement in efficiency is important to the production function theo- rist which in turn may depend upon the nature and stringency of the restrictions that such a theorist deems reasonable to impose. In another example, this approach may be applied to the production functions estimated by Ramsey and Zarembka (32, p.5) in their study of alternative function forms of the aggregate production function. The stochastic formula- tions of the five functional forms used were the Cobb- Douglas (CD) production function: 1n y1 = 1n «0 + «1 1n ki + «2 1n Li + ui; the constant elasticity of substitution (CES) production function: a ' I .p/v _ p p j. — élki + ézLi + u21 l the variable elasticity of substitution (VES) production function: JG LL 125 1n yi = 1n y + ¢(l-6p) 1n ki + 0C(Spln(Li+(p-1)ki) + u .; 31 the generalized production function (GPF): ln-yi + yyi = ln «0 + «1 1n ki + «2 ln Li + uni; and the quadratic production function (QP): _ 2 y. _ «0 + alLi + azki + a3Li + «uki + asLiki + uSi where yi is the value added of aggregate output of manufac- turing industry in each state in 1957, k1 is the value of capital services, and Li is the labor input in terms of em- ployment in each state in 1957. The CD and QP production functions were estimated by ordinary least squares while the other three were estimated by maximum likelihood. Table 39 indicates the reduction in mean squared error which is achieved by using the minimum distance estimator for the capital and labor coefficients. A percent reduction approaching 100% implies an estimated variance approaching zero and suggests that fewer and fewer observations fall in the constrained region. The sensitivity of the reduction in estimated variance to the restriction appears to depend upon the size of the unconstrained estimate relative to its esti— mated variance. In this case slightly tighter restrictions resulted in substantial reductions in estimated variance. The degrees of freedom for the Ramsey-Zarembka regressions q u .3" $- vav . ‘ - 9 I 126 TABLE 39 ESTIMATED VARIANCE, ABSOLUTE BIAS AND MEAN SQUARED ERROR OF THE MDE FOR THE RAMSEY-ZAREMBKA REGRESSIONS 8n Est. Test 0 Est. Est. Est. MSE Var. Stat. Var |Bias| MSE Reduc- 0 > 8 0 0 tion u —- m m m I. C-D Labor .689 .003 13.2 0 .003 .000 .003 % 3.6 2 .003 .000 .003 0% 6.0 l .000 .021 .000 100% -25.2 2 .000 .021 .000 100% Capital .313 .003 5.7 0 .003 .000 .003 0% 3.4 8 .000 .022 .000 100% -12.4 1 .000 .022 .000 100% -30.6 2 .000 .022 .000 100% II. CES Labor .096 .000 16.0 0 .000 .000 .000 0% -150.7 1 .000 .002 .000 100% -317.3 2 .000 .002 .000 100% Capital .657 .986 .7 O .613 .150 .635 36% .2 8 .400 .323 ' .504 49% .3 1 .204 .396 .361 63% 1.4 2 .067 .396 .090 91% III. VES Labor 1.152 .007 13.6 0 .007 .000 .007 0% 7.7 8 .007 .000 .007 % 1.8 l .007 .001 .007 6% 10.0 2 .001 .034 .000 100% Capital -.138 .007 1.6 0 .003 .034 .004 50% ' 7.4 k .001 .034 .000 100% -13.2 1 .001 .034 .000 100% -24.9 2 .001 .034 .000 100% j as Labor caPit 127 TABLE 39 - Continued 8n Est. Test 8 Est. Est. Est. MSE Var. Stat. > Var IBiasI MSE Reduc- 8 — e 8 8 tion 11 m m III IV. GPE Labor .678 .002 13.8 0 .002 .000 .002 0% 3.6 8 .002 .000 .002 0% - 6.6 1 .000 .020 .000 100% -27.0 2 .000 .020 .000 100% Capital .300 .003 5.8 0 .003 .000 .003 0% - 3.8 k .000 .021 .000 100% —13.5 1 .000 .021 .000 100% —32.7 2 .000 .021 .000 100% v. QP Labor .004 .000 6.7 0 .000 .000 .000 0% - 826.7 8 .000 .000 .000 100% -l660.0 1 .000 .000 .000 100% -3326.7 2 .000 .000 .000 100% Capital 3.318 .687 4.0 0 .687 .000 .687 0% 3.4 8 .687 .001 .687 0% 2.8 1 .687 .002 .687 0% 1.6 2 .626 .015 .627 8% range from 43 for the quadratic production function to 47 for the constant elasticity of substitution production function resulting in a critical value of approximately -1.7 for the tests of the restrictions at the 5% level of significance. The results range from almost all of the proposed restric- tions being rejected for the capital coefficient of the variable elasticity of substitution production function to all of the proposed restrictions being accepted for the capital coefficient of the quadratic production function. The coefficients whose test statistics showed the greatest /\ Ch 11 128 sensitivity to the ever more stringent restrictions had the smallest estimated variances. In particular, the estimated variances of the coefficients of labor in both the CBS and QP models were very small resulting in test statistics that went from positive to quite large negative values. On the other hand, the estimated variances of the coefficients of capital in both the CBS and QP models were quite large re- sulting in test statistics that did not change radically as more stringent restrictions were imposed. Just as in the Monte Carlo experiments of Chapter IV, the estimated vari- ance tended to dominate the estimated bias in the calcula- tion of estimated mean squared error. The maximum absolute bias again happens to correSpond to the estimated absolute bias under the restriction 0m :.2 in each case except for that relating to the capital coefficient for the quadratic production function where the maximum absolute bias is .331. In general, tighter restrictions result in a considerable reduction in variance especially where the minimum distance estimate is at the boundary of the constrained space. Ferguson's (8, p.135) time-series study of a CBS pro- duction function provides a convenient example of the effect of the minimum distance estimator on the estimated variance of the regression coefficient representing the estimated elasticity of substitution. The elasticity of substitution between labor and capital was estimated in the lumber, fur- niture, paper, chemicals, petroleum, metal 1, and machinery industries with the equation: l - a v ”(‘1 .AU' I LU mo. H O C. L .1 in 7,“ 1 31 \ U. 2. t ‘1‘ v; p 129 TABLE 40 ESTIMATED VARIANCE, ABSOLUTE BIAS AND MEAN SQUARED ERROR OF THE MDE FOR THE FERGUSON REGRESSIONS —~ 8 Industry u Est. Test 8m Est. Est. Est. MSE Var. Stat. > Var lBiasl MSE Reduc- 0 — 0 0 tion u m m m Food .241 .040 1.2 0 .033 .008 .033 18% - 1.3 8 .001 .080 .004 89% - 3.8 1 .000 .080 .000 99% - 8.8 2 .000 .080 .000 100% Tobacco 1.183 .212 2.6 0 .212 .010 .212 0% 1.5 8 .191 .003 .191 10% .4 1 .107 .106 .118 44% - 1.8 2 .000 .184 .011 95% Apparel 1.084 .026 6.8 0 .026 .000 .026 % 3.7 8 .026 .001 .026 0% .5 1 .014 .030 .015 40% - 5.7 2 .000 .064 .000 100% Lumber .905 .004 13.5 0 .004 .000 .004 0% 6.0 8 .004 .000 .004 0% - 1.4 1 .000 .027 .000 91% -16.3 2 .000 .027 .000 100% Furniture 1.123 .002 25.0 0 .002 .000 .002 0% 13.8 8 .002 .000 .002 0% 2.7 l .002 .001 .002 0% -19.5 2 .000 .018 .000 100% Paper 1.016 .004 16.9 0 .004 .000 .004 0% 8.6 8 .004 .000 .004 0% .3 1 .002 .017 .002 47% -16.4 2 .000 .024 .000 100% Printing 1.147 .096 3.7 0 .096 .002 .096 0% 2.1 8 .096 .006 .096 0% .5 1 .052 .063 .056 42% - 2.8 2 .000 .124 .001 99% Chemicals 1.248 .005 17.3 0 .005 .000 .005 0% 10.4 8 .005 .000 .005 0% 3.4 1 .005 .001 .005 0% -10.4 2 .000 .029 .000 100% 'u (D rr ,u L: t, 3 ‘ t (I) Lu I in () »—_7 far TABLE 40 - Continued 130 > Industry Bu Est. Test 0m Est. Est. Est. MSE Var. Stat. > Var lBiasl MSE Reduc- 0 '— 0 tion u m m m Petroleum 1.300 .022 8.7 0 .022 .000 .022 0% 7 5.4 8 .022 .000 .022 0% 2.0 1 .022 .003 .022 0% - 4.7 2 .000 .059 .000 100% Rubber .759 .314 1.4 0 .274 .007 .274 12% .5 8 .167 .116 .181 42% - .4 1 .057 .223 .107 66% - 2.2 2 .000 .223 .010 97% Leather .865 .020 6.2 0 .020 .000 .020 0% 2.6 8 .020 .003 .020 0% - 1.0 1 .000 .056 .004 82% - 8.1 2 .000 .056 .000 100% Glass .666 .221 1.4 O .197 .002 .197 11% .4 8 .107 .115 .120 45% - .7 1 .020 .188 .055 75% - 2.8 2 .000 .188 .003 99% Metal 1 1.200 .011 11.4 0 .011 .000 .011 0% 6.7 8 .011 .000 .011 0% 1.9 l .011 .003 .011 0% - 7.6 2 .000 .042 .000 100% Metal 2 .926 .068 3.6 0 .068 .002 .068 0% 1.6 8 .064 .001 .064 6% - .3 1 .016 .104 .026 61% - 4.1 2 .000 .104 .000 100% Machinery 1.041 .002 25.4 0 .002 .000 .002 0% 13.2 8 .002 .000 .002 0% 1.0 1 .001 .003 .001 24% -23.4 2 .000 .016 .000 100% Electrical .643 .130 1.8 0 .126 .004 .126 0% .4 8 .066 .083 .072 44% - 1.0 1 .002 .144 .022 83% - 3.8 2 .000 .144 .000 100% Transport .237 .314 .4 0 .162 .124 .178 43% - .5 8 .052 .223 .102 68% - 1.4 1 .000 .223 .032 90% - 3.1 2 .000 .223 .002 99% l r—T! (I) We r81 tic 131 TABLE 40 - Continued —_ —~ Industry 8n Est. Test 8m Est. Est. Est. MSE Var. Stat. > Var. [Biasl MSE Reduc- 0 — 0 0 0 tion u m m m. Instruments .763 .084 2.6 0 .084 .006 .084 0% . .9 8 .060 .026 .061 27% .8 1 .005 .116 .018 78% 4.3 2 .000 .116 .000 100% Textiles 1.104 .194 2.5 0 .194 .010 .194 0% 1.4 8 .169 .008 .169 13% .2 1 .085 .128 .101 48% 2.0 2 .000 .176 .007 97% log v = a + b1 log w + u where v is value added per man-year, w is the real wage rate, and b1 is an estimate of the elasticity of substitu- tion between labor and capital. The other 12 industries had estimates based on the equation: log v = a + b1 log w + bzt + u where t is an index of time. The observations were for the years 1949 through 1961 except for rubber, glass, and metal 1 which only covered up through 1958 because of material changes in industry designations for those three industries. The equations for all 19 industries were estimated by ordi- nary least squares. The rubber, glass, and metal 1 equa- tions have about seven degrees of freedom with a critical value for the tests of approximately -1.9 at the 5% level. Lt v. LC. 3. m5 132 All the other equations have about ten degrees of freedom with a critical value of about -1.8. None of the coeffi- cients in the Ferguson regressions violated the restriction that the coefficients be greater than zero. Consequently, this restriction was accepted at the 5% level in the tests for every industry. The restriction that the coefficients be greater than one-half was violated only by the food and transport industries but not by a sufficient amount for the restriction to be rejected. Nine industries had coefficients that violated the restriction that they be greater than one although only in the case of the food industry was this vio- lation substantial enough to cause the restriction to be re- jected at the 5% level. The coefficients of all 19 indus- tries violated the restriction that they be greater than two and the restriction was rejected by all 19 industries at the 5% level although the rejection was marginal for the tobacco industry. Consequently, it is not surprising that the per- centage reduction in mean squared error approached 100% for the restriction that the coefficients be greater than two. The estimated absolute bias under the restriction 8m 1 2 is the same as the maximum absolute bias in each case. In vir- tually every case the estimated bias was offset by the sub- stantial reduction in estimated variance brought about by using the minimum distance estimation technique. The above examples demonstrate that the minimum dis- tance estimator can be used to impose a priori inequality restrictions that result in a substantial reduction in r. w 4- in 133 estimated variance although, as expected, this effect is greatest when the restrictions are violated by the unre- stricted estimates. This result confirms the usefulness and effectiveness of the minimum distance estimator in incorporating inequality restrictions in the estimation procedures as demonstrated by the Monte Carlo experiment in Chapter IV. Conclusion This dissertation has emphasized the importance of using all information both a priori and empirical in the estimation process, in particular when the a priori information is in the form of inequality restrictions. Procedures for testing for the appropriateness of inequality restrictions have been developed. The distributional properties of the minimum dis- tance estimator have been compared to those of the uncon- strained maximum likelihood estimator and the constrained maximum likelihood estimator in extensive Monte Carlo sam— pling experiments. It has been shown that the minimum dis— tance estimator is greatly superior to the unconstrained maximum likelihood estimator in terms of various definitions of mean squared error. The equivalence of the constrained maximum likelihood estimator and the minimum distance esti- mator in a number of circumstances has been indicated by noting that in general there has not been a substantial dif— ference between the distributional properties of the minimum distance estimator and those of the constrained maximum likelihood estimator. A single sample approximation of the variance, absolute bias, and mean squared error of the mini- mum distance estimator has been developed and many examples of applying the minimum distance estimator in regression analysis have been presented indicating the effect of in- creasingly stringent inequality restrictions. Thus, in general, the usefulness and effectiveness of the minimum dis- tance estimator have been demonstrated. 134 LIST OF REFERENCES 10. 11. 12. 13. LIST OF REFERENCES Bancroft, T. A. "On Biases in Estimation Due to the Use of Preliminary Tests of Significance." The Annals 9£_Mathematical Statistics 15 (1944): 190-204. Berman, Simeon M. Mathematical Statistics. Scranton: International Textbook Company, 1971. Brown, Murray, ed. The Theory and Empirical Analysis of Production. New York: National Bureau of Economic Research, 1967. Chernoff, Herman and Moses, Lincoln E. Elementary Decision Theory. New York: John Wiley & Sons, 1959. Chipman, J. S. and Rao, M. M. "The Treatment of Linear Restrictions in Regression Analysis." Econometrica 32 (1964): 198-209. Chow, G. C.‘ "Tests of Equality Between Sets of Coeffi- cients in Two Linear Regressions." Econometrica 28 (1960): 591-605. Cramer Harald. Mathematical Methods of Statistics. Princeton: Princeton University Press, 1946. Ferguson, C. E. "Time Series Production Functions and Technological Progress in American Manufacturing In- dustry." g, 9£_Political Economy, April 1965: 135-147. Ferguson, Thomas S. Mathematical Statistics, A_Deci- sion Theoretic Approach. New York: Academic Press, Inc., 1967. Fisher, F. M. "Tests of Equality Between Sets of Coef- ficients in Two Linear Regressions: An Expository Note." Econometrica 38 (1970): 361-66. Fisz, Marek. Probability Theory and Mathematical Sta- tistics. New York: John Wiley & Sons, Inc., 1963. Freund, John E. Mathematical Statistics. 2d ed. Englewood Cliffs: Prentice-Hall, Inc., 1971. Graybill, Franklin A. Ag_Introduction tg_Linear Sta- tistical Models. Vol. 1. New York?’ McGraw-Hill Book Co., Inc., 1961. 135 14. 15. 16. l7. l8. 19. 20. 21. 22. 23. 24. 25. 26. 136 Hodges, J. L. and Lehmann, E. L. "Testing the Approxi- mate Validity of Statistical Hypotheses." Journal of. the Royal Statistical Society, Series B, V01. 16 (1954): 261-268. Hoel, Paul G.; Port, Sidney C.; and Stone, Charles J. Introduction to Statistical Theory. Boston: Houghton Miffiin Co., 1971. Hogg, Robert V. and Craig, Allen T. Introduction to Mathematical Statistics. 3rd ed. New York: Macmfillan Publishing Co., Inc., 1970. Hussain, Muhammad. 92_Inequa1ity Constraints in_Regres- sion Analysis. Ph.D. dissertation, George Washington University, 1972. Johnson, Norman L. and Kotz, Samuel. Distributions in Statistics: Continuous Multivariate Distributions. New York: John Wiley & Sons, Inc., 1972. Judge, G. G. and Takayama, T. "Inequality Restrictions in Regression Analysis." Journal 9£_the American Statistical Association 61 (1966): 166-181. Kendall, Maurice G. and Stuart, Alan. The Advanced Theory 2: Statistics. Vol. 2. New York: Hafner Publishing Co., 1961. Kendrick, John W. Long Term Economic Growth 1860-1970. Bureau of Economic Analysis, Social and Economic Statistics Administration, U. S. Department of Commerce, 1973. ' ' Kmenta, Jan. Elements 9: Econometrics. New York: Macmillan Publishing Co., Inc., 1971. Kmenta, Jan. "On Estimation of the CES Production Function," University of Wisconsin Social Systems Research Institute Series, no. 6410, October 1964. Lehmann, E. L. Testing Statistical Hypotheses. New York: John Wiley & Sons, Inc., 1959. Lovell, Michael, and Prescott, Edward. "Multiple Re- gression with Inequality Constraints: Pretesting Bias, Hypothesis Testing and Efficiency." Journal of the American Statistical Association 65 (1970): —‘ 913-25. Meyer, Paul L. Introductory Probability and Statisti- cal Applications. 2d ed. Reading, Mass.: Addison- Wesley Publishifig Co., 1970. 27. 28. 29. 30. 31. 32. 33. 34. 35. 38. 39. 137 Mood, Alexander M. and Graybill, Franklin A. Introduc- tion to the Theory of Statistics. 2d ed. New York: McGraw-H111 Book Co., 1963. Moran, P. A. P. An Introduction to Probability Theory. London: Oxfordfi University Press, 1968. Penneck, Stephen J. Maximum Likelihood Estimation ” Using Inequality Constraints. Master's thesis, University of Birmingham, 1974. Ramsey, James B. "The Maximum Likelihood Estimation of Parameters Contained Within Finitely Bounded Compact Sets: Some Preliminary Results." Mimeographed. East Lansing, Mich.: Michigan State University, 1974. Ramsey, James B. "Mixtures of Distributions and Maxi- mum Likelihood Estimation of Parameters Contained in Finitely Bounded Compact Spaces." Mimeographed. East Lansing, Mich.: Michigan State University, 1975. Ramsey, James B- and Zarembka, Paul. "Specification Error Tests and Alternative Functional Forms of the Aggregate Production Function." Journal of the American Statistical Association 66 (1971T?'471-77. Rao, C. Radhakrishna. Linear Statistical Inference and Its Applications. New York: John Wiley & Sons, Inc., 1965 Theil, Henri. Principles of Econometrics. New York: John Wiley & Sons, Inc., 1971. Theil, Henri and Goldberger, A. "On Pure and Mixed Statistical Estimation in Economics,“ International Economic Review 2: 65-78. Theil, Henri, and Van de Panne, Management Science (1960): 1-20. Toro, Carlos and Wallace, T. D. "A Test of the Mean Square Error Criterion for Restrictions in Linear Regression." Journal of the American Statistical Association 63 (1968) 558- 572. Wallace, T. D. "Weaker Criteria and Tests for Linear Restrictions in Regression." Econometrica 40 (1972): Wallace, T. D. and Anderson, R. L. "Notes on Exact Linear Restrictions and the Generalized Linear Hypothesis." Unpublished paper. 4O 41 40. 41. 138 Wilks, S. 8. Mathematical Statistics. Princeton: Princeton University Press, 1947. Zellner, Arnold. "Linear Regression with Inequality Constraints on the Coefficients." Mimeographed Report 6109 of the International Center for Manage- ment Science, 1961. I (’4!!! >III-lrl‘31i