n- n. .\u o... 1th! u . u . . . . .. .. 1%»? 1.. n . . .. d: .1 .33 g$.. .e. ...um.a.b.~um.abh <3 . . - . . vmxamflfiyuaauki . ~¢ J‘ . a x .. MNWMW q: «Wdfl‘f 141.]. WW. 1.... - Hm. n$fi!.arery~nx..w.u . v .. . .3691! it. ll. 1 .1'1'! ‘ Q... .(«l . .w:2 w “.x v v flflfi o. m v . . .l n. nyy‘ 197$”. l§ WHO-“0’91. AU” o. "V” lemromW-v. a. r v 0 In ‘ 0 .1 DJ . .1 v .I ll’xi‘hfll'l 1" «(Indy . ! - . v.4 .. 1.“an o. .tflm... um {I I its-I1! . ‘3 w . ”Nic‘yknnnmyl p 391%th . III.) 1. I! w; .11.;lt www.mf . .. - in“ l ' It“. . o. L.......,.. n 1.... $.14. ..a t 6(4),! ‘113 « .tl’ fvhuuttlvp .........Y . t 1! . III ’1 § v01... 0 u v ‘ o u. : . an: mu? . O. V 'l c 7r 4 . I .. . . Z - - ‘ u L . z y :l 34 49.: En”. . h [pi . I It: . m I x ‘1‘” 1 ' ’ a I O otwnoll .1 ”4|... U n [’3 ‘5' til)}'. U. '0. f I‘ll”... ¢.%4.411 . . . flux. Md: . an“... . {15:11} H1 ‘7"): 'T I". {I :‘H,' ." 'a 1,-7 .- -. I t “l ‘ l‘ "11"“ 2.5 v‘ .171 ‘ u «\v. 1. not!" 5-.t.l.t II." 1...”, ?."..| .I. .III . . . 7 I "h‘ ' . ‘ n . . ..«N..|IJ. «H- . n p . . .u. I .8 l‘ :9" f l 1?‘ '. l '1 i M M1" ' “11:;th {whit}! I ,. ll“ w ‘ 1 ,1 I -i ‘N 1! W" l ‘ . .n..I-_. v > v. u. ' :0. y 0 I i in "Wilt ‘ I ‘ If 1 0.4 ' . r in»! : . . .. . - ‘ 'l‘lu. . 11d” "I .b v . ?.lj{ln.k.n1~fllsr:\ [\Uv‘dluh... ,IIP .. I} , 1‘." a vnl‘.l t.‘v.)i.t..¢l 1". - I. d. A flit-«Ix :1. 1.9%. . u c . . t. .5. ”1.4.3.3.... t v . - \ L!” “Wriqru . . .. . . . I - l I 'I‘”! V)! II m? a m H”. ,‘O‘Co‘ V L - . v . #9., I: I». 3 . I t I‘lv‘lv‘fll" . -- ‘ .‘ “fawn...“y. ..3.a.:..$..fl%.ww...+m. . v‘. I . . v . 7| I. .. :n-t A .v . §fll.tfln . 51M G‘GWHN. . yl . . ‘ ".0 - :5 . 3%.. H «£4.01! i.— .flhfim“ . “wk-1 l. u . ‘ . r .I. 11L 1 u.. . .1 1‘ I J , A ..1a‘. . u.‘ _ .., F v“: o; 0§I3Vu.- .. w v 5 I m ‘ .rtv . . u: . . 1",". TH E515 .3; " , , a as, L I B A” A A Y f Michigan State 3 Unwemty gm w Ln"- 0-. This is to certify that the dissertation entitled ROBUSTNESS OF THE TOBIT ESTIMATORS TO HETEROSKEDASTICITY AND NON-NORMALITY presented by Abbas Arabmazar has been accepted towards fulfillment of the requirements for Ph . D . degree in Economics (‘PéOAJ SQQ 1;ij Major professor Date November , 198 1 MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 OVERDUE FINES: 25¢ per day per item RETUMING LIBRARY MATERIALS: Place in book return to remove charge from ctrcuhtton records ROBUSTNESS OF THE TOBIT ESTIMATORS TO HETEROSKEDASTICITY AND NON-NORMALITY BY Abbas Arabmazar A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Economics 1981 “7/ / ‘” . (/41 / /___g_.,-‘/, ABSTRACT ROBUSTNESS OF THE TOBIT ESTIMATORS TO HETEROSKEDASTICITY AND NON-NORMALITY BY Abbas Arabmazar It is known that the estimates of the parameters in the Tobit Model and other limited, Truncated and Censored dependent variable models are not robust against the misspecification of the model. The inconsistency of the censored Tobit estimator when the errors are heter- oskedastic is shown. To investigate the severity of this inconsistency, a simple model of a constant-term-only regression is utilized, and the value of the asymptotic bias (inconsistency) is calculated for a variety of prameter values. Using the Lagrangian multiplier test principle, a test of heteroskedasticity is derived for both the truncated and censored Tobit models. It is also shown that the usual Tobit MLE which assumes normality is inconsistent when the disturbances are in fact non— normal. For the simple Tobit model with a constant term as the only regressor, the asymptotic bias of the normal MLE is calculated for a variety of non-normal errors. To My Parents, and my brothers, Rasool, Ali, and Amir ii ACKNOWELDGMENTS First of all, I would like to express my sincere gratitude to my committee chairperson Professor Peter Schmidt. It was his constant attention, helpful comments, editorial skill, and general encouragement that made completion of this study possible. I also wish to thank other members of my disser- tation committee, William Quinn, John Goddeeris, and James Johannes. In addition, I would like to acknowledge the gen- ersous moral and financial support of members of my family throughout these years. Finally, I owe thanks to my typist, Mrs. Nancy Heath. iii TABLE OF CONTENTS LIST OF TABLES O O O O O O O C O O 0 LIST OF FIGURES . . . . . . . . . . Chapter I 0 INTRODUCTION 0 O O O O O O O 0 II. III. IV. 1.1 Review of Literature and Statement of the Problem . . . . . . . ROBUSTNESS TO HETEROSKEDASTICITY . . 2.1 Introduction . . . . . 2.2 Derivation of the Inconsistency . 2.3 Calculation of the Inconsistency . 2.4 Conclusions . . . . . . . . TEST FOR HETEROSKEDASTICITY . . . . 3.1 Introduction . . 3.2 The Model and the Test Statistic . 3.3 The Statistic for the Truncated Case . . . . . . . . . 3.4 The Statistic for the Censored Case 3.5 Conclusions . . . . . . . . ROBUSTNESS TO NON-NORMALITY . . . . 4.1 Introduction . . . 4.2 The Model and Its Estimators . . 4.3 Derivation of the Inconsistency . 4.4 Calculations . . . . . . 4.5 Extension to the Regression Case . 4.6 Derivation of the Inconsistency and Calculations . . . . . . . 4.7 Conclusions . . . . . . . . iv Page vi vii 10 10 13 18 21 25 25 28 30 34 37 38 38 41 47 50 Chapter Page Appendix A . . . . . . . . . . . 73 Appendix B . . . . . . . . . . . 86 V. CONCLUSIONS . . . . . . . . . . . 93 BIBLIOGRAPHY . . . . . . . . . . . . . 97 Table 2.1 2.2 2.3 Probability limits of estimates of u and g 0.5 O O O O O O C O 0 LIST OF TABLES Probability limits of estimates of u and 0, r1 = .2, r2 = .8 . . . . . Probability limits of estimates of u and 0, r1 = .8, r2 = .2 (variance known) Truncated-Moments of selected zero-mean (Symmetric) districutions (variance knownL Asymptotic Asymptotic Asymptotic Asymptotic Asymptotic Asymptotic cients, Censored - tS (p = .5) . . .' Asymptotic biases--t5 . . . . . . biases---tlo . . . . biases--t20 (variance known) biases--Lap1ace . . . . biases--Logistic . . . . biases of regression coeffi- biases of regression coeffi- cients, Censored - ts (p = .5) . . . Asymptotic biases of regression coeffi- cients, Censored - t5 . . . . . vi Page 22 23 24 58 59 60 61 62 63 64 65 66 Figure 4.1 4.2 ‘4.3 4.4 4.5 Asymptotic Bias Distribution . Asymptotic Bias Distribution . Asymptotic Bias sored Sample . Asymptotic Bias cated Sample . Asymptotic Bias Sample . . . Asymptotic Bias LIST OF FIGURES of of of of of of tic Distribution . Estimates Estimates Estimates Estimates Estimates Estimates vii Page t10 67 t10 68 Cen- . . . 69 Trun- . . . 70 Binary . . . 71 Logis- 72 CHAPTER I INTRODUCTION 1.1 Reivew of Literature and Statement of the Problem The estimation of models with qualitative and limited dependent variables has received considerable theoretical and empirical attention. The appealing fea- ture of these models is that they allow one to answer a variety of questions based on an incomplete sample, which could not previously be answered. Although our study is primarily concerned with what is known in the econometric literature as the Tobit model, it should be noted that similar considerations apply for a wide range of models with qualitative and limited dependent variables which might loosely be grouped under the heading of ”sample selection models." In the case of the ordinary regression model, there exists a substantial literature on the violation of basic assumptions, such as heteroskedasticity, non- normality, autocorrelation, etc., whereas there is rela- tively little corresponding analysis for the case of limited dependent variable models. A selection of papers which apply the Tobit model to economic problems includes 1 Tobin (1958), Cragg (1971), Gronau (1973, 1974), Heckman (1974, 1976), Lee and Trost (1978), Hausman and Wise (1977), Nelson (1977). Other related papers are MacFadden (1974) and Schmidt and Strauss (1975). Before I formally define the model, it is neces- sary to define two statistical terms, which are used throughout the study. The term censored is applied to a sample in which some observations are recorded only as below (or above) some threshold, the exact value in such a case not being observed (having been censored). The term truncated is applied to the samples in which such obser- vations, i.e., below some threshold, are excluded entirely. Note that in the econometric literature, however, the term truncated is often applied to the censored sample case, apparently in reference to the variable rather than sample. Censored Tobit model.--Consider the regression model defined by yi xi 8 + Si if RHS > 0 (1.1.1) ifRHS_<_0(ik=l,2,...,T) II 0 where B is a k-component column vector of unknown para- meters, xi is a k-component row vector of known constants, and the 81's are independent with distribution N(0, 02). This model was first suggested by Tobin (1958). Amemiya (1973) points out that any non-zero known constant can be considered as a censoring point of the model, without much extra complication, so the zero censoring point is not as restrictive as it may seem. Truncated Tobit model.--Consider the regression model defined by y1 = xi 8 + 6i if and only if RHS > 0 (1.1.2) where all the variables are defined as above. Note that the observations on y are truncated to the left of zero. The Tobit model is a special case of the sample selection model of Heckman (1976), while the Probit model (binary dependent variable) is a special case of the Tobit model. See Goldberger (1980). It is well known that ordinary least squares will produce biased and inconsistent estimates of the regres- sion parametersixithe Tobit model. Maximum likelihood estimation is being used with increasing frequency to avoid this inconsistency. There are other procedures sug- gested by Heckman (1976, 1979) and Amemiya (1973). The strong consistency and asymptotic normality of the maximum likelihood estimators of the regression parameters in the Tobit model has been proved by Amemiya (1973). In this study we investigate the robustness of Tobit Normal maximum likelihood estimators to two types of misspecification; heteroskedasticity and non- normality. From what is known so far, it seems that the assumptions required of these models are quite strong, and any violation, such as heteroskedasticity or non-normality, may result in an asymptotic bias as severe as in the naive ordinary least square (OLS) formulation. This is in con- trast with OLS estimators in ordinary regression models, which are consistent when disturbances are heteroskedastic or the assumption of normality of the error terms is vio- lated. Maddala and Nelson (1975) show that the Tobit esti- mators are inconsistent when the model is misspecified to be homoskedastic. This is in unfavorable contrast with com- plete sample least square estimators which are consistent under heteroskedasticity. Hurd (1979) proves that the MLE's of the truncated Tobit model are inconsistent under heteroskedasticity in the simple case in which the only right-hand variable is the constant; i.e., estimating the mean of a truncated normal random variable. He introduces heteroskedasticity into the model by assuming that one subset of the observations has a disturbance variance different from the rest. In other words, the sample comes from two normal populations, with equal location parameters and different scale parameters, with equal probabilities. Taking the probability limits of the first order condi- tions, he is able to get an implicit form of the asymp- totic bias or inconsistency. By numerically solving for the bias, rue comes up with answers to the obvious ques- tions of the direction and severity of the bias. Using different sets of parameter values, he concludes that the bias is substantial even when the heteroskedasticity is in the range to be expected in empirical work. He also finds that the sign of the asymptotic bias is not generally known; and he states that "it would be surprising" if these results did not generally hold for the censored Tobit model. The robustness of MLE's of the Tobit model is also considered by Nelson (1979) and Maddala (1979). Chapter II extends Hurd's analysis to the censored Tobit model. The implicit form of the inconsistency of the MLE's in the censored case is derived, and it is cal- culated for a variety of parameter values. This turns out to make a surprisingly large difference. The robustness of the MLE's to heteroskedasticity is much greater in the censored case than in the truncated case. Chapter III contains a simple test for heteroskedas- ticity based on the Lagrange multiplier (LM) test, which is also known as Rao's efficient score test. In the Tobit model defined in (1.1.1) or (1.1.2), we specify the nature of heteroskedasticity as o: = zia, where a is a p-component column vector unrelated to the coefficients 8, and 21 is a p-component row vector (with first element unity). In other words, the variance of disturbances is a linear function of a set of exogenous variables (elements of 21). This allows the null hypothesis of homoskedasticity to be for then 210 = a1 = ozis constant. The LM test statistic is obtained from the result of maximizing the likelihood subject to the parameter constraints implied by the null hypothesis and can be computed from the Lagrangian multi- plier corresponding to the constraints or from the first order conditions, as in Rao (1973). Silvey (1959) showed that the LM test has the same asymptotic power as the likelihood ratio nun test. For more on LM tests see, for example, Silvey (1959), Rao (1973), and Breusch and Pagan (1979, 1980). Breusch and Pagan pointed out several applications of LM test to model specification in Econ- ometrics and found that in many instances the LM statistic can be computed by a regression using the residuals of the fitted model. Lee (1981b) also suggests a LM test for misspecifications in sample selection models, based on the Pearson family of distributions. The robusteness of the Tobit estimator to non- normality is also a potentially important point since there is typically not any compelling reason to believe that the disturbances are normal. The MLE's based on the normality assumption are inconsistent when the distur- bances are non-normal; see Goldberger (1980) or Nelson (1981). This is not true for least square estimators in the standard linear regression model, which are unbiased and consistent under violation of the distributional assumptions. An obvious question here is the severity of the inconsistency under different conditions. Goldberger (1980) considers a simple case of esti- mation of the mean of a truncated random variable, and analyzes the asymptotic bias caused by violation of the normality assumption. He considers some symmetric dis- tributions (Student, Logistic, Laplace) to be the true distributions that were misspecified as normal, and numeri- cally calculates the inconsistency or asymptotic bias of the estimator of the mean of each distribution, given that their variances are known. He finds, not surprisingly, that the bias is negligible when the truncation is mild, and it is substantial when the truncation is extreme (less than 15 percent of population is retained in selected pOpu- lation). He concludes that the size of the bias for moder- ate degrees of truncation is unexpected. An obvious extension of Goldberger's work is to consider the non-normality bias of the censored Tobit model. It is also interesting to know what would be the effect of estimating the variance on the size of the bias of the estimate of the mean, both in the truncated and the censored case; that is, it is worth knowing whether the assumption of known variance matters. Chapter IV considers the non-normality bias and its severity in the Tobit model, both truncated and cen- sored, while relaxing the assumption of known disturbance variance. The implicit form of the inconsistency is derived. The numerical values of the bias for different symmetric distributions, under a variety of parameter values, is calculated. Chapter IV also contains a cross- distributional comparison and a comparison of different estimators for each distribution. To generalize the results, the Probit model (binary dependent variable model) has also been considered in this analysis. Appendix A includes the derivation of the first and second truncated moments of some selected symmetric distributions, which are necessary to evaluate the probability limits of the first order conditions. Furthermore, the analysis is extended to a regression with one dummy explanatory variable, and the robustness of estimates of the slope coefficients to violations of the normality assumption is investigated. There are other works on model specification and related topics in Tobit model. For a general test of misspecification in the censored Tobit model, see Nelson (1981). His test for a univariate censored normal model is based on the general specification test principle sug- gested by Hausman (1978). Lee (1981c) introduces a general LM test for selectivity bias, homoskedasticity and normality, based on the Pearson family of distribu- tions. For other specification error tests in limited dependent variable models, see Olsen (1979) and Maddala (1979). CHAPTER II ROBUSTNESS TO HETEROSKEDASTICITY 2.1 Introduction One of the basic assumptions of the regression model is that the disturbances have a common constant variance. This is known as homoskedasticity and the violation of this assumption is known as heteroskedastic- ity. In many econometric studies, especially those based on cross-section data, the assumption of a constant variance for the disturbance term is unrealistic. In con- sumer budget studies the variance of the error term very likely increases with household income; see Prais and Houthakker (1955). Likewise in cross-section studies of the firm the disturbance variance probably increases with the size of the firm. Heteroskedasticity also naturally arises (1) when the observations are based on average data, and (2) in a number of "random coefficient" models; see Judge et al. (1980). In the case of the ordinary regression model, 'there exists a substantial literature on the nature of heteroskedastic disturbances, their consequences and effects on inference and test procedures, and remedies for the problem. 10 11 It is well known that the consequences of heter- oskedasticity are two-fold. The ordinary least square estimates of the regression parameters are still unbiased and consistent, but are inefficient, and the estimates of the variances are biased. This will effect tests of hypotheses and inferences on the regression parameters. In most studies, a simple parametric form of heteroske- dasticity has been assumed to simplify the problem. In other words, it is assumed that the variance of distur- bances is a pre-specified function of one or several independent variables. For an up-to-date review of liter- ature, see Judge et a1. (1980). For limited dependent variable models, there has been relatively little work on the problem of heteroskedas- tic disturbances. Maddala and Nelson (1975) showed that in limited dependent variable models, contrary to the com- plete sample regression case, the estimators are not even consistent when the error terms are heteroskedastic. To illustrate the nature of the problem, following Maddala and Nelson, consider a censored Tobit model, as defined in equation (1.1.1). The locus of expected values of yi is given by E(yi) (xiB) 4 (xiB/G) + 0 ¢(xiB/c) (2.1.1) where o and 4 are respectively the standard normal density and cdf. Now suppose that the "true" model is 12 heteroskedastic with parameters Ooi and Bo and designate the "true“ locus as Eo(yi). Then Eo(yi) = (x. ) 1 Bo) (xi B /o ) + 0 . ¢(X- B /o o oi 01 1 o oi (2.1.2) It is immediately obvious that the presence of the vari- ance term in the expected value locus is the source of the difference in the two expressions and, in turn, of the estimation bias. The main questions to be answered are (1) what are the consequences of heteroskedasticity; (2) how can we detect it and test for it; and (3) what to do to correct for it. Hurd (1979) has presented some evidence on these points, for the truncated Tobit model, as defined in equation (1.1.2). For the simplified version of the model, assuming that the constant term is the only regressor, he proved that the maximum likelihood estimator is incon- sistent when the error terms are heteroskedastic, and derived an implicit form of the inconsistency of the estimate of the mean of a normal random variable. His assumption on the nature of heteroskedasticity is that the sample comes from a mixture of two normal populations with equal means but unequal variances. Assuming equal pr0portions for each population, he calculated the 13 asymptotic-bias (or inconsistency) of the estimated mean for different sets of parameter values. He concludes that, as one might expect, the bias depends on the degree of heteroskedasticity, the scale of the problem, and the amount of truncation. The bias increases with increase in the degree of truncation as well as increase in the degree of heteroskedasticity. He argues that the bias is extreme for what seem like modest amounts of heteroskedasticity, and so it may be a serious empirical problem. In the rest of this chapter, Hurd's analysis is extended to the censored Tobit model, and the asymptotic bias is calculated for a variety of parameter values. The results are surprisingly different than Hurd's. In par- ticular, the results for the censored case are more optimistic than for the truncated case. Heteroskedasticity of given severity causes less inconsistency in the former than latter. In the case considered here, moderate amounts of heteroskedasticity do not cause serious incon- sistency, except when the fraction of non-limit observa- tions is very low. 2.2 Derivation of the Inconsistency In this section we derive (implicit) expressions for the inconsistency of the censored Tobit estimator under a simple form of heteroskedasticity. To keep things 14 tractable, we consider the case of estimation of only the mean and variance of a censored normal; that is, the only explanatory variable is a constant term. In this case, the assumption underlying the cen- sored Tobit model is that there is a random sample of size T on an unobservable y which is distributed as N(u,02), while we observe instead y = max (o,y*). Thus negative y*'s are simple observed as zeroes. This con- trasts with the case considered by Hurd, in which only positive y's were observed and the fraction of limit observations is unknown. Let the number of positive y's be n, and index these as yl, . . ., yn. (Thus there are T-n values of zero for y, in the sample of size T.) The censored Tobit log likelihood function is n L = constant - n inc - —l5- 2 (yi - u) 20 i=1 + (T-n) 2n (-u/o) (2.2.1) where 4 is the cdf of the unit normal distribution. If we let fl and 6 denote the MLE's, they satisfy the first- order conditions for maximization of L, which can be written as: (yi - fl) - ~9- 8 m(-fi/8) = o (2.2.2a) 15 1 n A 2 T-n AA A A -o + 3' 2 (y. -u) + —H- om(-u/o) = 0 (2.2.28) 1 1 Here m(t) is defined as ¢(t)/¢(t), where ¢ is the density of the unit normal distribution and 4 is (as above) the corresponding cdf. We now wish to take probability limits of (2.2.2) under heteroskedasticity. Following Hurd, we assume that there are two distinct variances, of and 0%. These are sampled in proportions rl and r2 = l—rl. That is, with probability rl we sample y* from N(u,oi) while with probability r2, we sample y* from N(u,o§); then we observe y = max (o,y*). Thus of the total of T observations, we suppose that for T1 of these the underlying variable has variance oi while for T2 it has variance 0%, where as T + m, Tl/T + rl and TZ/T + r2. Correspondingly, of the n positive observations, let nl and n2 represent the num- bers of observations with underlying variances of and 0%, respectively. Then we can note the facts: ©(t1) + r2 @(t (2.2.3a) 2) r <1>(-t) + r (Pb-1:) T’—“ + l 1 2 2 (2.2.3b) rl 4(t1) + r2 @(tz) 16 23 = fling; / (E) + ri ¢(ti) = 3 r1 Ti T? T r1 4(t1) + r2 ©(t2) 1 i=1, 2 (2.2.3c) n. 1. g . _ + = n. . yi u + Oj m(tj) , 3 1,2 (2.2.3d) 3 i n. -lh g (y. - u)2 + o? - no. m(t.), j = 1,2 (2.2.3a) nj 1 1 J 3 J In all of the above, t1 = u/ol, t2 = u/oz, and all limits are taken as T (hence T1' T2, n, n1, n2) approaches infinity. To take probability limits of the first order con- ditions (2.2.2), we first rewrite them slightly: n n T-n A A - ("5—) U m(-t) — 0 (2.2.4a) n n n l n 2 2 l 1 A 2 2 1 A 2 '6 +TH‘Z‘Y1‘U’ “‘a‘a—WYi 1" l i 2 i + (335) 08 m(-E) = o (2.2.4b) where t = 8/8. Now we use (2.2.3) to take the probability /0. ~ limits in (2.2.4). Defining fi = plim u, 5 = plim 8, t==u 17 we obtain: ~ rl ¢(t1) [u + 01 m(tl) - u] ~ + r2 ¢(t2) [u + 02m(t2) - u] G. - [rl ¢(-tl) + r2 ¢(-t2)] o m(-t) = 0 (2.2.5a) -52 + 31 {of - uol m(tl) + 2(u - fi)[u + 01 m(t1)1 ~2 + (u - u2)} + s2 {03 - uoz m(tz) + 2(u - fi)[u + 02 m(t2)1 + (fiz - u2)} r1¢(-tl) + r2 ¢(-t2) r1¢(tl) + r2 ¢(t2) 0 (2.2.5b) t: o: B A I a: v I) + Equation (2.2.5) implicitly defines the proba- bility limits fl and 8 of the MLE's, in terms of the para- meters u, 01’ 02, rl and r2. It is easy to see that when 01 = 02 (and hence t1 = t2), the solution fi = u 5 = 0 satisfies (2.2.5); the Tobit estimator is consistent under homoskedasticity. It is also not difficult to verify that i = u, 6 = a does not satisfy (2.2.5) when 01 # 02; the Tobit estimator is inconsistent under heteroskedasticity. Finally, although it is not obvious from (2.2.5), the 18 inconsistency must depend strongly on the degree of censoring of the sample (which depends on t1 and t2). This must be so since in completely uncensored samples, the Tobit estimator becomes least squares, which is robust to heteroskedasticity. 2.3 Calculation of the Inconsistency For given parameter values (u, 01' 02, r1, r2) we can calculate the inconsistency of the Tobit MLE's by solving (2.2.5) numerically for fl and 6. We have done so, holding 02 = l for all cases, for n = l, 0, -l, -2, for 15 values of 01 between .1 and 10, and for three sets of rl and r2. These results are given in Tables 2.1 (r1 = r2 = .5), 2.2 (r1 = .2, r2 = 8), and 2.3 (r1 = .8, r2 = .2). It should be noted that setting 02 = l is innocuous, since the results are invariant with respect to scale; for example, the ratio fi/u is the same with u = -l, 01 = 10, 02 = l as with u = -.2, 01 = 2, 02 = .2. The solution of (2.2.5) was found using the Newton— Raphson iterative scheme, from arbitrary starting values. Many sets of starting values were tried. In all cases precisely two sets of solutions to (2.2.5) were found, with one set having 5 negative and the other having 6 -positive. Tables 2.1 to 2.3 give the solution corresponding to positive 5, for obvious reasons. 19 . We will first discuss the results in Table 2.1, concentrating on the estimate of u, since under heteroske- dasticity it is not entirely clear what consistency of 6 would mean. As would be expected, the degree of inconsis- tency depends heavily on the degree of heteroskedasticity and on the degree of truncation of the sample. For given 01 (degree of heteroskedasticity), the inconsistency of u increases as the degree of truncation increases; that is, as u decreases. This is entirely as expected, for the reason given at the end of Section 2.2. For given u, the inconsistency increases as the degree of heteroskedasticity increases, that is, as 01 becomes further from one (the value of 02). This is also as expected, though it should be recognized that changing 02 for fixed u also alters the degree of trunca- tion and the scale. This is why, e.g., the results for = .1, = l are not identical to those for 01 = 10, 01 0‘2 c2 = 1. It should be noted that our results for the Tobit model are much more optimistic than Hurd's results for the related truncated normal model. To pick a few examples: with u = 0, 01 = .5 we find a bias of -.02 whereas in the truncated normal model the bias is -1.47; with u = -l, 01 = 2 we bind a bias ¢e m: H LM ~ where I is information matrix when the null hypothesis is true, evaluated at the restricted estimates 6. The LM statistic is asymptotically distributed as x2 with q degrees of freedom. The term d I d is the "score" statistic, Rao (1973), while I, H, If' H I is called the Lagrangian multiplier statistic, but the two test statis- tics are identical. We employ the score test form because it is based on the restricted model and the restricted estimates, which are relatively simple to calculate in our case. 3.3 The Statistic for the Truncated Case Adopting the linear model in (3.2.1) with the heteroskedasticity assumption of (3.2.2), the truncation is introduced into the model by assuming, as in (1.1.2) that we observe yi if and only if yi is positive. That is, only the observations with positive values, say i = l, . . ., n, from a possible sample of size T are available for yi. To 31 test for the null hypothesis of homoskedasticity, equation (3.2.3), we follow the test procedure explained in Section 3.2. The log likelihood of the sample can be written as: :3 1 2 1 n L(a,B) = Constant - 5 2 in 011 - 7 2 0. i=1 i= n . 2 (yi-xiB) - iiltn 4 (xiB/oi) where 4 is cdf of unit normal distribution. The first order conditions are n " o 2:1 ’v 3 _ 3L _ 4 -l 1 1i d --—————- - Z (20.) 3(aIB) i=1 1 2 ’ 20i xi 0 v2i L— ..n L— _.J v . _ n 4 -1 11 i=1 v2. 1 where Ki is a (p+k) x 2 nonrandom matrix, and Vli’ vZi are random variables, with zero mean conditional on sample inclusion, defined as V11 = $1 ‘ Oi m(xiB/Oi) v = s2 4 02 + o (x B) m(x B/o-) 2i 'i i i i i 1 in which m(o) = ¢(-)/4(-), o is unit normal density and 4 is as above. The information matrix is 32 = 3L 3L I E ‘ETHTET 'STETETT’ n a. b. , = z (409)“1 K- 1 K- . 1 1 1 i=1 - b c i i where I is a square matrix of dimension (p+k) and _ 2 ai -' Oi - Oi m(xiB/Oi) [(xiB) + Oi m(xiB/Oi)] bi = a1 m(xie/oi)[(xis)2 + Oi + (xie) Oi m(xiB/oi)] _ 4 _ 2 2 ci — 20i (xiB) Oi m(xiB/oi)[(xiB) + Oi + (x18) oi m(xiB/Oi)] The vector of first derivatives (score), evaluated at the restricted MLE's (a, B, o),and denoted by d,is. ~ 2' G a = (254) 1 2 where Z is a (n x p) matrix with zi as its ith row, X is a (n x k) matrix having xi as its ith row, and I = ~ ~ 1 ' vj (vjl, Vj2’ . . ., vjn) . Note that the first element and the last k elements of d are zeros; only the elements corresponding to 33 oi, i = 2, . . ., p, are non—zero. The information matrix evaluated at the restricted MLE's is z'cz 282 Z'BX I = (268) 1 ~2 ~4 20 x'Bz 4o X'AX where c =-Diag (8i), B = Diag (Si), and A = Diag (Si). The LM statistic, using the inverse rule for partitioned matrices, is 1 ', ~ ZV2 Al, 1. LM = v2 Z[a'cz-z'BX(x'AX)' x’szl‘ (3.3.1) which is distributed asymptotically as x2 with p-l degrees of freedom, given that the null hypothesis of homaskedastic- ity is true. h Note that the complete sample heteroskedasticity .test considered by Breusch and Pagan (1979) is a special case of (3.3.1) in which B = 0, C = 264 In, and 421 = E§-52. In this case, the test statistic is equal to LM = —l—-e' 2(2'2)"l 2'4 .4 2 2 20 which is equal to one-half of the regression sum of squares n o ~-2 ~2 in the regressron of 0 ti on the 2's. For the special case of only a constant regressor, for which x = (l, l, . . ., l)’ is n x l and 5i = c for every i, the test statistic can be reduced to IIH 1 ~ LM = vé Z (2'2)- c v2 which is equal to the regression sum of squares in the % regression of E— 621, the standardized Tobit residuals, on the 2's. 3.4 The Statistic for the Censored Case Using the model in (3.2.1) and the assumption of (3.2.2), we now assume that we observe not yi but Iiyi’ where Ii is an indicator defined as 1 if yi > 0 0 if yi : 0 which is equivalent to the model in (1.1.1). The log likelihood function is - T _ _ l 2 _ l L(a.B) - constant 2 in Ci 2 .: “MI-i H i "Mr-3 2 (yi - x18) - (l - Ii) Rn ¢ (xiB/ci) i l where ¢ is unit normal cdf. Define uoi = Ii - @(xiB/oi) = (Ii—l) + ¢(-xiB/oi) u1i = Iiei - 01¢ (xiB/Oi) 35 “21 = 116% - [0i ¢(xiB/oi) - oi(xiB)¢(xiB/oi)] "11 = “11 + °im(’xiB/°i’ uoi _ 2 w21 ‘ “21 ' [Oi + Oi (x15) m(-xiB/Oi)l uoi where ¢ is the unit normal density, m (-) = ¢(-)/0(-), and wli' w2i are zero-mean random variables. Then the first order conditions and the information matrix can be written as I o zi “11 T 4 -l w d = 3L/3(a,8) = Z (ZOi) 2i i=1 2 0 20. x! l 1 L— .— T w . E Z (20:)-1 K1 11 1:1 w2i T d. I = z (4 a?) 1 K1 1 l K’ i=1 g. h 'l l . . . , . where Ki 13 as in Section 3.3, (Wli’ w2i) is a random _ 2 _ _ 2 vector, di - E(wli), gi - E(wli WZi) and hi - E(w2i)' Let a and 2 denote the first order conditions and information matrix evaluated with restricted MLE's, (d, é, 6). Then 36 Z' Q ~ ~4 -1 2 d = (20 ) 252 x' “1 L... .— where Z and X are as in Section 3.3, and wj = (wjl, . . ., I wjn) . Note that, as in the truncated case, the first element and the last k elements of d are zeros. Also F" 2 7 Z' HZ 25' Z'GX I - (468)"1 252 x' GZ 454 X’DX — ._J where H Diag (hi), G = Diag (51), and D = Diag (di)' The test statistic 1 1 2'6: ’ - X GZ] 2 LM = w ZIZ'HZ - Z'GX (x'DX)' I 2 is asymptotically distributed as x2 with p-l degrees of freedom. For the Special case of only one constant regressor, bi = h for every observation. Then the test statistic will reduce to 1 ~ =1; “I I - I LM H W2 Z(Z Z) Z wz which is equal to the regression sum of squares in the regression of h- a21 on the 2's. 37 3.5 Conclusions It is important to be able to test for heteros— kedasticity in the Tobit model, because under heteroskedasti- city the Tobit estimates are inconsistent and the usual tests are invalid. The Lagrangian multiplier test prin- ciple has been adopted, and the test statistic has been derived for the null hypothesis of homoskedasticity (in both the truncated and censored cases), against the alterna- tive that the error variance is a linear function of exogenous variables. The test statistics that result are not too difficult to calculate, and thus should be useful in applied work which uses the Tobit model. 4..“ ‘4“ A—Al CHAPTER IV ROBUSTNESS TO NON-NORMALITY 4.1 Introduction In a standard linear regression model, the least squares estimators are unbiased and consistent even when the assumption of normality of the disturbances is vio- lated. Normality (or some other distributional assumption) is necessary for hypothesis testing in finite samples, but does not affect the mean or probability limit of the least squares estimates. In the Tobit model, the situation is quite differ- ent. The usual MLE which assumes normality (which we will refer to as the "normal MLE“) is inconsistent when the disturbances are non-normal. Thus estimates of the Tobit model are not robust to violations of the distributional assumption for the disturbances. While this chapter is concerned specifically with a special case of the Tobit model, it should be noted that similar considerations apply in a wide variety of models with qualitative and limited dependent variables which might loosely be grouped under the heading of "sample selection models." There has been a proliferation of such 38 39 models in recent years, because they allow one to answer questions that could not previously be answered. However, virtually all of these models hinge on normality, or some equally specific distributional assumption, and their robustness has basically not been investigated. This is a potentially important point, since there is typically not any compelling reason to believe that disturbances are normal. The robustness of estimators of the Tobit model to heteroskedasticity has been considered in Chapter II. The general conclusions were that the MLE's are not con- sistent under heteroskedasticity, and the bias has a direct relationship with the degree of truncation and the severity of the heteroskedasticity. In Chapter III we developed a computationally simple test for heteroskedastic- ity in the Tobit model based on the class of Lagrangian multiplier (LM) tests. The LM tests, in general, have the same asymptotic properties and power as likelihood ratio (LR) tests, as proved by Silvey (1959). As for robustness to non-normality, Goldberger (1980) has considered the truncated version of Tobit model, as defined in (1.1.1), under the assumption that the only regressor is a constant term and that the disturbance variance is known (i.e., estimating the mean of a truncated random variable). He derives the asymptotic bias (incon- sistency) of the normal MLE and calculates the numerical 40 value of the bias for a variety of non-normal errors, by assuming that the true distribution of disturbances is a symmetric distribution (Student, Logistic, Laplace) other than normal. His calculations suggest that bias is small when the truncation is mild and substantial under extreme truncation of the sample. He concludes that the bias for moderate degrees of truncation is unexpectedly large. This chapter includes an obvious extension of Goldberger's work, namely the non-normality bias in the censored Tobit model. For the sake of tractability, the simple case of only a constant regressor is considered. To provide an insight into whether the assumption of known variance matters, this assumption has been relaxed for both the truncated and censored cases. To generalize the results, the bias in the Probit (binary dependent vari- able) model has also been investigated. The organization of this chapter is as follows: Section 4.2 contains the models and their estimators. Section 4.3 includes the derivation of the (implicit form of the) inconsistency for censored, truncated, and binary cases when the variance of the error terms is unknown. The numerical calculations are presented in Section 4.4. The results are tabulated for different distributions in Tables 4.2 to 4.6. Cross-distributional comparisons and 41 comparisons of different estimators for each distribution can be seen in Figures 4.1 to 4.6. Section 4.5 contains an extension of the above analysis to the regression case with one dummy explanatory variable. In Section 4.6 the inconsistency of the sloPe coefficient is derived and numerical calculations of the inconsistency have been illustrated. The results of the regression case are tabulated in Tables 4.7 to 4.9. Finally, Section 4.7 contains the conclusions. The main conclusions are that (l) the bias is generally less in the censored case than in the truncated case; (2) the assumption of known variance makes a substantial difference in the results; (3) the bias from non-normality can be substantial, and in fact for severely truncated samples can be larger than for the uncorrected least squares estimators. Appendix A includes the derivation of the first and second truncated moments of selected symmetric distributions (Student, Logistic, Laplace) which are nec- essary to evaluate the bias. A summary of the results in Appendix A is presented in Table 4.1. Appendix B con- tains the evaluation of some probability limits used in the derivation of the bias in the regression case. 4.2 The Model and Its Estimators We will concentrate on the special case in which the model contains only a constant term; that is, we are 42 attempting to estimate the mean of a pOpulation. The relevance of this case to more complicated models can be questioned, but it seems to be an obvious starting point, and it allows some results that would not otherwise be possible. Thus we suppose that we have (in principle) a random sample, yg, i=1, 2, . . . T, from a distribution with mean u and variance 02. If the y; were all observed, the normal MLE would be the sample average, and it would be robust to non-normality. However, we now consider three alternative assumptions about what is observed. l. Censored Case. Assume that we observe not yg, but rather yi = max (0, yg). Letting the first n y's be the positive observations, the normal log likelihood is "P45 L = constant - nin c --—£§ (yi-u)2 20 i l + (T-n) £n¢ (-u/o) (4.2.1) where 4 is the N(0,l) cdf. The normal MLE's fi, 8 satisfy the first order conditions n A A l 2 (y.-fi) - 2.2 8 m(-u/o) = O (4.2.2a) n ._ i n i-l “2 l n A 2 T-n “ A A A '0' + a .: (yi-p) + T 1.1 0' m(-u/o) (4.2.213) i l Here m(-) = ¢(-)/®(-), where ¢ is the N(0, 1) density and ¢ is as above. 2. Truncated Case. Here we observe yi (=yz) if and only if y; > 0. The normal log likelihood is L = constant - n inc - l (yi-u)2 20 i IIMS l - nfin ¢(u/o) (4.2.3) so that the first order conditions which yield the normal MLE's are n A A AA 3 Z (yi-u) - o m(u/c) = 0 (4.2.4a) i=1 A2 1n A2 AA AA- -0 + H 2 (yi-u) + u c m(u/o) - 0 (4.2.4b) 3. Binary (Probit) Case. Suppose that we observe 11£y§>o Y- ={ (4.2.5) 1 o if y; < o. In this case only u/o is identified. Taking o=l as a normalization, the normal log likelihood function is L = min @(u) + (T-n) in ¢(-p) (4.2.6) so that the first order condition for the normal MLE u is 44 m(fi) - T713- m(-fi) = 0. (4.2.7) For completeness, we also list as possible esti- mators the sample means - 1 n y = - 2 Y. (4.2.83) n i=1 1. T _. 1 n .. * g _ = _ The first is the mean of the truncated sample (positive observations), and could be used in either the censored or truncated case, while the second is the mean of the cen— sored sample, and could be used in the censored case. Clearly these are estimators which do not attempt to correct for the bias due to censoring or truncation. 4.3 Derivation of the InconSistency We now assume that the true distribution is some symmetric distribution other than normal. Let 2 be a random variable with such a distribution, satisfying E(z) = 0, and let f and F denote the density and cdf of z. The variable y* is assumed to be related to 2 by y* = p + bz where b2 = [var(z)]-l. Thus a E bz has mean zero and variance one so that comparisons among distributions will not be confused by differences in scale. 45 To derive the inconsistency of the various normal MLE‘s, we take the probability limits of the first order conditions, and then solve the resulting equations for the probability limits of the estimators. We note the facts that as T + w, _'1;_-_r_1__,P(y*_<_0) =F(- lb) _ n P(y* > 0) F(u7b5 ‘ ( 3, (4.3.la) n % z y. + E(y*ly* > 0) = u + bE(zIz > 4 u/b) . i 1=l E B (4.3.lb) 1‘12 2 222 H E yi + E(y* Iy* > 0) = u + b E(z Iz > - u/b) i: + 2buE(z|z > -u/b) E C, (4.3.lc) - where F, A, B, and C are determined by the true distribu- tion. Thus to evaluate the probability limits of the first order conditions, we need the cdf and the first two truncated moments of the distribution of 2. These are given in Table 4.1 for the distributions which we use, namely the Laplace, logistic and t distributions. Let i and 5 represent the probability limits of the MLE's fl and 8. Then taking probability limits of the first order conditions, and using (4.3.1), we obtain the following. 46 Censored Case fi.+ A 5 m(-fi/8) - B = o (4.3.2a) ~2 ~2 .. ~ .. ~ ~ )1 - o +- A u o m(-u/o) - 2Bu + C = O (4.3.2b) Truncated Case fi + 6 m(fi/a) - B = 0 (4.3.3a) ~2 ~2 ~ ~ ~ ~ ~ - o -r u c m(p/c) - 2 B u + C = 0 (4.3.3b) Binary Case m(fi) - A m(-fi) = 0 (4.3.4) (Actually, (4.3.4) can be simplified to ¢(fi) = F(u/b), but is written as above to maintain uniformity of nota- tion.) These can then be solved numerically for u or 6. For example, in the censored case we would solve (4.3.2a) and (4.3.2b) for u and 8. This solution will depend on u and o and on the form of the distribution chosen. In the case in which 0 is assumed to be known, we set 5 = o and solve (4.3.2a) only for fi--that is, we ignore (4.3.2b). Similar statements apply to the truncated and binary cases. The inconsistency of the sample averages given by (4.2.8) can be eXpressed explicitly. We have 47 i? + E(y*ly* > 0) = B (4.3.Sa) §* + P(y* > 0) E(y*|y* > 0) = F(u/b)B, - (4.3.5b) and the inconsistency is just u minus this expression. 4.4 Calculations In this section we report the results of our cal- culations of the inconsistency of the estimators under non-normality. The distributions considered are t (with 5, 10 and 20 degrees of freedom), Laplace (double exponen- tial), and logistic. In all cases 0 = 1, while u varies from -3.0 to 3.0. The results are given in Tables 4.2 to 4.6, with each table representing a different distribution. The first and second columns give u and P(y* > O), the latter being a measure of the degree of truncation or censoring in the population. The next two columns give the asymp- totic biases of the censored mean y* and the truncated mean §, as defined in (4.2.8) above. The next three columns present the asymptotic bias of the censored, truncated and binary normal MLE's of u, when a is known; the next two columns give the same information for the case of unknown 0. Finally, the last two columns present the probability limit 5 of the estimate of o, for the truncated and censored cases (with unknown 0, obviously). Our results for the truncated case and known 0 correspond 48 to the results of Goldberger (1980), though with sign changes since he considered upper truncation. Most of the results are qualitatively similar for all distributions. First, the asymptotic biases of all estimators except the binary estimator disappear as u gets large (that is, as the sample becomes complete), as they must. Second, it makes a considerable difference whether one knows the disturbance variance. The estimators which assume a known generally have a much smaller bias than those which also estimate a. This is not uniformly true (it can't be, since the biases change signs, and thus each equals zero at some point), but for many values of u the difference is huge. Since G will in practice never be known, our results are more pessimistic than Goldberger's, which were based on known 0. Figure 4.1, which plots the biases for the t10 case, gives a good visual illustration of the two points just made. ‘It also illustrates a third major conclusion, that the bias of the censored estimator is generally con- siderably less than that of the estimator from the trun- cated sample. Again, this is not uniformly so, but for many values of u the difference is quite considerable. This can be seen perhaps more clearly in Figure 4.2, in which the vertical axis is inflated. A fourth point concerns the comparison of the normal MLE's with the sample means_§ and §*, which do not 49 correct for truncation or censoring bias. When 0 is known, the bias of the normal MLE is less than the bias of the sample mean, for both truncated and censored samples, for all of the distributions and parameter values we considered. Thus it is better to correct than not to correct, even though the correction is biased, for all cases we considered. (Whether this is true in general is an interesting question.) However, this is not necessarily the case when a is unknown; then the biased correction can be worse than no correction. This can be seen in tables for both the truncated and censored cases, or in Figure 4.1 for the truncated case. A comparison across distributions is illustrated in Figure 4.3 for the case of a censored sample with known variance, in Figure 4.4 for a truncated sample with known variance, and in Figure 4.5 for a binary sample. As would be expected, the bias is worse for distributions which are more non-normal (e.g., t5 or Laplace vs. tzo). Finally, Figure 4.6 illustrates a comparison of the biases in the censored, truncated and binary cases. The binary bias does not go to zero when u + m. It is interesting that where any two of the bias curves inter- sect, all three do; Goldberger has shown that this is so for any distribution (personal communication). For the distributions we consider the censored bias is always 50 between the binary and truncated biases, though we have not proved this as a general result. 4.5 Extension to the Regression Case In this section the robustness to non-normality of the Tobit estimator of regression parameters is inves- tigated. For a simple linear regression with one explana- tory variable, the asymptotic bias of the s10pe coefficient and constant term has been derived for both censored and truncated Tobit models under the assumption that the regressor is a dummy variable. Suppose we have (in prin- ciple) a random sample of size T on the regression # = a + x. + e. 'yl B 1 l where ei's are iid as N(O, 02) and x1 is a non-random variable with known values for T observations. If the value of y* were known for all T observations, this would be the ordinary regression case and the normal MLE's (least squares) would be robust to non-normality. Let us now consider two alternative assumptions about what is observed.on yz, given that xi is always observed. 1. Censored Tobit Model. Assume that we observe not y: but rather yi = max (0, y:). Assuming that the first n y's are the positive observations, the normal log likelihood function can be written as n L = Constant -n£no - —i— X (y. - (a+Bx.))2 . 1 i 20 1:1 T-n + 2 2n 4 (- 0‘ + 8x1) (4.5.1) i=1 0 where 4 is the cdf of unit normal. -The normal MLE's a, B, 0 can be computed from the first order conditions: 1 n A A 3 T-n d-réxi H X (Yi - (c-eri)) - H 2 m(- A ) = 0 (4.5.2a) i=1 i=1 0 n A A T-n l A c _, E x (y - (a + Bx.)) - - Z x. n i=1 i 1 l n i=1 i & +‘Bxi m (- A ) = 0 (4.5.2b) o “ T-n A2 1 n A A 2 O. A A --0 +3 .2 (yi '- (Oc + BXiH +3 '2 (a + 3X1) i=1 i=1 a + Bx. m(- l) = 0 (4.5.2c) 0 where m(-) ¢(°)/4(-), ¢ is the unit normal density, and 4 is the unit normal cdf. * = ° ° * yi ( Yi) if and only if yi > 0. 2. Truncated Tobit Model. Here we observe In other words, only posi- tive values of y; are observed, and therefore used in estimation. The normal log likelihood is 52 n L = constant -nlno - —l3 2 (y - (a + Bx.))2 20 i=1 1 l n a + Bxi -' X 2n4 (—-—5——-) (4.5.3) i=1 and the first order conditions which yield the normal MLE's are n A n A “ %z (yi- (a + BxiH -§- 2 m (0":8) = o (4.5.4a) i=1 1:1 0’ n A n Eli”: xi (y -(a+-Bxi))-% 2xl i=1 i=1 a + é m( A ) = 0 (4.5.4b) o A n A ’5 A n A A -02+}- 2 (y - (01,-(-Bx.))2+g Z (a+Bx.) n . 1 n . 1 i=1 1=l a + éxi m(-—:——-—) .= o (4.5.4c) o 4.6 Derivation of the Incon- sistency and Calculations Now assume, as before, that the true distribution of disturbances is a symmetric distribution other than normal. Let 2 be a random variable with such distribution ' with zero mean, and let f and F denote the density and cdf of 2. Furthermore, let bz be the error term, where 53 b2 = [var(z)]-l, so that the errors have a variance of one. Thus we have the regression *— .-a+x.+. 1 81 61 where e. = bz.. 1 1 In this stage, in order to make the derivations of the bias tractable, we assume that x is a dummy vari- able which takes the value of unity p percent of the times and therefore, the value of zero l-p = q percent of the times (in the original hypothetical population). Putting it differently, x can be assumed as a Bernoulli random variable such that P(x = l) = p and P(x = 0) = q = l-p. ‘ Let a, g, and 5 denote the probability limits of the MLE's a, g, and 8. Then taking Unaprobability limits of the first order conditions in(4.5.2) and (4.5.4), and using the results of Appendix B and rearranging the terms we obtain the following. Censored Case ~ 5 + pl 8 + y p2 5 m(-tl) + yqz 5 m(- E0) - A = o (4.6.la) p1 (a + 3) + y pz 6 m(-El) - p10 = o (4.6.lb) 54 ~2 ~2 _ ~2 ~ ~ ~ ~ ~ ~ 4 + 91 B 0 + 2 pl 4 B + Y p2 (a + B) o m(’t1’ + Y q2 a 5 m(-to) - 2A§ - 2 pl D g + C 0 (4.6.1c) ~ where to = d/E, t1 = (a + B)/6 and the values of A = E (yzly; > 0), D = E(y;|y; > 0, xi = 1), c = E(y*2Iy* > 0), p1 = (1-ql), p2 = (1 - qz) and Y depend on the distribution of z. Truncated Case Q! + pl E + pl 5 m(tl) ” m (E0) - A = o (4.6.2a) + (.0 H O a + B + a m(tl) - D = o (4.6.2b) ~2 ~2 ~2 ~~ ~ ~ ~ ~ a + pl 8 - o + 2Pi a8 + p1 (a + B) o m(tl) 0 (4.6.2c) + ql a6 m(to) - 2 A a - 2 plné + c Calculations. To illustrate the calculation of the inconsistency of the regression coefficients under non-normality, the Student—t distribution with 5 degrees of freedom is used. Then equations (5.6.la) and (5.6.lb) are solved for d and 6, assuming the variance is known (6 = 1). For the unknown variance case (5.6.1) is solved to calculate the values of d, B and 6. As mentioned earlier, the simple regressions model with a dummy vari- able discussed above can be interpreted as a mixture of 55 two random variables with the expected values “1 = a and “2 = a + B, where “2 is selected p percent of the times. The probability limits of estimates of the two means ‘54 ~ are: fil = a, HZ = d + B, and so 8 = “2 - fil' The numer- cal results are presented in Tables 4.7 to 4.9. Table 4.7 provides the asymptotic biases of the estimates of B, a “l and “2' while moving the means of two random vari- ables in opposite directions from zero. This keeps the degree of truncation more or less constant. The case of moving the means together in the same direction, keeping their distance constant (=1), is presented in Table 4.8. Both Tables 4.7 and 4.8 assume p = .5. Table 4.9 includes the biases when only the degree of contamination (p) is changed. Table 4.7 presents the true values and biases of B, “2 and ”l = a. For the case where o is known, the results are identical to the non-regression case (constant term as only regressor) considered earlier (e.g., for “2 = -3, the bias of estimate of “2 is .55 here and in Table 4.2). This is because we are in fact estimating the two means separately. With variance unknown, the results are different from the non-regression case because of the extra restriction (same variance in both samples) which is imposed here. This leads to smaller biases for the estimates of pl, “2 and B. In general, the bias of the MLE of 8 increases as the absolute value of 8 increases. 56 Table 4.8 changes the degree of truncation, while keeping the distance between the two means, 8, constant. The results suggest that the inconsistency of B, as well as of each mean, increases with an increase in the degree of truncation, but at a smaller rate than the asymptotic bias of separate estimates of each mean (as in Table 4.2). Table 4.9 presents the biases as p increased (keep- ing “1' ”2 and therefore 8 constant), where p is the pro- portion of the sample with the larger mean. This may also be viewed as decreasing the degree of truncation, and it will reduce the asymptotic bias of the estimate of the slope coefficient 8. It will also reduce the bias of the estimate of “l = a, which is the same type of effect as in Table 4.7, since the means of two samples have been chosen to be symmetric with the equal unknown variance. 4.7 Conclusions We have considered the asymptotic bias of the normal MLE for censored, truncated and binary samples, when the disturbances are, in fact, non-normal. We state three practical conclusions. 1. The bias for non-normality can be substantial. This is especially true in the realistic case in which the disturbance variance is unknown. In fact, the bias of the normal MLE can be larger than the bias of the uncor- rected sample mean. 57 2. The censored estimator is usually much less biased then the truncated estimator. Therefore, the limit observations should be used if they are available. 3. The bias due to non-normality depends on the degree of censoring (or truncation) of the sample. For example, for samples that are 75 percent complete (u = roughly 1.2 above) there is virtually no bias. For sam- ples that are 50 percent complete (u = 0), the bias is substantial for truncated samples, though not for censored samples. For samples that are largely incomplete, the bias is substantial for all cases. These results, if general, are of considerable practical importance. For example, they give guidance as to cases in which the approach of Lee (1981a, 1981b), based on a less restrictive distributional assumption than normality, may be worth considering. Therefore, it is important to stress that our results may fail to be general for at least two reasons. First, we have estimated only a simple model, not a general regression. Second, for reasons of tractibility we have considered only a limited number of non-normal distributions, all of which are sym- metric. Undoubtedly there exist distributions for which one would obtain results that are more pessimistic. 558 Al: .~e\eu. a . .uee ..e.e.m \ .e.e. as u .e.e. ea . 9 . scuuocsu cued ouc~m€00c« cu m cauuocsu cyan aw m . m I a 3mmm a d ._Hmmm I a .nI—am..*.m nwxu I .:.£ .£ auccomm‘ can .co«ua>4uoo ache ~-uo« durum o A a « +Amamumqlnwm. o A x . x+~ o A a .uuo . n a .I n _a_|o « confided o v a . a.xu~. + H o.w x .x-~ o w.» . o. N N w o A x ._l( |.I( + N3 a: .asu .xncu.m+..x.m.ca a. a + «a V 3.: u a: n I 3: flung (m 7.9.23 «(103: no 03303 e v x ._.xo-.o-..x.muu.eu a. a + «a N a a , .m.e. uaa v .a.m ~-= «(a a1 a .xvu x+c (Ml uoauvu H .«+s..calm + avacvn asuucousum o.w a . .x.m« .mnm. w a « .a.a.e~ e r can 3.. a I .a.u 4-2 anew e ee.seu a .-o .-.=~. sensor a N MAN I A u_Nu.u. can I A u_uv a “33> Auvh ASH N accuusnwuuaqv .oauuollhm. constanon vouoonou uo Caseloaluouoocsuanu.~.v nqmdfi 59 vo.I No.I «o. oo. ov.I «o. oo. do. «0. am. o.n mo.I no.I «o. do. on.I «o. oo. ac. dc. mm. o.« mo.) «o.I no. no. oN.I no. no. vo. do. mm. o.~ .I no.I no. no. a~.I «o. «o. co. #0. am. v.~ oo.I no.I no. no. -.I no. no. no. go. om. ~.~ oo.I no.I no. do. no.I no. go. no. «o. mm. o.« mo.I vo.I no. no. no. do. do. mo. «o. no. o.~ mo.I vo.I no. so. no. do.I do. «a. no. mm. o.~ mo.I mo.I do. I no. an. no.I no. vn. vo. mm. v.~ oo.I mo.I no. I no. vn. mo.I ao. as. oo. um. ~.~ vo. mo.I nu. I No. we. mo.I no. va. mo. no. o.H an. mo.I 5N. I «0. cu. ~n.I do. On. Ha. no. o.o am. mo.I mm. I «o. «a. m~.I do. on. me. on. o.o on. vo.I mo.d I no. mo. au.I do. ov. «a. mo. v.0 no. do.I m~.~ I do. I no. cm.I No.I cm. on. ow. m.o v«.~ no. pv.m I no. I oo. ma.I vo. mp. nn. cm. o.o vn.~ oo. vn.m I on. I mo.I vu.I mo. an. no. ov. «.OI no.“ Nu. mv.oHI ad. I mo.I mo.I co. co.” do. an. v.oI mo.~ om. No.a~I ow. I ~A.I mo. on. («.u mu. va. o.OI nm.~ mu. Av.~nI «v. I v~.I on. ad. nv.u Ha. ha. n.0I oo.~ on. oo.«HI ow. I v~.I he. cu. no.u mo.~ nu. o.~I o~.~ Hm. -.NAI no. I vH.I an. ac. vc.~ mm.“ mo. N.AI -.N v0. on.-I ao.~I ~H.I hm. no. so.“ vv.~ ho. v.«I an." on. n~.n~I ov.~I mo.I m~.~ no. am.n nodn mo. o.nI am.~ mm. no.n~I . vs.~a no.I mm.~ no. an.” no.~ no. m.~I so.~ o~.~ mv.v~I n~.~I no. no.~ oa. an.“ no.” no. o.~I mm.~ nu.a n~.m~I om.~I Ad. ad.“ an. nm.u. -.« no. «.NI vo.n mv.~ vo.m~I no.nI ma. «v.~ ma. nu.n Av.« Ho. v.~I mm.n vo.n mm.ouI mm.nI am. an.“ em. vv.n Ho.u no. o.NI vv.n mu.q mn.haI no.vI on. «o.n vv. oo.n do.« do. o.~I mo.n mo.~ nn.man vo.vI av. on.n mm. «m.n «o.n do. o.nI nounossua uouousoo vouaocsuh vououcoo xuoswn vouuossua vououcoo vouoocsua couoacou .oAcavm : Assesses o. : assoc: o. 1 can: oddldm malnnooawn cauoualan¢I1.«.v Manda 60 ~0.I 00.I H0. 00. ~n.I no. 00. do. 00. 00.~ 0.n ~0.I Ho.I #0. 00. mu.I A0. 00. N0. 00. am. 0.« N0.I ~0.I A0. 00. 0~.I do. 00. n0. 00. mm. 0.~ n0.I H0.I N0. 00. nH.I H0. 00. «0. A0. mm. v.~ n0.I n0.I N0. 00. 00.I no. 00. m0. 40. 00. n.n n0.I H0.I «0. 00. n0.I do. 00. 00. no. em. 0.~ m0.I N0.I N0. «0. .I H0. ‘0. 00. no. 00. 0.~ N0.I N0.I «0. no. no. 00.I no. «A. no. mm. w.a A0.I N0.I 00. «0. m0. ~0.I “0. ma. v0. n0. v.~ 00.I N0.I «0. I A0. 00. No.I «0. an. 00. 00. N.A N0. N0.I m0. I «0. 00. n0.I «0. ha. 00. 00. 0.A v0. N0.I HA. I do. 00. m0.I do. en. an. 00. 0.0 00. N0.I ad. I no. mo. 00.I 00. me. cu. 0w. 0.0 NA. N0.I an. I 00., n0. h0.I 00. mm. «a. no. «.0 ma. n0.I am. I no. I no. m0.I #0. me. 0m. 0m. ~.0 an. «0. 00. I «0. I 00. n0.I «0. Nb. mm. 0m. 0.0 an. no. ~N.~I we. I N0.I m0.I No. #0. 0m. av. «.0I mm. m0. vm.~I 50. I m0.I H0.I n0. h0.d «0. mm. v.0I mp. 00. m0.nI Ad. I m0.I v0. v0. nn.d on. ma. 0.0I mm. Ha. uh.vI 0H. I 00.I Na. 90. 0v.~ «0. 0w. c.0I m~.~ mu. 00.0I mm. I 00.I mm. v0. hm.a 00.n cu. 0.«I AN.H 0m. nm.0I an. I 00.I vn. n0. mh.d m~.n 0H. «.AI vn.~ mN. oh.0l av. I m0.I 0v. «0. v0.” vv.d 50. v.HI o~.~ an. 00.0I mm. I n0.I v0. 00. mg.“ no.u m0. o.nI 0N.A hm. v0.0I no. I 00. «0. no. nn.~ no.u 00. 0.~I an.~ vv. mm.oI no. I «cu ~0.H 00. nm.~ no." N0. 0.NI vn.~ an. 00.5I N0.~I 00. nm.« 0A. an.“ nu.n «0. «.ml 0n.« mm. mo.hI ~«.~I ma. mv.« mu. mm.n av.“ H0. v.NI nv.d ow. m~.bI mv.AI . mu. h0.d an. vu.n 00.~ do. w.~I ov.u on. on.hI 0h.~I mN. ~0.~ hm. mn.n 00.~ no. 0.~I Nm.~ m0. -.hI 50.nI mm. m~.~ mm. 0m.m 00.n 00. 0.nI umuaocsuh vouoacou vouuucsua oouomcoo huasnm wouoocaua wououcso vaunossuk cououcoo .0Acavm, 1 2505.5. 3 a .555. o. 3 soot 393m 0A UIIoooadn ouuoualaudII.n.v man‘s 61 ~0.I 00.I 00. 00. mu.I 00. 00. no. 00. 00.d 0.n A0.I 00.I «0. 00. v~.I do. 00. do. 00. 00.H 0.N ~0.I 00.I A0. 00. A~.I A0. 00. N0. 00. am. 0.N d0.I 00.I 40. 00. no.4 «0. 00. n0. 00. 00. v.“ ~0.I 00.) A0. 00. 10.I ‘0. 00. v0. «0. 00. N.N «0.I ~0.I do. 00. ~0.I «0. 00. 00. do. 00. 0.n ~0.I ~0.I a0. 00. 00.I 00. 00. 00. «0. ma. 0.~ d0.| ~0.I #0. 00. «0. 00. 00. an. M0. mm. 0.« H0.I do.) 00. 00. no. 00.I 00. 0A. v0. «a. v.« 00.I d0.I ‘0. I 00. n0. ~0.I 00. an. 00. 00. N.« do. mo.I «0. I do. no. ~0.I 00. 0a. 00. m0. 0.~ N0. A0.I m0. I 00. n0. ~0.I 00. on. Na. 00. 0.0 «0. d0.I 00. I 00. N0. n0.I 00. mv. ha. MB. 0.0 00. d0.I MA. I 00. «0. n0.I 00. mm. MN. 00. v.0 00. 00.I AN. I .I #0. n0.I 00. 00. 0M. 0m. «.0 «A. 00. on. I ~0.I 00. n0.I «0. 0h. an. an. 0.0 mu. #0. on. I ~0.I do.I a0.I do. «a. 0m. ~v. «.oI on. «0. vm. I m0.I ~0.I do.I no. no.m no. va. v.0I vn. no. an. I m0.I «0.I No. «o. -.H an. pa. 0.0I 0N. m0. m0.«I h0.I . m0.I 00. N0. 0n.n am. On. 0.0I on. no. av.AI 0~.I no.I Au. «0. mm.~ 00.A nu. 0.~I av. 00. 0A.NI v~.I n0.I Bu. «0. Nb.~ . 0N.A AA. N.AI 00. an. n0.«I 0~.I no.I vm. ~0. 00.« vv.~ 00. v.AI Ah. vd. ~0.nI vN.I «0.I mm. 00. 00.N no.a m0. 0.AI vs. «a. om.MI 0n.I 00. av. N0. 0«.~ ~0.~ v0. 0.~I m5. 0N. 00.nI 0n.I N0. vm. n0. mv.« A0.« «0. 0.NI 0h. vN. 00.vI hv.I v0. no. 00. no.u AN.~ N0. «.mI 05. pm. v0.vI hm.I 50. do. 00. n0.u 0v.~ A0. v.NI an. an. v«.vI 00.I Ag. pm. «a. ~0.n oo.~ do. 0.aI on. on. «~.vI ~0.I vu. nu.~ 0a. aa.n 00." 00. 0.«I N0. av. 0N.vI 00.I 0A. 0n.~ 0N. Av.n 00.n 00. 0.MI ocuuossuh vouoasoo voucocsua vouoasoo Shanda vouaossua couoosou couaocsha couoacoo .0Ar>.m 1 33550.: 33500: sxoasm 0N uIIuoaaua OuuogalautII.v.v 04049 62 m0.I ~0.I N0. 00. mm.l «0. 00. n0. d0. 00. 0.n m0.I N0.I n0. 00. 0v.I n0. 00. n0. A0. 00. 0." .I N0.I 90. m0. OM.I n0. #0. v0. A0. 00. 0.N h0.I n0.I v0. #0. 0N.I n0. d0. m0. A0. 00. v.« h0.l n0.l m0. m0. mu.I n0. A0. #0. N0. 00. N.N .I n0.I m0. A0. ~A.I n0. «0. 00. «0. ha. 0.N 00.I '0.I m0. do. '0.I «0. d0. 0A. n0. 00. 0.A h0.l m0.I m0. «0. «0. n0. «0. MA. v0. m0. 0.~ h0.I m0.I «0. N0. 00. ~0.I «0. ea. m0. «0. v.d v0.I 00.I #0. I N0. nu. 00.I N0. 0H. #0. am. N.H ~0. h0.I 00. I n0. ha. 00.I «0. CN. 00. 00. 0.A 0A. 00.I an. I n0. 0.. ,v~.I «0. Ga. Md. ‘0. 0.0 an. 00.I on. I m0. ma. AN.I N0. 0m. ma. GB. 0.0 00. h0.I 0n.NI n0. ha. 0N.I #0. vv. 0N. Nh. v.0 Nb.d m0.I 00.0I «0. Au. vn.l #0. mm. ha. «0. «.0 00.H N0. 00.0AI h0. I 00. hN.I 00. db. mm. on. 0.0 00.H nu. 09.0HI 0N. I AA.I h0.I AA. H0. 5'. on. «.0I 0m.d a“. mm.0AI mm. I h~.I nu. ma. dd.d 00. ON. v.0I 00.H an. an.04I hv. I m~.I an. vn. dn.d mp. AN. 0.0! 00.A av. 0H.0HI no. I 0H.I mm. ma. Hm.n «a. 0A. 0.0I 00.A 0m. 00.0) oh. I DH.I nh. 0A. Ah.d 00.. «A. 0.~I 00.H 0m. ah.mI do. I nu.l no. 50. «a.d hN.H 00. «.mI 00.~ 00. mm.al O0.AI 00.! nd.d «0. dd.n mv.~ F0. v.~I om.d nh. an.ml NN.AI ~0.I nn.~ v0. an.N v0.d m0. 0.AI wm.d 00. 0~.0I hn.AI v0. nm.~ 0n. an.“ n0.n v0. 0.AI 00.“ ha. 00.0I nm.~I AA. nh.~ RA. 4h.« n0.N n0. 0.NI om.d vo. 0h.0I 0h.HI mu. n0.~ ma. «m.n NN.N «0. N.NI 00.H 00.~ «n.0I 00.AI 0N. MH.N an. dd.n Av.N N0. v.NI Om.d h0.~ mn.0I N0.NI On. an.“ #9. AM.M A0.N A0. 0.NI 00.d MH.H 0d.0I o~.NI 0v. nm.u 0n. Am.n HQ.N #0. 0.NI oo.d 0H.H mm.hl 0n.NI mm. nh.n 00. Ah.n d0.n «0. 0.MI peacocsus concucoo peacocsua couoacoo >uacun concocsua oouoocoo Gounocsua uouoacoo .oana0m : 3305.5. o. q. .555. o. 1 coo: cud-am 1"! 003031.033 03323.1 . m . v 3.29 63 no.I do.I do. co. nn.I do. ac. dc. co. oo.~ o.n no.I «o.I «a. co. on.I do. oo. «o. co. mm. a.u no.I mo.I «o. co. n«.I «o. oo. no. ~o. mo. o.~ nc.I «o.I «o. co. p~.I «o. co. «a. do. am. v.~ vo.I «o.I «o. co. ~—.I «o. co. no. go. om. ~.~ vo.I «o.I no. «a. co.I do. do. no. go. pm. o.« vo.I «o.I «a. do. «o.I ac. dc. mo. ac. om. a.~ no.I «a.I «o. do. no. ac. dc. NH. no. ma. o.~ mo.I no.I do. we. no. do.I dc. mg. «c. an. v.~ Ho.I no.I «o. I do. no. no.I do. om. co. co. u.~ ~o. mo.I no. I do. no. vo.I do. om. co. co. o.~ co. no.I «d. I do. we. no.I do. an. «A. do. m.c “a. no.I pm. I do. no. mo.I do. No. a“. mu. o.o «A. «o.I av. I Ac. mo. o~.I oo. an. «a. no. «.0 o~. do.I on. I no. I no. -.I Ac. mm. mm. mm. ~.o do. do. an.~I «a. I oo. c~.I no. on. on. on. o.o am. «a. oo.~I mg. I no.I mo.I nc. gm. av. «v. «.oI do. so. p«.nI o". I mo.I «o.I vo. no.~ «u. an. v.oI no.~ ad. va.vI mg. I he.I no. mo. v~.~ on. mN. o.oI n~.~ o“. o~.oI am. I mo.I mu. mo. do.“ ~m. ma. o.oI m~.~ "a. mo.oI an. I oo.I ~m. no. mm.d no.0 «a. o.~I an.~ on. vo.oI av. I po.I ow. co. on.” o~.~ ca. ~.~I ~n.~ an. no.oI «a. I mo.I no. «a. na.~ cc.“ no. q.~I ~n.~ on. mp.oI no. I no.I on. do. he.” .no.~ no. o.~I Am.d vv. «m.oI afi. I do. am. «e. on.« «o.~ vo. o.~I ~n.~ on. nv.oI «a. I no. m~.~ mo. on.u ~c.« ac. o.~I an.~ um. on.oI no.~I ~H. mm.H cu. op.u -.~ no. «.«I «n.~ «o. -.oI -.~I nu. on.« ad. oo.« av.“ do. v.«I an.~ no. mm.mI on.HI n~. vs.“ ca. m~.n ~m.« do. o.«I ~n.~ v». op.mI vm.«I on. cm.~ an. mn.n oo.~ uo. o.~I ~n.~ as. mm.mI -.~I pm. n~.n oc. mn.n oo.n co. q.mI vouuocaua oouoacoo vouaocsuh vouoacoo auucwn vuumucsua vauoucau vouaocaha vuuoucoo AOAcxvm : 3305.5. 3 a .555. 3 1 so: 3930 oqunumodunnoacwn OAHOuEu¢II . o . v H.313 64 TABLE 4.7.--Asymptotic biases of regression coefficients, Censored - t5 (p = .5) True Values Bias (o-known) Bias (o-unknown) B uz “1 B uz “1 B uz “1 o -6.0 -3.0 3.0 .55 .55 .00 .57 .57 .00 -.01 -2.0 -1.0 1.0 -.11 -.10 .01 -.ll -.10 .01 -.00 -0.8 -0.4 0.4 -.08 -.08 .00 -.08 -.09 -.01 .01 0.0 0.0 0.0 .00 -.04 -.04 .00 -.05 -.05 .02 0.8 0.4 -0.4 .08 .00 -.08 .08 -.01 —.09 .01 2.0 1.0 -l.0 .ll .01 -.10 .ll .01 -.10 -.00 6.0 3.0 -3.0 -.55 .00 .55 -.57 .00 .57 -.Ol 65 TABLE 4.8.-~Asymptotic biases of regression coefficients, Censored - t5 (p = .5) True Values Bias (o-unknown) 8 Hz “1 B u2 “1 0 1.0 -3.0 —4.0 .25 -5.30 -5.55 2.29 1.0 -2.0 -3.0 .22 -2.51 -2.73 1.28 1.0 -1.0 -2.0 .17 - .76 - .93 .51 1.0 0.0 -1.0 .14 - .08 - .22 .09 1.0 1.0 0.0 .04 .02 - .02 - .03 1.0 2.0 1.0 -.01 ' .01 .02 - .04 1.0 3.0 2.0 -.01 .00 .01 - .03 1.0 4.0 3.0 .00 .00 .00 - .01 66 TABLE 4.9.--Asymptotic biases of regression coefficients, Censored - t5 . True Values Bias (0 unknown) 8 “2 ”1 p B “2 “1 0 2.0 1.0 —l.0 .l .34 .00 -.34 .18 2.0 l.0 -l.0 .2 .23 .01 -.22 .09 2.0_ 1.0 -l.0 .3 .17 .01 -.16 .05 2.0 1.0 -l.0 .4 .13 .01 -.12 .02 2.0 1.0 -1.0 .5 .ll .01 -.10 -.00 2.0 1.0 -l.0 .6 .10 .02 -.08 -.02 2.0 1.0 -1.0 .7 .08 .02 -.06 -.03 2.0 1.0 -l.0 .8 .07 .02 -.05 -.04 2.0 1.0 -l.0 .9 .06 .02 -.04 -.05 Asymptotic Bias .5. N 67 (1)-VERIRNCE KNOHN (2)-VHRIRNCE UNKNOHN Yam“ TRuu L A AJJIAJALIJAAAJ O ff' V'W‘T’TIV'VII‘TV‘ ' YVV'jVTrfiIWfYTlv‘j ‘fi -2.0 -l.5 -1.0 -0.5 O 0.5 1.0 1.5 2.0 Figure 4.l.--Asymptotic Bias of Estimates of u, tlo Distribution Asymptotic Bias 68 ‘Y‘jjjTrTIT‘Y‘lTVT -2.0 1.5 -l.0 -0.5 1 I V U 0 I I I j V T T r V V T T I 1' V I 0.5 1.0 1.5 Figure 4.2--Asymptotic Bias of Estimates of u, t10 Distribution (Variance known) ' 1 2.0 Asymptotic Bias 69 .18 .14 .10 O 0 OS O l A L #1 l A A A A l L A l I I o o o H O O Q m N L41 1 1 LL A Ll AA A A l 0 H .h "W'IIjT‘Y'jTTV‘VV‘j‘ j'T'17‘11IVIVTjI1‘Yl -2.0 -l.5 -l.0 -0.5 0 0.5 1.0 1.5 2.0 Figure 4.3.--Asymptotic Bias of Estimates of u, Censored Sample (Variance known). ’ 70 N O O A L L A A A A ] Asymptotic Bias ( O 0 0° 1" _L A A A _l A TT'T“V‘T‘rrVWTITYfiITYV1'VTVI'VT'TrTV‘T1 -2.0 -1.5 -1.0 -O.5» 0 0.5 1.0 1.5 .2.0 14 Figure 4.4.--Asymptotic Bias of Estimates of u, Truncated Sample (variance known) I A A Asymptotic Bias I I S 8 L A l A 1 L L -.-154 TfiTTV—VV' .4 .8 r 71 V V 1.2 71 rerY'V' 1.6 u 2.0 2.4 ‘frtrrTIT—Vifh 2.8 V T 1 1 3.2 Figure 4.5.--Asymptotic Bias of Estimates of u, Binary Sample 72 l I .43.: .331 g . .4 ‘ ‘3 m . 0.231 a. :1 . "a 3 1 E431 m I d: ‘ ‘ 09- fi‘ .031 ‘ 0‘ f . 11 : (ENSORED -.07"‘ 4 -017 'TVf'VYrV'fiY‘V'Tri"ITTI‘TWTY’TTi‘V‘VjTijjfi -1.5 -l.0 ~0.5 0 0.5 1.0 1.5 2.0 2.5 Ll Figure 4.6.--Asymptotic Bias <05 Estimates of u, Logistic Distribution (Variance known) APPENDIX A TRUNCATED MOMENTS OF SELECTED DISTRIBUTIONS 73 APPENDIX A TRUNCATED MOMENTS OF SELECTED DISTRIBUTIONS 1. Student distribution. Let 2 be distributed as Student with n degrees of freedom and denote the density and cdf of 2 as f and F. Then f(2) = h(n) (l + 2 §;)_8(n + 1), 28(-m. +w) where h(n) = [/5 B(%. %)]’l, E(z) = 0 and Var(z) = 5%: Z P(z) =j' f(t)dt E( l > _ ) n + x2 f(x) 2 z X n-l P(x) 7 n IT (a! B) 2 (n-Z) 2F(x) ' x 3 o E(z [z > -x) = J 2-I (a, B) n T ((n-Z erx) ' x > 0 n 3 Where a = -H—’ B = 5 Proof: 74 75 +oo 2 E(zlz > -x) = filgfl zf(z)dz = (11;); ’ 3%) (Note A1) ’3! Using the results of Notes A8 and A9, we get +00 E(zzlz > -x) = 5%;7 zzf(z)dz -x m 2 - % (n+1) = h(n)f+ 22 (1 + 3;) dz P(x) -x 0 2 -k(n+l) .. h(n) 2 _z___ — P(x) LIX z (1 + n) d2 +00 2 _2__)-15(n + 1)dz] n +f’ 22 (l + 0 Using the results in Notes A3 and A4, we get 3 'hn) 1_ 2 $142“ 3(a,B)IT(a.B)]I x30 =( 2 h(n) 2 - l I km)- [n B(OLI B)(1 2 IT (0‘! B)”: X > O ( n 3(0) 8) I (a 8) x < 0 2F(x) 1 a T . . _ " n B(a. B) 2F(x) 1 n ‘2 ' IT ‘0" BHI X > 0 1 B“: '2") 76 I (a. B) n T (3:5) ZETXY_ I x i 0 = I (0 B) n 2- T ’ 2. Logistic distribution. Let 2 be distributed as logistic, and denote the density and cdf of 2 as f and F respectively; Then 5(2) = e'z(1 + e‘z)’2 = ez(l + ez)‘2, ze(-m, +m) 7r2 where E(z) = 0 and Var (z) = -§- P(z) = (1 + e-z)"l = e2 (l + e2)".1 and so[1 - F(zfl== (1 + ez)’l = e‘2 (1 + e'z)'1 _ 2n(1-F(x)) E(zlz > - x) P(x) f x2 + E%%T[x 2n (1-F(x))-g(-ex)], x i 0 E(z2|z > -x) = fl 2 2 _ -x \X +fix—)[XKD(F(X))+9(G) 2 2 H x +?'TL x>° where g(t) = Z (tn/n2) 77 Proof: +00 E(z|z > -x) = 5713f zf(z)dz 'X = fifi-hxmx) - inn-P(x)” (Note A6) _ _ _ £n(1-F(x)) - x P(x) 00 + l 2 WI_ 2 f(Z)dz E(zzlz > ~x) x o +00 (Fix) (-fx 22f(z)dz +f0 zzf(z)dz). x _<_ 0 o +00 1 2 2 km) (jozflz)dz +j; zf(z)d2)IX>0 Using the results of Notes A5 and A7, we get: .__l_[- 33 + x2F( ) + 2x£n (l-F( )) - 2'(- x) ( P(x) 6 x x 9 e 2 = g ‘+ L], x _<_ 0 (J— [113 - x2 (1-F(x)) + 2x£n(F(x)) P(x) 5 H2 78 X + ‘F—(ZXTIXKI‘HI‘F(X)) " g(-e‘x)] r X _<_ 0 2 2 x2 + F%%T[xln(F(x)) + g(-e-x) + %7 - ér], x > 0 3. Laplace (double exponential) distribution. Let 2 be distributed as Laplace and denote the density and cdf of 2 as f and P. Then, f(z) =I% e—lzl, 2€(-m, +m) where E(z) = 0 and Var(z) = 2 éez , 2:0 0(2): 1 - % e'2 = 2ez (2ez - 1)‘1, z > 0 l ; x ' x > 0 2e - 1 - 2 1 + (l - x) , x i 0 E(z2|x > -x) = 2 79 Proof: . +® E(z|z > -x) = fif zf(z)dz -x f .2.... .1 X F(X)l 2 e (l'X)] I l l k P(X) [Eex (1+X)] I l x -2—e ' (1+x) Zex- 1 ' E(22|z > -x) - 1 +w 2 .FM [’1‘ zf(z)dz 1 I. m 1% ex (1 + (1 - X)2)] 1 I— m [-12:61‘ (1+XH 1+(1-x)2 1-(1+x)2 2ex - l 2 + IA IA | A | A 80 Note (Al) +00 2- f 2(l+-zl:1—) “n+l) dz "X = ngm — 251—1 z (1 + 3;)-%(n + l)dz -x = 53—1. ((1 + 2:2) -%(n-l)1:: = 3%: (l + 3Eli-"km - 1) = 9—57"; (1 + Elli-)4“n + 1) Note (A2) on Integration 2 -%(n + 1) £01 + 5-) dz n 24%: + 1) 2 2 =122 (1 + 33—) (“_En: ) dw, (w = -—3—-§) n+2 22 (n + 22)2 (g_i_§3)-% (n + 1) -2nz n dw 2 -% (n + 1) n+2) dw 5 =j'- g; (n + z2)-8 (n + 22)2 ( n [01 n + zz)-;5 (n + 1) 2) n n dw where a = -—-and B = % O = j. -.% n w (l - W)B- dw (Note A2) 1 mm~ H = n f wa-l (1 - w)B-l dw 0 3 =-§-n70 (a. B) where 3(0, 8) is complete Beta function. Also using the pr0perties B(a, a) = £4§13%%’ and F(a + 1) = aF(a), it follows that n + 2 3 B(0I B) _ B( 2 ’ 2) _ 1 - n-2 1 n l n 82 Note (A4) x 245 (n+1) j. z2 (1 + %T) dz 0 3 1 = % n2 ‘[ wa-l (1 - w)B_1 dw, (Note A2) (T = -_2——§) T n + x .3. '1' = % n2 [B(0, 8) - j, wa-l (l - W)8-l dw] 0 3. =-]2:-n2 [ B(a, B) " B (a, 8) IT (a! 8)] where IT (a. 8) is the incomplete Beta function ratio. Eggg (A5) Some facts (summations are from n = l to n = +m) 2 Z (_1)n-l n-Z = E: - - n - Z (_l)n 1 n 2 e nlxl = _ Zn 2(-e lxl) 5 _ g (-e (XI) 2 (_l)n-1 n-l e-nlxl 1n (1 + e (XI) 2 (_l)n 1 e-nlxl = e IX|(1 + e |x|)-l z (-1)“"’1 ne n'xl lx'(1 + e (XI)-2 +00 f 2 2 112 z e 2(1 + e 2) d2 ‘ 7f 83 For the last statement, see Johnson and Kotz (1970), Chapter 22, p. 4. Note (A6) +00 ‘f ze“z (1 + e-zf-2 dz “X +00 .[ z 2 (-1)n"l n e-nz dz (Note A5) -x n-l +m -nz Z(-l) ‘[ z e dz -x 2 (-1)""1 (-x enx + n‘1 enx) (integration by parts) -x Z (_l)n-1 enx + Z(-l)n-l n-l nx = -x ex (1 + ex)-l + 2n (1 + ex) (Note A5) Note (A7) 0 ‘f 22 e"2 (1 + eflz)“2 dz -lx| 0 2 n-l -nz =.[ z 2 (-1) n e dz (Note AS) -|x| 0 - 2(-1)“'1 n"2 szesds (s = nz) 84 1 -2 Z (-l)n- n [2-e-nlxl(n2x2 + 2n|x| + 2)] 22 (“1)n-1 n-2 ~ x2 Z (-n)n"l e-nlxl - 2|x| z (-1)“"1 n’1 e"nlxl - 22 (-1)"'1n‘2e"n‘XI Using the facts in Note A5, we get 2 = %? - x2€-|x|(l + e-Jx|)-1 - lelzn (1 + e-lxl) + 2 g (-e-le) where g(t) = 2(tn/n2) Note (A8) Assume x i 0 and use integration by parts +m 1 -z _ 1 x _ .f 2 z e dz - 2 e, (l x) -x +m f%22e'zdz=%e"[1+(1-x)21 -x Note (A9) Assume x > 0 and do integration by parts +w f%zezdz -x 0 +w =f%zezdz+I%zezdz N|H (D NIH NIH "X "X 85 -x 2 e (x+2x+2) [4ex - (x2 + 2x + 2)] [1 — (1 + x)2 + 2 (2ex - 1)] APPENDIX B SOME PROBABILITY LIMITS 86 APPENDIX B SOME PROBABILITY LIMITS Let yi = a + Bxi + bzi where zi's are iid symmetric random variables with E(zi) = 0, and denote the density and cdf of 2 by f and F. Assume that b2 = (Var(zi))'1, which makes the variance of bz unity as a normalization, and xi is a dummy variable (Bernoulli random variable) with P(xi = 1) = p and P(xi = 0) = q = l-p. Then 1) P(z > - P(y > le b ) =F(——B——) (Bl) P(y > le 0) P(z > -%) = F(- (132) P(y > 0) = P(y > le = l) P(x = l) + P(y > le = 0) _ ; _ -0:+8 g B3 P(x—0)—pF(——b)+qF(b) ‘) See (El) and (82) + P(y < 0) = pF( -1—5—§) +qF (- %) (B4) 87 88 P(y > 01x = 1) P(x = l) P(x = lly > 0) = P(y > 0) pF(a + B) = 5- p pF(—-— ——-——B;b) + q%F() l pF(- 0 + B) P(X'1))Y<0)= ‘- E th- °‘ ; B) + qF(.- g) E(bz|y > 0, x = 1) = E(bz[a + 8 + bz > 0) a + B -—3——) bE(z|z > - E(bz[y > 0, x 0) = bE(zIz > - %) (BS) p2 (B6) (B7) (BB) E(y|y > 0, x = 1) = E(a + B + bzla + B + bz > 0) = (a + B) + bE(z|z > - a g B E(y|y > 0, x = 0) = a + bE(z[z > - —) E(y2ly > o, x = 1) =E[(oc +s+ bz)2|o + s a B = (a + e)2 + 2(o + B)bE(z|z >--————) + b a + B) + b2E(zZIz > - b E(y2|y > 0, x = 0) = 02 + 20bE(z[z >-%) + b2E(zzlz >-%) ) (B9) (310) + bz > 0] (B11) (1312) 89 E(Y|y > 0) = p1E(Y)Y > 0. x = 1) + qlE(yly > 0. x = 0), (313) (p1 + q1 = 1). See (B9) and (BIO) 2 2 E(y ly > 0) = PlE(Y ly > o, x = l) + qlE(y2ly > o. x = 0) (314) The following are some probability limits. Let a, B, 6 be probability limits of a, 8, and 3. As T + + w (n + + 00), we have the following proba- bility limits 1 n 2 1 n ki LURE? xiEE+P(X=1|Y>O)=Pl (315’ i=1 i=1 See (B5). T-n T-n k l 2 l 2 —:- Z x. = —:— Z x 5 -:— + p(X = lly > 0) T n i=1 1 T n i=1 1 T n = P2 (B16) See (36). T-n_l_ _____l .. aw: n - 2 1 + p(y>>0) 1 P(y > 0) ' Y (Bl?) T See (B3) and (B4). f; =T_-_13. "2 .. n n T-n YP2 T"““2 -111..§?.. - n n n Y 1 g _ k1 1 k1 - x.y. -— _— n._. 1 1 n k X 1-1. 1 . 1=l See (B9). 1 n . — X (y. - (02 + Bx.)) = n i=1 1. 1 + E(YIY > 0) - pl 3 See (B15) and (B13). 1 n A — £x.(y. - (&+Bx.)) n i=1 1 1 l A 1 n i=1 - a - 0] See (B15) and (B20) . :SIH 90 (818) "U N I _< 5—: I "O N V Ill qu (319) + p1E(ny > o, x = 1) (320) n "ln A Zy-B(-—Xx.)-0I i=1 1 n i=1 1 & (321) n n l 1 —Zx.y.-a(-—Zx) 1711:1111 11:11 0+8 (322) $-4IL _ _ L.- 91 1 n . . 2 2 ~2 3 .2 (yi - (0 + Bxi)) + E(y ly > 0) + p18 ,1=1 ~” 0 + 8 ~ +2pla8 — 2 p1 [a + B + bE(zIz >- b )1 3 ~ ~2 - 2E(y[y > 0) a + a (323) T-n 0 4‘ 8x1 1 k2 a + S l T-n-kz " 3' 22 m(- A )’=—' X m(- ) + H X m(-%) i=1 0 1:1 6 i= 0 k A A T-n-k _‘_3 _ a + B 2 _ a - n m ( T) + n m( 7;) + Y p2 m(- f B) + Yq2m(- %) (324) 0 0 See (318) and (B19). 3 Z X m(— -——x——") “‘5 2 m(- A ) i=1 0 l=1 O + ypz m(- a t B) (325) 0 See (18B). T-n A A + EX. % Z (0 + Bxi)m(- A 1) 1-1 9 A T-n a + 8x. A T-n & + 8x. -23. Imp—1.3) I; 2 xi... 1) i=1 0 1=l ° + a (324) + 3 (325) (B26) 92 " k n-k n 01 + Bx 1 A A 1 A %2m( Al)=%2m(a18)+%£m(%) .=1 0 i=1 0 i=1 0 o é “ ~> pl m (———-'f ) + q1 m (9-) (327) 0 5 See (B15). 0 k n a + Bx. 1 A " %2x1m( A'1)=%2m(°‘fs) '=1 0 i=1. 0 + p m (a f B) (328) l 0 l n A A a + 8x1 3 n a + 8x1 ; Z (01 + 8x1) m (---w—--) = E I m( ) é n 8 + 8x1 - + — - z m(-———A——-) -> 0': (B27) + 8 (B28) (329) n i=1 0 CHAPTER V CONCLUSIONS The Tobit model is being employed with increasing frequency in economics and other areas. The assumptions underlying the model are quite strong and more attention must be paid to the effects of violating those assumptions to avoid erroneous inferences. The implication of heteroskedasticity in the Tobit model is investigated in Chapter II. The model considered has only a constant term, and heteroskedasticity is intro- duced into the model by assuming two distinct subsamples, each with a different variance of the normal random error. Calculating the asymptotic mean of the estimator for a variety of'parameter values, the main conclusions can be summarized as follows: (1) heteroskedasticity will lead to inconsistent estimates of the coefficients, the severity of which increases with severity of heteroskedasticity and the degree of truncation or censoring of the sample: and (2) heteroskedasticity of a given severity causes less inconsistency in the censored Tobit model than in the corresponding truncated model. While it is dangerous 93 94 to generalize the results of our simple model, it is reasonable to conclude that moderate heteroskedasticity (say, variances differing by a factor of two) is not likely to cause substantial inconsistency in the censored Tobit model unless the sample is heavily censored (more than half of the observations at the limit). The study ' of the truncated case by Hurd (1979) is less optimistic. It should also be noted that heteroskedasticity invali- dates the usual test statistics, even in the complete sample regression case. Considering these implications of heteroskedastic— ity in the Tobit model, some general test for heteroskedas- ticity would be useful. Such a test is developed in Chapter III, using the Lagrangian multiplier test prin- ciple. Assuming (asenlalternative to homoskedasticity) that the variance of the error terms is a linear function of some exogenous variables, the test statistics are given for both the truncated and censored cases. Although an interpretation of the form of the test statistics is not simple, the test statistics are not difficult to calcu- late, using the estimated Tobit residuals. This test has the same asymptotic pr0perties as the likelihood ratio test. It should be added that if a theoretical specifi- cation of the cause of heteroskedasticity can be made, the model can be corrected for heteroskedasticity by 95 taking it into account in the estimation. However, the attraction of the Lagrangian multiplier procedure is that one can test for heteroskedasticity first to see whether or not the more cumbersome estimation procedure is neces- sary. The robustness of Tobit estimators to non-normality is considered in Chapter IV. The asymptotic bias of the estimate of the mean of a random variable, for censored, truncated and binary samples, when normality is assumed but the distribution is in fact non-normal, has been cal- culated using a variety of parameter values. Our analysis suggests that: l. The bias for non-normality can be substantial. This is especially true in the realistic case in which disturbance variance is unknown. In fact, the bias of the normal MLE can be larger than the bias of the uncor- rected sample mean. 2. The censored estimator is usually much less biased than the truncated estimator. Therefore, the limit observations should be included if available. 3. The bias due to non-normality depends on the degree of censoring (or truncation) of the sample. For example, for samples that are 75 percent complete, there is virtually no bias, while for the largely incomplete samples, the bias is substantial. One practical signifi- cance of our results is that, for largely incomplete 96 samples, estimation methods based on less restrictive distributional assumptions may be worth considering. It is important to stress that our results may fail to be general for two reasons. First, we have estimated only a mean, not a regression. Second, for the sake of tractibility, we have considered only a limited number of non-normal distributions, all of which are symmetric. BIBLIOGRAPHY 97 BIBLIOGRAPHY Aitchison, J., and S. D. Silvey (1958), "Maximum-likeli- hood Estimation of Parameters Subject to Restraints,” Annals of Mathematical Statistics, 22, 813-828. (1960), "Maximum-likelihood Estimation Procedures and Associated Tests of Significance," Journal of the Royal Statistical Society, Series B, 22, 145- 171. Amemiya, T. (1973), "Regression Analysis when the Dependent Variable Is Truncated Normal," Econometrica, 41, 997-1016. (1977), "A Note on a Heteroskedastic Model,“ Journal of Econometrics, g, 365-370. Breusch, T. S., and A. R. Pagan (1979), "A Simple Test for Heteroskedasticity and Random Coefficient Varia- tion," Econometrica, 41, 1287-1294. (1980), "The Lagrange Multiplier Test and Its Applications to Model Specification in Econo- metrics," Review of Economic Studies, XLVII, 239- 253. Cragg, J. G. (1971), "Some Statistical Models for Limited Dependent Variables with Applications to the Demand for Durable Goods," Econometrica, 5, 829-844. Goldberger, A. S. (1980), "Abnormal Selection Bias," SSRI Discussion Paper 8006, University of Wisconsin, Madison. Gronau, R. (1973), "The Effect of Children on the House- hold's Value of Time," Journal of Political Economy, 81, 168-199. (1974), "Wage Comparisons--A Selectivity Bias," Journal of Political Economy, 82, 1119-1144. 98 99 Hausman, J. A. (1978), "Specification Tests in Econo- metrics," Econometrica, 46, 1251-1272. Hausman, J. A., and D. A. Wise (1977), "Social Experi- mentation, Truncated Distributions, and Efficient Estimation,” Econometrica, 45, 919-938. Heckman, J. (1974), "Shadow Prices, Market Wages, and Labor Supply," Econometrica, 43, 679-694. (1976), "The Common Structure of Statistical Models of Truncation, Sample Selection and Limited Dependent Variables and a Simple Estimators for Such Models," Annals of Economic and Social Measurement, 5, 475—492. (1979), "Sample Selection Bias as a Specification Error," Econometrica, 41, 153-162. Hurd, M. (1979), "Estimation in Truncated Samples When There is Heteroskedasticity," Journal of Econ- ometrics, 11, 247-258. Johnson, N., and S, Kotz (1969), Discrete Distributions, New York, Wiley. (1970), Continuous Univariate Distributions, Vol. I and II, New York, Wiley. Judge, G. G., W. E. Griffiths, R. C. Hill, and T. C. Lee (1980), The Theory and Practice of Econometrics, New York, Wiley. Lee, L. F. (1981a), "Estimation of Some Non-Normal Limited Dependent Variable Models,“ Discussion Paper 43, Center for Econometrics and Decision Sciences, University of Florida. (1981b), "A Specification Test for Normality Assumption for the Truncated and Censored Tobit Models," Discussion Paper 44, CEDS, University of Florida. (19710), "A Specification Test for Normality in the Generalized Censored Regression Models," Unpublished Paper. Lee, L. F., and R. P. Trost (1978), “Estimation of Some Limited Dependent Variable Models with Applications to Housing Demand," Journal of Econometrics, 8, 357-382. 100 Maddala, G. S. (1979), "Specification Errors in Limited Dependent Variable Models," Discussion Paper, University of Florida. Maddala, G. S., and F. D. Nelson (1975), "Specification Errors in Limited Dependent Variable Models," NBER Working Paper No. 96. McFadden, D. (1974), "Conditional Logit Analysis of Quali- tative Choice Behavior,” in, Frontiers in Econo- metrics, New York, Academic. Nelson, F. D. (1977), ”Censored Regression Models with Unobserved, Stochastic Censoring Thresholds," Journal of Econometrics, 6, 309-328. (1979), "The Effect of and A Test for Misspecifi- cation in the Censored-Normal Model,” Caltech Social Science Working Paper No. 291. (1981), "A Test for Misspecification in the Censored Normal Model," Econometrica, Forthcoming. Olmsted, J. M. (1961), Advanced Calculus, New Jersey: Prentice-Hall. Olsen, R. J. (1979), "Tests for the Presence of Selectivity Bias and Their Relation to Specification of Functional Form and Error Distribution," ISPS, WOrking Paper 812, Yale University. Pearson, K. (1968), Tables of the Incomplete Beta-Function, 2nd Ed., Cambridge: The University Press. Prais, S. J., and H. S. Houthakker (1955), The Analysis of Family Budget, New York: Cambridge University Press. Rao, C. R. (1973), Linear Statistical Inference and Its Applications, 2nd Ed., New York: Wiley. Schmidt, P. (1976), Econometrics, New York: Marcel Dekker. Schmidt, P., and R. P. Strauss (1975), "The Prediction of Occupation Using Multiple Logit Models," Inter- national Economic Review, 16, 471-486. Silvey, S. D. (1959), "The Lagrangian Multiplier Test," Annals of Mathematical Statistics, 29, 389-407. 101 Smirnov, N. V. (1961), Tables for the Distribution and Density Functions of t-Distribution, New York: Pergamon Press. Tobin, J. (1958), "Estimation of Relationships for Limited Dependent Variables," Econometrica, 26, 24—36.