PLACE IN RETURN BOX to remove this checkout from your record. To AVOID FINES return on or before date due. MAY BE RECALLED with earlier due date if requested. DATE DUE DATE DUE DATE DUE //SEHIPARAMETRIC ESTIMATION OF MULTIVARIATE TOBIT MODEL / BY Bih-Shiow Chen// A DISSERTATION Submitted to Michigan State University in partial fulfillment of the requirements for the degree of DOCTOR OF PHILOSOPHY Department of Economics 1988 49$ OQH \u‘T ABSTRACT SEHIPARAMETRIC ESTIMATION OF HULTIVARIATE TOBIT MODEL BY Bih-Shiow Chen The purpose of this dissertation is to study distribution-free methods of estimation for the simultaneous equation Tobit model. The simultaneous equation model studied here contains censored dependent variables, and also some dependent variables of the usual continuous type. The typical treatment of this Kind c’ model is to assume that the error terms follow a multivariate normal distribution and to estimate the parameters by the method of maximum likelihood. If the normality assumption for the error terms is correct. the normal MLE is consistent and asymptotically efficient. However, if the normality assumption is incorrect. the normal HLE is inconsistent. and it is therefore desirable to have available estimators whose consistency does not hinge on a specific distributional assumption. In this dissertation we propose such estimators and consider their efficiency. Our method of estimation involves estimating the reduced form equations by distribution-free methods. and then deriving estimates of the structural parameters from the estimates of the reduced form parameters by the minimum 'y Blh-Shlow Chen distance method. As we have expected, our estimator (or other robust estimators) is inefficient relative to the normal MLE when the error terms are indeed normal. In this dissertation. we measure the extent of this inefficiency for particular parameter values and sequences of exogenous variables by comparing the asymptotic covariance matrices of our estimator and of the normal MLE. From our experiments of efficiency comparison. we find three important results. First, our robust estimators become less efficient relative to the normal HLE when the correlation between the error terms in the different equations is increased. Second. our estimators become less efficient relative to the normal MLE as the degree of censoring increases. Third, the comparison between our estimators, which are based on Powell’s CLAD and SCLS estimators, also depends heavily on the degree of censoring. ACKNOWLEDGEMENTS I would like to express my deepest thanks to Professor Peter Schmidt. chairman of my dissertation committee. His insightful suggestions, enthusiastic encouragement, and ever—lasting help during the period of my dissertation writing are most highly appreciated. Without his help I could not finish my dissertation. I also want to thank all other members of my dissertation committee - Professor Daniel S. Hamermesh, Professor Richard T. Baillie, and Professor Jeff E. Biddle — for their kindness and help. Most importantly. I owe my greatest thanks to my husband, Tzong-Rong Tsai, for his continuing support and never-ending encouragement. Without his unselfish love and considerate help. I could not finish my graduate study. Also, I would like to thank my son. Jay, who brightened my life with his Joy and love. Finally I want to thank my parents for their love and care since my childhood. iv CHAPTER ONE. THREE. FOUR. FIVE. APPENDIX A APPENDIX B APPENDIX C APPENDIX D TABLE OF CONTENTS DISTRIBUTION-FREE ESTIMATION OF THE SIMULTANEOUS TOBIT MODEL Introduction ...... One Continuous and One Censored Dependent Variable .. . . .. Application of CLAD Estimation Application of SCLS Estimation .. m Continuous and n Censored Dependent Variables .. . . . Application of CLAD Estimation Application of SCLS Estimation DERIVATION OF ASYMPTOTIC COVARIANCE MATRICES WHEN THE ERROR TERMS ARE BIVARIATE NORMAL . RELATIVE EFFICIENCY OF THE MLE AND SEMIPARAMETRIC ESTIMATES ... Reduced Form Equations with One Explanatory Variable . ............ Reduced Form Equations with Two Explanatory Variables .. . Structural Equations . ..... . ..... SUMMARY AND CONCLUDING REMARKS Proof of Consistency of Proof of Consistency of Proof of Consistency of of Chapter Two ... Proof of Consistency of Estimator of Chapter Two . o v Page 0‘00) 21 N m 28 42 42 44 48 .. 56 .. 72 . 79 Page APPENDIX E second Derivatives Of the Log 11K811h00d Function ........... ..... 94 APPENDIX F Some Expectations of Truncated Univariate or Bivariate Normal Distributions ............. 97 APPENDIX C The Asymptotic Covariance Matrix of the CLAD Estimate .......................... 99 APPENDIX H The Asymptotic covariance Matrix Of the SCLS Estimate ............ ............ .. 101 APPENDIX I The Identified Structural Models Corresponding to Reduced Form Equations with “2 = (1. 0)’. (1. 1)’. or (O, i)’ .. ........ 103 BIBLIOGRAPHY .... ........... ............................ 105 CHAPTER ONE INTRODUCTION The purpose of this study is to present and investigate distribution-free methods of estimation for the simultaneous equations Tobit model. The model we consider is essentially the model of Nelson and Olsen (1978). Some equations have censored dependent variables while others may have continuous dependent variables, and each dependent variable may appear as an endogenous explanatory variable in other equations. However, Nelson and Olsen assumed normally distributed error terms. In our model this assumption is relaxed, and we adopt distribution-free estimation methods whose consistency does not depend on normality. This study is an extension to the simultaneous equations case of a large recent literature on the robust estimation of the (single equation) Tobit model. The Tobit model was originally proposed by Tobin (1958). It is of the form (1) Yt*:XtB+€t ,t:1,...,T, where Xt is the t—th observation on a row vector of exogenous explanatory variable, Et is an unobservable error term. and Yt” is the t-th realization of an unobservable dependent variable. We are assumed to observe (a) Yt : max(o, Yt*), and the error terms 6t are assumed to be independently and identically distributed (11d) as mace). The standard method of estimation of the Tobit model has been maximum likelihood. The form of the likelihood clearly depends on the assumption of normality of the errors. Arabmazar and Schmidt (1982) and Goldberger (1983) have shown that the normal maximum likelihood estimator is inconsistent when the distribution of the errors is non- normal. Because the assumption of normality is often questionable. there has been considerable interest in finding estimators which are "robust" in the sense of being consistent without having to make specific distributional assumptions for the errors. Such distribution—free consistent estimators have been developed by Duncan (1986). Fernandez (1986). Gourieroux et a1. (1987). Horowitz (1986), and Powell (1984, 1986a. 1986b). (In related work. Chamberlain-(1987) and Cosslett (1987) have derived bounds for the asymptotic efficiency of such estimators, and Newey (1987a) and Smith (1987) have provided tests of the normality assumption in the Tobit model.) However. while all of these estimators are known to be consistent. not all have known asymptotic distributions. and for some (e.g. the estimator of Fernandez (1986)) T%(§ - B) does not even have an asymptotic distribution. In this study we will therefore focus on the methods suggested by Powell (1984, 1986b), the "censored least absolute deviations" (CLAD) estimator and the "symmetrically censored least squares" (SCLS) estimator. These estimators are consistent for a wide class of distributions. and they have known asymptotic normal distributions. They are also consistent even in the presence of heteroskedasticity (which the normal MLE is not). For the usual linear simultaneous equations model, it is possible to estimate the reduced form equations by ordinary least squares. and then to derive structural estimates from the reduced form estimates by the minimum distance method. This procedure was suggested by Chamberlain (1983), who proved that it led to asymptotically efficient estimates of the structural parameters, being essentially an efficient version of indirect least squares. He also showed that it is an approximate version of three stage least squares -In this study we propose a method of estimating the structural parameters of a simultaneous equations Tobit model. which combines the distribution-free estimators of Powell with Chamberlain’s minimum distance estimator. Specifically, we solve the simultaneous equations Tobit model for its reduced form, and we estimate the reduced form by ordinary least squares (for those equations whose dependent variables are continuous) and by one of Powell’s methods (for those equations whose dependent variables are censored). We derive the asymptotic covariance matrix of these reduced form estimates; this is non-trivial because it involves the Joint distribution of two different kinds of estimates. Following Chamberlain, we then derive structural estimates using the minimum distance method. Subject to suitable regularity conditions. this procedure leads to consistent and asymptotically normal estimates of the structural parameters of our simultaneous equations Tobit model. There is some previous literature on estimation of simultaneous equations Tobit model by methods other than MLE. The earliest contributions were motivated by computational considerations. since the Joint MLE involves integrals of a multivariate normal distribution (the number of integrals being the number of censored variables) and is computationally demanding. Under the assumption of normally distributed error terms, Nelson and Olsen (1978) suggested a two—stage estimator. In the first stage, they estimated the reduced- form equations separately by ordinary least squares and single-equation Tobit maximum likelihood. Then they estimated the structural equations separately by the same methods, but with the predicted values for the endogenous explanatory variables from the first stage substituted into the equations. Amemiya (1979) derived the asymptotic covariance for Nelson—Olsen estimator and proposed an asymptotically more efficient estimator. He estimated the reduced-form equations as Nelson and Olsen did. But to estimate the structural equations, he suggested a form of generalized least squares estimation. Lee (1981) demonstrated that his GBSLS (generalized two stage least squares) estimator is asymptotically more efficient than one version of AGLS (Amemiya‘s generalized least squares) estimator. Amemiya (1983) compared the AGLS estimator with the Lee-Maddala-Trost (1980) GZSLS estimator for the simultaneous equations Tobit model. Amemiya then proved that his AGLS estimator is asymptotically more efficient for most cases. Finally, Newey (1987b) provided a definitive treatment of the efficiency of such two—step estimators by showing that the methods which they use to move from reduced-form estimates to structural estimates are dominated by the minimum distance method (which he calls the "minimum chi—square" method), though a form of the AGLS estimator is asymptotically equivalent to minimum distance estimator. If the minimum distance method is used, the efficiency of the structural estimates simply depends on the efficiency of the first-stage reduced form estimates from which they are derived. For the case in which only one equation has a censored dependent variable, efficient estimation of the reduced form requires augmenting the reduced form equation for the censored dependent variable with the residuals from the other reduced form equations. It is not clear how this result generalizes to models with more than one censored variable. Because they all rely on single-equation Tobit MLE in the first stage, none of the estimators Just discussed is robust to non—normality of the errors. However, if the errors are indeed normal. these estimators may be more efficient than our estimator because the single-equation Tobit MLE will be more efficient than Powell’s estimators. Newey (1985) is apparently the only previous treatment of a simultaneous equations Tobit model, which does not impose normality. His model consists of only one censored dependent variable with endogenous explanatory variables. He suggested estimating the reduced form equation for the censored dependent variable by Powell’s symmetrically censored least squares method, as we do, but he considered asxv and AGLS estimation of the structural coefficients rather than minimum distance. From Newey (1987b) it is apparent that this is not a substantive difference for the case of only one censored dependent variable. but our treatment is more general in the sense of more readily accommodating an arbitrary number of such variables. If the errors actually are normal, our estimators are less efficient than the normal MLE. This is the price one pays for gaining robustness to non-normality. It is natural to ask how high this price is likely to be. We attempt to gather some evidence on this question by comparing the asymptotic covariance matrices of our estimates with those of the normal MLE. For given values of the parameters and given sequences of exogenous variables, this is done by calculating the asymptotic covariance matrices of the MLE and of our estimators; it requires a complicated simulation because certain expectations are analytically intractable. An interesting finding is that the efficiency loss varies directly with the degree of censoring. This is the complement to the result (Arabmazar and Schmidt (1982)) that the size Of the inconsistency 0f the normal MLE caused by non-normal errors also varies directly with the degree of censoring. The structure of the dissertation is as follows. Chapter 2 sets out the model to be considered, and it derives the asymptotic distributions of the estimates of the reduced form and structural parameters. It treats both the homoskedastic and the heteroskedastic cases. For ease of exposition, it discusses two—equation models before going on to a treatment of the general model with m continuous dependent variables and n censored dependent variables. Chapter 3 derives the asymptotic covariance matrices of the various estimators for the special case that the error terms are bivariate normal. Chapter 4 then reports the results of several experiments which measure the efficiency loss from using our distribution-free estimators (rather than the normal MLE) when the error terms are bivariate normal. Chapter 5 gives our concluding remarks. CHAPTER TWO DISTRIBUTION—FREE ESTIMATION OF THE SIMULTANEOUS EQUATIONS TOBIT MODEL I. INTRODUCTION In this chapter we discuss the distribution-free estimation of the simultaneous equations Tobit model. We begin in section II with the simple case of a two equation model in which one dependent variable is continuous while the other is censored. In section III we treat the general case with an arbitrary number of equations and an arbitrary number of each type of dependent variable. The basic principle of estimation is straightforward. We begin by estimating the reduced form. Those reduced form equations with continuous dependent variables are estimated by ordinary least squares, while those reduced form equations with censored dependent variables are estimated by the censored least absolute deviations (CLAD) estimator or symmetrically censored least squares (SCLS) estimator of Powell (1984. 1986b). We derive the (Joint) asymptotic distribution of the reduced form estimates and a consistent estimator of their asymptotic covariance matrix. The structural coefficients are then estimated by the minimum. distance method of Chamberlain (1983L II. ONE CONTINUOUS AND ONE CENSORED DEPENDENT VARIABLES In this section we consider the simple two-equation model (1) Y1 : y1Y2* + X81 + 61 (2) Ya” : yaY1 + X82 + 62 where Ya” is a latent Tx1 variable, Y1 and Ya“ are Txi vectors; X is a TxK matrix of exogenous variables: 61 and £3 are unobservable Txi error terms: Y1 and Y2 are unknown scalar parameters to be estimated and i - YiYa ¢ 0; 51 and 83 are Kxi parameter vectors to be estimated. Some elements of 81 and Ba may be known to equal zero. Note that we do not observe Ya” but we do observe the Txi vector Y3. which is related to Ya” as follows (3) Yet = max(0, Yet"): At = 1, a, ..., T.- The reduced form of this model is 81 + Yifia $1 + Y1€2 (4) Y1 = X + 1 - YiYa 1 ' YiYa : xvi + V1 92 + Yafii G2 + Yaei (5) Y3” = + 1 - Y1Y2 1 - Y1Ya = Xfla + Va where Y1. Ya*, and X are the same as above; v1 and we are unknown Kxi parameter vectors to be estimated; V1 and V2 are unobservable TXi error terms. A. Application of CLAD Estimation 1. Estimation of the Reduced-Form Parameters (W1 and "2) We may estimate V1 by ordinary least squares (OLS), and we by censored least absolute deviation estimation (CLAD) introduced by Powell (1984). The CLAD estimator of Va is defined by Powell (1984) to be the value that minimizes the criterion function (6) ST(fl2) : (i/T)t§1)Yat - max(o, Xtfla)l. The CLAD estimator is shown by Powell to be consistent and asymptotically normal, subJect to certain regularity conditions. The assumptions are as follows: (A1) The parameter vector we is an interior point of the compact parameter space 0. (A2) (Xt, vt1, Vta” is a sequence of independently not (necessarily) identically distributed random vectors. (A3) Buxtn5 < x0 for all t and some positive x0. UT. the smallest characteristic root of the matrix E[(1/T)E1(Xtfla 2 so)xt'xt], has UT > we whenever T > To, some positive so, Do. and To. (A4) Defining Gt(z, *2: r) e E[1(lXtfigl 1 uxtuz)nxtnrl. the function Gt is 0(2) for 2 near zero, fig near we and r : O, 1, a, 3, uniformly in t, i.e.. Gt(z, fig. 1") S K12 if 0 i Z S 60. "fig - "all < 80 r = O, 1, 2, 3, for some positive H1 and 60. (A5) The conditional distribution of Vta given Xt has median zero for all t, and the corresponding distribution functions for the {vta} are continuously differentiable in a uniform neighborhood of zero. with density functions [ft(AlX)l which are uniformly bounded above and uniformly bounded away from zero. i.e.. ft(A|Xt) < L and ft(llxt) > k > 0 whenever IAI < k, some positive L and k, all t. (A6) The conditional density function ft(A|Xt) of the {Vta} is Lipschitz continuous: lft(A1|Xt) - ft(Angt)l 1 LOIA1 - Aal some Lo > 0 (A7) (a) There exist positive finite constants 5 and A such that, for all t, E(|vt13l1+5) < A and E()xt3xtxli+6) < A (J. x = 1. ..., K); (b) fiT = (1/T)EE(Xt’Xt) is nonsingular for (all) T sufficiently large. such that det NT > 5 > 0. (A8) There exist positive finite constants 6 and A such that for all t, E(|vt13xt3xtxli+6) < A (J.K=1....,K). ’ (A9) (a) E(Xt’vt1) = o. (b) EIXtJ§t|3*5 < A < m for some 5 > O, J : 1, ..., k and all t. (1/T)’%Ext’vt1 (c) E : var is uniformly positive (1/T)‘%§xt'zt definite, where at : 1(XtVa > 0)[1/a - 1(vta < 0)) (A10) There exist positive constants 6 and A such that for all t E()xtJ3xtht1)1+6) < A (J, x. 1 = 1. .... K) (A11) E(|1(xtwa > 0)xthtK|1+5) < A < m for some 5 > 0, all t, and J, x = 1. ..., x. (A12) BrgtvtixtJXtK|1+6 < A < m for some a > 0, all t. and a. Consistency and Asymptotic Distribution of (al. fig) To consider the Joint distribution of (fii. fie). we consider e1 — w, (X'X/T)-1(T-%)(§xt'vt1) E91 (7) 1T“) = a2 - we (cT/a)'1(T-%)(§xt'§t) F02 where (8a) CT 5 E[(1/T)Eaf(O|Xt)[1(Xtfla > 0)]xt'xt3 (£(01xt) = f(0) for homoscedasticity case) (8b) 91 : (T'“)(Ext’vt1) (8c) ea = (T-%)(Ext'gt) and (8d) gt = [1(xtna > 0)][1/2 - 1(vt2 < 0)] (Note that 5t is as given by Powell (1984, p.320, equation (A.14).) White (1980a) has proved that (T)%(a1 - «1) is consistent and asymptotically normal under Assumptions (A2), (A7), (A8) and (A9)(a)(c). (T)%(a2 - we) is also consistent and asymptotically normal under Assumptions (A1). (A2), (A3). (A4), (A5), and (A6) according to Powell (1984). The Joint asymptotic normality of fii 1'1 (T“) is derived below. fie - "2 e1 811 212 WhiCh is uniformly III M Ill Let COV 92 z1’12 222 positive definite under Assumption (A9). 8 is calculated as follows. 211 = covr(T-%)(§xt'vt1)) : (1/T)[EE(vat1Xt’Xt)] = vT (By Assumption (A2), (Xt, Vt1) is a sequence of independently distributed random vectors and E(Xt'vt1) : O by Assumption (A9).) Baa = cov{(T‘“)(EXt’Et)J = (1/T)t§cov(xt':t)1 (By Assumptions (A2), (Xt. Vta) is a sequence of independently distributed random vectors.) cov(xt':t) = E<§t3xt'xt) - [E(Xt’Et)]a ' gtaxt'xt = (1/4)[1(xtw2 > 0)xt*xt] E(§t2Xt’Xt) : (1/4)E([1(tha > 0)]xt'xt) E 0)][1/2 - 1(vta < 0)]; = Ewanlxlxt'[1(Xt"2 > 0))[1/2 - 1(vta < 0)); = Exlxt’[1(Xt"a > 0)])EVUJxr1/a - 1(Vta < 0)] = o (v Eualx[1/a - 1(vt2 < 0)] = o by Assumption (A5).) A cov(Xt'§t) = (1/4)E([1(th2 > 0))xt'xt) So, 322 : (1/T)[Ecov(xt’Et)] : (1/4T)EE[[1(Xtfla > 0)]xt'xt) = (1/4)HT where MT : (1/T)EE{[1(Xtfla > 0)]Xt'Xt] 213 : cov[(T'“)EXt'§t, (T'“)€Xt’vt1] Etrtr'%)§xt’zt)r(T'%)"t'vt11'3 — t EI(T‘”)§Xt’EtIEI(T'“)§Xt'Vt11 El[(T'“)§Xt'Etl[(T'“)Ext'vt11’l (7 E[(T‘“)Ext'vt1l = 0) : (1/T)EE(EtVt1Xt’Xt) 1L. According to the modified multivariate Liapounov Theorem of White (1980b). if Assumptions (A2), (A5). (A8) and (A9) are satisfied. the asymptotic distribution of 91 V 512 is N(O, ), where v : plim VT, e2 5'1‘": Mu/4 fl” = plim MT. E12 = plim 812. E9 1 1 The asymptotic distribution of then is F92 EVE’ fifi1afi’ N(0, fifi’1afi’ F(fi§/4)F’ where e = plim E = plim (X’X/T)'1 = fi‘1 e : plim F = plim (CT/2)-1 = (6/2)‘1 Therefore, a, — w, fi-ivrri fi'1512(5/a)'1 (T%) —> N(0, fie - we fi'15’12(5/2)'1 6-1fi.e-1 b. Consistent Estimator of Asymptotic Covariance Matrix A consistent estimator of E'1VE'1 is (X’X/T)'1(1/T)(Evtiaxt’xt)(X’X/T)'1, where at. : Yti - Xtfi1~ This has been proved by White (1980a). A consistent estimator of 6‘1fi,6"1 is éT‘ifiTéT'i as was proved by Powell (1984), where 5T 2 2(TeT)-1§[1(xtaa > 0))t1(o 1 Vta s aT))xt'xt 9T 5 (1/T)Ef1(xtfia > 0)]xt'xt Vta = Yta - Xtfia- 15 Here 5T is an appropriately chosen function of the data, and it is assumed that there is some non-stochastic sequence (cT) such that plim éT/CT : 1, cT : 0(1). cT-i = o(T%). That is, 5T ten§s>$o zero in probability, but at a rate slower than T‘“. (Note that 5T» NT. 5T are as given by Powell (1984. pp.312-314. equations (5.1), (5.3), and (5-6))~) The consistent estimator of (1/T)EE[§tvt1Xt'Xt] is (1/T)E[1(Xtaa > 0))[1/2 — 1(Vta < 0)]vt1xt'xt. The proof is in Appendix A. 2. Estimation of Structural-Form Parameters (Y1. 81. ya, sand 82) To estimate y1, 91’ ya, and 83, we adopt the minimum distance method. Minimum distance estimation for simultaneous equations is a generalization of three-stage least squares estimation according to Chamberlain (1983). The only difference is that in the usual linear model the reduced form is estimated by least squares, whereas we use least squares plus a semi—parametric method. The minimum distance estimator of a : (Y1! 91', ya, 83’)’ is derived from 3%? [a - f(a))'e-1ta — f(a)1 where w : f(a); a is a consistent estimator of Q - the asymptotic covariance matrix of T%(a - w)), and Q is positive definite. T : parameter space for a. To adopt the minimum distance principle, we need to add some assumptions. These assumptions are suggested by Chamberlain (1983). They are as follows. Assumption 1: T is a compact subset of R2K+a that contains the true value do and a neighborhood of do. Assumption 2: f(d) is continuous and has second partial derivatives on T; f(a) : f(d°) for a E 7 implies that a = «0; rank (F) : ax + a, where F : 6f(a°)/0a’- If the Assumptions above are satisfied, then a is consistent, and asymptotically normal: The - a) —> mo. 9). where o = (P'n-1P)-1 and P = o£(d°)/od'. B. Application of SCLS Estimation 1. Estimation of the Reduced-Form Parameters (W1 and we) We may estimate W1 by ordinary least square (OLS) and we by symmetrically censored least squares (SCLS) estimation for Tobit equation, which was introduced by Powell (1986). The SCLS estimator of we is defined to be the value that minimizes the criterion function sT(w2) = Eth - max(Yt/a, Xt"2)12 + {1(Yt > extw2)((Yt/a)2 - [max(o, Xt"2)]23 Powell shows that the SCLS estimator is consistent and asymptotically normal, subJect to certain regularity conditions. The assumptions are as follows: (A1)’ Same as (A1) in II.A.1. (A2)’ (A3)' (A4)' (A5)’ (A5)’ (A7)’ (A8)’ (A9)’ Same as (A2) in II.A.1. Buxtu4+fl < x0 for all t and some positive x0 and n. UT. the minimum characteristic root of the matrix AT = (1/T)EE[1(Xtfla z 5°)xt'xt], has uT > no whenever T > To, some positive 20, v0, and To. The conditional distribution of Vta given Xt is continuously and symmetrically distributed about zero, with densities which are bounded above and continuous and positive at zero, uniformly in t. That is, if F(AlXt, t) a Ft(x) is the conditional c.d.f. of Vta given Xt, then dFt(i) : ft(A)dA, where ft(l) : ft(—A), ft(x) < Lo and ft(l) > so whenever IAI < 50, some positive Lo and 50. Same as (A7) in II.A.1. Same as (A8) in II.A.1. Same as (A9) in II.A.1. Same as (A10) in II.A.1. El‘tVtixtJth'1+6 < A < m for some 5 > 0, all t, and J. K = 1. ..., K; E(|1(-tha < Vta < xtw2)xthtK|1+5) < A < m for some 5 > 0, all t. and J, k : 1, ..., K; where 5t : 1(tha > O)min[max(vta, 'Xt"2)- tha]. a. Consistency and Asymptotic Distribution of (51. 62) To consider the Joint distribution of (a1. fie), we cons ider a, - w, (X’X/T)'1(T‘“)(2Xt'vt1) A91 (TM) : t 5 ea — "a cor-1 ('r-M) ”EX" 5.) Bee where CT E (1/T)EE[1(_Xt"2 < Vta < Xt"2)Xt’Xt3 91 = (T‘ulfxt’vt1 ea = (T‘“)(Ext'§t) and Et = 1(Xtfla > O)min[max(vta, -Xtfia), Xtfla] A s (X'X/T)‘1 B a CT’1 White (1980a) has proved that fi1 is consistent and that (T)“(a1 — W1) is asymptotically normal under Assumptions (A2)’, (A5)’. (A6)' and (A7)'(c). Also fig is consistent and (T)“(fia — n2) is asymptotically normal under Assumptions (A1)’. (A2)', (A3)’, and (A4)’. according to Powell (1986). The asymptotic normality of *1 - 1"1 (T“) is derived below. *2 - "2 e1 211 212 Let cov a E e which is uniformly 92 2-12 822 positive definite under Assumption (A7)’. 2 is calculated as follows: 211 : VT (the same as in II.A.1.a) 232 : cov[(T'“)(EXt’Et)] = (l/T)[ECOV(Xt’Et)] (By Assumption (Aa)', (Xt- Vta) is a sequence of independently distributed random vectors.) cov(xt'gt) = E(§t3Xt’Xt) — [E(xt';t))3 v gtaxt'xt : [1(XtVe > O)]min[vt23, (xtwa)2)xt'xt E(;t3xt'xt) : E{[1(tha > O)]min[vtaa, (xtwa)31xt'xt) E(Xt’§t) : Elxt'[1(Xth > 0)]min[max(vta, -Xth), th213 : Exixt’[1(xtwa > 0)]3Ewu|x[min[max(vt2, -Xtfla), thall If Xtfla 1 o, E(Xt’§t) = o. If Xtfia > o. the conditional distribution of {min[max(vta, —tha), th33 given Xt is continuously and symmetrically distributed about zero under Assumption (A4)’. Then E(Xt’§t) = Ex(Xt’)Ewulximin[max(vt2, -tha), Xtflall : o. A cov(Xt'§t) = E([1(tha > O)]min[vtaa, (xtwa)a]xt'xt) So, 232 = (1/T)[Ecov(xt‘§t)] . = (1/T)EE{[1(tha > 0)]min[vtaa, (Xtfl2)a]Xt'th = DT 212 : (1/T)EE(§tVt1Xt'Xt) (The calculation steps are the same as in II.A.1.a.) According to the modified multivariate Liapounov Theorem by White (1980b), if Assumptions (A2)’, (A4)', and (A7)‘ are satisfied, the asymptotic distribution of CI 91 51a is N(o, ), where B = plim DT. Ga 3'12 5 V = Plim VT. E12 : plim 213. Also A61 AVA’ Afi1afi’ —> N(o, ). Beg AE'125’ fibfi’ 20 whore A : plim A = Plim (it’ll/'1‘)"1 = 3‘1 a = plim B = plim cT-i = 5-1 Therefore, G1 - 171 fi-ivfi-1 fi'15128—1 (T94) —> mo. a3 ' "a fi'15’125‘1 6'156‘1 b. Consistent Estimator of Asymptotic Covariance Matrix The consistent estimator of fi'1Vfi‘1 is the same as in II.A.1.b. The consistent estimator of 6 is 8 = (1/T)E[1(-Xtfia < Vta < xtaa)lxt'xt. Where Vta = Yta ' Xtfia = Xt"2’+ Vta” ' xtfia = Vta” ‘ XtBt- (Vta* a maxivta, —th21, 3t = fig - we). The proof is in Appendix B. The consistent estimation of D is 5T = (1/T)E[1(Xtfig > O)min{vtaa. (xtee)33xt'xt. This was proved by Powell (1986). The consistent estimator of (1/T)EE[§tVt1xt'th is (1/T)E[1(Xta2 > 0)]min[max(vta, -Xtfia)» Xtfialvtixt'xt. The proof is in Appendix B. 2. Estimation of Structural-Form Parameters (Y1. 91, ya, and 82) To estimate Y1. 81. ya, and 92» we adopt the minimum distance estimator, as shown in II.A.E. The‘only difference is in the asymptotic variance-covariance matrix Q of T%(fi - w). 21 111. m CONTINUOUS AND n CENSORED DEPENDENT VARIABLES In this section we consider the general model (1) Yt* : Yt*B + xtr + at t = 1, .... T or (a) Y” = Y*B + XF + e where Yt* is a 1x6 row vector of endogenous variables; Xt is a 1xK row vector of exogenous variables; B is a 6x6 nonsingular matrix to be estimated and I - B is nonsingular; P is a finite unknown KxG matrix to be estimated; 6t is a 1x6 row vector of unobservable error terms; Y” is a TxG matrix of endogenous variables; X is a TxK matrix of exogenous variables: 6 is a TxG matrix of unobservable error terms. We do not observe (all of) Yt*, but we observe the 1x6 vector Yt, defined as follows. (3) Ytl : Yt1* 1 : 1, ..., m (4) Ytl : max(0, Yt1*), i : m+1, .... G where G : m + n. 80 we do observe the first m variables in Yt*, but the last n variables are censored from below at zero. The reduced form of this model is (5) Yt* : xtw + vt or (6) Y” : Xv + v where Y*, Yt*, X, and Xt are the same as above; w : P(I ~ B)“1 is a KxG matrix of reduced form 22 parameters; vt = €t(I - B)“1 is a 1x6 row vector of unobservable error terms. A. Application of GLAD Estimation 1. Estimation of Reduced-Form Parameters w Let vi be the 1th column of w. Then we may estimate V1 (1 : 1, ..., m) by ordinary least squares (OLS) and up (p : m+1, ..., G) by Powell’s censored least absolute deviation (CLAD) estimator. The assumptions are as follows: (A1) The parameter vectors {up} (p = m+1, ..., G) are interior points of the compact parameter space 0. (A2) (Xt, Vt)’ is a sequence of independently not (necessarily) identically distributed random vectors. (A3) Euxtu5 < k0 for all t and some positive k0. UT, the smallest characteristic root of the matrix E[(1/T)E1(Xt1rp 2. Eolxt'XtJ' p : m+1, ..., G has UT > v0 whenever T > To. some positive 50, uo, and To. (A4) Defining th(z, ap. r) a E[1(Ixt'ep| 1 nxtnz)uxtnrl. thq(z, fip. fiq. r) a E[1(|Xt’fipl 1 "thz. lXt’fiql 1 uxtuz)nxturl the functions th and thq are all 0(2) for 2 near zero, 5p near up. fig near flq, p, q = m+1, ..., G, p ¢ q and r = o, i, a. 3, uniformly in t, i.e.. th(z, fip, r) 1 kpz, thq(z. fip. fiq. r) 1 kpqz. if 0 i z i 60. "I‘l’p " 17p" < 600 “‘fi’q ‘ "q" < 601 (A5) (A6) (A7) (A8) (A9) (A10) 23 r = O, 1. a. 3, for some positive kp, kpq and 50. The conditional distribution of vtp (p : m+1, ..., G) given Xt has median zero for all t, and the corresponding distribution functions for the [vtpl are continuously differentiable in a uniform neighborhood of zero, with density functions {ftp(AlX)l whiCh are uniformly bounded above and uniformly bounded away from zero, i.e.. ftp(AlXt) < L and ftp(xlxt) > k > 0 whenever Ill < k, some k > 0, all t. The conditional density function ftp(llxt) of the {vtpz is Lipschitz continuous: lftp(A1lXt) - ftp(lalxt)| 1 LOIA1 - A3! some Lo > o (a) Erxtjxtxre < m J, x = 1. ..., K; (b) 9 e E(X’X/T) has uniformly full column rank. (a) E(Xt'vt1) = 0, i = 1, ..., m; (b) EnxtJvt1|3+6 < m, i = 1, ..., m, J = 1, .... K; (c) v : var(vecT‘%X'V) is uniformly positive definite, where W : (v1, ..., vm, §m+1, ..., is) is a TxG matrix. (a v EIXtJEtp|2*5 < A < w for some 5 > O, p = m+1, ..., G; J = 1, ..., k; (b) VP : var(vecT‘KX'wp) is uniformly positive definite, where Wp = (Em+1' ..., :6) is a Txn matrix. El1(xtwp > 0)xthtK|1+6 < A < m for some a > 0, all t, P = m+1, ..., G and J, K = 1, .... K. 24 (A11) ElttpvtixtJXtK|1+3 < A < m for some 3 > 0, all t, 1:1,...,m,p:m+1,....GaJldJ,K:1,...,K. ElitpithtJXtKI1+5 < A < m for some 5 > 0, all t, P.Q=m+1,...,G.p¢qandJ,k:1,...,K. where Etp = [1(Xtflp > 0)][1/2 - 1(th < 0)]. th = [1(xtwq > 0)111/2 - 1(vtq < 0)] a. Consistency and Asymptotic Distribution of (fi1, ..., am, fim+1- ..., *6) To consider the Joint distribution of (a1. ..., am, am+11 ..., as), we consider J a, - w, - J (X’X/T)'1(T'%)Ext’vt1 - [ E191 - am - Wm (X'X/Tl‘11T‘“)§Xt'vtm Emem (TK) --------- : ------------------------ E ------ as - "a [CTa/al'itT'“)§xt'€ta Bees *6 - 1rG J L [CTG/21'1(T'“)Ext’§te J EGeG J where ch e 3((1/T)Eafp(ouxt)[1(xtwp > 0)]xt'xt) p = m+1, ..., G; E1 = (X’X/T)’1 i : 1. ..., m: 6i = (T‘“)§Xt’vti3 2p = tch/al'i: op = (T‘%)Ext’§tp and §tp : [1(Xt'rrp > 0)][1/2 — 1(vtp < 0)]; or = m+1. Define a Txm matrix W, = (v1. .... vm). Then VecT-“X’Wi 25 is :asymptotically normal under Assumptions (A2), (A7), and (A8). VecT-“X’Wp is also asymptotically normal under Assumptions (A1), (A2). (A3). (A4). (A5), (A6) and (A9). The asymptotic normality Of vecT‘KX‘W is derived below. F ‘ , ' 61 i 311 --~ z31m ; E1d - 816 6111 2m1 '-- Emm : Ema 2me Let cov —--- E E E ----------------------------- ed XV011 '-- B'dm E Edd 11- EGG L 96 J 3’61 -~- 2'Gm 1 26d 1 2Ge _ t E11 E12 2’12 322 L I: is calculated as below: 211 has typical element Ehi' h,i : 1, 2, ..., m, given by an, : cov[(T‘%)EXt'vth, (T‘%)8Xt’vt1] : (1/T)EE(thVtiXt'Xt)l (By Assumption (A2), (Xt, vt) is a sequence of independently distributed random vectors and E(Xt’vt1) : O by Assumption (A8).) 322 has typical element qu, p,q : m+1, ..., G, given by zpq = COV[(T-%)EXt’Etpg (T'%)§Xt'itql EtttT-Mlgxt'ztpl[(T'%)§Xt'thl'l — E[(T'%)Ext'Etp]E[(T”“)EXt’§tql E(t(T'%)§xt':tpl[(T-%)§xt';tql'l (r E(Xt’§tp) = Etxt'[1(xtwp > 0)][1/2 — 1(vtp < 0)]; 26 Exqulxixtl[1(Xt"p > 0)][1/2 - 1(vtp < 0)); O by Assumption (A5)) (1/T)EE(Etp§tht'Xt) (1/4)MTp if p = q. where MTp : (1/T)EE([1(Xtfip > 0))xt'xtl.) 812 has typical element 81p, 1 : 1, ..., m; p : m+1, given by zip : cov[(T'%)§xt'gtp. (T-%)zxt'vt1) El[(T'%)EXt’Etpl[(T‘“)EXt’thl’l - EE(T'%)§Xt'Etp]E[(T’%)EXt’Vt1] EI[(T'%)EXt‘Etp][(T'“)Ext’vtll'l (7 EttT-%){xt'vt,l = 0) (i/T)EE(§tth1Xt'Xt) According to the modified multivariate Liapounov Theorem by White (1980b), if Assumptions (A2), (A5), and (A9) are satisfied. (TK) The typical By calculation above, 1"1 1Tm A11 A12 --------- is N(0, ). "a A’12 A22 1rG A (block) element of A11 is plim(Eth1E1) |x[1/2 - 1(vtp < 0)) (A8). the asymptotic distribution of vecT'WX’v is asymptotically normal. 27 Q‘ifin19'1, where 9'1 : plim Eh or plim E1, and where Ehi = plim Ehii h,i : 1, 2, ..., m. The typical element of A12 is plim(Eh2hpEp) : 9-ibnpgp; h = 1, ..., m; p = m+1, .... G; where bhp = plim zhp and where ep = plim Ep : plim(ch/a)‘1. The typical element of A22 is Epfipng, p,q : m+1, ..., G. Where fipq = plim qu. b. Consistent Estimator of Asymptotic Covariance Matrix The consistent estimator of plim CTp and plim MTp is 5Tp and fiTp, which was proved by Powell (1984), where eTp e 2(TaT)-1Er1(xtap > 0)][1(O 1 vtp 1 5T))xt'xt flTp E (1/T)E[1(Xtfip > 0)lXt’Xt vtp = Ytp - xtap p = m+1, ..., G 5T is an appropriately chosen function of the data, such that plim ET/CT = 1, cT : 0(1). cT-i : o(T%). T—>m The consistent estimator of Phi (h,i : 1, ..., m) is (1/T)Evtnvtlxt'xt. where vth = Yth - xtah and at, : Yti - xtai- The proof is similar to Theorem 6.3 by White (1984). The consistent estimator of 51p (1 = 1, ..., m; p : m+1, .... G) is (1/T)E[1(Xtfip > 0)][1/2 - 1(vtp < 0)]Vtixt'xt- The proof is the same as in Appendix A. The consistent estimator of qu (p.q = m+1, .... G) is (1/T)E[1(Xtep > 0)][1/2 - 1(th < 0)][1(Xtfiq > 0)][1/2 - 1(vtq < 0)]Xt’Xt, where p, q = m+1, ..., G and p ¢ q. The proof is given in Appendix C. 28 2. Estimation of Structural-Form Parameters B : (9h Be) and P = ( Y1. .... YG) All steps regarding the estimation of structural-form parameters are the same as in II.A.a. The only difference is the asymptotic variance-covariance matrix Q of T“(& - w)- B. Application of SCLS Estimation 1. Estimation of Reduced-Form Parameters n We may estimate "1 (i : 1, ..., m) by ordinary least squares (OLS) and up (p : m+1, ..., G) by symmetrically censored least squares (SCLS). The assumptions are as below: (Ai)’ Same as (A1) in III.A.1. (A2)’ Same as (A2) in III.A.1. (A3)' Euxtn4+n < KO for all t and some positive K0 and n. vT, the smallest characteristic root of the matrix NT : E[(1/T)Ei(thp 2 So)Xt’Xt1» p = m+1, .... G, has ”T > uo whenever T > To, some positive 80, uo, and To. (A4)' The conditional distribution of vtp (p : m+1, .... G) given Xt is continuously and symmetrically distributed about zero, with densities which are bounded above and continuous and positive at zero, uniformly in t. That is, if Fp(xlXt, t) E Ftp(X) is the conditional c.d.f. of vtp given Xt. then dFtp(A) : ftp(x)dx, where ftp(x) : ftp(-A), ftp(X) < Lo and ftp(X) > so whenever IA! < 50, some positive Lo and 60. (A5)’ Same as (A7) in III.A.1. 6 > 0. = G. P ¢ Q. Where gtp th ' ‘36) I 1"1 {(1/T)EE[1(—thp < vtp < xtwp)xt'xt] P (X’X/T)'1 all t, 29 Same as (A8) in III.A.1. Same as (A9) in III.A.1. Elvtpl < A < m uniformly in t and p ElitpvtixtJXtK|1+6 < A < m for some m+1, ..., G and J. K ElitpithtJthl1*5 < A < m for some all t, and J, K = 1. E(l1(—xtwp < vtp < xtwp)xthtK|1+6) p = m+1, .... G and J, 1(thp > 0)min[max(vtp, 1(thq > O)min[max(vtq. a. Asymptotic Distribution of (a1. .... am. fim+1- To consider the Joint distribution of we consider (X'X/T)‘1(T‘“)Ext’vt1 (X’X/T)‘1(T’%)Ext’vtm (CTd)-1(T_%)Ext'5ta (CTG>" 0)]min[max(vta, 'Xt"2)' xtwa] a = m+L Define a Txm matrix W1 : (v1, ..., vm). Then VecT-“X’Wi is asymptotically normal under Assumptions (A2)’, (A5)’, and (A6)'. Vec(T)-%x'vp (vp : (zm,1. ..., £6). a Txn matrix) is also asymptotically normal under Assumptions (A1)'. (A2)’. (A3)', and (A4)'. The asymptotic normality of vec(T)'%X'V (W = (v1. ..., vm, §m+1, ..., £6), a TxG matrix) is derived below. I 91 i 311 -- 21m 1 31a --- 216 6:11 Eml - 3mm : Ema '-' 8me Let cov -——— E 2 E ——————————————————————————— ed Edi '-' 80cm 3 Bad - - EGG 96 J EG1 ... EGm : BGa - EGG L. I. _ $11 512 2'13 222 2 is calculated as below. has typical element Ehiv h,i : 1,2, ..., m, given by = (1/T)EE(Vtthixt’Xt)3 (same as in III.A.1.a.) has typical element qu, p,q: m+1, ..., G, given bY = COV[(T'“)Ext’£tp. (T'“)Ext'itq] = E{[(T'“)§Xt’itp][(T‘%)Ext'itq]’l — 31 EL(T-%)§xt':tp1Et(T’%)§xt'th1 = Et[(T'%)§xt'§tp1ttT‘%)§xt'§tq1'1 (-.- E(xt':tp) = Etxtwuxtwp > O))min[max(vtp, -xt1rp). thpll : Exixt'[1(xtwp > 0)]1Ev¢lx{min[max(vtp, -thp), Xtflpll If xtwp s o, mxt'gtp) = o. If Xtvp > O, the conditional distribution of (mintmax(vtp, -thp), thp] given Xt is continuously and symmetrically distributed about zero by Assumption (A4)'. Then E(Xt’§tp) : Ex(xt’)Ev¢,x{min[max(vtp, —thp), thpll = O.) = (1/T)EE(§tp§tht'xt) 212 has typical element 21p. 1 : 1, ..., m; p = m+1, ..., G, given by zip : (1/T)EE(§tthixt'xt) (The calculation steps are the same as in III.A.1.a.) According to the modified multivariate Liapounov Theorem by White (1980b). if Assumptions (A2)’, (A4)’, (A6)’, and (A7)’ are satisfied, vecT'WX'W is asymptotically normal. By the calculation above, the asymptotic distribution of 32 *1 - "1 6m - Trm A11 A12 (TK) --------- is N(O. L fie — “a A’ia A22 fiG - 1TG L .- The typical (block) element of A11 is plim(Ah2n1A1) = Q‘1fih19'1. where 9'1 : plim Ah or plim A1 and where fihi : plim Bhi‘ h,i : 1,2. ..., m. The typical element of A13 is plim AnshpAp = Q‘iehpxp; h : 1..... m; p = m+1,.... G; where Enp : plim zhp and where AP : plim Ap : plim(CTp)'1. The typical element of A23 is Apfipqzq. p,q : m+1,..., G; where fipq : plim DTp (if p : q) and DTp : (i/T)EE{[1(Xtfla > O)]min[vtaa, (xtwa)21xt’xtz. b. Consistent Estimator of Asymptotic Covariance Matrix The consistent estimator of plim CTp (p : m+1, is ap = (1/T)E[1(-xtap < vtp < xtap)lxt'xt. where th : Ytp - Xtap : thp + vtp* — Xtfip : vtp* - xtstp. (vtp* : maxivtp, -thp), 3tp : ap - up.) The proof is the same as in Appendix B. The consistent estimator of plim DTp (p = m+1, ..., G) is sTp = (1/T)E[1(xtap > 0)m1n[$tpa. (xtap)21xt'xt. This was proved by Powell (1986). The consistent estimator of qu is (i/T)E[1(Xtfip > O)]min[max(vtp, -Xtfip). Xtfip][1(xtfiq > 33 O)]Ihin[max(vtq, —Xtfiq), Xtfiqlxt'xt- The proof is given in Appendix D. The consistent estimator of 5hi is (l/T)Evthvt1xt'xt. where Vth : Yth — Xtfih and th : Yti - Xtfil. The proof is similar to Theorem 6.3 by White (1984). The consistent estimator of hip is (i/T)E[1(Xti}p > O)]min[max($tp, -xtap), xtaplvtixt'xt. The proof is the same as in Appendix B. 2. Estimation of Structural-Form Parameters B - (81, Be) and F = ( Y1. .-.. YG) It is the same as in II.A.e. The only difference is the asymptotic covariance matrix Q of T“(z¢r — 1T). CHAPTER THREE DERIVATIOK OF ASYHPTOTIC COVARIAHCE MATRICES WHEN THE ERROR TERMS ARE BIVARIATE HORHAL The purpose of this chapter and of the next chapter is to compare the efficiency of our semiparametric estimates to the efficiency of the maximum likelihood estimates, when the error terms are bivariate normal. Our estimates will be less efficient than the MLE. and we wish to see how large this efficiency difference is. The actual comparison of these efficiencies will be given in the next chapter, by comparing the asymptotic covariance matrices of our estimates and of the MLE for given parameter values and exogenous variables. In this chapter we perform the preliminary task of deriving the asymptotic covariance matrix of the MLE, and we show the simplifications that arise in the formulae for the asymptotic covariance matrices of our estimators in the case that the error terms are bivariate normal. In this chapter and in chapter four we restrict our attention to the following two-equation model: (1) Yti : Xtfl1 + Vti (2) Yta‘ : Xtfla + Vta Yta = Yta” if Yt2* > O : 0 otherwise We assume Vti and Vta are independently and identically distributed as bivariate normal with zero mean and covariance matri X 31+ IIIIIIIEaIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII|| via 012 (3) ' . U12 322 First, we derive the asymptotic covariance matrix of the maximum likelihood estimator (MLE). Under certain regularity conditions, the MLE is consistent. asymptotically normal and efficient. The asymptotic covariance matrix of the MLE is therefore equal to the Cramer Rao lower bound, which is the inverse 0f the information matrix. The information matrix iS the expectation of minus the second derivatives 0f the 108 likelihood function. That is, aalogL(e) I = -E ( —————————- ). where e is the vector of unknown aeoe’ Parameters. The log likelihood function for the model (1) - (a) is: (4) lnL(e) = c - Tlno, - (Ext)1noa - —%—(Ext)1n(i - pa) + Exti‘[(Yt1 ' Xt"1)a/‘°1a * 39(Yt1 - Xt"1)(Yta - XtVa)/Uiba + (Yta - Xtflg)2/Uaa]/[2(1 — p3)11 + E(i — xt)[-(Yt1 - th1)3/ao133 + 5(1 - xt)1no(wt) where wt = [-Xtfia - pca(Yt1 - xtw1)/o11/[o3(1 — 93)“); At = 1(Yta > 0); e : (“1" "2'. 013. 623. p)’ : §(o) is the standard univariate normal distribution function. With the loglikelihood function stated above, we may calculate the elements of the information matrix as follows. 36 ' (Details are provided in Appendix E and Appendix F. ) —E[alnL(e)/6w16fl1’] = -E[pa - 1 — §(at)p2]Xt'Xt/[c12(1 + EE(1 - it)paztxt'xt/[o12(1 - p3 (where at : (th‘a)/‘oa. and 2t = ¢2(wt)/§3(wt) + wt¢(wt)/§(wt).) -E[61nL(9)/6w16wa’] : -§pi(at)xt'xt/[o1oa(i - 92)] — EE(1 — xt)pztxt'xt/[o1oa(i - pa —E[OlnL(e)/Ov16o12] = -§p(1 - ap3)¢(at)xt'/[2o13(1 - pa + EE(1 - At)Xt’{th/[2613(1 - pa) + pazt(Yt1 — Xtfl1)/[2614(1 - 93)] (where Mt = ¢(Wt)/§(wt)-) -E[alnL(e)/6w16o22] = -§962¢(at)xt’/[Eviba3(1 - p3)] + E§(1 — it)p(tha)ZtXt’ /[aciba3(1 - 93)] -E[OlnL(e)/0W1OP) = E¢(at)xt'/[o1(1 - pa): - EE(1 - At)Xt’[Mt/[c1(i - p2)% + pZt[p(Xtfla)/Ua + (Yt1 - xtw1)/o /[o1(1 - p2)311 -E[a1nL(e)/awaow2'1 : E§(at)xt'xt/[o23(1 - pa) + EE(1 — xt)ztxt'xt/[o32(1 - 93)] -E[61nL(e)/ow36o12] = —Ep3¢(at)xti/[ao13oa(i - p3)] - 32(1 - xt)pzt(Yt1 - xtw1)xt' /[ao13og(1 - p3)1 -E[61nL(e)/awaao221 = -§(pa - 2)¢(at)Xt'/[2623(1 - 93)] - 35(1 - xt)xt'{Mt/taoa3(1 - pa)“ + (xtwa)zt/[aoa4(1 - 93)]: -E[BlnL(e)/6wadp] -E[o1nL(e)/eo1aao131 37 -Ep¢(at)xt'/Ioa(1 - p2)1 + EE(1 - At)xt’[th/[oa(1 — 93)“) + Zt[P(Xt"a)/Ua + (Yti — th1)/c1) /[oa(1 - p3)311 -T/(2o14) - E[-a + 392/2 + paN1t(Xtfla) /(aoaa)1§(at)/[ao14(1 - pa): + Eli 9(at)1[1 - paflat(xtw2)/Uaal/o14 + 33(1 - it)(3pht(Yt1 - xtn1) /[4o15(1 - 93)“) + paZt(Yt1 - thila /[4b15(1 - 92)]; (where "it = 62¢(at)/§(at). and Kat = -Ua¢(at)/[1 - §(at)l-) -E[61nL(6)/Ooiaooaal —E[61nL(e)/6o126p] -E[ainL(e)/aoaaaoaal -E[61nL(e)/Ocaadp] ~§p3§(at)to23 — H1t(XtWa)l /[4v1aoa4(1 - p2>1 + E¥(i - At)p(tha)Zt(Yt1 - Xtfli) /[4613oa3(1 — pa): -§§(at)[p - 9‘92 + 1)/2 + 9(1 - pa)n1t(xtwa)/(aoaa)1 /[o12(1 — p2)21 - EE(1 - At)(Yt1 - xtw1) {Mt/rao13(1 - p3)¥1 + pzt[p(xtwa)/oa + (Yti - th1)o1] /[ao13(1 - 92,2], ~§o(at)/(ao24) — {9(at) [-1 + ape/4 + (4 - 393)ult(xtwa)/(4oaa)1 /toa4(1 - pa): + E§(1 — xt)£3Mt(xtwa)/[4oa5(1 - 93)“) + (xtwa)azt/[4v36(1 - 93)]: -§p§(at)[1 - N1t(tha)/Uaa]/[2baa(1 — pa w— 38 - 35(1 - At)!p(xtwa)Ht/[2og3(1 - pa)x + (Xtfla)ZtEP(XtW2)/ba + (Yti - X;W1)/U1l/[3Ua3(1 - 93>311 -E[61nL(9)/OPOPJ = -§(1 + 93)§(at)/(1 - p3)3 - E§(at)[294 - 4 + (p4 - ape + 3)/uaa /(1 - 92,3 + 35(1 - At) {ut[(1 + 2p2)o1(xtwa) + 3po3(Yt1 - Xf /[u1va(1 4 p3>¥1 + zttp 0)xt'xt1-1 L. The derivation of the above matrix is in Appendix G. The asymptotic covariance matrix of the SCLS estimate is: 39 ‘013(X'X)'1 '013(X'X)‘1 cov( A ) = "2 o12(x'xr1 (1/T)6‘1DE‘1 where '6 = (1/T)E1(xtwa > onamat) - ilxt'Xt h = (i/T)§1(xtna > 0){c23[a@(at) — 1 — 2at¢(at)] + 2(xtwa)2[1 — §(at)11xt'xt The derivation of the above matrix is in Appendix H. To make our comparison complete, we also consider the estimator of Amemiya (1979). Amemiya adopted the method suggested by Nelson and Olsen (1978) of estimating the first reduced-form equation by OLS and the second reduced-form equation by (single equation) MLE. Therefore. Amemiya's estimate is as efficient as the Joint MLE estimate of the reduced-form coefficients when the correlation between the error terms in the two equations is zero. However, when the correlation between the error terms is not zero, Amemiya’s estimate is less efficient than the Joint MLE estimate. Amemiya's estimate of the reduced—form coefficients should be more efficient than semiparametric estimates when the error terms are bivariate normal. To derive estimates of the structural coefficients from the estimates of the reduced-form coefficients, Amemiya adopted a method he called generalized least squares, and which we will refer to as "AGLS". Hewey (1987b). section 4 has shown that Amemiya's GLS procedure is asymptotically equivalent to the minimum distance method that we use, so that the AGLS 40 structural estimates are asymptotically as efficient as Joint MLE when the correlation between the error terms is zero. When this correlation is non-zero we should expect the AGLS estimates to be more efficient than our distribution-free estimates, but less efficient than the Joint MLE. Incidentally, for models (like our present model) with only one censored variable, Hewey (i987b, section 5) shows that asymptotically efficient structural estimates can be derived by AGLS (or minimum distance method) based on conditional MLE estimates of the reduced form equations. We do not need to consider this possibility separately because we already include one asymptotically efficient estimator (Joint MLE) in our comparisons. The asymptotic covariance matrix of Amemiya’s estimate is: . o12(X’X)‘1 o1a(x'X)-1 W1 cov( ) : fie o12(x'X)-1 -(I,O){E[6310gL(e)/Oeiaei’]3'1(I,0)' where 61' = (we'. baa), O is a kx1 column vector of zeroes, I is an kxk identity matrix, 63103L(91) gAtxt'xt EBtXt' E -——————————— : 69 69 ’ EB X EC 1 1 t t t t t At -!at¢(at) - ¢3(at)/[1 - §(at)1 - 9(at)1/caa Bt = Iat3¢(at) + ¢(at) - at¢3(at)/[1 - o 0)][1/2 - 1(Vt2 < 0))vt1xt’xt. The proof is as below. a = (1/T)E[1(Xtaa > 0)][1/2 — 1(vt2 < 0)]vt1xt'xt - (1/T)E[1(Xtfla > 0)][1/2 - 1(vta < 0)]vt1Xt'Xt IA (1/T)El[1(Xtfia > 0)][1/2 - 1(vta < 0)]Vt1Xt’Xt - [1(Xt"2 > 0)][1/2 - 1(vta < 0)]vt1Xt'th IA (1/T)E(1/2)l1(Xtfia > 0)Vt1Xt’Xt — 1(Xt"2 > 0)Vt1Xt'Xt| + (1/T)El[1(Xtfia > 0)][1(Vta < 0)]Vt1Xt’Xt — [1(Xtfla > 0)][1(Vta < 0)]vt1Xt’th G1 + 02 IA (3/2T)E[1(1thal 1 uxtuueg — wau)llvt1lflxtfla + (3/2T)§[1(Ixtwel 1 uxtuuea - wgfl)]lvt1lflxtfla + (1/T)E[1(Ivtal 1 uxtuuaa — flau)llvt1lflxtfla + (1/T)E[1(Ivtal 1 “Xtflflfig — flaH)]|Vt1lHXtfla (3/2)C + (3/2)C’ + D + D’ (T 1. d1 part Xtfia tha a1 (1) + + ¢ 0 (ii) + - ¢ 0 (111) - + ¢ 0 - - - 0 Case (1)? Xtfia > 0, Xtfla >70. 79 Case (11): 80 d1 1 (1/T)§(1/2)lvt1Xt’Xt - Vt1Xt’th = (i/T)E(1/2)lXt(W1 - fi1)Xt’th 1 (1/T)EHW1 - ainuxtn3 L 0 (by consistency of W1 and Assumption (A2)) xtaa > o, Xtfla 1 0. d1 1 (1/T)E[1(lxtwal S HXtHflfig - flau)]lvt1lHXtHa : (1/2)C' Case (111): Xtfia 1 0, Xtfla > o. (1) (2) (3) (4) (5) Xtfie + + 61 1 (1/T)E[1(1thal 1 uxtuuaa - vaH)]lvt1lHXtua : (1/2)C xt“2 Vta Vta 0‘2 + + + : 0 + + - ¢ 0 + - + ¢ 0 — + + : 0 + + + : 0 - + + : o + - + = O + + - ¢ 0 _ — + ¢ 0 _ + — : O + - - ¢ 0 (the same as Case (1) above) - - + = 0 _ + - = o (6) + Case (1): Case (2): Case (3): Case (4): 81 _ — . - ¢ 0 Vtg l O, Vta < O. Xtfia > O. Xtfla > O. lvtal 1 lvta - Vt2' ‘ Vta = Yt2” - xt"2 Vte = Yta - Xtfia Yta - Xtfia l o <=> maxto, Ytg*] - Xtfia l 0 <=> 712* > 0 (7 Xtfia > 0) <=> Yt2 : Yt2' 5 th2 ' Vtai = IYt2 ' XtTTa - Yt2 + XtfiaI = IXt(fig - We)! «a 1 (1/T)E[1(|Vtal 1 uxtuuaa - figfl)llvt1lflxtna : D Vta < 0, Vta l o. Xtfia > 0. xtwa > o. v12 : Yt2” - thg l 0 Yt2* z Xtva > o 1 Yt2 = Yt2” thal 5 tha - Vtal = 'Yta - Xt"2 - Yta + Xtfi'al = IXt(fia - Wa)| «a s (1/T)E[1(lvtal 1 uxtuuaa - wanlllvt1luxtu3 : D’ Vta l 0, Vta < o, Xtfia 1 0, xtwa > 0. «a 1 (1/T)§[1(Ixtwan 1 uxtuuaa - flafl)llvt1lflxtna = C Vta < o, v12 1 o, xtaa > o, XtWa 1 0. ca 1 (i/T)E[1(Ivtal 1 uxtnuaa - we")llvt1|uxtua = D’ or 62 1 (i/T)E[1(lxtwel 1 nxtuuea - waulllvt1luxtu3 : C' 82 case (5): Vtg < O, Vta < 0, Xtfig 1 0, tha > 0. ca 1 (1/T)§[1(Ixtwa| 1 uxtuuaa - wafl)llvt1lnxtfla = C Case (6): Vte < o, Vta < 0, Xtfia > o, Xt"2 1 0. Ga 1 (1/T)§[1(1xtwal 1 uxtnuaa — flafl)llvt11flxtfla :C’ ) plim C : 0. The proof is similar to that of p.323 (A27) in Powell’s [1984] paper. We may also prove plim C' : plim C = O as below. °t1 = Yt1 - Xtfi1 = xtfl’1 + Vt1 - xtr‘ri = vt1 + Xt("1 ' 51) 19111 = Ivt1 + xt(w1 - £1)! 1 Ivt1l + lxt(a1 - "1)1 (1/T)§[1(lxtflgl 1 uxtuuag - wafl)llvt1luxtua — (1/T)E[1(|thgl 1 "Xtflflfig - nau)l|vtinuxtn3 = (1/T)E[1(lxtwal 1 nxtuuaa - flafl)l(lvt1l - lvt1l)HXtN2 1 (i/T)§[1(|thal 1 uxtnuaa — flafl)]HXtH3N&1 - «in By Assumption (A4) and the consistency of ai, (1/T)§[1(IX.wZI 1 uxtuuaa - we")]HXtH3Hfi1 - win —> 0(1) a (1/T)§11(Ixtwgl 1 “Xtflflfia — wafllllvt1lflxtfla L (1/T)E[1(|Xtflgl 1 uxtnuaa - flafl)llvt1luxtua = 0 SO, plim C’ = plim C : 0. By the same way as above, D and D’ could be shown to converge to zero in probability. So, (1/T)E[1(Xt%2 > 0)][1/2 -1(Vt2 < 0)]vt1xt'xt L (1/T)E[1(tha > 0)][1/2 -1(vta < 0))Vt1Xt’Xt By Assumption (A12), (1/T)§§tvtixtlxt R (1/T)EE[§tvt1Xt’Xt]. APPENDIX B Proof of Consistency of Estimator of 6 and E12 in II.B.1.b. 0f Chapter Two .The consistent estimator Of 5 is 6 = (1/T)E[1(-Xtfia < v.2 < Xtfig)]Xt’Xt. Where Vta = Yta ' Xtfia = Xtfla + Vta” - Xtfia = Vt2” ‘ tht- (Vt2* = maXIVt2' 'Xtflal. 5t = 72 ‘ We) l11/T)E[1('Xtfia < v13 < Xtfia)1XtJth - (1/T)E[1(-th2 < Vt2 < Xtfla)1XtJthl 1 (1/T)E|[1(-xtwa — xtst < vt2* - xtst < Xtfla + xtst)] — [11'Xt72 < vta < tha)]luxtna : (1/T)§1[1(-Xt"2 < vt2* < Xtva + extst)l - [11'Xtflg < vta < Xth)]lHXtH2 = (i/T)E|[1(—tha < vta < xtwa + extst)] — [1(‘Xtflg < vtg < xtwa)lluxtu2 (v v13” : max(vt3. —th2) : Vta' : Vt2 if vt2* > “Xt"2-) Since Euth4+fl is uniformly bounded. this term converges to zero almost surely by the strong consistency of fig- A (i/T)§[1(-Xtfia < eta < Xtfia)]Xt'Xt L (1/T)E[1(-tha < vta < xtw2)lxt'xt (1/T)E[1(-Xtfla < v12 < Xtfla)]Xt’Xt L (1/T)EE[1(-Xtfle < vta < xtfla)]Xt'Xt by Assumption (A9)' and the law of large numbers. 85 84 ‘froof of Consistency of E12 in.II.B.1.b. of Chapter Two The consistent estimator of (1/T)EE[§tv11Xt‘Xt] is (1/T)E[1(Xtfia > 0)]min[max(vta, -Xtfia)» xtaalvtixt'xt. The proof is as follows. 1(1/T)E[1(Xtfia > 0)]min[max(vtg, -Xtfia), Xtfialvtlxthtk - (1/T)E[1(Xtfla > 0)]m1n[max(vta, ~Xtfla), Xtflalvt1XtJthl IA (1/T)E|[1(Xtfia > 0)]min[max(vt2, *Xtfia). Xtfialvt1XtJth - [1(Xtfla > 0)]min[max(vta, -Xtfla), xtwalvt1xtJXtK| IA (1/T)E1[1(tha > -Xt3t)]min[max(vta* — xtst, 'Xtfla ‘ Xt3t)1 Xt"2 + Xtfitl ' [1(Xt72 > 0)]m1n[max(vta, -Xt72)- tha11vt1lflxtfla + (1/T)E|[1(tha > -Xtat)]min[max(vta' - xtxt, -xtwa - xtst). tha + xtatllna1 - wiuuxtu3 : «1 + d2 Let r s l[1(xtn2 > —Xt3t)]min[max(vta* — x131, —th3 — tht), tha + X131] - [1(Xt"2 > 0)]min[max(vta, 'Xt32): thall Case 1. tha > -xtst and Xtfia > 0. (a) max(vta, -tha) < Xt"2 (i) -Xt72 < Vt2 < xtva Vt2* : max(vta, ‘Xt"2) : Vt2 P S 'Vt2 — tht - Vtal = IXtBtl (ii) v12 < 'Xt"2 < Xt"2 vt2* : max(vta, 'XtV2) : 'Xt"2 P 1 l-tha - tht - (-xtw2)| = Ithtl (b) max(vta, -tha) > Xtfla 85 P 1 Ixtwa + Xt3t - th2l = lxtstl Case a: xtVa s -xt3t and tha > o. r : |-min[max(vta. -tha), XtVEJI s lxtwal s lXtStl Case 3: tha > —xt3t and XtVE S O. This implies lthal < lxtstL r :’|min[max(vta* — xtst, -tha — xtst), Xtfla + thtJI S lxtval + IXtStl s alxtstl From cases 1.2 and 3. we Know «2 s (a/T)§nxtn3uatulvt1l. Since Euxtu4+fl is uniformly bounded, and E(Xt’vt1) = o, (2/T)Enxtu3fl8tfllvt1l converges to zero almost surely by the strong consistency of fig. So. plim a1 = 0. a2 (1/T)§'[1(Xt"a > —Xt3t)]min[max(vta* - Xtat, -XtW2 - xt3t)- Xtva + thtllflfii — wiuuxtn3 IA (1/T)E(lxtfl3l + lXt8t|)flfi1 - «1nuxtn3 = L Case 1: Xtfla > ‘xt3t and Xt"2 S 0. This implies lxtwal < 'Xt3t'- L < (a/T)§ua1 ~ w1uustnuxtu4 Since Euxtn4+fl is uniformly bounded. (a/T)EH%1 - w1HH8tHHXtH4 converges to zero almost surely by the strong consistency of %1 and fig. Case a: Xtva > -xt3t and Xtfla > o. This implies lthgl > lthtl. L < (a/T)§uai - «1nuwaunxtu4 86 Since Ellxtu4H1 is uniformly bounded, and we is finite. (E/T)§Hfi1 - wiunwguuxtu4 converges to zero almost surely by the strong consistency of fii and fig. (1/T)E[1(Xt€ya > 0)]mintmaxwta. ~Xtfia). Xt%2]9t1Xt’Xt z. (1/T)§[1(xtwa > O)]min[max(vt3, -xtw2). xtwawuxt'xt By Assumption (A9)' and the law of large numbers. (i/T)E[1(tha > 0)]min[max(vtg. 'XtVE)’ xthJvtixt'xt L (i/T)EE[1(tha > 0)]min[max(vt2. —tha), thalvtixt'xt z (i/T)EE(EtVt1Xt'Xt) APPENDIX C Proof of Consistency of Estimator of fipq 1n III.A.i.b. of Chapter TWO The consistent estimator of fipq is (1/T)§[1(xtap > 0)1[1/a - 1(vtp < 0)1[1(xtaq > 0)1[1/a - 1(vtq < 0)]xt'xt. The proof is as below. a = (i/T)E[1(Xtfip > 0)1[1/a — 1(vtp < 0)][1(xtaq > 0)][1/2 - 1(vtq < 0)]xt'xt - (1/T)§[1(xtwp > 0)][1/2 - i(vtp < 0)][1(xtwq > 0)][1/2 - 1(vtq < 0)]Xttxt s (i/T)El[1(xtfip > o. xtaq > 0)]{1/4 - [1(vtp < 0)] - [1(vtq < 0)] - [1(vtp < o, vtq < 0)]: — [1(xtwp > o, thq > 0)]{1/4 — [1(vtp < 0)] — [1(vtq < 0)] - [1(vtp < o. vtq < c)13qutu3 s (i/4T)El[i(Xtfip > o. xtaq > 0)] - [1(xtwp > o, xtwq > 0)]!"th3 + (1/T)El[i(Xtfip > o. xtaq > 0)][1(vtp < 0)] - [1(xtwp > o, xtwq > 0))[1(vtp > 0)]qutu3 + (1/T)§I[1(xtap > 0. xtaq > 0)1[1(vtq < 0)] — [1(xtwp > 0. thq > 0)J[1(vtq > onnuxtua + (1/T)EI[1(xtap > o. xtaq > 0)][1(vtp < o. vtq < 0)] - [1(xtwp > o, xtwq > 0)][1(vtp > o. vtq > onthna :a1+a3+a3+a4 «1 : (1/4T)E|[1(Xtfip > O, Xtaq > 0)] — [1(xtvp > o, xtwq > onluxtua s (1/T)§[1(Ixtwp| s uxtnuap — up". Ixtwa s uxtunaq - «qu)1ux£n3 87 88 For any n > o, Pri(1/T)§[1(Ixtwpl i uxtunap - up". lxtwa 5 uxtuuaq - qu)]nxtu3 > n! IA Prt(1/T)§[1(Ixtwpn s uxtnz). lthql s nxtuz)1uxtu3 > n; + Pr(uap - up" > 2. "sq - qu > 2) IA (1/n)(1/T)§Pr(uxtwpu s "XtHz). lxtwa s uxtnz)nxtn2 + Pr(flfip - up" > 2. "fig - wqu > z)(by Markov’s inequality) IA (1/n)K32 + Pr(flfip - flp" > 2, "fig - "q" > Z) By choosing z sufficiently small. Pr(||fiP - up" > 2. "fig - wqu > 2) can be made arbitrarily small for large T by the consistency of ap and aq. So. plim a1 : 0. a4 : (i/T)E|[1(Xtap > o, xtaq > 0)][1(vtp < o. vtq < 0)] - [1(xtwp > o, xtwq > 0)1[1(vtp > o. vtq > o>1luxtua IA (1/T)§[1(Ixtwpl s uxtnuap — wpn)1uxtu2 + (1/T)§[1(Ixtwq| s uxtuuaq - qu))uxtu3 + (1/T)§[1(Ivtpl s uxtunap - wpn)1uxtua + (1/T)E[1(|thl s nxtunaq - «qu)1uxtu3 :A1+A2+A3+A4 ('.' Xtfip Xtfi'q th th Xt‘fl’p Xtflq th th (X4 + + — - - + - — A1 + + — - + - - - A2 + + - - + + + — A3 + + - - + + - + A4 — + - - + + — —- A1 + — — - + + — - A2 + + + - + + — - A3 89 + + - + -+ + — - A4 ) These terms can each be shown to converge to zero in probability by that of p. 323 (A27) in Powell’s [1984] paper. So. plim a4 : 0. By the same way. we can show that plim «a = 0 and plim a3 = 0. Therefore. (1/T)E[1(Xt€rp > 0)][1/2 - “th < 0)][1(Xta'q > 0)][1/2 - 1(vtq < 0)]xt'xt a (1/T)E[i(xtwp > 0)][1/2 — 1(vtp < 0)][1(xt1rq > 0)][1/2 - 1(vtq < 0)]xt'xt = (1/T)§Etp§tht’xt By Assumption (A9). (umgztpthxt'xt .r. (umgmztpthxt'xt) APPENDIX D Proof of Consistency of Estimator of fipq in III.B.1.b of Chapter TWO The consistent estimator of (i/T)§E[§tp§tht’xtl is (1/T)§[1(xtap > 0)1m1n[max(vtp. -xtap). xtap1[1(xtaq > 0)]mintmax(vtq. -Xtfiq). xtanXt'xt. The proof is as below. a = 1(1/T)§[1(xtap > 0)1m1n[max(vtp. -xtap). xtap1[1(xtaq > O)]min[max(vtq. -Xtfiq). XtfiqIXtJth - (i/T)E[1(thp > 0)]min[max(vtp. —xtwp). xtwp][1(xtwq > 0)]min[max(vtq. -Xtflq), Xtflq1XtJXtK' IA O)]min[max(vtq. -Xtfiq). Xtfiqlxtjxtki- [1(Xtflp > 0) )min [maX(th, —Xt1rp) . Xtfl'p] [1 (th'q > 0) ]min [maX(th, -Xtflq), XthlxtJthl (1/T)E'[1(xt"p > -Xt3tp)]min[max(vtp* - thtp, -thp - Xtfltp). thp + thtp][1(Xtflq > ‘thtq)]m1n[max(vtq* ' xtstq. -Xtflq - thtq), Xtflq + (1/T)§I[1(xtap > 0)1m1ntmax(vtp. -xtap). xtap1[1(xtaq > > 0)]mintmax(vtq. -thq). xtwaluxtua rnxtna Case 1. Xt‘n’p > ‘Xtatp. Xth > ‘thtq and Xtflp > 0. Xtflq > O. (a) max(vtp, -Xtflp) < thp and max(vtq. —thq) < Xt"2 (i) -thp < vtp < xtvp and —thq < vtq < xtwa vtp* : max(vtp. ~thp) : vtp vtq' : max(vtq. -xtwq) - vtq r s |(th - thtp)(th - Xt$tq> - thvtq' 9O IA IA 91 ‘thXtStq - thxt3tp + xtstpxtstql thxtfitql + thqxtstpl + 'Xt3tpxt3tq' vtpuuxtuustqn + thqIHXtuflotp" + natpunstqnuxtua F1 Since EHXtH4*“. Elvtpl. and Elvtql are bounded uniformly in t. riuxtna converges to zero almost surely by the strong consistency of fip and fiq. (11) th < 'Xtflp < Xtflp and th < 'Xtflq < xtflq th* th" P ( : max(vtp. -Xtvp) : -thp : max(vtq. -thq) : -thq |(xtvp + xtstp)(xt"q + Xt3tq) - (xtwp)(xtuq)| |(Xtflp)(Xt3tq) + (xtltp)(Xth) + (xtatp)(xtstq)l unpuuatqnuxtua + qunnstpunxtna + ustpunthuuxtua Pa Since Euxtu4+fl is uniformly bounded and up. wq are finite. Paflxtfla converges to zero almost surely by the strong consistency Of fip and fiq. th' F ( ( ( it : max(vtp, -thp) : vt1 : max(vtq. -thq) : -thq |(th - XtStp)(-Xtflq - thtq) - th(-Xtfiq)l Ivtpxtatql + lththtflql + 'Xt3tpxt3tq' Ivtpluxtuuatqn + nstpunnquuxtua + natpnustqunxtua 92 : r3 Since Euxtn4+“. Elvtpl are uniformly bounded and wq is finite, r3nxtu3 converges to zero almost surely by the strong consistency of fip and fiq. (iv) vtp < -thp < thp and -thq < vtq < tha The proof is the same as (111% (b) max(vtp. -thp) > thp and max(vtq. —Xtvq) > XtVE r s 1(xtwp + xtatp)(xtwq + xtgtq) - xtwpxtwa i lthpXtthl + IXtSththl + IXtSthtStql s unpuustquuxtua + unquuztpuuxtua + natpuuatqnuxtua = r4 Since Eflxtfl4*“ is uniformly bounded and both up and fig are finite. r4uxtu3 converges to zero almost surely by the strong consistency of fip and fiq. Case a: xtwp s -xt3tp or xtwq s —xt3t and thp > 0, thq > 0. This implies that Ithpl i Ixtstpl or lxtvql s lthtql. P : I—{min[max(vtp. —Xtvp). Xtflpllimin[max(vtq. ‘xt"q)n XthJII IA [(thp)(xtwq)l IA IXtEtpllxtwa (or Ixtwpllxtstqi) IA qunnxtpunxtua (or uwpnustqnuxtua) : r5 Since Ellxtll‘l““fl is uniformly bounded and wq (or WP) is finite, rsuxtna converges to zero almost surely by the strong consistency of fip (or fiq). Case 3. thp > 'Xt3tp' Xth > -Xt3tq and 93 thp i O or thq-i O. This implies that lXtflpl < r‘ : IA IA IA IA limin[max(vtp* - Xt3tp' —thp - Xt3tp)v thp + xtstplliminEmax(vtq* — Xt3tq- —thq - xtfitq)- thq + thtqlll |(th1 + XtStp)(Xtflq + Xtfitq)l lthpthql + lxtwpxtgtql + lxtstpxtwa + lXt3tht3tql IXtEtPIIXthI + lXtStpllxtflql + ZIthtpllXthql (or IXtflpllXtthl + IXtflpllXtStql + Zlthtpilxthql) aqunustpnuxtua + austpnuatquuxtua (or auwpuustqnuxtua + anatpuuatquuxtua) P6 Since Enxtu4+n is uniformly bounded and vq (or up) is finite, renxtua converges to zero almost surely by the strong consistency of Sp and fiq- ; (l/T)E[1(Xtfip > 0))m1n[max(vtp. -xtap), xtap][1(xtaq > 0)]min[max(vtq, -Xtfiq). Xtfiq1Xt'xt L (1/T)E[1(thp > 0)]min[max(vtp, —Xtflp). thp][i(xtwq > O)]min[max(vtq. -Xt1Tq), Xt‘flq] Xt'Xt : (1/T)E§tpgtht'xt By Assumption (A8)’ and law of large numbers. (1/T)§ztp§tht'xt s (1/T)EE(§tp§tht'xt) APPIHBIX E SECOND DERIVATIVES OF THE LOG LIKELIHOOD FUNCTION To derive the second derivatives of the log likelihood function. we use the following relationship frequently: 5§(Wt)/6Wt = ¢(Wt) 5¢(Wt)/5Wt = 'Wt¢(wt) 0[¢(Wt)/Q(Wt)]/6Wt = -[¢2(Wt)/§a(wt) + Wt¢(Wt)/§(Wt)] where §(o) is the standard univariate normal distribution function, and ¢(.) is the univariate normal density function. The second derivatives of 108 likelihood function are as below: 61nL(e)/aw16w1’ : -Ext(xt'xt)/[oia(1 - 92)] — 5(1 - At)(Xt’Xt)/C12 — §(1 - xt)pa -Xtfla) : 1(Yt2 > O), 1(Vt2 ~Xtfia) : 1(Ytg : 0), ca¢(at)/§(at). -Ua¢(at)/[1 - @(at)L t) AtHYti - E(1 - Xt)(Yt1 - AtHYta - = PU1¢(at) = §(at)to13 - 92012N1t(xtflg)/Caal = va¢(at) = §(at)[caa - N1t(xtwa)1 XtV1)(Yta - tha) = §(at){PU1[U22 - N1t(XtVa)]/Ual §(at) thi) : —po1¢(at) th1)3 - §(at)][b13 — 93613Nat(xtw3)/o331 tha) = -Ua¢(at) Xt"2)2 — 6(atlllbaa — Ngt(xtfla)] XtV1)(Yta - XtVa) [1 — 9(at)]{pv1[baa - N2t(Xtflg)1/Ual 97 IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIlIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII‘ II. III. 98 'Truncated univariate7normal distribution (Johnson and Kotz 1970 pp.8i-83) (1) x ~ H(O. be) x z oci. c¢(c1)/[1 - o(ci)1 = M1 o2 — M1(M1 - ecl) E(X) V(X) E(x3) = c3 + M1(CC1) (2) x ~ N(O. U?) x s oca E(X) 'O[-¢(Ca)/§(Ca)] = 1‘13 V(X) = o3 — M2(Hg - oca) E(x3) = c3 + na(oca) (3) x ~'n(o. be) cc, 1 x s ace 30‘) 'O[¢(C1) - ¢(Ca)]/[§(Ca) - @(C1H = M V(X) o3 - n2 + v3101¢(c1> - Ca¢(Ca)l/[§(Ca) - @(01)1 E(x2) = be + vatc1¢(c1) - ca¢(cg)1/[o(cg) - o(c1)1 Truncated bivariate normal distribution (Johnson 3 K0122 1972 p. 113) V1 U12 V12 V2 612 U23 v3 > n (or v3 < K or n < v3 < K) E(V1) : p(o1/oa)E(Va I Va > h) E(V1V2) = p(o1/og)E(v33 1 v2 > n) E(V13) = p2(o13/o33)s(v23 I v3 > n) + e13(1 - p3) APPENDIX G THE ASYNPTOTIC COVARIANCE MATRIX OF THE CLAD ESTIMATE The asymptotic covariance matrix of the CLAD estimate for the case of independently and identically distributed error terms is shown in Case One of Chapter Two as follows: 1 — 1 — 1 1 --1— — —1 (-——)H' ya“ (-——)u n1a[f(0)uul T T 1 - 1- - 1 1 2 1 (-¥—)H' D’1a[f(0)Hx]- (—;—)[2f(0)]' Eu- How. we assume Vt1 and Vta are independently and identically distributed as bivariate normal with mean zero and covariance matrix ”12 U12 U12 022 Treating X as fixed. and using the assumption of normality. we have: fi" = pllm(X'X/T)'1 = (X’X/T)'1 v = plim (1/T)E(vt13xt'xt) : (i/T)012(X’X) fi. = plim Ei(1/T)§1(Xt"a > 0)xt'xt1 = (l/T)E1(Xtfla > 0)xt'xt f(0) = 1/[(aw)%c21 512 : plim (i/T)EE((Vt1)(§t)Xt’Xt} = (1/T)E[E(Vt1)(§t)lxt'xt Ewunzt) = E(vt1)[1(xtwa > onm — mite < 0)] : -1(Xt"a > O)E[Vt1ui(vta < 0)] E[Vt1I1(Vta < 0)] = “2(vt1 I Vta < 0) : %p(b1/CE)E(Vta I Vta < 0) 99 TOO : -p[‘c1/(21r)%] (see appendix F) 2 E13 = (1/T)§po11(xtwg > o1xt'xt/(2w)“ = PU1fix/(2W)% (1/T)fi‘1Vfi‘1 = (i/T)(X’X/T)'1[(1/T)613X'XJ(X’X/T)'1 = 612(X’X)'1 (1/T)fi‘1a1a[£(0)fi.1‘1 = (1/T)(x'X/T)'1[pv1n./(2W)%1(fi./[(aw)%u311'1 : '012(X'X)—1 (1/T)[3f(0)]_afix-1 = (1/T)12/1(aw)%v311'2[(1/T)§1(xtwa > 0)xt'xt]-1 : (w/a)o23[§1(xtwa > 0)xt'xt1-1 U13(XIX)-1 C12(X’X)'1 612(X’X)'1 (w/a)o23[f1(xtwa > 0)xt'xt]‘1 APPENDIX H THE ASYHPTOTIC COVARIANCE MATRIX OF THE SCLS ESTIHATE The asymptotic covariance matrix of the SCLS estimate for independently and identically distributed error terms is shown in Case One of Chapter Two as follows: (i/T)fi'1Vfi‘ (1/T)fi'1fi125‘1 (1/T)fi'1fi’125’1 (1/T)5'156'1 Under our new assumption about the error terms (bivariate normal). we may calculate the asymptotic covariance matrix as follows: fi'1 = plim(X’X/T)‘1 = (x'X/T)‘1 0)[a§(at) - 11xt3x r = plim (i/T)EE{1(tha > O)min[Vtaa. (xtwa)21xt'xt1 (i/T)E1(tha > O)E[[i(-Xtfl2 < Vta < Xt"2)1Vtaa + [1(Vta Z Xtfla or Vta S -tha)](xtwa)3lxt'xt E[1(—th2 < Vta < Xt"2)]Vtaa : [2§(at) - 1]E(Vtaa I -Xtfla < Vta < Xtfla) = caatao(at) - 1 - aat¢(at)] (See Appendix F for all conditional expectations) E[1(Vta z XtVZ or Vta 1 'Xt"2)1(XtVa)2 = 2(xtwa)2[1 - 6(at)] ; n = (l/T)Ei(tha > 0)1e22[2o(at) - 1 — aat¢(at)l 101 102 + 2(xtwa)at1 - §(at1]3Xt’Xt 512 = plim (1/T)§E(;tvt1xt'xt) = (l/T)EE(§tVt1)Xt’Xt E(§tVt1) = E{1(tha > 0)min[max(Vta. 'Xt"2)1 Xtfi213Vt1 = 1(tha > O)E{[i(Vtg 1 -Xtfla)l(-Xt"a)Vt1 + [1(’Xt"a < Vta < Xtfla)]Vt1Vta + [1(Vt2 2 Xtfla)1(tha)Vt1l E[1(Vta 1 -xtwa)J(-xtwa)vt1 = -(Xt"a)£1 - 0(at)lE(Vt1 l Vta 1 ‘Xt"a) = (Xt"2)U1a¢(at)/Ua E[i(-tha < Vta < Xtfla)]Vt1Vta = [2§(at) - i]E(Vt1Vta I 'XtWa < Vta < xtwa) = b1a[2§(at) - 1 - aat¢(at)l ‘ E[i(Vt2 z xtwa)](xtw2)vt1 : [1 — §(at)](xtwa)E(Vt1 I Vta zxtwa) = (Xt"2)512¢(at)/Ua 1 E13 = oia(l/T)E1(Xtfla > 0)[2§(at) - 11xt'xt = .0125 (i/T)fi'1Vfi"1 = (x'X/T)‘1 (i/T)fi‘15136‘1 = 613(X’X)‘1 o13(x'xI‘1 013(X’X)‘1 o12(x’xI-1 (i/T)5‘1DE'1 where a = (i/T)E1(tha > 0)[2§(at) - 1Ixt'xt 5 = (i/T)Ei(xtw2 > O){o22[2@(at) - 1 - aat¢(at)] + 2(xtwg)3[1 — §(at)]!Xt’Xt APPENDIX I THE IDENTIFIED STRUCTURAL HODELS CORRESPONDING TO REDUCED FORK EQUATIONS WITH “'2 = (1, O)’, (1. 1)’. 0R (0,1)'. 1' Yti = Y1Yta* + B12Xta + €t1 (912 = 0) Yta” = 821Xt1 + Baaxta + €t2 (922 = 0) 3~ Yti = B11Xt1 + 91aXta + €t1 (912 = 0) Yt2” = YaYt1 + eta 3- Yti = Y1Yta* t 912Xt2 + et1 Yta = Ba1Xt1 + Baaxta + eta (922 = 0) 4' Yt1 = 911Xt1 * B12Xta + 61.1 Yta” = YaYt1 + Baaxta + eta (YaB1a + 922 = 0) 5- Yt1 = Y1Yt2” + B11Xt1 + €t1 Yta” = YaYt1 + Baaxta + eta 5' Yt1 = Y1Yt2” + B11Xt1 + €t1 Yta” = 821Xt1 + Baaxta + eta 7- Yti = Y1Yta* + €t1 Yta” = 921Xt1 + Baaxta + €t2 5- Yt1 = B11Xt1 + B12Xt2 + €t1 (812 = 0) Yta” = YaYt1 + Baaxta + eta 9' Yt1 = Y1Yt2” + 912Xt2 + €t1 Yta” = Ba1Xt1 + BaaXt1 + €ta 10- Yt1 = Y1Yt2' + Biaxta + Et1 103 11. 12. 13. 14-. 15. 16. Yt1 Yta Yti Yti Yta Yti Yta 104 YaYt1 + Ba1Xt1‘+ 6ta B11Xt1 + B1axta + €t1 YaYt1 + 821Xt1 + €ta B11Xt1 + B12Xta + €t1 YaYt1 t €t2 B11Xt1 + B12Xta + €t1 YaYt1 + B12Xt1 + Eta B11Xt1 + B1axt2 + €t1 YaYt1 + Baaxta + et2 Y1Yt2* 921Xt1 Y1Yta' 921Xt1 911Xt1 YeYt1 + + + + + B11Xt1 + €t1 Baaxta + eta et1 Baaxta + eta B1axt2 + €t1 €t2 911Xta + B12Xta + €t1 YaYt1 + 921Xt1 + €ta (911 = 0) (521 = 0) (Bag = 0) (511 = 0) (YaB11 + 921 = 0) ,..s..1.1 BIBLIOGRAPHY Amemiya, T. (1979), "The Estimation of a Simultaneous Equation Tobit Model." International Economic Review 20: 169—161._ Amemiya. T. (1963), "A Comparison of the Amemiya GLS and the Lee—Haddala-Trost GZSLS in a Simultaneous-Equations Tobit Hodel,” Journal of Econometrics 23:295-300. Arabmazar, A and P. Schmidt (1962), "An Investigation of the Robustness of the Tobit Estimator to Hon-Normality,” Econometrica 50: 1055-1061 Chamberlain, G. (1983). "Panel Data." in Handbook of Econometrics. V01.2. (edited by Griliches. Z. and M.D. Intriligator), 1246-1318. Chamberlain, G. (1987), "Asymptotic Efficiency in Estiamtion with conditional Moment Restrictions," Journal of Econometrics, 34: 305-334. Cosslett. S.R. (1987), "Efficiency Bounds for Distribution- Free Estimators of the Binary Choice and the Censored Regression Models," Econometrica, 55: 559-586. Duncan, G.M. (1986), "A Semi-Parametric Censored Regression Estimator, " Journal of Econometrics 322 5-34. Fernandez, L. (1986) "Nonparametric ML Estimation of Censored Regression Models."Journal of Econometrics 32: 105 106 35-57. Goldberg, A.S. (i980), " Abnormal Selection Bias," Workshop Paper No. 6006 (Social Systems Research Institute, University Of Wisconsin, Madison, WI). Gourieroux. C., A. Monfort and E. Renault, "Consistent M- Estimators in a Semi-Parametric Model,” Working Paper 8706, INSEE, Paris. Horowitz, J.L. (1986), "A Distribution-Free Least Squares Estimator for Censored Linear Regression Models," Journal of Econometrics 32: 59-84. Johnson, N.L. and S. Kotz (1970), Distributions in Statistics: Continuous Univariate Distributions 1, New York: John Wiley a Sons, Inc. Johnson, H.L. and S. Kotz (1970). Distributions in Statistics: Continuous Multivariate Distributions 1, Mew York: John Wiley a Sons, Inc. Lee, L.F. (1981), "Simultaneous Equations Models with Discrete and Censored Variables," in: C.F. Manski and D. McFadden, eds., Structural Analysis of Discrete Data with Econometric Applications (MIT Press, Cambridge, MA). Lee, L.F., G.S. Maddala, and R.P. Trost (1980), "Asymptotic I Covariance Matrices of Two-Stage Probit and Two-Stage Tobit Methods for Simultaneous Equations Models with 107 Selectivity," Econometrica 48: 491-503. Malinvaud, E. (1980), Statistical Methods of Econometrics, 3rd ed., New York: North-Holland Publishing Company. Nelson, F. and L. Olsen (1978), "Specification and Estimation of a Simultaneous Equation Model with Limited Dependent Variables,” International Economic Review 19: 695—705. Hewey, W.K. (1985), "Semiparametric Estimators for Limited Dependent Variable Models with Endogenous Explanatory Variables," Annales de L’INSEE 59/60: 219-237. Newey, W.K. (1987a), "Specification Tests for Distributional Assumptions in the TObit Model, " Journal Of Econometrics, 34: 125-146. Newey, W.K. (1987b), "Efficient Estimation of Limited Dependent Variable Models with Endogenous Explanatory Variables," Journal of Econometrics 36: 231—250. Powell, J.L. (1964), "Least Absolute Deviations Estimation for the Censored Regression Model," Journal of Econometrics 25: 303-325. Powell, J.L. (1985), "Symmetrically Trimmed Least Squares Estimation for Tobit Models," MIT Working Paper, No.36& Powell, J.L. (1986a), "Censored Regression Quantiles," I 108 Journal of Econometrics 32: 143-155. Powell, J.L. (1966b), "Symmetrically Trimmed Least Squares Estimation for Tobit Models," Econometrica 54: 1435- 1460. Smith. R.J. (1987), "Testing the Normality Assumption in Multivariate Simultaneous Limited Dependent Variable Models," Journal of Econometrics, 342 105-124. Tobin, J. (1958), "Estimation of Relationships for Limited Dependent Variables," Econometrica, 26: 24-36. White, M. (1980a), "A Heteroskedasticity-consistent Covariance Matrix Estimator and a Direct Test for Heteroskedasticity," Econometrica 50: 1055-1063. White, M. (1980b), "Nonlinear Regression on Cross-section Data," Econometrica 48: 721-746. White, M. (1984), Asymptotic Theory for Econometricians, New York: Academic Press, Inc. (.23. I. 3:11.111... .11.... ...... ...... a : I who: I 3 , . a ”Emir. ..i..we...~§..n. ,s . . i 5:... 1. an. ...... t. “ran . .... ...: in 1 v 7 «Rant .x