Ju 17"";IW:‘J}J J:.:'. J 1311.1 .1331»??? 9‘11”; MJflh’ffiw‘JJs1I-‘giig " ‘51:}- L‘J I E‘Lvfi :: --4 —-4.. a—-. —": —~ -.._- ‘ i ' J . ' Mm? ...._ 3‘!’ _. ' Iq‘ddx? od;;: 4. , «1“ ”£4 ‘1‘ >11: _ .1 r' -’ ~. "'x‘u’. I a o a , .. n - 1“ __. J ~ 3‘. ”at I! 1: ——.,v—uw v ‘U'fi 0‘ o. .‘ . 1 ‘ .13 '1 . 3 - r JJJ'11111'11 1'. .. " ‘PWwfiflmJJ pflPJJ91~1pfl1qJ u111'111‘i1imv'1 “3111-1. 23; JJJJJJ‘ IJJJJ' ‘7 u ‘h‘ 7—. ' 4 v-w ‘ 54'”. .I ’ ”1 E: "n... — V ‘ | 3.: W1 ‘ t :— ‘u'h ' ”I! ' '.‘ 4 ".I . I - 11$} ‘ ‘ J 1 .1 L r 11.. > . Wu. .J l .. .. A 0‘ . . -. 4 n. . .;. -J :: ;!::<:¥-., , J I' .J;.1.{'~J‘JJ;};J: 3E§Jfi .J$ - ‘ _‘ . at}? ‘ ‘I' u‘ 511.: 1:3." ' 1' :3 .1 153 ‘ 3 1:‘ z s Mari. 1 U”! i J‘.,: . .35 [1‘ ' gfiJ ‘4'; "1‘. t.‘ "' ' ~31] J fiii'i’lfl‘kk‘g?" : I . : - 111161111111: w I) 1- .. I.:1; -..: '- :11 ] -.: . _ '12 ' - .» 1:. : 'J1J1J11‘J-i‘ffir ‘ “"11? hr,“ :1" _ “1.11:, ‘ 1- - ' ‘ 1 ' i;1£!£l;.:]jjf‘[' 2.3" 1 11$ . {4.1.}; (.55: Jz‘lbgj'" :1}; “.2 '3. ‘ ‘1‘1‘ w}. v- —. ..l9¢—o-. a o- 1 t 1 1'111 c.,' v .. .I. I .J._ '0' '1 I: .' .'. J‘av} 'JJ |>JI11J.:.I 5 1‘ 1% L 1.5 . . A ,.~;n_-v 5;‘ -' I 1 . 1I1I 1‘11‘ 1‘11..1511‘:11:1J.1J'3.51““ ”‘7 '13: ~ .2 ~~|,~..J '5L;1'J‘. ."i' .1. 5 1‘ . 1 1 . "iJ 1 1 1 1 1 .. J . . .. 5. :.1,-’"1~1..1-11:1.1.U 1 1.1.9., 1 ,,3 .~:- .. ..J.fi”‘flJ1 1‘:""'*”'§’ 1111115. 111J=J51115113’ ' "1.1.2" _ I‘I I; 1.:JJJJ".'J' 5IJ 'J'1J‘1 J 1 I1 71 .',"' ’1_J1 1 """'J,J|"1‘ [‘11 'WW“'1 J flJJ 'JWMMwflmmmm1mfiumqfl W '. .\ :1, r .3 . w1‘1 1131' , f1.) 1.. 1 .11 ' "' 111 1 s 1 1. vl.‘ "'11 '3' 1‘35"v'1":1i‘:113l I {1 :IJIJJJ11‘ 1111113111!“ J 1". .JJ. I 11.I_U~3JJ. I'II1 II,;JJI1I1J.I_I,1I1J'IIIIIJJJ,IJI‘I|JJJJ.IIJJI,J 1IIJ1IJ1 1J1IJ1'191JH‘11111 JJJJIJJJIJJIJJJ ”14:"th J11 J'JI’J‘JdJJ III; 1 IIIItIIJIIvI‘RII "1 1;;11 .1' .111 1115111; . "111.151.11111"" ”'1‘“ .11.”? ' 3 11J’J.1.11’11 1 131.1 J11: n1 3 551"’J 11L . 1 51. i I "'J ‘1'1 J I. J"' “21;; II 1! II J l JJJM1JJ11JJ1 ,1... .. J .11... .55 JJ JJ 11JJ‘1J5 JIJIJ 11J111111 ’11I" Q 1 1'11 J' J ' 1 311111113 1111511111111 11.111111“ “1.115.151“ 11‘“ “4.11111 1111 1311151115 .1.;.. . '51 1HJJ :11m1 l'11113w11" 31JI‘I1JQ‘13J1'111.‘t 11JJ1,JJJ1JI 5'11““! J1J1¥11JJ113J11511J1JJJJ‘1J11‘J1zlJ111'1JoJ:1.1111J1JJIJJJJ13‘111111111111J1JJJJ1Ji4L'11 ‘J;J'JJ1J1‘JJJJM 1 1111111 1 1111111” 11 111m 13111‘1‘3111 Twp?" This is to certify that the dissertation entitled SOME APPLICATIONS OF THE LAGRANGE MULTIPLIER TEST IN ECONOMETRICS presented by Tsai-Fen Lin has been accepted towards fulfillment of the requirements for Ph. D. degree in Economics "P53“ 319% Major professor Date August 6, 1982 MS U is an Affirmative Action/Equal Opportunity Institution 0-12771 “ A ‘ , I ‘ ‘II‘V “::I’ Ji’flnws'ugi y ah :- ' 74.... R .~»-« ~. .- VM's‘i" - u: .3 » -. -> .7 O I ' \ ‘ O 4 ' " ‘“*3 in, R.\.- m‘aJ CI ' ‘5’ i T». “if 1$MT$3£< LET-T. «72, 33:3 ',“' ' Tex . séb‘iS‘x-VI-‘AL.’ ‘— of: P. a .9.- 7’)’ I l\ {1‘ ' )V153[_] RETURNING MATERIALS: Piace in book drop to LIBRARJES remove this checkout from mes-5.. your record. FINES Win be charged if book is returned after the date stamped beiow. SOME APPLICATIONS OF THE LAGRANGE HULTIPLIER TEST IN ECONOHETRICS BY Tsai-Fen Lin A DISSERTATION Submitted to Michigan State University in partial Fulfillment of the requirements For the degree OF DOCTOR OF PHILOSOPHY Department 0? Economics 1982 ABSTRACT sour: APPLICATIONS or THE IAGRANGE MULTIPLIER TEST 1N ECONOME'IRICS By Tsai-Fen Lin There are three kinds of tests for model specification - the Wald test. the likelihood ratio test and the Lagrange multiplier test. They have the same asymptotic power. Therefore. the choice among them depends on computational convenience. Since the Lagrange multiplier test is based on the restricted esti- mates. we choose the Lagrange multiplier test when estimation is easier in the restricted model than in the unrestricted model. Since the Lagrange multiplier test is not well known. and the derivation for the test statistic is complicated. in this thesis. I develop the Lagrange multiplier test statistic for some commonly used econometric models so that they can be used readily by applied economists. These models include distributed lag models. qualitative and limited dependent variable models. and stochastic pro- duction and cost frontiers. In the distributed lag models. the Lagrange multi- plier test statistic is shown to be asymptotically equivalent to the 1“ statistic in testing the coeflicients of the lagged explanatory variables when they are added to the restricted model. In Heckman’s sample selection bias model. the Lagrange multiplier test statistic is asymptotically equal to the. square of the t test statistic in testing the coeflicient of the correction term for the sample selection bias when this correction term is added to the’restricted model. In Poirier's partial observability model. the Lagrange multiplier test statistic is equivalent to the explained sum of squares in a regression of residuals on a set of regressors. In the stochastic production and cost frontiers models. the '2' Tsai-Fen Lin Lagrange multiplier test fails in some cases. and alternative tests are suggested. In summary. the Lagrange multiplier test. except in a few cases. can be used to test the adequency of the simple models. Since the simple model usually involves a simple estimation method or less computational cost than the more complicated alternative. the Lagrange multiplier test can be useful. ACKN OWIEDGMENTS I am deeply indebted to my advisor. Professor Peter Schmidt. for sparking my interest in these topics and for all his guidance and help. I also thank other members of my dissertation committee. William Quinn. John Goddeeris. and Paul Menchik. ii. “‘- ,.. Chapter I II III TABLE OF CONTENTS Introduction 1.1 Introduction 1.2 The LM Test Distributed Lag Models 2.1 2. PJ 2. bid 2. Introduction The Geometric Lag Model The Rational Lag Model Conclusions Qualitative and Limited Dependent Variable Models 3.1 Introduction 3.2 Test of the Tobit Specification against Cragg’s Extension 0? the Tobit Model 3.2.1 The Tobit Model and Cragg’s Extension 3.2.2 The LM Test Test a? Sample Selection Bias 3.3.1 Heckman’s Sample Selection Bias Model and His k lest 3.3.2 The LM Test OP Sample Selection Bias Test 0? Independence in Poirier’s Partial Observabilitg Probit Model iii page he [U m 0‘ [U M 34 36 IV 3.5 3.4.1 Poirier’s Partial Observabilitg Probit Model 3.4.2 The LM Test Conclusions Stochastic Production/Cost Frontiers 4.1 4.2 4.6 Introduction Test 09 Zero Mode For the Technical Inefficiencg in Stevenson’s Extension 0? the ALS Model 4.2.1 Stevenson’s Extension 09 the ALS Model 4.2.2 The LM Test Test 0? Allocative InePPiciencg in SL I Model 4.3.1 SL I Model 4.3.2 The LM Test 4.3.3 An Alternative Test Test or Systematic Allocative IneFFiciencg in SL II Model 4.4.1 SL II Model 4.4.2 The LM Test Test of Independence between Technical Inefficiencg and Allocative IneFFiciencg in SL III Model 4.5.1 SL III Model 4.5.2 The LM Test 4.5.3 Alternative Tests Conclusions iv 41 47 4e 51 63 63 68 69 73 V Summarg and Conclusions Notes Appendix Appendix Appendix Appendix mommy Appendix Bibliography 74 B4 85 9O 97 CHAPTER I INTRODUCTION 1. 1 Introduction In statistics and econometrics. there are three basic principles for the con- struction of test statistics for model specification. They are the Wald test (Wald(1943)). the likelihood ratio (LR) test and the Lagrange Multiplier (LM) test. Suppose there are two possible model specifications. one of which is a special case of the other one under some restrictions. Let's call the special case the restricted model. and the generalized case the unrestricted model. The Wald test is based on the estimates from the unrestricted model. while the LM test is based on the estimates from the restricted model. and the LR test is based on both sets of estimates. These three principles yield tests which are equivalent in large samples when the restrictions are true (see Silvey(1959)). Their small sam- ple properties are unknown. except in special cases. Therefore the choice among them will often be based on computational convenience. The LM test is very use- ful in cases in which the restricted model is easier to estimate than the unres- tricted model. This will often be the case when one is testing the adequacy of a particular model. Then the null hypothesis is that a relatively simple model is adequate. while the alternative is that a more complicated model is necessary. The LM test permits a test of this hypothesis without having to estimate the more complicated model. Although the LM test was suggested by Aitchison and Silvey in 1958. it did not receive much attention from econometricians until recent years. Therefore. not many economists are aware of the LM test and its computational advantages -2- in many cases. It is the responsibility of the econometricians to introduce the LM test to applied economists by developing LM test statistics for common models in econometrics. The LM test has been applied successfully in testing for a liquidity trap. autocorrelation. the error components model. seemingly unre- lated equation systems and various non-nested hypotheses. (See Breusch and Pagan(1980) for a survey and references.) In this thesis. I report the successful application of the LM test to distributed lag models in chapter 2. to some quali- tative and limited dependent variable models in chapter 3. and to stochastic production/cost frontiers in chapter 4. 1.2 The L]! Test Let 19 = (13,. ' ' - .d,)' be a set of parameters. L03) be the log-likelihood fuc- tion. h(1$) = [111(6), - - - .h,(13)]' = 0 be a set of r restrictions. A = [A1. - - - .N]' be a set of Lagrange Multipliers. 1(13) be the information matrix and n be sample size. Define the Lagrangian function for the maximization of the likelihood sub- ject to the restrictions as L30“) = Lw) + Nh(1$) A constrained maximum of L03) is obtained at a stationary point of LR (13). By difierentiating LR(13.>\) with respect to u and A . we have the first order condi- tions: D03) + H; A = 0 h(13) = o (1.1) ’ 6h 13 where 0(1)) is the sxl vector. {65:1 . and H1, is the er matrix . [ 6’1: ) . By . . . i ' i solving eq.(1.1). we obtain the restricted MLE 3' and 7". When the restrictions are in fact true (h (13) = 0). the restricted estimates 1? will tend to be near the unres- tricted estimates. and 0(5) and X will tend to be near zero. It seems reasonable to decide that h('6) = O is true if 3: is in some sense near enough zero. Aitchison and Silvey (1958) proved that under the null hypothesis that h(13) = O . V723: is asymptotically distributed as normal with mean zero and covariance matrix ~ --1 ~ ~ [h'g'n[1(13)]‘1 H3] where H; and [(13) are H, and I (13) evaluated at 1? respec- tively. They suggested a test statistic which is based on the estimated Lagrange Multipliers (All) and called this the Lagrange Multiplier test statistic: LM test statistic = X'H;'[I(5)]-1H3X (1.2) This statistic asymptotically follows a chi-square distribution with r degrees of freedom when Ms) = 0 is true. The region of acceptance of the null hypothesis -4- h.(13) = O is X'H3'[I(5)]“H3Xs K. where K is determined by Prob ( x3 5 K) = 1- significance level. . Note that from eq.(1.1). 1135: = — 0(5). so eq.(1.2) can be rewritten as LM test statistic = [0(5)]'[1(5)]-1[D(5)] (1.3) The right hand side of eq.(1.3) is just Rao's score statistic (Rao(1947)). Hence the LM test statistic is the same as Rao's score statistic. Since eq.( 1.3) is easier to use. in the following chapters eq.(1.3) instead of eq.(1.2) will be used. When 13 is partitioned into 2 subsets. 131 and 192. and the restriction under test is that one of the subsets of parameters equals particular values. i.e. Hon). = 73m- then we can establish a simpler form of the LM test statistic. From eq.(1.1). 0(3) = —H3’X’. therefore. [61. 912522., . . “11") W 6.61 - 6'61 6131 , 6L T - 6111(13) dh.,.('0) ‘ 6192 l at"; ‘ ' ' as”; ”A, 7' 6h 13 ~ 2 61.1; )l‘j = _ {=1 1 (1 4) o a 6h 13 6h ~ where 61’ and —%—)—ar and —-L(—)- evaluated at 13 respectively. GT; 5761; 61!; BL T: 0 because hj (13) is not a function of 132. Partitioning 1 (1’5) conformably. z eq.(1.3) becomes 6L ' -1! 6L LM test statistic: [631211 I121 [501 21 22 u ' ~ ~ ~ - ~ -1 a = [6 1 {[11 “112122 1121] [a—é‘] (1-5) If I ('5) is block diagonal. the LM test statistic can be further simplified as . . 6L _. M = .6 LM test statistic [6131 In [6191 (1 ) -5- When I (13) is difiicult to calculate. we can use the negative of the Hessian ( matrix of second derivatives ) or its limiting form to construct the LM statistic 6% is 621. 6666' acac' ~ I 2 because in many cases. plim {[I(19)]“l— 4115—1 -.- [3, where was . W evaluated at 5'. Also note that whenever the usual regularity conditions hold. [(19) can be obtained from the first partial derivatives of the log-likelihood func- 6' I 2 l f H 1' tion. ‘That is, [(19) = Ell- Eta-9%“ = E{{%€'-}l§-Lll‘ Besides these. there is an 1 indirect approach using the scoring algorithm ( Newton-Raphson algorithm ) to compute the LM statistic indirectly (Breusch and Pagan (1980)). If I (13’) is not of full rank. say. rank [[6313] = s - t < s. then [(3) is singular and is therefore not invertible. Silvey (1959) assumes there exists a sxt subma- trix H1 of H, such that 3:4(5) + H 1H 1' is positive definite. Then. he proposed a modified LM test statistic LM' - 1 'D(§)l'[l-I(5) + H H 'l-‘ID {i ] i (1 7) - m n 1 1 l ( ) - which asymptotically follows a chi-square distribution with (s - t) degrees of free- dom. This case arises in one of our analyses of chapter II. -3- CHAPTER II DISTRIBUTED LAG MODEIS 2.1 Introduction A distributed lag model describes how the lagged independent variable aflects the dependent variable over time. The length of the lag may sometimes be known a priori. but usually it is unknown and in many cases it is assumed to be infinite. Thus we consider a distributed lag model of the general form y; = 23051th + 8: where y, is the dependent variable. 2,4 is the lagged independent variable. .6. is the distributed lag weight. a; is a disturbance term. Infinite lag distributions involve an infinite number of unknown parameters. and thus it is impossible to estimate all these parameters. To make estimation possible. it is necessary to make some reasonable assumption about the pattern of the distributed lag weights. The earliest distributed lag model is the geometric lag model proposed by Koyck (1954). He assumes that the lag weights decline geometrically. i.e. a. = W. for i = 0.1.2.... where O s A < 1. Since the lag weights of the geometric lag model decline mono- tonically. and this may not always be reasonable. various alternative models have been proposed. For example. the Pascal lag model proposed by Solow (1960) permits a hump in the lag weight distribution curve. In 1966. Jorgenson proposed a more general rational lag model _ .4ng y; - 30:) 3: +114 -7- where A(L) and B (L) are polynomials in the lag operator of order p and V. respectively. He also proved that any arbitary lag model can be approximated to any desired degree of accuracy by a rational lag model with sufficiently high values of p and v. If we take A(L) = 6(1 — A) and B(L) = 1 - M. the rational lag model is Koyck’s geometric lag model. If we take A(L) = 6(1 — A)’ and B(L) = (1 - M)’. the result is the Pascal lag model. In this chapter. two distributed lag models are discussed. The geometric lag model is discussed in section 2.2. and rational lag model is discussed in sec- tion 2.3. 2.2 The Geometric Lag Model Following Klein (1958). the geometric lag model can be expressed as 3!: = B‘onlzt-i + 8: = We +noA‘ + 81 . t = 1. ° - ~ .T. (2.2.1) 6-1 a where 05>. < 1. w. = 21%.... no= 52162.... a. iid N ( 0. 0'2). 11x: 0. or 1: i=0 i=0 we know the value of A. we could estimate eq.(2.2.1) by OLS. Usually.we don't know the value of A and we use a search procedure to estimate eq.(2.2.1). Since the search procedure is not simple. it may be useful to test whether A = 0 before we start the search procedure. The restriction A = O is easy to impose and the restricted model can be estimated by OLS of y, on 2:, only. Therefore. the LM test is very suitable in this case. It is well known that the parameter no can not be estimated consistently. and that indeed the information matrix is singular asymptotically when no is included in the list of parameters to be estimated. See Appendix A for details. However. Schmidt and Guilkey (1976) showed that it makes no difference asymptotically whether one drops or estimates the truncation remainder term in the maximum likelihood estimation of distributed lag models. Maximum likeli- hood estimation of eq.(2.2. 1) amounts to minimizing the sum of squares i (y; " £11), "00”)2 £81 with respect to A. .3. and no. Since noA‘ disappers asymptotically. this is ' 1' equivalent to minimizing 2 (y; - 6111‘)”; that is. to setting no = 0. and applying t=i OLS to the model yt=fiwt+8¢ . t=l."'.T. (2.2.2) 2—1 where w, = 2A‘z,-i. Us A < 1. a; iid N (0. 02). Also. the estimated variances i=0 of A and 5 resulting from estimation of eq.(2.2.2) are asymptotically the same as ones from eq.(2.2.1). This is so because after deleting the row and column corresponding to no. the resulting submatrix of the inverse of the information matrix corresponding to eq.(2.2.1) is asymptotically the same as the inverse of the information matrix corresponding to eq.(2.2.2). Therefore. we can construct our LM test statistic based on eq.(2.2.2) instead of eq.(2.2.1). The log-likelihood function for eq.(2.2.2) is 1' L = constant - g-logo‘e — 31:5; (3;. - fiwdz :1 The first partial derivatives are 61. _ 1 T - d1”: -“‘- 11—: 6A - 02 Effie; where R; - dA -‘§11A 2.4 OZ. " —= — w a 66 2 {:1 t ‘ 6L 1 T The elements of the information-matrix are ( 62L 1 1M = — E fl. = fig (131%)2 . 881 _ 62L _ 1 T 2 I” ' 'Elaaz ' 35.2.3.” _ lazL _ IP°"'E__—']a,saa2 -0 The restricted model is -10- y¢=ztfi+8g I t=l."’,T. "_t=l Let '3': (75.6.1?) where A: 0. §= OLS estimate. E} = y, - 33:; and 02 T T ~ 1 5?: ‘2131-1 2 = ‘gg—Xt-let ~ "" 1' ~2 1AA = $213124 - ~2 Xt-ixc-i 1M - ~2 i=1 1;” = .}_27 3‘2 — 1_](")(‘ a2 .3, 52 where Xt-l = (30-31- ' ' ' 31-1). X: =(31- " ' 137'). 8‘ = (31' ' ' ' 037'). Let 191 = A. 132 = (3.02). Then we can use eq.(1.5). -1 1 1' f“: ' ~_ [~ . . ._ 6L) N _ 11p [”1 ~O [An 6L LM test statistic — [57] In [0 ll 0 1:22,: I 0 6A ' -1 6L ~ ~ -~ _ ~ 6L 1.1111 1.1 = eixt -1(X¢'-1M1X¢ -1)‘1X¢'-1 e: ‘52 {(Xt.-1Ml){t-1)-1(Xt.-1Ml 1,012 32(X£-1M1X‘_1)-1 (Note 1) (2.2.4) where M; =1 -X¢(X[X¢)‘1X¢' and Y: =[y1_ .yy-I. The last equality holds because Mi}? = Y: -%(X¢')Q)"X[Y: = Y: -X¢5 = e:- -11- Note that this LM test statistic can be expressed as the square of the t statistic'for the coeflicient of X,_1 in regression of Y; on (X,,X¢-1). This point can be clarified from the following discussion. Consider the regression 11 = m +XHc + u, , u, m N (0.0”) (2.2.5) A O O OLS estimate for c is c = (X¢_1M1X¢-1)’1(X,-1M1Y,) and OLS estimate for 02 is 7' A A $1112 A 02 = “IT where u; is the OLS residual. The OLS estimate for 05 is equal to 32(Xt'-1M,X;_1)‘1. The t test statistic for c = 0 is I‘ _ c — O t — .T— a? _ (Xt'-1M1X:-1)—1(Xt'-1M1 Yt) [32(X:'-1M1X:—1)"]” [(Xg-lMiXt-O-th—IMIYt)]2 E\’2(Xt'-11‘11Xt-i)nl Therefore. t2 = which diflers from eq.(2.2.4) only in one term. namely the estimate for 02. The test of c = O in eq.(2.2.5) is asymptoti- cally equivalent to the test of A = 0 in eq.(2.2.2). since when c = 0 is true . 32 is near 32. This is an interesting result. We can test the existence of a lag (A = 0) in the geometric lag model by testing the significance of the single lagged term. X¢-1. in the OLS regression of y; on (X, .Xg-1). This provides an asymptotically optimal test. despite the fact that the geometric lag is a lag of infinite order. -12- 2.3 The Rational Lag Model The rational lag model is a rather general distributed lag model. It can be expressed as follows: HUI) (2.3.1) where L is the lag operator defined as y‘=fl£)_.zt+u‘. .t=1.....T. £1th = 2‘4, , k = 0.1. ' ° ' . L0 = I . 13‘ = 2t v o and A(L) = f aiL‘.~B(L) = 2 (7,-1.1. b0 = 1.1.1. < v. The independent variable 2, is i=0 j=0 assumed to be nonstochastic. or if stochastic. uncorrelated with the random term at. We also assume u, is independently. identically distributed as N (0.02). Dhrymes. Klein and Steiglitz (1970) suggested that this model can be estimated by maximum likelihood methods through a search procedure ( search :1... given b, ). or through an iterative procedure for all of the parameter estimates simul- taneously. Then. using the estimates for ti. and bj. one can estimate 02 easily from the first order conditions. Since the estimation of a..- and b; is not easy. we have two alternative model specifications which can be estimated by OLS. The test of B(L) = 1 is given in section 2.3.1. and the test of A(L) = a0 and B(L) = 1 is given in section 2.3.2. 2.3.1 Test of B(L) = 1 The restriction B(L) = 1 can be written as 221:0 . j=1. .11. (2.3.2) -13.. Under the restrictions. eq.(2.3.1) becomes 31. = A(L)z: + u: = iniL‘z. + u. . t = 1. -~ .T. (2.3.3) i=0 Despite the fact that there might exist high multicollinearity among x's. we still can use the OLS method to estimate this restricted model. and indeed OLS pro- vides MLE’S subject to the restriction. Let 131 = (b 1. - - ' .b..)'. 132 = (0.0.01. ' - ' .awaz)’ and 13’: (31. - - - .gv.;o.;1. - . - Epic-2y where 7' .2 211-22 b1: =b.,=o. a; = OLS estimates fora. .7: =0. -~ .u. and 3?: ”‘7, where 17.} = y; - X(L)x. and Z(L) = f: ELL". We can use eq.(1.5) to construct the i=0 LM test statistic. The log-likelihood function is L = - 330114210 -A(L)z,)2 The first partial derivative with respect to 132 evaluated at 1? is zero (see eq.(1.4)). The first partial derivative with respect to 13, evaluated at 5' is 61.13;; , aL l6b1'6b, 1” T.... ' = "— 217 u, A(L)2¢- _1. .ZugA(L)z;_.. -02 =1 t=l = .. .145 52 where . 34(1031—1 - ' - A(L)31-v‘ X = .A(L)zT-l ' - - A(L)zT-v. and U= [171. {Erl' -14.. The information matrix evaluated at 1; is .. _ P131 52 [(19)- Vzi 122 r11.}. 12.. o = [:5 f“ 0 '0 0 1°20; where 1:101 Eloy] 1» = [121»! ' ' ' Kuhn [r .2 A. 7' ~ ~ l 2[A(L)x._11[A(L)z.-.] . . . g21[A(L)a.-...][A 0) = Prob. (8: < 2:5) = 4:5 where (PM is cumulative distribution function of N (0.1). Therefore. a: prob.(y. =0) = 1 -proe.(y. = 1) = 1 -[ f. The probit estimate for gcan be obtained by maximizing the following likeli- hood function: xtfl I 53:31 L = y1._=I1¢[7_Lg-oll - (I>[ a . H 16.631 1110 1101 W t -24- Since we cannot identify [3. a separately. we choose the normalization 0:1 to identify 6. The classical example of the probit analysis in economics is the study of the consumer’s decision of buying a durable good. In this chapter. we consider three models which are extensions of the probit model. Section 3.2 considers the Tobit model. and Cragg's extension of it. in which the dependent variable is observable in a limited range. Section 3.3 con- siders Heckman's sample selection bias model which consists of two equations. one of which is a probit equation representing the rule for sample selection. Sec- tion 3.4 considers Poirier‘s partial observability probit model. which consists of two probit equations with a condition of partial observability. -25- 3.2 Test of the Tobit Specification against Cragg’s Extension of the Tobit Model 3.2.1 The Tobit Model and Cragg’s Extension Tobin(1958) considered a case in which the dependent variable is observ- able in a limited range and the analyst is not only interested in the probability of limit and non-limit responses. but also in the value of non-limit responses. Probit analysis is not suitable for this purpose. He proposed the following model. called the Tobit model: yg.= $35 + 8; . 8; 12d N (0. 02) . g. If y‘.> 0 t _= 1.....T where y,’ is unobservable and y; is observable. y. has a lower limit which is zero. That is. there is an event which at each observation may or may not occur. If it does occur. associated with it will be a continuous positive random variable. If it does not occur. this variable has a zero value. An example is an individual’s deci- sion whether or not to buy a new car. and the amount he spends if he does buy one. According to eq.(3.2.1). for y, > D. the probability density function (p.d.f.) for y; is f (21.) = 72%?pr — $21!. —z.a)2} (3.2.2) and for y; = 0. the probability of observing y, = O is Prob. (y; = 0) = Prob. (yfs O) = Prob. (3.3 + a; s O) = Prob. (a, s -z;fi) -z‘p = f —1——exp - -1-—52 ds V2770 2a2 Wh! ob rh’ -25.. - a (3.2.3) where (-) is the c.d.f. of the standard normal distribution. The probability of observing y, = 0 is represented by the shaded area in Fig.3.2.1. 122*) 5/: Fig.3.2. 1 Note that there is one and only one 6 to determine both the probability of y; = 0 and the shape of the probability distribution for y; > 0. That is. in the example of purchases of a durable good. the decisions on whether to acquire and on how much to spend if acquisition occurs are basically the same in this model. in the sense that the same variables and parameters occur in eq.(3.2.2) and eq.(3.2.3). Cragg(1971) argues that "In some situations the decison to acquire and the amount of the acquisition may not be so intimately related. In particu- lar. even when the probability of a non-zero value is less than one half. one might not feel that values close to zero are more probable than ones near some larger value. given that a positive value will occur." In the case of buying a new car. this argument is certainly true. The probability of buying a new car for an individual in a particular year is probably less than one half. From Fig.3.2.1. the Tobit model implies that. if a new car is purchased. smaller expenditures(e.g. 5 dol- lars) are more likely than larger expenditures(e.g. 5000 dollars). This foolish imp min 92> ter: the Tot F1!" 210 -27- implication is due to the fact that there is only one set of parameters to deter- mine the probability of y, = 0 and the shape of the probability distribution for y¢>0. Cragg(1971) proposed a more general model which uses two sets of parame- ters. One set determines the probability of y, = 0. and the other set determines the shape of the probability distribution for y, > 0. Cragg’s extension of the Tobit model can be written as a two-stage decision process. First-stage -- decision on whether to acquire The probability of not buying a durable good is Prob. (y¢=0) = Prob. (x,fil+u1<0) = (-x,fil) (3.2.4) and the probability of buying a durable good is Prob. (y¢>0) = 1 - Prob. (yt=0) = 1 - (-z“81) = @(z‘fil) (3.2.5) where a, is normalized as 1 because we can not identify 61 and 01 separately in a probit model. Second-stage -- decision on how much to acquire if acquisition occurs The probability density function for y,. given acquisition occurs. is f (y; I y; > 0) = N (2; fig. 0%) truncated at zero 1 [_ (y; '1': 592i _ Mag , 203 J- 1 x _(yt-2882)2‘d - 0 flag p 20% y; 1 L (ye-31132)“ _ mag“? 20.2 1 (326) — q, 3:52I ' . L 02 1 since (2111- 232232)] __ 3:52 IMO—i? xPl 1113;. -Q[ 02 The unrestricted estimates for 51. 62 and 02 can be obtained by maximizing the 1 WI}: -23- the following likelihood function: 1.12.6222) = H {Prob (y.=0)] H [Prob (yt>0)'f (y: I yam] y‘=0 yt>0 T 1' t t = E{Hob.(y.=0)] ‘ [Prob.(y.>0)-f(y. Iy.>0)]‘ l- 2: I _ ‘9 :fi{¢('x‘fil)] d‘ 43(151) 1 eXpl— (y: 1132?] (3.2.7) ¢[2‘ 62 mag 2022 J 02 where d,=1 if yg>0. d¢=0 if y¢=0. Equivalently. one can maximize the log- likelihood function [4131.52.09 =1I1L.(51.I32.0'2) =‘)_51{<1-d.)1n[¢<-z.2>1 (32-8) + d. {1n<1>(z.5.)—1n<1>[—%z‘82 —ln(\/2'1?a'2)- (.y‘ ”‘25 2021 02 202 I 3.2.2 The LM Test In order to derive the LM test statistic. it is convenient to reparametize in a way similar to that suggested by Olsen(1978). That is. letting - 5; _ 02 6= 51 - 5 (3.2.9) h. = -1— 02 eq.(3.2.8) becomes “64”!) = §{(1’dt)lnq’[-zt(£+fl)] (3-2-10) 8:1 I + d. 11:122. (£+B)]-1n<1>(2=¢ B)-;—ln(21r)+1nh -;}.—" _ M N" - N' S where M = N = X'AX S = X[(A+B)-CD"C]X Note that M’IN = 1. Thus. ~ I“ = M"1-l-M"1N(.51-N'1ll"1N)"1N'1ll'1 -1 = (X'AX)‘1+{X[(A+B)-C'D"CPI-XXX} , -l = (XAX)‘1+{XBX @0300} (3.2.16) Substituting eq.(3.2.13) and eq.(3.2.16) back into eq.(3.2.15). we have the LM test statistic. -33- Although I “ has no obvious interpretation. it is easily calculated from the Tobit estimates. On the other hand. aLEI—is both easily calculated and also easiy l interpreted as a vector of cross products between the explanatory variables and the Tobit "residuals" for the non-limit observations. That is so in the sense that E[(hye-z:§) I 1100] = 771.0513) by 99-(C'1) and thus the term in brackets in eq.(3.2.13) can be regarded as the Tobit "resi- dual." -34- 3.3 Test of Sample Selection Bias 3.3.1 Heckman's Sample Selection Bias Model and His A Test In some cases. the dependent variable is unobservable while the corresponding independent variables are still available. That is. we have an incomple sample(or censored sample). Heckman(1976.1979) proposed a two- equation model to deal with this situation: ya = 311131 + “11 (3-3-1) 3121 = 32152 + “21' (3-3-2) with [an I“ iid N(O. 2). i=1....,n. 2i where _ '01" pm 2 ' a, 1 since eq.(3.3.2) is a probit equation. We observe the sign of ya and we observe y u it and only if ya > 0. That is. ya > O is the sample selection rule and we have a nonrandomly selected sample. This model can be estimated by the maximum likelihood method. But if we use least squares for eq.(3.3.1) and probit analysis for eq.(3.3.2) instead . the resulting estimates of 61 will be biased. This is so because E(y“ l swsample selection rule) E(yu | :3“. ya > O) 31151 + 30111 I 3121 > 0) . ' = 31151 + P01)”: (See Johnson and Kotz (1970).p.81). where the inverse of the Mill’s ratio is _ ¢(‘32~;§) M - 14130-22132) with ¢(') and (-) being the p.d.f. and c.d.f. of a standard normal distribution - 35 - respectively. Thus. the expectation of y“ for the nonrandom sample is not equal to the expectation of yu- for the complete (random) sample unless Emu | ya.- > 0) equals zero; that is. unless p=0. Therefore. by using eq.(3.3.1).we have the equivalent of an omitted variable problem which will result in a bias in the esti- mate. This bias is called "sample selection bias." This bias will be eliminated if the conditional mean of u“ is included as a regressor. However. since 62 is a parameter to be estimated. the M’s are unknown. Heckman proposed a simple two-stage estimation procedure to estimate the parameters. Step 1 -- Probit Analysis Let 1ify2i>0 di: 0 lfyZiSO n 2de=ni 1 Gt = P705- (yeis 0 I 1'21) = ‘I’("32i52) The probit estimate 52 is obtained by maximizing the following likelihood func- 1. tion: it _ L. = II(1-G.-)““G.-‘ “ 1:1 Step 2 -- Least Squares Let a" = 54-22131!) 1414-32132) then apply OLS method to the following equation 1;“ = mum + Xe +verro'r. i=.....n (3.3.3) where c = pol. The OLS estimates are > 51 = (XiX1)“Xi Y1 " (XiXxl'lXiXd'Mlifti'MzY1 8 = (1..“ 1X)-1XOM1Y1 -35- where X1 is a nlxlc matrix which consists of mu. Y1 is a nlxl vector which con- sists of y“. A is a nlxl vector which consists of A; corresponding to observed y“. M1 = I - X1(X'1X1)"Xi. Note that g, is a consistent estimate of 51. If p=0. E(yu I 2:“. ya > 0) = 211,31. Then. there is no sample selection bias even if we apply OLS to eq.(3.3.1). Therefore. the test of sample selection bias is equivalent to the test of p = 0. Since c =p01 in eq.(3.3.3). the test of p = O is equivalent to the test of c = O. Heckman uses the standard t test to test the hypothesis c = 0.(We will refer to it as the "A test") The t statistic is . ~ _1~. = W)... $511111 (3'34) where 3,2 is the usual variance estimate (SSE divided by 11.1. or degrees of free- dom) from OLS to eq.(3.3.3). This model and the A test have been widely used. especially in labor economics. Many applications have reported an insignificant value for A test statistic. One possible conjecture is that the A test is not a very powerful test of sample selection bias. However. this turns out to be a false con- jecture. The A test is asymptotically equivalent to the LM test. as is shown in the next section. and thus has good asymptotic power properties. 3.3.2 The LM Test of Sample Selection Bias Let Ft = Prob. (3111- “.921 > 0 l zit: 221') f Mam-2.1191. “21)‘11‘21 (53-3-5) “3213: where h(-.') is the p.d.f. of N(O. E ). Then the log-likelihood function for eq.(3.3.1) and eq.(3.3.2) is L = iid-an. + (1-d.)1nc.] (3.3.6) i=1 -37- The restricted model is the one in which p = 0 is imposed. When p = 0. eq.(3.3.5) becomes Ft = f hi(yii-2iifii)'¢(u21)du21 ”up: = hi(yii‘xiifii)'[1 ’ ‘I’('22152)] (33.7) where MO is p.d.f. of N(O. 0?). Hence. whenp = O. eq.(3.3.6) becomes L. = i:[d-ilnh-K‘tl11"“: 11131) + diln[1—Gi] 4' (l-di)1nGi (3-3-3) The restricted MLE 5 = (551.523,) is obtained by maximizing L' w.r.t. £1. .32 and 0,; i.e.. S: O. 51 = OLS estimate from eq.(3.3.1). 52 = probit MLE of eq.(3.3.2). i": -—eiel where 21 = Y1 -X1§1. That is. when p = 0. we can estimate 51. of ~ 1 a in.l from eq.(3.3.1) by OLS and estimate {32 from eq.(3.3.2) by probit analysis. Letting 131, = p. 192 = (61.32.01). we can use eq.(1.5) to construct the LM statis- tic. From eq.(3.3.6). the first partial derivatives evaluated at 13‘ are (see Appendix D) .25 6fl an 55;- E. 552 BL .W Note thatA is a nlxl vector. not a nxl vector. The information matrix evaluated >43 to 32 by eq. (1.4) (3.3.9) mo: LN COD '7 at :5 (see Appendix D) is ,N ~ ~ [pp [p31 195; [9011 [5151 [5152 [5101 [3252 152‘: [‘10:] -33- F n M.» n ~~ 2(1-GilAizu §_‘,(1--c;.)>..a "1 ... o 0 =1 01 n ~ ' ;(1-Gi)zlizli =1 . .... 0 0 a? = n ' MN (3.3.10) ‘ZzaZZiAim-i 0 =1 A H l .53 V where 51 = @('221'.§2) ~ = ¢(-32i§2) m... ¢(”32i52). N 1: Since pl'im(1 - G.) = 1 - Gi. E(d,;) = 1 - Gi.and palm-1172 [(1. - (l-G¢)]-H.- = 0 i=1 - with H.- being a nonstochastic variable. we can replace (1-5.) by d. in [(13) without affecting plimi—fla). Thus. eq.(3.3.10) becomes '~.~ AX 1 AA 1—1- 0 o 01 xx x'x .3 3.21 o 0 [was .. . ..- (3.3.11) 0 0 Pagixakm.‘ 0 =1 2n 0 o o ~2‘ l “I -39- From eq.(3.3.9). eq.(3.3.11) and eq.(1.5). we have F ( . -1 X X .e [ $21 0 0 lXiAl 1 All LM statistic N[ .51 (AA-1&- 0 Cl 0 [ixz'izzix‘imi] 0 0 01 01 i=1 l 0 0 0 512 ' ate: 51 _ Xe! . ~. ~ ’1 X 1 - [‘5‘] [W] [.. = [(A'figil'1A'e 112 (3 3 12) 3f(A'M1A)“ where M1 = 1" X1(XlX1)-1Xl- Comparing eq.(3.3.12) and eq.(3.3.4). we see that the LM test statistic is almost the square of the t test statistic used to test the coefficient of {when it is added to eq.(3.3.1). The only difl'erence is the diflerence between 312 and 512. which is asymptotically negligible when p = 0. In other words. Heckman’s A test -40- is almost the LM test; the simple A test therefore has desirable large sample properties (Note 4). -41.. 3.4 Test of Independence in Poirier’s Partial Observability Probit Model 3.4.1 Poirier's Partial Observability Probit Model In a recent paper Poirier(1980) has proposed the partial observability pro- bit model: yfi = 3151 4’ ‘Uii 3152 = $1132 + ”1:2 1 if yfl > O y“ = {o iiyg. s o {1 ifyi'e > 0 3112 = 21 = yilyiz 1'. = 1....n. ‘ 1 [:11] 1111:! N (0.2) where 2 = [a q] 12 Here yfi. yi'g. y“ and 31.2 are unobservable. We observe only 2‘ and 21. We observe z. = 1 if and only if y“ = yiz = 1. and z. = 0 if y“ = 0 or 31.2 = 0 or both. Some examples of this model are 1) Retention of trainees (see Gunderson (1974)) 2) Two-member committee voting anonymously under a unanimity rule (see Poirier (1980)). 3) Colletive bargaining between cities and municipal employees' unions in Michi- gan; binding arbitration is imposed if either side asks for it (see Connally (1982)). If y“ and gig were individually observed. we would simply have a system of two probit equations. Instead we observe only the product of y“ and yig. and estimation is correspondingly more diflicult. If we define p¢=Prob. (21: 1):}3’05- (911:1 and yi2=1)=F(zifli-3152§P) (3-4'2) - 43 - 1-p.=Prob. (z.=0)=Prob. (y..=o or y.z=0)=1-F(z.p..z.-az:p) (3.4.3) where F is the bivariate standard normal cumulative distribution function. then. the log-likelihood function for this model is L(P.51.52) = gem .+ (lvziflnflvpi) (3.4.4) It can be maximized numerically with respect to the parameters p. 31. fig. The main numerical difficulty involved is the accurate evaluation of the bivariate normal c.d.f. for arbitary p. Furthermore. there is some (limited) experience with the model which indicates that p is rather hard to estimate. These prob- lems would be avoided if the restriction p = 0 were imposed. Then the bivariate standard normal c.d.f. factors into the product of two univariate standard nor- mal c.d.f.'s: F(zi51-315230) = ‘I’Wtfiifi’uififl (3-4-5) Since univariate normal c.d.f.’s are fairly easy to evaluate. and since the param- eter p need no longer be estimated. the cost savings from the restriction p = 0 can be substantial. Given that p = 0 is a potentially valuable restriction. and the estimation in the restricted model is easier. the LM test can be used to test the hypothesis p = 0. 3.4.2 The LM Test In order to construct the LM test statistic. we need the first partial deriva- tives and the information matrix. both evaluated at the restricted MLE. 1; = (15.51.32) where 5': 0. El and 32 are obtained by maximizing eq.(3.4.4) with p = 0 being imposed. From eq.(3.4.4). the first partial derivatives of the unres- tricted likelihood are _6_L__§’j zi-Pi apt 6p .«-.pi(1-P.-)6p -43- an _ 3 21-101 61m 671 {=1pi(1-Pi) 551 6_L_ _ " zi_Pi aPi a—_52 i=1P1'(1-Pi) 552' The information matrix is 1(3) = c’c where C is the nx(2k + 1) matrix with with row equalling -l- at = [Jodi-12.)] 2[f(at-bi3p)'¢(a-i)@(Ai)zin¢e(8’.)zl Vplil-Plj V$1(1_p15‘31 VP1(1-Pl) c = . Z . (3.4.3) mm.) menial)“ twat); flesh-pal Vpnll-Pnl Vfinll-p'nl‘" From eq.(3.4.6) - eq.(3.4.8) and eq.(1.3). we have 6L2~~ LM statistic = [5:7] (C'C)fi‘ 2 =LZ———v "P‘——-yv meta?) (55).? =1pi(1-pi where (5'5) {'1‘ is the upper left corner element of (275)“. This expression can be further simplified as following : Let -45.. ae'Zrfiz... sea 1 [VINO-’1’!) VPnzl—Pnjj then §~ zg-P‘ -—--¢(( or: (349) 21' __-___¢Pi b i Eiglpiu— P11) (a'i)¢( fit): I - 22. 63 by eq.(3.4.6) Therefore. LM statistic = [-—]). {1f:)]- :[:L_] — ) by eq...(347)andeq.(3..49) ._. 55°5- (3.4.10) where B is the OLS estimate for the coefficient of E in the following regression: = 53 + 8 Define 2:62 e=2-2 Eq.(3.4.10) becomes LM statistic = ('Q' + e')Z‘k§ = 053 by e'E = O = é'é which equals explained sum of squares in a regression of 6 on E. -43- Note that Q can be interpreted as a vector of standardized residuals from the restricted model. This is so since E0 l0 otherwise with (-) being the c.d.f. of N(0.1). Therefore. the joint density function for k(u:)= w, = v, —ut is _" 1 , _1 us-ulz “31+“! 2 Mm.) ‘fl exp 2 a + a an. (4.2.1) o li-‘I>[- ‘E—HZTWuO'u u ‘ " au 1 104-1» I 1 I ’1 = 3440—] 1-4[;{-£§-+ w.x]].l1-2[-§:—H where a = (a: + 03)”. A = 53-. (at) = standard normal p.d.f. Eq.(4.2.1) util- v izes the following formula for integration: - _ 2 = 1_ .1? b2-4ac . b {eXp[ (au +bu+c)]du 2 erplL—E—Ll erfc[zTE-] where erfc (p) = é—fexm-uzmu p -53- The log-likelihood function for Stevenson's model is ‘ 7' Mama) 1311mm.) 7' = —211n(2n) - {we - #13114. «2.2mm +t§lln[1-¢(a,)]- Tln[1-(b)] l where at = fi-lf-i- (y,-x,fi)A] b = —%—(A‘2+1)* The first partial derivatives of the likelihood function are 4% 317‘: [(yt-ztfi)+#] + figmm.) — gramme) 6L __,1__"[ _ - >\_" . ' '7- agtll}; (y, 3tfi)+fl]zt 4' a‘§‘m(¢¢)3t r %£-= -%—§lfi%'+ (yt-ztfi)l'm(at)} + 7%{A‘24-1Y’9'm9) 36;Ié-=- 2—1524'2 —2:3[(y: "M5)"’#]2+ i251: m(at)- -9Mb) where m(-) = Z:- 1-‘1’(') The second partial derivatives are 3%” 2+ +02A222<4)-—- when Et = armfi't) - 7112(3) H --—1-r 1 " p.3- 52‘§121(+zt) H 2.2 Lima). _1._§:=<2n> -§-Iz..I—*-2[1-¢3... 2.. Eu: = (0142- ' ' ' noun) F022 . . . 0’23. ' 286 = Lang 0 . 0 am‘ v; is independent of v." and 8;. v; i.id. N(O. 03) Note that cov (u, .83) = 0 but that -69- 0°”(uts lea l) = (ZUuVO‘u/"MV 1-pf+p.arcsin(pi)-1] > o. 11p.- .. o where p. is the correlation of u; and a... -16\::O = 'JT‘ZQWRt 'I' ”ION-1)” - 4.22:2 - [W " ..E'angh '37:“: 1.3.2 " -EIa;:I;2 - 0 -31- The information matrix is 1 U... I... 1....) o In}. I” 15710 0 Incl Inofl loan.) 0 I O O 0 ”102,2‘ [(13) = 'Under reasonable assumptions about the explanatory variables. lyrfnm. 17,-!an and IFIMO all converge in probability to zero. Thus the information matrix is singular asymptotically. if ’00 is estimated as a parameter. -82- APPENDIX B INVERSE OF A PARTITIONED MATRIX: "OR 'I'HE RAT'ONAL LAG MODE L C J N RI Q‘ where [1 «I 1 ~ .1 1 1 Go I P Tina”... 71.1.1. _ $32—$95” T~2X'X“ 1 = 1 ’V 1 'V - 10 6'0 .150 . ._., fi ___ no; a 001 no Too. T» I T‘b’ZXY TEZXY‘ 1 ~ 1 1 I R _ FIMO _ TEZX'X 1- 1 ~ - _1 E730 7.71“," L -T,, —-—2-X00X _ 1 v _ 1 1 Q1- -7w—Ill°a°- Ta ___-XX Step 2 IA F 1 _ B G- [Pl-R1 Q1 R1] II + 1 X'MJ 3° X'M4X -1 _ # T~2 O O T62 O O. - E J¥.M;X'o 53 Y'Ah)? .T002 .0 T521 O. .04 [P2 R21" " 32 02 Hence. = (P2 " 32425135)“ -83.. -1 _ 1 ' 1 ' ' t - ' t _ I“, + WYOIJ4XO — .T—aTz—XoIJoXoo (XOOJ'J41Y00) Una-141$ :1“ Sinc e X;M4Xo " X;IJ4X00(X;0!J4Xoo)—1X;01M4Xo= X;[I " .Xu(x;a.Xoo)_1Xu]Xo = 0 where X. = [J4X.. X» = mx... The second equality holds because X. is contained 111x... 3: 'QEIRé(P2 - ReQz-lfiérl = -Qz'"f?éA = 8L(XL.M4X..)*1XL.M,X. 0 _ 1 [’1‘] ' 2710 F: ‘(Pz " RzQilRé)-132051 = “M2051 = é—[Iw 0] ‘10 G: (25‘ + 0511?;(1’2 — Rzai‘Ré)“RgQi’ 52 , ‘1 . . . . [#Xufl‘x”] + (3:5 "1(X..M4X..)“(Xufly‘fi)(X.f/I4X..)(Xufithrl 1 ~2 . -1 [’1‘ = 302 To (X00514Xoo) + [0 [Ip' 0]] -84.. APPENDIX C SOME EXPECTATIONS FOR CRAGG'S EICI'EI‘ISION OF THE TOBI'I' HODEL E(d,)= l-Prob(d‘=1)+ O-Prob(d, =0) = Prob(y,>0) = M21“ + 5)] E(dgy,)= ECU: lyt>0)-Prob (yt>0) = [37:52 + m 2;;5—2‘1'021'Wxtfii) = 112113 + Mam-@1241 + 6)] by eq. (3.2.9) Ed( 12:31:) E(y;2|y¢>0)- -Prob (yt>0) = WWW: lyt>0) + [E(yt lyt>0)]2 = [1 — $77?- Eflm " 223:2 ‘72 +{\31162)2 +2(3152)02' m[‘—}+U§ m 212%} '-' 022 + (1': 592 + 02 ($132) "LI-£1 = EEU + (2:5)2 + ztfi-m(=:fi)]"1’[x:(£ + 5)] by eq. (3.2-9) "' 3:132 02 |y>0 E(hy¢ ‘3tfilyt>0)= Er! _ [3:32 —m , 02 since E(ytlyt>0) =3152+m a 02. 3:52 . 2 -35- APPENDIX D INFORLIATION HA’I‘RIX FOR THE SAMPLE SELECTION BIAS MODEL The log-likelihood function is L =-id‘1nFi + (1 - 40an i=1 The first partial derivatives are 11;: " I e _ p (My‘ZuHJZ + (la-pa) (3111“31151)A¢ _ _p Bi] 6p {.1 1-pz (1-p2)2 012 (1—p2)2 01 Ft (1-P2)2 Ft] Egg-___ id-‘Hyu‘zufiflzit __ Jo 3'11 AL] 361 m «fa—p2) l-pz 0: A] I’ o 21.- .. - m _ .. - 652 - ig‘ldiza F1 (1 dl)z2i7nl 6L = i“: . __1__+ (yu’ziifii)2 _ J (ya-31151) At] 501 m 0: (l-pzki‘ 1-92 612 Ft} where ht.= h(yu '31131. -22152). A¢= f ua'hWit'zufliuuaddua: “32152 Bi. f ugt'th-zufii- “21)‘11‘21: 43152 F1: f h(yu"$ufli-ua)duziu "2132 = M'Zztflz) m.‘ “"32152). Whenp = O. -55- f ua'¢(u2¢)duzi %:= «an: ' "-' 3(u21lu21>-321132) = M (D'l) fa @(uztIdU-zi “21 2 Hence the first partial derivatives evaluated at 13 are ~ ~ 6L " 3111-31151 X91 ~ ¢(‘22i52) —:..-= ‘ ... I‘ = —-— where . = ... , 6p Ed‘ 01 w 01 AI 1-‘I’(-32u32) 6L_6L_BL - =0 (29.1.4 7-51 35-; 601 y 9() The second partial derivatives are 2 2 '3‘}: (11:p2)2_—L_;¥ldl {(112312);«Izzddyu-zl‘mf". 28(3;2[;:) a? ‘§1d1(yu-2u51)— 04-102)2 _ A12 _p_=__n [_B__¢:_ D: +(1-p2)‘ a? 121“?!“ 31251;:[8'F-B F} T75]- (1_p2)4i21 F12 F1 -ML“ 31 _ 2pL1+pz) 1 _ _Buh] (1-p2)3 “I F" (1_ pg)‘ 0 Eddy“ 21151)[? F3; F22 1 I 2 2 P(1+p2) (ya-21150131 __A; H fl: n szii(yli‘zlifi) _zl‘[(1+p)%- + (1‘102) 01 [F1 F12 2 apafil i=1 (1-'/Jz)2012 (1-p2)201 . I p2 Pi- A‘BiI zitl1_p2 LF‘ R2 4 (1'P2)201 I 2 h." A; 626+2' = ”((11:52?2 1 it @321“! H. "'3 1‘51) —'[zzifiz+ 2.1:] - —&—§ld1321(1_p2)2 F—:{(32152)2" %] -37- 62L 2p21+ 1 n A: apaal = (l-p a8 1‘2‘1‘(y“-z“fil)2- ((1 $32320 “Edam-21:51)}:- 62L - n -21‘3“ f p2 zitzu Af-BiF‘j amen; 1?, 0?(1-p2)_(1-p2)2 of F3 1 66165;. m 1-p a: I F‘ WM] 62 L “Edi p 311221 [-22:52th _ A: ’11.] 62L = i {-2(yli-zlifil)zii + J 311 [141+ p (ya-21151) IB‘ _A‘ZI] 63160, {=1 (1-p2)a? 1—p2 of [Ft 1—p2 0; [Ft F? J 62L 2 0132862: l-pz an?“ 2‘ 2““ ‘2dt32132t i;.2—+ 2(1'dn3322321(32ifiz)mt " 2(1‘dtk2t32tm12 (all i (81 {-1 2 0 666261271 = T310210 24:321(y1t‘31¢51)%‘[12152 4' 2‘7} 62L J2 6012 = Egldt- (1_ :2)a4“fi12-'d‘(yli-zlifll) (l- pz)2 a4“1fi1d((y1“2u51)2 [-A-:-— % Tag—E'ad‘ (91¢ '31¢51)2— where f ugt'thu'zufli- “21)‘11‘21 ’32:”: f ua-Myu-zuanumdua '32": Whenp = O. -33- A.‘ _ _ F:- M . (D-l) ht. __ ‘P('32¢52) _ _ n " 1 - u-zaaz) ‘ "‘ (D 2) E f “221:9”(1‘20‘11‘2: - F‘ = 42‘? = E(u§1lu2¢>‘zafiz) (D-B) i f ¢(u2i)du2¢ ”at”: ‘3 [EWa lu2i>-32152)]2 + V°T(u2i |u2i>‘32-;52) =M2+ (1 + ZiM “A3 = 1 + zix‘ where z‘ = 21:362. The elements of the information matrix evaluated at '3‘ are a; = -E[g:lz‘ evaluated at 1’; = -§<1-aa + :1?" sax-5;) - -..i,,-"'a?<1-6;>[<1+am-Xfl+ Sin-annex) {=1 01 (:1 01 {=1 ‘81 = 1:51-53)th unth 2‘ -' “3213.2 since eq.(D-l). (D-3). and E(d¢) = l'Prob(d.‘=1)+ O-Prob(d¢=0) = 1 - @(-zafiz) = 1 - Q (D-4) and Midas?» = imuamawmuma) 09-5) {:1 i=1 -39- = 1:1(1+P221M)012(1'Gi)- 1,51.—‘§1(1-a)—r“*‘ by eq.(D-1)andeq.(D-4). ~ IP52 = 0. since E(d¢uu) = Emu ldt=1)'P"°b 011:1) (D-6) = 012N‘P70b (di=1) =POIN'Pr05 (di=1) 1m,1 = o by eq. (12-6). i (1-§)ziizu 131,1 = 312 by eq. (D-4). {81 [p152 = 0 1;"! = 0 by eq. (D -6) gape:21¢(-32i52)321132t(32132) 4' f: %é‘za [280(- 21:52)]2 ' leétzzi(zzifiz)¢(-22¢fiz) 4' g1 M42132) zaza = ‘fil‘i‘mzé‘za, .by eq. (D-Z) and eq. (D-4) 54-32132) where ~ = —=— mi @(‘32150 1,3,1-0 A. n N T1;- <1-e>+§—:af<1-§) 011:1 01 2 2 ~ = .72.. (1— ) by eq. (12-4) and 99-(9‘5) -5“)- APPENDDK E INFDRMATION MATRIX FOR SL 11 MODEL From eq.(4.4.4). the first partial derivatives are 5L Tn _ =22(8a_€i)0a.l=2.'n,n =11 =2 Q)@ a 32—- §5m(at)_ Trn(b)z>‘_2++1)%_2[g;_] £81 :31 0 6L _ 1” 6A - a’Eganz(a¢)1D‘] 6L .. I. _1_’ _ Ba - a + azmwtxmmm 03:33; 6L ._ TBlnlEccl+ 1 n n «U; j 60M: - 2 60M +=2‘ 7.13;;ng (3X81! Efla a” where amlzccl __ 0M ifh=k 60M, - 20"" ifhatlc h=2.-ooln' k=2....n 6L T+ a1 1' 0““ Tlt=1j=2i=z 1 ‘7' + T wilful: (E'l) -91- BL T A T 1 7' n , 1 7' —=—+— m 'lnx —— 2 --a"‘+— wlnx 6a,, 7. “a (at) m “Mag; it £1.) 02‘; t n: h = 2. n. where a“ is the (i.l)th element 0:251. m(-) = l%' The seCond partial derivatives are 62L —=-To"“. l=2.~-.n, m=2.°--.n aézaém 62L = 0 I: 2.- .n afial‘ 62L - - . o . 6£¢6A-0' l-2. .n 621. _ _ ... 6&6A-0' l—2, .n 62L - - o I o 6&60’ -0. l-2. .n _6_2L__= -ZT fi(ea-£i)a"‘a“. 1.11.]: = 2, - ~ - .71. 65:50».- t=u=2 62L T " a = -— a . I: 2. .n afiaai (Mtge 62L = T l: 2 . n aAaa 62L 6/1601,- 62L aAaoq 62L axaa 621. KT. 27 = 33?EJQUH)":;F23u“ -92- T A = 272{ (ch) - pvrzwa} =1 t=l ==O A2 7' 1 . = Fawn,“ mm. 3 =1 1 r = -?Ew‘Q(a,) ‘81 6A6 a.“ 62L mam 62L = 602 62L 6060” _ 1 T A .. ——-2 {3'10¢Z(01)Inzfl - m(mflnza} a, T 7+ _2 (hi-701:) - 25?“) - 4 2w! 02t=1 0' 0' i=1 0' ==0 (13'?) (3'3) (3'4) (E'5) (PS-6) (E‘7) (E-B) (13-9) -93- 2 %= —2Aa(ag)1nzu02. - —22w:1nzu (B-w) £81 62L 7' n n _ 7" —.= 2 2 2 (cu —$i)(ej,-£j)a"‘a""a"j —- E-(dmfi’. h=2. - - ' .n (E-ll) aohhaahh (=1j:2i=2 2 T n n __6L__= -£o’“a"”‘ + 1—2 2 E (ea-£¢)(s,-g-$j)(a“a”"‘a’"+afl'amam’) (E-12) aamaalm 2 2t=lj=2i=2 h.l,m=2,-'-.n 62L 1 T n n *= -Ta"‘o”"‘ + —2 2 2(Eu-6t)(813 -$j)(a“am"a"’+o auam’) hath. h,lc.l.m=2. .n 12—: "LET: fitézkea ‘50 4' (ejg-£5)]a"‘o*j Bauaal 20: “Fe h. k = . .n 62L 1 1' " n —-——=--— e-- omakj-i- e-- afl‘a” 601125“: 2a; t§1[j§2( 1‘ £1) E); g 6‘) ] hi kcl = 2. ' ' ' .n 62 2 1 1' Bali“; = 'Lz+ %§Z(W)’(‘mu)z’ 32-210mm? (El-13) r 12 2 i: Z°g(2+5a+€jc‘€i‘€j) —2a1 t: lj=2i=2 62L _ T X2 7' 1 - _aalaah — r2 + 02‘=12(a:)(1nzu)(lnzm) a2,é(m“)(lm“) (E 14) n + am' h = 2' . o . 'n “1“); gauge 62L _= _L... firz( )(lnx )(lnx )_ Li (Zn: “In: ) (El-15) anhaak 1'2 02::1 a: M ’“ (,2:=1 M kt T - ma”. h. k 2. '.n where 2(8) = S'm(8) - m2(8) 0(8) = -m(S) + 82mm -S'm2(8) P(s) = -2m(s) + szom(s) -s-m2(s) The elements of the information matrix are IGsz = To” [6:4 = 0 [613:0 1“,: O [Glut = _ 0' T m [61% = - 62 I,“ N— 6751A— see eq. (E -2) (3‘2? -95- 2 as... .- IAA BABA see eq. (E 3) [4 R3- 62L see eq (E-4) ° aAaa ' 1A,” — o L as— - [M GAGA see eq. (E 6) [A - 62L see eq(E-7) a 6A6 [A0331 = 0 [A N- 62L see eq. (E-B) a‘ akaa; 1 RI- 621’ see 2:] (5—9) W 6060 ' 1,,“ = o 621. a- Ica‘ 60' 6 at see eq. (E-19) : 2£(ahh)2 [UMUM (E-3)' (E'4)' (E-5)' (E'6Y (E'7)' (E'3)' (E'9)' (E-lO)’ (EH 1)’ -95- Iowm = E—a’i‘amh (E-12)’ [ahkatm - Toma” (E-13)’ I‘M“; = 0. l: 1. .n [ahakz__§_2L__. h. Ic = 1. ' - ° .17.. see eq. (E-13) - (E-15) (E-l4)’ Bahaab Some elements of the information matrix are difiicult to find and are approxi- mated by the negative of the second partial derivatives since this will not aflect the probability limit of the resulting "information matrix." BIBLIOGRAPHY -97- BIBLIOGRAPHY Aigner. D. J.. C. A. K. Lovell and P. Schmidt (1977). "Formulation and Estimation of Stochastic Frontier Production Function Models." Journal of Econometrics. 6. 21-37. Aitchison. J. and S. D. Silvey (1958). "Maximum-likelihood Estimation of Parame- ters Subject to Restraints." Annals of Mathematical Statistics. 29. 813-828. ------ (1960). "Maximum-likelihood Estimation Procedures and Associated Tests of Significance.” Journal of the Royal Statistical Society. Series B. 22. 154-171. Breusch. T. S. and A. R. Pagan (1980). "The Lagrange Multiplier Test and Its Ap- plication to Model Specification in Econometrics." Review of Economic Studies. XLVII. 239-253. Connally. M. (1982). "The Impact of Final Ofier Arbitration on the Bargaining Pro- cess and Wage Outcomes." unpublished Ph.D. thesis. Michigan State University. Cragg. J. G. (1971). "Some Statistical Models for Limited Dependent Variables with Application to the Demand for Double Goods." Econometrica. 39. 329-844. Dhrymes. P. J.. L. R. Klein and K. Steiglitz (1970). "Estimation of Distributed Lags.” International Economic Review. 11. 235-250. Farrell. M. J. (1957). "The Measurement of Productive Efficiency." Journal of the Royal Statistical Society (A. Goneral). 120. 253-281. Finney. D. J. (1971). Probit Analysis; Third edition. Cambridge University Press. Forsund. F. R.. C. A. K. Lovell and P. Schmidt (1980). "A Survey of Frontier Pro- duction Functions and of Their Relationship to Efliciency Measuremen ." Journal of Econometrics. 13. 5-25. Godfrey. L. G. (1973). "Testing Against General Autoregressive and Moving Aver- age Error Models When the Regressors Include Lagged Dependent Variables." Econometrica. 46. 1293-1301. Gunderson. M. (1974). "Retention of Trainees : A Study with Dichotomous Depen- -93- dent Variables." Journal of Econometrics. 2. 79-93. Heckman. J. J. (1976). "The Common. Structure of Statistical Models of Trunca- . tion. Sample Selection and Limited Dependent Variables and a Simple Estimator for Such Models." Annals of Economic and Social Measurement. 5. 475-492. (1979). "Sample Selection Bias as a Specification Error.” Econometrica. 47. 153-161. Johnson. N. L. and S. Kotz (1970). Continuous Univariate Distribution —I. Boston : Houghton Miflin Company. Jorgenson. D. W. (1966). "Rational Distributed Lag Functions." Econometrica. 34. 135-149. Judge. G. G.. W. E. Griffiths. R. C. Hill and T. C. Lee (1980). The Theory and Practice of Econometrics. New York : Wiley. Klein. L. R. (1956). "The Estimation of Distributed Lags." Econometrica. 26. 553- 565. Koyck. L. M. (1954). Distributed Lags and Investment Analysis. Amsterdam : North-Holland. Melino. A. (1962). "Testing for Sample Selection Bias." Review of Economic Studies. forthcoming. Nicholls. D. F.. A. R. Pagan and R. D. Terrell (1975). "The Estimation and Use of Models with Moving Average Disturbance Terms : A Survey." International Economic Review. 16. 113-133. Olsen. R. J. (1978). "Note on the Uniqueness of th Maximum Likelihood Estimator for the Tobit Model." Econometrica. 46. 1121-1215. Poirier. D. J. (1980). "Partial Observability in Bivariate Probit Models." Journal of Econometrics. 12. 209-217. Rao. C. R. (1947). "Large Sample Tests of Statistical Hypothesis Concerning Several Parameters with Applications to Problems of Estimation." Proceedings of the Cambridge Philosophical Society. 44. 50-57. -99- Schmidt. P. (1976). "On the Statistical Estimation of Parametric Frontier Pro- duction Functions." Review of Economics and Statistics. 58. 238-239. ----- and D. K. Guilkey (1976). "The Effects of Various Treatments of Truncation Remainders on Tests of Hypotheses in Distributed Lag Models.” Journal of Econometrics. 4. 211-230. ----- and C. A. K. Lovell (1979). ”Estimating Technical and Allocative Inefiiciency Relative to Stochastic Production and Cost Frontiers.” Journal of Econometrics. 9. 343-366. ----- and C. A. K. Lovell (1980). "Estimating Stochastic Production and Cost Frontiers when Technical and Allocative lnefiiciency Are Correlated." Journal of Econometrics. 13. 83-100. Silvey. S. D. (1959). "The Lagrangian Multiplier Test." Annals of Mathematical Statistics. 30. 389-407. Solow. R. M. (1960). "On a Family of Lag Distributions." Econometrica. 28. 393- 406. Stevenson. R. E. (1980). "likelihood Functions for Generalized Stochastic Fron- tier Estimation.” Journal of Econometrics. 13. 57-66. Tobin. J. (1958). "Estimation of Relationships for Limited Dependent Variables." Econometrica. 26. 24-36. Wald. A. (1943). "Tests of Statistical Hypotheses Concerning Several Parameters when the Number of Observations is Large." Trans. Am. Math. Soc. . 54. 426-482. 0 ._L