PLACE IN RETURN BOX to remove this checkout from your record.
To AVOID FINES return on or before date due.
MAY BE RECALLED with earlier due date if requested.

DATE DUE DATE DUE DATE DUE

 

 

 

 

//SEHIPARAMETRIC ESTIMATION OF MULTIVARIATE TOBIT MODEL

/

BY

Bih-Shiow Chen//

A DISSERTATION
Submitted to
Michigan State University

in partial fulfillment of the requirements
for the degree of

DOCTOR OF PHILOSOPHY

Department of Economics

1988

 

49$
OQH
\u‘T

 

ABSTRACT

SEHIPARAMETRIC ESTIMATION OF HULTIVARIATE TOBIT MODEL

BY

Bih-Shiow Chen

The purpose of this dissertation is to study
distribution-free methods of estimation for the simultaneous
equation Tobit model. The simultaneous equation model
studied here contains censored dependent variables, and
also some dependent variables of the usual continuous type.
The typical treatment of this Kind c’ model is to assume
that the error terms follow a multivariate normal
distribution and to estimate the parameters by the method of
maximum likelihood. If the normality assumption for the
error terms is correct. the normal MLE is consistent and
asymptotically efficient. However, if the normality
assumption is incorrect. the normal HLE is inconsistent. and
it is therefore desirable to have available estimators whose
consistency does not hinge on a specific distributional
assumption. In this dissertation we propose such estimators
and consider their efficiency.

Our method of estimation involves estimating the
reduced form equations by distribution-free methods. and
then deriving estimates of the structural parameters from

the estimates of the reduced form parameters by the minimum

 

'y

 

 

Blh-Shlow Chen

distance method. As we have expected, our estimator (or
other robust estimators) is inefficient relative to the
normal MLE when the error terms are indeed normal. In this
dissertation. we measure the extent of this inefficiency for
particular parameter values and sequences of exogenous
variables by comparing the asymptotic covariance matrices of
our estimator and of the normal MLE.

From our experiments of efficiency comparison. we find
three important results. First, our robust estimators
become less efficient relative to the normal HLE when the
correlation between the error terms in the different
equations is increased. Second. our estimators become less
efficient relative to the normal MLE as the degree of
censoring increases. Third, the comparison between our
estimators, which are based on Powell’s CLAD and SCLS

estimators, also depends heavily on the degree of censoring.

 

 

 

 

 

 

 

ACKNOWLEDGEMENTS

I would like to express my deepest thanks to Professor
Peter Schmidt. chairman of my dissertation committee. His
insightful suggestions, enthusiastic encouragement, and
ever—lasting help during the period of my dissertation
writing are most highly appreciated. Without his help I
could not finish my dissertation. I also want to thank all
other members of my dissertation committee - Professor
Daniel S. Hamermesh, Professor Richard T. Baillie, and
Professor Jeff E. Biddle — for their kindness and help.

Most importantly. I owe my greatest thanks to my
husband, Tzong-Rong Tsai, for his continuing support and
never-ending encouragement. Without his unselfish love and
considerate help. I could not finish my graduate study.
Also, I would like to thank my son. Jay, who brightened my
life with his Joy and love. Finally I want to thank my

parents for their love and care since my childhood.

iv

 

 

CHAPTER

ONE.

THREE.

FOUR.

FIVE.

APPENDIX A

APPENDIX B

APPENDIX C

APPENDIX D

TABLE OF CONTENTS

DISTRIBUTION-FREE ESTIMATION OF
THE SIMULTANEOUS TOBIT MODEL

Introduction ......
One Continuous and One Censored
Dependent Variable .. . . ..
Application of CLAD Estimation
Application of SCLS Estimation ..
m Continuous and n Censored

Dependent Variables .. . . .
Application of CLAD Estimation
Application of SCLS Estimation

DERIVATION OF ASYMPTOTIC COVARIANCE MATRICES
WHEN THE ERROR TERMS ARE BIVARIATE NORMAL .

RELATIVE EFFICIENCY OF THE MLE AND
SEMIPARAMETRIC ESTIMATES ...

Reduced Form Equations with

One Explanatory Variable . ............
Reduced Form Equations with
Two Explanatory Variables .. .
Structural Equations . ..... . .....

SUMMARY AND CONCLUDING REMARKS

Proof of Consistency of
Proof of Consistency of

Proof of Consistency of
of Chapter Two ...

Proof of Consistency of Estimator
of Chapter Two .

o v

Page

0‘00)

21

N
m

28

42

42

44

48

.. 56
.. 72

. 79

 

Page

APPENDIX E second Derivatives Of
the Log 11K811h00d Function ........... ..... 94

APPENDIX F Some Expectations of Truncated Univariate or
Bivariate Normal Distributions ............. 97

APPENDIX C The Asymptotic Covariance Matrix of
the CLAD Estimate .......................... 99

APPENDIX H The Asymptotic covariance Matrix Of
the SCLS Estimate ............ ............ .. 101

APPENDIX I The Identified Structural Models
Corresponding to Reduced Form Equations with
“2 = (1. 0)’. (1. 1)’. or (O, i)’ .. ........ 103

BIBLIOGRAPHY .... ........... ............................ 105

 

 

 

 

 

CHAPTER ONE

INTRODUCTION

The purpose of this study is to present and investigate
distribution-free methods of estimation for the simultaneous
equations Tobit model. The model we consider is essentially
the model of Nelson and Olsen (1978). Some equations have
censored dependent variables while others may have
continuous dependent variables, and each dependent variable
may appear as an endogenous explanatory variable in other
equations. However, Nelson and Olsen assumed normally
distributed error terms. In our model this assumption is
relaxed, and we adopt distribution-free estimation methods
whose consistency does not depend on normality.

This study is an extension to the simultaneous
equations case of a large recent literature on the robust
estimation of the (single equation) Tobit model. The Tobit
model was originally proposed by Tobin (1958). It is of the
form
(1) Yt*:XtB+€t ,t:1,...,T,
where Xt is the t—th observation on a row vector of
exogenous explanatory variable, Et is an unobservable error
term. and Yt” is the t-th realization of an unobservable
dependent variable. We are assumed to observe
(a) Yt : max(o, Yt*),
and the error terms 6t are assumed to be independently and
identically distributed (11d) as mace).

The standard method of estimation of the Tobit model

 

 

 

 

 

has been maximum likelihood. The form of the likelihood

clearly depends on the assumption of normality of the
errors. Arabmazar and Schmidt (1982) and Goldberger (1983)
have shown that the normal maximum likelihood estimator is
inconsistent when the distribution of the errors is non-
normal. Because the assumption of normality is often
questionable. there has been considerable interest in
finding estimators which are "robust" in the sense of being
consistent without having to make specific distributional
assumptions for the errors. Such distribution—free
consistent estimators have been developed by Duncan (1986).
Fernandez (1986). Gourieroux et a1. (1987). Horowitz (1986),
and Powell (1984, 1986a. 1986b). (In related work.
Chamberlain-(1987) and Cosslett (1987) have derived bounds
for the asymptotic efficiency of such estimators, and Newey
(1987a) and Smith (1987) have provided tests of the normality
assumption in the Tobit model.) However. while all of these
estimators are known to be consistent. not all have known
asymptotic distributions. and for some (e.g. the estimator
of Fernandez (1986)) T%(§ - B) does not even have an
asymptotic distribution. In this study we will therefore
focus on the methods suggested by Powell (1984, 1986b), the
"censored least absolute deviations" (CLAD) estimator and
the "symmetrically censored least squares" (SCLS) estimator.
These estimators are consistent for a wide class of
distributions. and they have known asymptotic normal

distributions. They are also consistent even in the

 

 

 

 

 

presence of heteroskedasticity (which the normal MLE is

not).

For the usual linear simultaneous equations model, it
is possible to estimate the reduced form equations by
ordinary least squares. and then to derive structural
estimates from the reduced form estimates by the minimum
distance method. This procedure was suggested by
Chamberlain (1983), who proved that it led to asymptotically
efficient estimates of the structural parameters, being
essentially an efficient version of indirect least squares.
He also showed that it is an approximate version of three
stage least squares

-In this study we propose a method of estimating the
structural parameters of a simultaneous equations Tobit
model. which combines the distribution-free estimators of
Powell with Chamberlain’s minimum distance estimator.
Specifically, we solve the simultaneous equations Tobit
model for its reduced form, and we estimate the reduced form
by ordinary least squares (for those equations whose
dependent variables are continuous) and by one of Powell’s
methods (for those equations whose dependent variables are
censored). We derive the asymptotic covariance matrix of
these reduced form estimates; this is non-trivial because
it involves the Joint distribution of two different kinds of
estimates. Following Chamberlain, we then derive structural
estimates using the minimum distance method. Subject to

suitable regularity conditions. this procedure leads to

 

 

 

 

 

consistent and asymptotically normal estimates of the

structural parameters of our simultaneous equations Tobit
model.

There is some previous literature on estimation of
simultaneous equations Tobit model by methods other than
MLE. The earliest contributions were motivated by
computational considerations. since the Joint MLE involves
integrals of a multivariate normal distribution (the number
of integrals being the number of censored variables) and is
computationally demanding.

Under the assumption of normally distributed error
terms, Nelson and Olsen (1978) suggested a two—stage
estimator. In the first stage, they estimated the reduced-
form equations separately by ordinary least squares and
single-equation Tobit maximum likelihood. Then they
estimated the structural equations separately by the same
methods, but with the predicted values for the endogenous
explanatory variables from the first stage substituted into
the equations. Amemiya (1979) derived the asymptotic
covariance for Nelson—Olsen estimator and proposed an
asymptotically more efficient estimator. He estimated the
reduced-form equations as Nelson and Olsen did. But to
estimate the structural equations, he suggested a form of
generalized least squares estimation. Lee (1981)
demonstrated that his GBSLS (generalized two stage least
squares) estimator is asymptotically more efficient than one

version of AGLS (Amemiya‘s generalized least squares)

 

 

 

estimator. Amemiya (1983) compared the AGLS estimator with

the Lee-Maddala-Trost (1980) GZSLS estimator for the
simultaneous equations Tobit model. Amemiya then proved
that his AGLS estimator is asymptotically more efficient for
most cases. Finally, Newey (1987b) provided a definitive
treatment of the efficiency of such two—step estimators by
showing that the methods which they use to move from
reduced-form estimates to structural estimates are dominated
by the minimum distance method (which he calls the "minimum
chi—square" method), though a form of the AGLS estimator is
asymptotically equivalent to minimum distance estimator. If
the minimum distance method is used, the efficiency of the
structural estimates simply depends on the efficiency of the
first-stage reduced form estimates from which they are
derived. For the case in which only one equation has a
censored dependent variable, efficient estimation of the
reduced form requires augmenting the reduced form equation
for the censored dependent variable with the residuals from
the other reduced form equations. It is not clear how this
result generalizes to models with more than one censored
variable.

Because they all rely on single-equation Tobit MLE in
the first stage, none of the estimators Just discussed is
robust to non—normality of the errors. However, if the
errors are indeed normal. these estimators may be more
efficient than our estimator because the single-equation

Tobit MLE will be more efficient than Powell’s estimators.

 

 

 

Newey (1985) is apparently the only previous treatment

of a simultaneous equations Tobit model, which does not
impose normality. His model consists of only one censored
dependent variable with endogenous explanatory variables.
He suggested estimating the reduced form equation for the
censored dependent variable by Powell’s symmetrically
censored least squares method, as we do, but he considered
asxv and AGLS estimation of the structural coefficients
rather than minimum distance. From Newey (1987b) it is
apparent that this is not a substantive difference for the
case of only one censored dependent variable. but our
treatment is more general in the sense of more readily
accommodating an arbitrary number of such variables.

If the errors actually are normal, our estimators are
less efficient than the normal MLE. This is the price one
pays for gaining robustness to non-normality. It is natural
to ask how high this price is likely to be. We attempt to
gather some evidence on this question by comparing the
asymptotic covariance matrices of our estimates with those
of the normal MLE. For given values of the parameters and
given sequences of exogenous variables, this is done by
calculating the asymptotic covariance matrices of the MLE
and of our estimators; it requires a complicated simulation
because certain expectations are analytically intractable.
An interesting finding is that the efficiency loss varies
directly with the degree of censoring. This is the

complement to the result (Arabmazar and Schmidt (1982)) that

 

 

 

 

the size Of the inconsistency 0f the normal MLE caused by

non-normal errors also varies directly with the degree of
censoring.

The structure of the dissertation is as follows.
Chapter 2 sets out the model to be considered, and it
derives the asymptotic distributions of the estimates of the
reduced form and structural parameters. It treats both the
homoskedastic and the heteroskedastic cases. For ease of
exposition, it discusses two—equation models before going on
to a treatment of the general model with m continuous
dependent variables and n censored dependent variables.
Chapter 3 derives the asymptotic covariance matrices of the
various estimators for the special case that the error terms
are bivariate normal. Chapter 4 then reports the results of
several experiments which measure the efficiency loss from
using our distribution-free estimators (rather than the
normal MLE) when the error terms are bivariate normal.

Chapter 5 gives our concluding remarks.

 

 

 

 

 

CHAPTER TWO

DISTRIBUTION—FREE ESTIMATION

OF THE SIMULTANEOUS EQUATIONS TOBIT MODEL

I. INTRODUCTION

In this chapter we discuss the distribution-free
estimation of the simultaneous equations Tobit model. We
begin in section II with the simple case of a two equation
model in which one dependent variable is continuous while
the other is censored. In section III we treat the general
case with an arbitrary number of equations and an arbitrary
number of each type of dependent variable.

The basic principle of estimation is straightforward.
We begin by estimating the reduced form. Those reduced form
equations with continuous dependent variables are estimated
by ordinary least squares, while those reduced form
equations with censored dependent variables are estimated by
the censored least absolute deviations (CLAD) estimator or
symmetrically censored least squares (SCLS) estimator of
Powell (1984. 1986b). We derive the (Joint) asymptotic
distribution of the reduced form estimates and a consistent
estimator of their asymptotic covariance matrix. The
structural coefficients are then estimated by the minimum.

distance method of Chamberlain (1983L

II. ONE CONTINUOUS AND ONE CENSORED DEPENDENT VARIABLES
In this section we consider the simple two-equation

model

 

 

 

 

(1) Y1 : y1Y2* + X81 + 61

(2) Ya” : yaY1 + X82 + 62
where Ya” is a latent Tx1 variable,
Y1 and Ya“ are Txi vectors;
X is a TxK matrix of exogenous variables:
61 and £3 are unobservable Txi error terms:
Y1 and Y2 are unknown scalar parameters to be
estimated and i - YiYa ¢ 0;
51 and 83 are Kxi parameter vectors to be estimated.
Some elements of 81 and Ba may be known to equal zero.
Note that we do not observe Ya” but we do observe the Txi
vector Y3. which is related to Ya” as follows

(3) Yet = max(0, Yet"): At = 1, a, ..., T.-

The reduced form of this model is

 

 

81 + Yiﬁa $1 + Y1€2
(4) Y1 = X +
1 - YiYa 1 ' YiYa
: xvi + V1
92 + Yaﬁi G2 + Yaei
(5) Y3” = +
1 - Y1Y2 1 - Y1Ya
= Xﬂa + Va

where Y1. Ya*, and X are the same as above;
v1 and we are unknown Kxi parameter vectors to be
estimated;

V1 and V2 are unobservable TXi error terms.

A. Application of CLAD Estimation

1. Estimation of the Reduced-Form Parameters (W1 and "2)

 

 

 

We may estimate V1 by ordinary least squares (OLS), and

we by censored least absolute deviation estimation (CLAD)
introduced by Powell (1984). The CLAD estimator of Va is
defined by Powell (1984) to be the value that minimizes the
criterion function
(6) ST(ﬂ2) : (i/T)t§1)Yat - max(o, Xtﬂa)l.
The CLAD estimator is shown by Powell to be consistent and
asymptotically normal, subJect to certain regularity
conditions. The assumptions are as follows:
(A1) The parameter vector we is an interior point of the
compact parameter space 0.
(A2) (Xt, vt1, Vta” is a sequence of independently not
(necessarily) identically distributed random vectors.
(A3) Buxtn5 < x0 for all t and some positive x0.
UT. the smallest characteristic root of the matrix
E[(1/T)E1(Xtﬂa 2 so)xt'xt],
has UT > we whenever T > To, some positive so, Do. and
To.
(A4) Defining Gt(z, *2: r) e E[1(lXtﬁgl 1 uxtuz)nxtnrl.
the function Gt is 0(2) for 2 near zero, ﬁg near we
and r : O, 1, a, 3, uniformly in t, i.e..
Gt(z, ﬁg. 1") S K12 if 0 i Z S 60. "ﬁg - "all < 80
r = O, 1, 2, 3, for some positive H1 and 60.
(A5) The conditional distribution of Vta given Xt has
median zero for all t, and the corresponding
distribution functions for the {vta} are continuously

differentiable in a uniform neighborhood of zero. with

 

 

density functions [ft(AlX)l which are uniformly bounded

above and uniformly bounded away from zero. i.e..
ft(A|Xt) < L and ft(llxt) > k > 0 whenever IAI < k,
some positive L and k, all t.
(A6) The conditional density function ft(A|Xt) of the {Vta}
is Lipschitz continuous:
lft(A1|Xt) - ft(Angt)l 1 LOIA1 - Aal some Lo > 0
(A7) (a) There exist positive finite constants 5 and A
such that, for all t, E(|vt13l1+5) < A and
E()xt3xtxli+6) < A (J. x = 1. ..., K);
(b) ﬁT = (1/T)EE(Xt’Xt) is nonsingular for (all) T
sufficiently large. such that det NT > 5 > 0.
(A8) There exist positive finite constants 6 and A
such that for all t, E(|vt13xt3xtxli+6) < A
(J.K=1....,K). ’
(A9) (a) E(Xt’vt1) = o.
(b) EIXtJ§t|3*5 < A < m for some 5 > O, J : 1, ..., k
and all t.

(1/T)’%Ext’vt1

(c) E : var is uniformly positive

(1/T)‘%§xt'zt
definite, where at : 1(XtVa > 0)[1/a - 1(vta < 0))
(A10) There exist positive constants 6 and A such that for
all t
E()xtJ3xtht1)1+6) < A (J, x. 1 = 1. .... K)
(A11) E(|1(xtwa > 0)xthtK|1+5) < A < m for some 5 > 0,
all t, and J, x = 1. ..., x.

(A12) BrgtvtixtJXtK|1+6 < A < m for some a > 0, all t. and

 

 

a. Consistency and Asymptotic Distribution of (al. ﬁg)

To consider the Joint distribution of (ﬁi. ﬁe). we
consider
e1 — w, (X'X/T)-1(T-%)(§xt'vt1) E91

(7) 1T“) =
a2 - we (cT/a)'1(T-%)(§xt'§t) F02

where (8a) CT 5 E[(1/T)Eaf(O|Xt)[1(Xtﬂa > 0)]xt'xt3
(£(01xt) = f(0) for homoscedasticity case)
(8b) 91 : (T'“)(Ext’vt1)
(8c) ea = (T-%)(Ext'gt) and
(8d) gt = [1(xtna > 0)][1/2 - 1(vt2 < 0)]
(Note that 5t is as given by Powell (1984, p.320, equation
(A.14).)

White (1980a) has proved that (T)%(a1 - «1) is
consistent and asymptotically normal under Assumptions (A2),
(A7), (A8) and (A9)(a)(c). (T)%(a2 - we) is also consistent
and asymptotically normal under Assumptions (A1). (A2),
(A3). (A4), (A5), and (A6) according to Powell (1984). The
Joint asymptotic normality of

ﬁi 1'1
(T“) is derived below.
ﬁe - "2

e1 811 212
WhiCh is uniformly

III
M
Ill

Let COV
92 z1’12 222

positive definite under Assumption (A9). 8 is calculated as

 

 

 

follows.

211 = covr(T-%)(§xt'vt1)) : (1/T)[EE(vat1Xt’Xt)] = vT
(By Assumption (A2), (Xt, Vt1) is a sequence of
independently distributed random vectors and E(Xt'vt1) : O

by Assumption (A9).)

Baa = cov{(T‘“)(EXt’Et)J = (1/T)t§cov(xt':t)1
(By Assumptions (A2), (Xt. Vta) is a sequence of
independently distributed random vectors.)
cov(xt':t) = E<§t3xt'xt) - [E(Xt’Et)]a
' gtaxt'xt = (1/4)[1(xtw2 > 0)xt*xt]
E(§t2Xt’Xt) : (1/4)E([1(tha > 0)]xt'xt)
E<xt';t) = Eixt'[1(xtwa > 0)][1/2 - 1(vta < 0)];
= Ewanlxlxt'[1(Xt"2 > 0))[1/2 - 1(vta < 0));
= Exlxt’[1(Xt"a > 0)])EVUJxr1/a - 1(Vta < 0)]
= o (v Eualx[1/a - 1(vt2 < 0)] = o by Assumption
(A5).)
A cov(Xt'§t) = (1/4)E([1(th2 > 0))xt'xt)
So, 322 : (1/T)[Ecov(xt’Et)]
: (1/4T)EE[[1(Xtﬂa > 0)]xt'xt) = (1/4)HT

where MT : (1/T)EE{[1(Xtﬂa > 0)]Xt'Xt]

213 : cov[(T'“)EXt'§t, (T'“)€Xt’vt1]

Etrtr'%)§xt’zt)r(T'%)"t'vt11'3 —
t

EI(T‘”)§Xt’EtIEI(T'“)§Xt'Vt11

El[(T'“)§Xt'Etl[(T'“)Ext'vt11’l
(7 E[(T‘“)Ext'vt1l = 0)

: (1/T)EE(EtVt1Xt’Xt)

 

 

1L.

According to the modified multivariate Liapounov Theorem
of White (1980b). if Assumptions (A2), (A5). (A8) and (A9)
are satisfied. the asymptotic distribution of

91 V 512
is N(O, ), where v : plim VT,
e2 5'1‘": Mu/4
ﬂ” = plim MT. E12 = plim 812.
E9

1 1
The asymptotic distribution of then is
F92

EVE’ ﬁﬁ1aﬁ’
N(0,
ﬁﬁ’1aﬁ’ F(ﬁ§/4)F’
where e = plim E = plim (X’X/T)'1 = ﬁ‘1

e : plim F = plim (CT/2)-1 = (6/2)‘1

Therefore,
a, — w, ﬁ-ivrri ﬁ'1512(5/a)'1
(T%) —> N(0,
ﬁe - we ﬁ'15’12(5/2)'1 6-1ﬁ.e-1

b. Consistent Estimator of Asymptotic Covariance Matrix

A consistent estimator of E'1VE'1 is
(X’X/T)'1(1/T)(Evtiaxt’xt)(X’X/T)'1, where at. : Yti - Xtﬁ1~
This has been proved by White (1980a).

A consistent estimator of 6‘1ﬁ,6"1 is éT‘iﬁTéT'i as was
proved by Powell (1984),
where 5T 2 2(TeT)-1§[1(xtaa > 0))t1(o 1 Vta s aT))xt'xt

9T 5 (1/T)Ef1(xtﬁa > 0)]xt'xt

Vta = Yta - Xtﬁa-

 

 

 

15

Here 5T is an appropriately chosen function of the data, and
it is assumed that there is some non-stochastic sequence
(cT) such that plim éT/CT : 1, cT : 0(1). cT-i = o(T%).
That is, 5T ten§s>$o zero in probability, but at a rate
slower than T‘“. (Note that 5T» NT. 5T are as given by
Powell (1984. pp.312-314. equations (5.1), (5.3), and
(5-6))~)

The consistent estimator of (1/T)EE[§tvt1Xt'Xt] is
(1/T)E[1(Xtaa > 0))[1/2 — 1(Vta < 0)]vt1xt'xt. The proof is

in Appendix A.

2. Estimation of Structural-Form Parameters (Y1. 81. ya,
sand 82)

To estimate y1, 91’ ya, and 83, we adopt the minimum
distance method. Minimum distance estimation for
simultaneous equations is a generalization of three-stage
least squares estimation according to Chamberlain (1983).
The only difference is that in the usual linear model the
reduced form is estimated by least squares, whereas we use
least squares plus a semi—parametric method.

The minimum distance estimator of a : (Y1! 91', ya,
83’)’ is derived from
3%? [a - f(a))'e-1ta — f(a)1
where w : f(a);

a is a consistent estimator of Q - the asymptotic
covariance matrix of T%(a - w)), and Q is positive

definite.

 

T : parameter space for a.

To adopt the minimum distance principle, we need to
add some assumptions. These assumptions are suggested by
Chamberlain (1983). They are as follows.

Assumption 1: T is a compact subset of R2K+a that contains
the true value do and a neighborhood of do.

Assumption 2: f(d) is continuous and has second partial
derivatives on T; f(a) : f(d°) for a E 7 implies that a

= «0; rank (F) : ax + a, where F : 6f(a°)/0a’-

If the Assumptions above are satisfied, then a is
consistent, and asymptotically normal:
The - a) —> mo. 9).

where o = (P'n-1P)-1 and P = o£(d°)/od'.

B. Application of SCLS Estimation

1. Estimation of the Reduced-Form Parameters (W1 and we)

We may estimate W1 by ordinary least square (OLS) and
we by symmetrically censored least squares (SCLS) estimation
for Tobit equation, which was introduced by Powell (1986).
The SCLS estimator of we is defined to be the value that
minimizes the criterion function
sT(w2) = Eth - max(Yt/a, Xt"2)12 + {1(Yt > extw2)((Yt/a)2 -
[max(o, Xt"2)]23

Powell shows that the SCLS estimator is consistent and
asymptotically normal, subJect to certain regularity
conditions. The assumptions are as follows:

(A1)’ Same as (A1) in II.A.1.

(A2)’

(A3)'

(A4)'

(A5)’
(A5)’
(A7)’
(A8)’

(A9)’

 

Same as (A2) in II.A.1.

Buxtu4+ﬂ < x0 for all t and some positive x0 and n.
UT. the minimum characteristic root of the matrix

AT = (1/T)EE[1(Xtﬂa z 5°)xt'xt],

has uT > no whenever T > To, some positive 20, v0, and
To.

The conditional distribution of Vta given Xt is
continuously and symmetrically distributed about zero,
with densities which are bounded above and continuous
and positive at zero, uniformly in t. That is, if
F(AlXt, t) a Ft(x) is the conditional c.d.f. of Vta
given Xt, then dFt(i) : ft(A)dA, where ft(l) : ft(—A),
ft(x) < Lo and ft(l) > so whenever IAI < 50, some
positive Lo and 50.

Same as (A7) in II.A.1.

Same as (A8) in II.A.1.

Same as (A9) in II.A.1.

Same as (A10) in II.A.1.

El‘tVtixtJth'1+6 < A < m for some 5 > 0, all t, and
J. K = 1. ..., K;

E(|1(-tha < Vta < xtw2)xthtK|1+5) < A < m for some
5 > 0, all t. and J, k : 1, ..., K;

where 5t : 1(tha > O)min[max(vta, 'Xt"2)- tha].

a. Consistency and Asymptotic Distribution of (51. 62)

To consider the Joint distribution of (a1. ﬁe), we

cons ider

 

a, - w, (X’X/T)'1(T‘“)(2Xt'vt1) A91
(TM) : t 5

ea — "a cor-1 ('r-M) ”EX" 5.) Bee
where CT E (1/T)EE[1(_Xt"2 < Vta < Xt"2)Xt’Xt3

91 = (T‘ulfxt’vt1

ea = (T‘“)(Ext'§t) and

Et = 1(Xtﬂa > O)min[max(vta, -Xtﬁa), Xtﬂa]

A s (X'X/T)‘1
B a CT’1
White (1980a) has proved that ﬁ1 is consistent and that
(T)“(a1 — W1) is asymptotically normal under Assumptions
(A2)’, (A5)’. (A6)' and (A7)'(c). Also ﬁg is consistent and
(T)“(ﬁa — n2) is asymptotically normal under Assumptions
(A1)’. (A2)', (A3)’, and (A4)’. according to Powell (1986).

The asymptotic normality of

*1 - 1"1
(T“) is derived below.
*2 - "2
e1 211 212
Let cov a E e which is uniformly
92 2-12 822

positive definite under Assumption (A7)’. 2 is calculated
as follows:

211 : VT (the same as in II.A.1.a)

232 : cov[(T'“)(EXt’Et)] = (l/T)[ECOV(Xt’Et)]

(By Assumption (Aa)', (Xt- Vta) is a sequence of
independently distributed random vectors.)

cov(xt'gt) = E(§t3Xt’Xt) — [E(xt';t))3

 

 

v gtaxt'xt : [1(XtVe > O)]min[vt23, (xtwa)2)xt'xt

E(;t3xt'xt) : E{[1(tha > O)]min[vtaa, (xtwa)31xt'xt)
E(Xt’§t) : Elxt'[1(Xth > 0)]min[max(vta, -Xth), th213

: Exixt’[1(xtwa > 0)]3Ewu|x[min[max(vt2, -Xtﬂa),

thall
If Xtﬂa 1 o, E(Xt’§t) = o.
If Xtﬁa > o. the conditional distribution of {min[max(vta,
—tha), th33 given Xt is continuously and symmetrically
distributed about zero under Assumption (A4)’. Then
E(Xt’§t) = Ex(Xt’)Ewulximin[max(vt2, -tha), Xtﬂall : o.
A cov(Xt'§t) = E([1(tha > O)]min[vtaa, (xtwa)a]xt'xt)
So, 232 = (1/T)[Ecov(xt‘§t)] .
= (1/T)EE{[1(tha > 0)]min[vtaa, (Xtﬂ2)a]Xt'th

= DT

212 : (1/T)EE(§tVt1Xt'Xt) (The calculation steps are the
same as in II.A.1.a.)
According to the modified multivariate Liapounov
Theorem by White (1980b), if Assumptions (A2)’, (A4)', and

(A7)‘ are satisfied, the asymptotic distribution of

CI

91 51a
is N(o, ), where B = plim DT.
Ga 3'12 5

V = Plim VT. E12 : plim 213. Also

A61 AVA’ Aﬁ1aﬁ’
—> N(o, ).
Beg AE'125’ ﬁbﬁ’

 

 

20

whore A : plim A = Plim (it’ll/'1‘)"1 = 3‘1

a = plim B = plim cT-i = 5-1

Therefore,

G1 - 171 ﬁ-ivﬁ-1 ﬁ'15128—1
(T94) —> mo.

a3 ' "a ﬁ'15’125‘1 6'156‘1

b. Consistent Estimator of Asymptotic Covariance Matrix

The consistent estimator of ﬁ'1Vﬁ‘1 is the same as in
II.A.1.b.

The consistent estimator of 6 is
8 = (1/T)E[1(-Xtﬁa < Vta < xtaa)lxt'xt.
Where Vta = Yta ' Xtﬁa = Xt"2’+ Vta” ' xtﬁa = Vta” ‘ XtBt-
(Vta* a maxivta, —th21, 3t = ﬁg - we). The proof is in
Appendix B.

The consistent estimation of D is
5T = (1/T)E[1(Xtﬁg > O)min{vtaa. (xtee)33xt'xt. This was
proved by Powell (1986).

The consistent estimator of (1/T)EE[§tVt1xt'th is
(1/T)E[1(Xta2 > 0)]min[max(vta, -Xtﬁa)» Xtﬁalvtixt'xt.

The proof is in Appendix B.

2. Estimation of Structural-Form Parameters (Y1. 91, ya,
and 82)
To estimate Y1. 81. ya, and 92» we adopt the minimum
distance estimator, as shown in II.A.E. The‘only difference
is in the asymptotic variance-covariance matrix Q of

T%(ﬁ - w).

 
 

 

21

111. m CONTINUOUS AND n CENSORED DEPENDENT VARIABLES

 

In this section we consider the general model
(1) Yt* : Yt*B + xtr + at t = 1, .... T
or
(a) Y” = Y*B + XF + e
where Yt* is a 1x6 row vector of endogenous variables;
Xt is a 1xK row vector of exogenous variables;
B is a 6x6 nonsingular matrix to be estimated and
I - B is nonsingular;
P is a finite unknown KxG matrix to be estimated;
6t is a 1x6 row vector of unobservable error terms;
Y” is a TxG matrix of endogenous variables;
X is a TxK matrix of exogenous variables:
6 is a TxG matrix of unobservable error terms.
We do not observe (all of) Yt*, but we observe the 1x6
vector Yt, defined as follows.
(3) Ytl : Yt1* 1 : 1, ..., m
(4) Ytl : max(0, Yt1*), i : m+1, .... G
where G : m + n. 80 we do observe the first m variables in
Yt*, but the last n variables are censored from below at
zero.
The reduced form of this model is
(5) Yt* : xtw + vt
or
(6) Y” : Xv + v
where Y*, Yt*, X, and Xt are the same as above;

w : P(I ~ B)“1 is a KxG matrix of reduced form

 

 
 

 

22

parameters;
vt = €t(I - B)“1 is a 1x6 row vector of

unobservable error terms.

A. Application of GLAD Estimation
1. Estimation of Reduced-Form Parameters w
Let vi be the 1th column of w. Then we may estimate V1
(1 : 1, ..., m) by ordinary least squares (OLS) and up (p :
m+1, ..., G) by Powell’s censored least absolute deviation
(CLAD) estimator.
The assumptions are as follows:
(A1) The parameter vectors {up} (p = m+1, ..., G) are
interior points of the compact parameter space 0.
(A2) (Xt, Vt)’ is a sequence of independently not
(necessarily) identically distributed random vectors.
(A3) Euxtu5 < k0 for all t and some positive k0.
UT, the smallest characteristic root of the matrix
E[(1/T)E1(Xt1rp 2. Eolxt'XtJ' p : m+1, ..., G
has UT > v0 whenever T > To. some positive 50, uo, and
To.
(A4) Defining th(z, ap. r) a E[1(Ixt'ep| 1 nxtnz)uxtnrl.
thq(z, ﬁp. ﬁq. r) a E[1(|Xt’ﬁpl 1 "thz.
lXt’ﬁql 1 uxtuz)nxturl
the functions th and thq are all 0(2) for 2 near
zero, 5p near up. ﬁg near ﬂq, p, q = m+1, ..., G,
p ¢ q and r = o, i, a. 3, uniformly in t, i.e..
th(z, ﬁp, r) 1 kpz, thq(z. ﬁp. ﬁq. r) 1 kpqz.

if 0 i z i 60. "I‘l’p " 17p" < 600 “‘ﬁ’q ‘ "q" < 601

 

(A5)

(A6)

(A7)

(A8)

(A9)

(A10)

 

23

r = O, 1. a. 3, for some positive kp, kpq and 50.

The conditional distribution of vtp (p : m+1, ..., G)
given Xt has median zero for all t, and the
corresponding distribution functions for the [vtpl
are continuously differentiable in a uniform
neighborhood of zero, with density functions
{ftp(AlX)l whiCh are uniformly bounded above and
uniformly bounded away from zero, i.e.. ftp(AlXt) < L
and ftp(xlxt) > k > 0 whenever Ill < k, some k > 0,
all t.

The conditional density function ftp(llxt) of the
{vtpz is Lipschitz continuous:

lftp(A1lXt) - ftp(lalxt)| 1 LOIA1 - A3! some Lo > o
(a) Erxtjxtxre < m J, x = 1. ..., K;

(b) 9 e E(X’X/T) has uniformly full column rank.

(a) E(Xt'vt1) = 0, i = 1, ..., m;

(b) EnxtJvt1|3+6 < m, i = 1, ..., m, J = 1, .... K;
(c) v : var(vecT‘%X'V) is uniformly positive definite,
where W : (v1, ..., vm, §m+1, ..., is) is a TxG

matrix.

(a

v

EIXtJEtp|2*5 < A < w

for some 5 > O, p = m+1, ..., G; J = 1, ..., k;

(b) VP : var(vecT‘KX'wp) is uniformly positive
definite, where Wp = (Em+1' ..., :6) is a Txn
matrix.

El1(xtwp > 0)xthtK|1+6 < A < m for some a > 0, all t,

P = m+1, ..., G and J, K = 1, .... K.

 

24

(A11) ElttpvtixtJXtK|1+3 < A < m for some 3 > 0, all t,
1:1,...,m,p:m+1,....GaJldJ,K:1,...,K.
ElitpithtJXtKI1+5 < A < m for some 5 > 0, all t,
P.Q=m+1,...,G.p¢qandJ,k:1,...,K.
where Etp = [1(Xtﬂp > 0)][1/2 - 1(th < 0)].

th = [1(xtwq > 0)111/2 - 1(vtq < 0)]
a. Consistency and Asymptotic Distribution of (ﬁ1, ..., am,
ﬁm+1- ..., *6)
To consider the Joint distribution of (a1. ..., am,
am+11 ..., as), we consider
J a, - w, - J (X’X/T)'1(T'%)Ext’vt1 - [ E191 -
am - Wm (X'X/Tl‘11T‘“)§Xt'vtm Emem

(TK) --------- : ------------------------ E ------
as - "a [CTa/al'itT'“)§xt'€ta Bees
*6 - 1rG J L [CTG/21'1(T'“)Ext’§te J EGeG J

 

 

 

 

 

 

where ch e 3((1/T)Eafp(ouxt)[1(xtwp > 0)]xt'xt)

p = m+1, ..., G;
E1 = (X’X/T)’1 i : 1. ..., m:
6i = (T‘“)§Xt’vti3
2p = tch/al'i:
op = (T‘%)Ext’§tp and
§tp : [1(Xt'rrp > 0)][1/2 — 1(vtp < 0)];

or = m+1.

Define a Txm matrix W, = (v1. .... vm). Then VecT-“X’Wi

 

 

 

25

is :asymptotically normal under Assumptions (A2), (A7), and
(A8). VecT-“X’Wp is also asymptotically normal under
Assumptions (A1), (A2). (A3). (A4). (A5), (A6) and (A9).

The asymptotic normality Of vecT‘KX‘W is derived below.

 

 

 

 

 

F ‘ , '
61 i 311 --~ z31m ; E1d - 816
6111 2m1 '-- Emm : Ema 2me
Let cov —--- E E E -----------------------------
ed XV011 '-- B'dm E Edd 11- EGG
L 96 J 3’61 -~- 2'Gm 1 26d 1 2Ge
_ t E11 E12
2’12 322
L
I: is calculated as below:
211 has typical element Ehi' h,i : 1, 2, ..., m, given by

an, : cov[(T‘%)EXt'vth, (T‘%)8Xt’vt1]

: (1/T)EE(thVtiXt'Xt)l
(By Assumption (A2), (Xt, vt) is a sequence of
independently distributed random vectors and E(Xt’vt1) : O
by Assumption (A8).)
322 has typical element qu, p,q : m+1, ..., G, given by

zpq = COV[(T-%)EXt’Etpg (T'%)§Xt'itql

EtttT-Mlgxt'ztpl[(T'%)§Xt'thl'l —

E[(T'%)Ext'Etp]E[(T”“)EXt’§tql

E(t(T'%)§xt':tpl[(T-%)§xt';tql'l

(r E(Xt’§tp) = Etxt'[1(xtwp > 0)][1/2 — 1(vtp < 0)];

 

26

Exqulxixtl[1(Xt"p > 0)][1/2 - 1(vtp < 0));

O by Assumption (A5))

(1/T)EE(Etp§tht'Xt)

(1/4)MTp if p = q. where MTp :

(1/T)EE([1(Xtﬁp > 0))xt'xtl.)

812 has typical element 81p, 1 : 1, ..., m; p : m+1,
given by
zip : cov[(T'%)§xt'gtp. (T-%)zxt'vt1)

El[(T'%)EXt’Etpl[(T‘“)EXt’thl’l -

EE(T'%)§Xt'Etp]E[(T’%)EXt’Vt1]

EI[(T'%)EXt‘Etp][(T'“)Ext’vtll'l

(7 EttT-%){xt'vt,l = 0)

(i/T)EE(§tth1Xt'Xt)

According to the modified multivariate Liapounov

Theorem by White (1980b), if Assumptions (A2), (A5),

and (A9) are satisfied.

(TK)

 

The typical

By calculation above,

1"1
1Tm A11 A12
--------- is N(0, ).
"a A’12 A22
1rG
A

 

(block) element of A11 is plim(Eth1E1)

|x[1/2 - 1(vtp < 0))

(A8).

the asymptotic distribution of

vecT'WX’v is asymptotically normal.

 

 

27

Q‘iﬁn19'1, where 9'1 : plim Eh or plim E1, and where Ehi =
plim Ehii h,i : 1, 2, ..., m. The typical element of A12
is plim(Eh2hpEp) : 9-ibnpgp; h = 1, ..., m; p = m+1, .... G;
where bhp = plim zhp and where ep = plim Ep : plim(ch/a)‘1.
The typical element of A22 is Epﬁpng, p,q : m+1, ..., G.

Where ﬁpq = plim qu.

b. Consistent Estimator of Asymptotic Covariance Matrix
The consistent estimator of plim CTp and plim MTp is
5Tp and ﬁTp, which was proved by Powell (1984),
where eTp e 2(TaT)-1Er1(xtap > 0)][1(O 1 vtp 1 5T))xt'xt
ﬂTp E (1/T)E[1(Xtﬁp > 0)lXt’Xt
vtp = Ytp - xtap p = m+1, ..., G
5T is an appropriately chosen function of the data,

such that plim ET/CT = 1, cT : 0(1). cT-i : o(T%).
T—>m

The consistent estimator of Phi (h,i : 1, ..., m) is
(1/T)Evtnvtlxt'xt. where vth = Yth - xtah and at, : Yti -
xtai- The proof is similar to Theorem 6.3 by White (1984).

The consistent estimator of 51p (1 = 1, ..., m; p :
m+1, .... G) is (1/T)E[1(Xtﬁp > 0)][1/2 - 1(vtp <
0)]Vtixt'xt- The proof is the same as in Appendix A.

The consistent estimator of qu (p.q = m+1, .... G) is
(1/T)E[1(Xtep > 0)][1/2 - 1(th < 0)][1(Xtﬁq > 0)][1/2 -
1(vtq < 0)]Xt’Xt, where p, q = m+1, ..., G and p ¢ q. The

proof is given in Appendix C.

 

 

28

2. Estimation of Structural-Form Parameters B : (9h
Be) and P = ( Y1. .... YG)
All steps regarding the estimation of structural-form
parameters are the same as in II.A.a. The only

difference is the asymptotic variance-covariance matrix

Q of T“(& - w)-

B. Application of SCLS Estimation
1. Estimation of Reduced-Form Parameters n
We may estimate "1 (i : 1, ..., m) by ordinary least
squares (OLS) and up (p : m+1, ..., G) by symmetrically
censored least squares (SCLS).
The assumptions are as below:
(Ai)’ Same as (A1) in III.A.1.
(A2)’ Same as (A2) in III.A.1.
(A3)' Euxtn4+n < KO for all t and some positive K0 and n.
vT, the smallest characteristic root of the matrix
NT : E[(1/T)Ei(thp 2 So)Xt’Xt1» p = m+1, .... G, has
”T > uo whenever T > To, some positive 80, uo, and To.
(A4)' The conditional distribution of vtp (p : m+1, .... G)
given Xt is continuously and symmetrically distributed
about zero, with densities which are bounded above and
continuous and positive at zero, uniformly in t. That
is, if Fp(xlXt, t) E Ftp(X) is the conditional c.d.f.
of vtp given Xt. then dFtp(A) : ftp(x)dx, where ftp(x)
: ftp(-A), ftp(X) < Lo and ftp(X) > so whenever IA! <
50, some positive Lo and 60.

(A5)’ Same as (A7) in III.A.1.

 

6 > 0.

 

=

G. P ¢ Q.

Where gtp

th '

‘36) I

1"1

{(1/T)EE[1(—thp < vtp < xtwp)xt'xt]

P

(X’X/T)'1

all t,

 

 

 

29

Same as (A8) in III.A.1.
Same as (A9) in III.A.1.
Elvtpl < A < m uniformly in t and p

ElitpvtixtJXtK|1+6 < A < m for some

m+1, ..., G and J. K

ElitpithtJthl1*5 < A < m for some

all t, and J, K = 1.

E(l1(—xtwp < vtp < xtwp)xthtK|1+6)
p = m+1, .... G and J,
1(thp > 0)min[max(vtp,

1(thq > O)min[max(vtq.

a. Asymptotic Distribution of (a1. .... am. ﬁm+1-
To consider the Joint distribution of

we consider

(X'X/T)‘1(T‘“)Ext’vt1

(X’X/T)‘1(T’%)Ext’vtm

(CTd)-1(T_%)Ext'5ta

(CTG>"<T'“)§Xt'EtG

Ian, G:

1 = 1. ..., m;

(T'%)Ext’vt1:

(ch)'1:

< A < (I) for some

-Xt‘ﬂ'p) . Xt‘lfp] .

'Xtﬂq) . Xth] o

 

 

 

 

30

6p = (T-%)§xt'§tp and
Etp = [1(thp > 0)]min[max(vta, 'Xt"2)' xtwa]

a = m+L

Define a Txm matrix W1 : (v1, ..., vm). Then VecT-“X’Wi

is asymptotically normal under Assumptions (A2)’, (A5)’, and

(A6)'. Vec(T)-%x'vp (vp : (zm,1. ..., £6). a Txn matrix) is

also asymptotically normal under Assumptions (A1)'. (A2)’.

 

 

 

 

(A3)', and (A4)'. The asymptotic normality of vec(T)'%X'V
(W = (v1. ..., vm, §m+1, ..., £6), a TxG matrix) is derived
below. I
91 i 311 -- 21m 1 31a --- 216
6:11 Eml - 3mm : Ema '-' 8me
Let cov -——— E 2 E ———————————————————————————
ed Edi '-' 80cm 3 Bad - - EGG
96 J EG1 ... EGm : BGa - EGG
L. I. _
$11 512
2'13 222

2 is calculated as below.

has typical element Ehiv h,i : 1,2, ..., m, given by
= (1/T)EE(Vtthixt’Xt)3 (same as in III.A.1.a.)

has typical element qu, p,q: m+1, ..., G, given bY
= COV[(T'“)Ext’£tp. (T'“)Ext'itq]

= E{[(T'“)§Xt’itp][(T‘%)Ext'itq]’l —

 

 

31

EL(T-%)§xt':tp1Et(T’%)§xt'th1
= Et[(T'%)§xt'§tp1ttT‘%)§xt'§tq1'1
(-.- E(xt':tp) = Etxtwuxtwp > O))min[max(vtp, -xt1rp).
thpll
: Exixt'[1(xtwp > 0)]1Ev¢lx{min[max(vtp, -thp),
Xtﬂpll

If xtwp s o, mxt'gtp) = o.
If Xtvp > O, the conditional distribution of
(mintmax(vtp, -thp), thp] given Xt is
continuously and symmetrically distributed about
zero by Assumption (A4)'. Then E(Xt’§tp) :
Ex(xt’)Ev¢,x{min[max(vtp, —thp), thpll = O.)

= (1/T)EE(§tp§tht'xt)

212 has typical element 21p. 1 : 1, ..., m; p = m+1, ..., G,

given by

zip : (1/T)EE(§tthixt'xt) (The calculation steps are the
same as in III.A.1.a.)

According to the modified multivariate Liapounov
Theorem by White (1980b). if Assumptions (A2)’, (A4)’,
(A6)’, and (A7)’ are satisfied, vecT'WX'W is asymptotically
normal.

By the calculation above, the asymptotic distribution

of

 

 

 

32

 

 

*1 - "1
6m - Trm A11 A12
(TK) --------- is N(O. L
ﬁe — “a A’ia A22
ﬁG - 1TG
L .-

The typical (block) element of A11 is plim(Ah2n1A1) =
Q‘1ﬁh19'1. where 9'1 : plim Ah or plim A1 and where ﬁhi :
plim Bhi‘ h,i : 1,2. ..., m. The typical element of A13 is
plim AnshpAp = Q‘iehpxp; h : 1..... m; p = m+1,.... G; where
Enp : plim zhp and where AP : plim Ap : plim(CTp)'1. The
typical element of A23 is Apﬁpqzq. p,q : m+1,..., G; where
ﬁpq : plim DTp (if p : q) and DTp :

(i/T)EE{[1(Xtﬂa > O)]min[vtaa, (xtwa)21xt’xtz.

b. Consistent Estimator of Asymptotic Covariance Matrix

The consistent estimator of plim CTp (p : m+1,
is ap = (1/T)E[1(-xtap < vtp < xtap)lxt'xt.
where th : Ytp - Xtap : thp + vtp* — Xtﬁp : vtp* - xtstp.
(vtp* : maxivtp, -thp), 3tp : ap - up.) The proof is the
same as in Appendix B.

The consistent estimator of plim DTp (p = m+1, ..., G) is
sTp = (1/T)E[1(xtap > 0)m1n[$tpa. (xtap)21xt'xt. This was
proved by Powell (1986).

The consistent estimator of qu is

(i/T)E[1(Xtﬁp > O)]min[max(vtp, -Xtﬁp). Xtﬁp][1(xtﬁq >

 

 

33

O)]Ihin[max(vtq, —Xtﬁq), Xtﬁqlxt'xt- The proof is given in
Appendix D.

The consistent estimator of 5hi is
(l/T)Evthvt1xt'xt. where Vth : Yth — Xtﬁh and
th : Yti - Xtﬁl. The proof is similar to Theorem 6.3 by
White (1984).

The consistent estimator of hip is
(i/T)E[1(Xti}p > O)]min[max($tp, -xtap), xtaplvtixt'xt. The

proof is the same as in Appendix B.

2. Estimation of Structural-Form Parameters B - (81,

Be) and F = ( Y1. .-.. YG)
It is the same as in II.A.e. The only difference is

the asymptotic covariance matrix Q of T“(z¢r — 1T).

 

CHAPTER THREE
DERIVATIOK OF ASYHPTOTIC COVARIAHCE MATRICES

WHEN THE ERROR TERMS ARE BIVARIATE HORHAL

The purpose of this chapter and of the next chapter is
to compare the efficiency of our semiparametric estimates to
the efficiency of the maximum likelihood estimates, when the
error terms are bivariate normal. Our estimates will be
less efficient than the MLE. and we wish to see how large
this efficiency difference is. The actual comparison of
these efficiencies will be given in the next chapter, by
comparing the asymptotic covariance matrices of our
estimates and of the MLE for given parameter values and
exogenous variables. In this chapter we perform the
preliminary task of deriving the asymptotic covariance
matrix of the MLE, and we show the simplifications that
arise in the formulae for the asymptotic covariance matrices
of our estimators in the case that the error terms are
bivariate normal.

In this chapter and in chapter four we restrict our
attention to the following two-equation model:

(1) Yti : Xtﬂ1 + Vti
(2) Yta‘ : Xtﬂa + Vta
Yta = Yta” if Yt2* > O
: 0 otherwise
We assume Vti and Vta are independently and identically
distributed as bivariate normal with zero mean and

covariance matri X

31+

 

 

IIIIIIIEaIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII||

via 012
(3) ' .
U12 322
First, we derive the asymptotic covariance matrix of
the maximum likelihood estimator (MLE). Under certain
regularity conditions, the MLE is consistent. asymptotically
normal and efficient. The asymptotic covariance matrix of
the MLE is therefore equal to the Cramer Rao lower bound,
which is the inverse 0f the information matrix. The
information matrix iS the expectation of minus the second
derivatives 0f the 108 likelihood function. That is,
aalogL(e)
I = -E ( —————————- ). where e is the vector of unknown
aeoe’
Parameters.
The log likelihood function for the model (1) - (a) is:
(4) lnL(e) = c - Tlno, - (Ext)1noa - —%—(Ext)1n(i - pa)
+ Exti‘[(Yt1 ' Xt"1)a/‘°1a
* 39(Yt1 - Xt"1)(Yta - XtVa)/Uiba
+ (Yta - Xtﬂg)2/Uaa]/[2(1 — p3)11
+ E(i — xt)[-(Yt1 - th1)3/ao133
+ 5(1 - xt)1no(wt)
where wt = [-Xtﬁa - pca(Yt1 - xtw1)/o11/[o3(1 — 93)“);
At = 1(Yta > 0);
e : (“1" "2'. 013. 623. p)’ :
§(o) is the standard univariate normal distribution

function.

With the loglikelihood function stated above, we may

calculate the elements of the information matrix as follows.

36

' (Details are provided in Appendix E and Appendix F. )
—E[alnL(e)/6w16ﬂ1’] = -E[pa - 1 — §(at)p2]Xt'Xt/[c12(1
+ EE(1 - it)paztxt'xt/[o12(1 - p3
(where at : (th‘a)/‘oa. and
2t = ¢2(wt)/§3(wt) + wt¢(wt)/§(wt).)
-E[61nL(9)/6w16wa’] : -§pi(at)xt'xt/[o1oa(i - 92)]
— EE(1 — xt)pztxt'xt/[o1oa(i - pa
—E[OlnL(e)/Ov16o12] = -§p(1 - ap3)¢(at)xt'/[2o13(1 - pa
+ EE(1 - At)Xt’{th/[2613(1 - pa)
+ pazt(Yt1 — Xtﬂ1)/[2614(1 - 93)]
(where Mt = ¢(Wt)/§(wt)-)
-E[alnL(e)/6w16o22] = -§962¢(at)xt’/[Eviba3(1 - p3)]
+ E§(1 — it)p(tha)ZtXt’
/[aciba3(1 - 93)]
-E[OlnL(e)/0W1OP) = E¢(at)xt'/[o1(1 - pa):
- EE(1 - At)Xt’[Mt/[c1(i - p2)%
+ pZt[p(Xtﬂa)/Ua + (Yt1 - xtw1)/o
/[o1(1 - p2)311
-E[a1nL(e)/awaow2'1 : E§(at)xt'xt/[o23(1 - pa)
+ EE(1 — xt)ztxt'xt/[o32(1 - 93)]
-E[61nL(e)/ow36o12] = —Ep3¢(at)xti/[ao13oa(i - p3)]
- 32(1 - xt)pzt(Yt1 - xtw1)xt'
/[ao13og(1 - p3)1
-E[61nL(e)/awaao221 = -§(pa - 2)¢(at)Xt'/[2623(1 - 93)]
- 35(1 - xt)xt'{Mt/taoa3(1 - pa)“

+ (xtwa)zt/[aoa4(1 - 93)]:

 

 

-E[BlnL(e)/6wadp]

-E[o1nL(e)/eo1aao131

37

-Ep¢(at)xt'/Ioa(1 - p2)1

+ EE(1 - At)xt’[th/[oa(1 — 93)“)

+ Zt[P(Xt"a)/Ua + (Yti — th1)/c1)
/[oa(1 - p3)311

-T/(2o14) - E[-a + 392/2 + paN1t(Xtﬂa)
/(aoaa)1§(at)/[ao14(1 - pa):

+ Eli 9(at)1[1 - paﬂat(xtw2)/Uaal/o14
+ 33(1 - it)(3pht(Yt1 - xtn1)

/[4o15(1 - 93)“) + paZt(Yt1 - thila

/[4b15(1 - 92)];

(where "it = 62¢(at)/§(at). and Kat = -Ua¢(at)/[1 - §(at)l-)

-E[61nL(6)/Ooiaooaal

—E[61nL(e)/6o126p]

-E[ainL(e)/aoaaaoaal

-E[61nL(e)/Ocaadp]

~§p3§(at)to23 — H1t(XtWa)l

/[4v1aoa4(1 - p2>1

+ E¥(i - At)p(tha)Zt(Yt1 - Xtﬂi)
/[4613oa3(1 — pa):

-§§(at)[p - 9‘92 + 1)/2

+ 9(1 - pa)n1t(xtwa)/(aoaa)1

/[o12(1 — p2)21 - EE(1 - At)(Yt1 - xtw1)
{Mt/rao13(1 - p3)¥1

+ pzt[p(xtwa)/oa + (Yti - th1)o1]
/[ao13(1 - 92,2],

~§o(at)/(ao24) — {9(at)

[-1 + ape/4 + (4 - 393)ult(xtwa)/(4oaa)1
/toa4(1 - pa):

+ E§(1 — xt)£3Mt(xtwa)/[4oa5(1 - 93)“)

+ (xtwa)azt/[4v36(1 - 93)]:

-§p§(at)[1 - N1t(tha)/Uaa]/[2baa(1 — pa

w—

38

- 35(1 - At)!p(xtwa)Ht/[2og3(1 - pa)x

+ (Xtﬂa)ZtEP(XtW2)/ba

+ (Yti - X;W1)/U1l/[3Ua3(1 - 93>311
-E[61nL(9)/OPOPJ = -§(1 + 93)§(at)/(1 - p3)3

- E§(at)[294 - 4 + (p4 - ape + 3)/uaa

/(1 - 92,3 + 35(1 - At)

{ut[(1 + 2p2)o1(xtwa) + 3po3(Yt1 - Xf

/[u1va(1 4 p3>¥1

+ zttp<xtwa)/(oa(1 - 92)“)

+ (Yt1 — xtw1)/(b1(1 — 92)“)12;
With the elements of information matrix calculated above, w
may obtain the asymptotic covariance matrix of the MLE
estimate as the inverse of the information matrix.

After this, we calculate the asymptotic covariance
matrices of our two semiparametric estimates (the CLAD
estimate and the SCLS estimate) when the error terms are
bivariate normal.

The asymptotic covariance matrix of the CLAD estimate
is:

r
o13(x'X)’1 o13(X'X)‘1

 

o1a(x'X)*1 (w/2)o33[fi(xtwa > 0)xt'xt1-1

L.

The derivation of the above matrix is in Appendix G.
The asymptotic covariance matrix of the SCLS estimate

is:

 

39

‘013(X'X)'1 '013(X'X)‘1
cov( A ) =
"2
o12(x'xr1 (1/T)6‘1DE‘1
where '6 = (1/T)E1(xtwa > onamat) - ilxt'Xt
h = (i/T)§1(xtna > 0){c23[a@(at) — 1 — 2at¢(at)]
+ 2(xtwa)2[1 — §(at)11xt'xt
The derivation of the above matrix is in Appendix H.

To make our comparison complete, we also consider the
estimator of Amemiya (1979). Amemiya adopted the method
suggested by Nelson and Olsen (1978) of estimating the first
reduced-form equation by OLS and the second reduced-form
equation by (single equation) MLE. Therefore. Amemiya's
estimate is as efficient as the Joint MLE estimate of the
reduced-form coefficients when the correlation between the
error terms in the two equations is zero. However, when the
correlation between the error terms is not zero, Amemiya’s
estimate is less efficient than the Joint MLE estimate.
Amemiya's estimate of the reduced—form coefficients should
be more efficient than semiparametric estimates when the
error terms are bivariate normal. To derive estimates of
the structural coefficients from the estimates of the
reduced-form coefficients, Amemiya adopted a method he
called generalized least squares, and which we will refer to
as "AGLS". Hewey (1987b). section 4 has shown that
Amemiya's GLS procedure is asymptotically equivalent to the

minimum distance method that we use, so that the AGLS

 

40

structural estimates are asymptotically as efficient as
Joint MLE when the correlation between the error terms is
zero. When this correlation is non-zero we should expect
the AGLS estimates to be more efficient than our
distribution-free estimates, but less efficient than the
Joint MLE. Incidentally, for models (like our present
model) with only one censored variable, Hewey (i987b,
section 5) shows that asymptotically efficient structural
estimates can be derived by AGLS (or minimum distance
method) based on conditional MLE estimates of the reduced
form equations. We do not need to consider this possibility
separately because we already include one asymptotically
efficient estimator (Joint MLE) in our comparisons.

The asymptotic covariance matrix of Amemiya’s estimate
is:

. o12(X’X)‘1 o1a(x'X)-1

W1
cov( ) :

ﬁe

o12(x'X)-1 -(I,O){E[6310gL(e)/Oeiaei’]3'1(I,0)'
where 61' = (we'. baa),
O is a kx1 column vector of zeroes,
I is an kxk identity matrix,
63103L(91) gAtxt'xt EBtXt'

E -——————————— :
69 69 ’ EB X EC
1 1 t t t t t

At -!at¢(at) - ¢3(at)/[1 - §(at)1 - 9(at)1/caa

Bt = Iat3¢(at) + ¢(at) - at¢3(at)/[1 - o<at)11/(aoa3)
ct = -tat3¢(at) + at¢(at) - ata¢2(at)/[1 - §(at)] — ao(at)1

/(4oa4).

 

 

41

Note that the off-diagonal block ['o1a(X‘X)'1] is the
same in the asymptotic covariance matrices for Amemiya’s
estimate and for the CLAD and SCLS estimates.

Among these four asymptotic covariance matrices, only
the asymptotic covariance matrix of the MLE estimate can not
be computed directly. We use a computer simulation to
approximate the noncomputable expectation terms of the
information matrix for the MLE estimate. To do the
computer simulation of these terms, we need to assume
certain value for the parameters #3, cl. oz and p and the
exogenous variables (X). The simulation was done as
follows. First, we draw T (where T : 100, our assumed
sample size) pairs of random numbers from a bivariate normal
distribution. Then we compute the values of the terms whose
expectations we require from these drawn random numbers,
assumed parameter values, and fixed X. We repeat this
procedure 20,000 times for the same parameter values and
fixed X. We compute the means of these 20,000 realizations
and use it as our approximation to the expectations we could
not take analytically. We can then form the information
matrix, whose inverse is the asymptotic covariance matrix of
the Joint MLE.

Once we have calculated the information matrix. we can
compare the efficiency of the Joint MLE and of our other
estimates. The precise design of our experiments (parameter
values and X series) and the results of these comparisons

are given in the next chapter.

 

 

CHAPTER FOUR

RELATIVE EFFICIENCY OF THE MLE AKD SEHIPARAHETRIC ESTIHATES

I. INTRODUCTION

In this chapter we report the results of our
experiments comparing the efficiencies of four different
estimators for a two—equation simultaneous Tobit model:
Joint MLE, Amemiya’s estimator, and Powell’s CLAD and SCLS
estimators. The asymptotic covariance matrices of these
estimators and the numerical methods by which they are
evaluated have been derived in the last chapter. In this
chapter we evaluate them for various values of the
parameters and explanatory variables. We seek to compare
their efficiencies and to see what parameters the relative
efficiencies depend on.

We consider three different types of exogenous
variables. First, we consider explanatory variable matrices
of the form X = [1, X2] where the T elements of X2 are iid
N(0,1). That is, we have a constant plus one normally
distributed regressor. Second. we consider explanatory
variables of the same form but where the elements of X3
equal ez, where the z are iid N(0,1). Therefore we have a
constant plus a lognormal regressor. Finally, we consider
the case of a constant plus two regressors distributed as
bivariate normal. That is. X : [1,X3,X3] where the elements
of [X2.X3] are iid as bivariate normal. However. in all of
these cases our drawings from standard normal distributions

were truncated to the range [—1.96, 1.96] to eliminate

42

 

 

#3

Outliers in X-space that might be too influential to the
results. The reduced form parameters and error terms
covariance matrix are chosen fairly arbitrarily, but in such
a way to vary the degree of censoring and cross-equation
error terms correlation in a systematic way.

Our model is as given by equations (1) and (a) of
chapter three. The first reduced form equation is estimated
by ordinary least squares in Amemiya's method and in the
CLAD and SCLS methods; only the second reduced form equation
(the one with the censored dependent variable) is estimated
differently in these different cases. Furthermore, for a
model with only one censored dependent variable. ordinary
least squares is Just as efficient as Joint MLE for the
equation with the continuous dependent variable (Newey
(1987b)). Therefore all four estimators (Joint MLE,
Amemiya’s, CLAD and SCLS) are equally efficient for the
first reduced form equation, and we will report results only
for the second reduced form equation.

In reporting our results, we will simply give the
ratios of the asymptotic variance of an inefficient estimate
to the asymptotic variance of Joint MLE, the efficient
method. These ratios must be no smaller than one, and the
extent to which they exceed one indicates the extent of the
inefficiency incurred by not using Joint MLE. Specifically,
we adopt the following notation:

AMLE: ratio between the asymptotic variance of Amemiya’s

estimate (for a particular parameter, of course) and

 

 

41+

the asymptotic variance of the Joint MLE.

ShLE: ratio between the asymptotic variance of the SCLS
estimate and the asymptotic variance of the Joint
MLE.

LNLE: ratio between the asymptotic variance of the CLAD
estimate and the asymptotic variance of the Joint
MLE.

While most of our results are for reduced form
parameters, we also present some results for structural
parameter estimates. Because the structural parameter
estimates are derived from the reduced form parameter
estimates, it is clear that more efficient reduced form
estimates lead to more efficient structural estimates.
However. while the efficiency gain from Joint MLE was
confined to one of the two reduced form equations, it
generally applies to both structural equations, and the
interesting question to investigate is the extent to which
this efficiency gain spreads itself across both structural
equations. Since any reduced form is consistent with more
than one possible structural model, the nature of the
structural model should be relevant in determining this

split.

11. REDUCED FORM EQUATIONS WITH ONE EXPLANATORY VARIABLE
In this section we present results for models with a
constant term plus one (normal or lognormal) explanatory

variable. Our results for models With two explanatory

 

 

1+5

‘Variables will be given in the next section.

We begin with some experiments designed to investigate
the effect on relative efficiencies of the correlation
between the error terms in the two reduced form equations
(9). These results are given in Tables 1-6, as follows.
Tables 1-3 cover cases in which the explanatory variable is
(truncated) normal, as described above, and each have o1 : 1
and be = 1.2. However, there are three different values
chosen for we (the regression coefficients for the second
reduced form equation): namely, (1, 0), (1, i), and (O, 1).
Tables 4-6 cover the same cases but with a lognormal
explanatory variable.

In the usual regression equation the efficiency of
estimation does not depend on the value of the regression
coefficients. In the present case. the regression
coefficients matter because they affect the degree of
censoring of the sample, which is relevant to the question
of relative efficiencies. Changes that do not affect the
degree of censoring will not affect relative efficiencies.
For example, the results for o1 : 1, oz : 1.2, we : (1. O)
are the same as for o1 : 1, ca : 6, we : (5, O): the
variance of each estimator Just goes up by a factor of five.
We will vary the degree of censoring in a systematic way
momentarily, but for now we simply want to vary p. and we
tried different values of we and different types of X to try
to be sure our answer was not specific to a single

s ituat i on.

 

 

 

#6

In considering the effects of changes in p, two things
are clear at the outset. First. when p equals zero
Amemiya's estimate is efficient, and so AMLE(waa) : 1 when
p : 0. Second, the asymptotic variances of all estimates
except the Joint NLE are independent of p. while the
asymptotic variance of the Joint MLE will decrease with p.
Therefore the other estimates will become more inefficient
relative to the Joint MLE as p increases. However,
comparisons between various inefficient estimators (e.g..
SCLS versus CLAD) will be independent of A

These expected results do indeed occur in Tables 1-6.
The additional information in Tables 1-6 which could not
have been predicted a priori is the size of the efficiency
gain of Joint MLE relative to the inefficient estimates.
This is noticeable but not paricularly large. More
specifically, it is largest in Table 3, where the efficiency
gain of the Joint MLE is approximately 29% as p increases
from zero to 0.9.

A somewhat more interesting set of experiments is
reported in Table 7. Here we set p = 0.6 and we vary the
degree of censoring systematically by varying the intercept.
More specifically. we have o1 : 1, oz = 1.2. a normal
explanatory variable, and we : [c, 1], where the intercept c
ranges from 1.5 to —1.5, causing the degree of censoring to
vary from 10.56X to 89.442.

As the degree of censoring increases. Amemiya's

estimator (the single-equation Tobit estimator) becomes less

 

 

 

 

1+7

efficient relative to the Joint MLE. This is expected

because, in the absence of censoring. there is no efficiency
gain to the Joint estimation of reduced form equations
However, this effect is apparently not very strong; the
inefficiency of Amemiya's estimator relative to the Joint
MLE is only 92 even for a sample that is almost 902
censored. This finding is also consistent with the results
already displayed in Tables 1-6. The cases considered there
differed considerably in their degrees of censoring. but the

inefficiency of Amemiya's estimator relative to the Joint

 

MLE is never more than 302.

The degree of censoring has a much stronger effect on
the relative efficiencies of the semiparametric (CLAD and
SCLS) estimators. As is apparent in Table 7, they are very
inefficient relative to the Joint MLE when the sample is
heavily censored, and the inefficiency appears to grow
without bound as the degree of censoring approaches one.
For example. ShLE(waa) grows from 1.43 to 5.49 and 1033 as
the degree of censoring grows from 10.56X to 502 to 89.44%.
and the corresponding values for LMLE(W33) are 1.55, 5.05,
and 325.

It is known that the extent of the bias caused by an
incorrect assumption of normality is larger the higher the
degree of censoring. (See, e.g.. Arabmazar and Schmidt
(1982).) It is therefore unfortunate (if not surprising)
that the efficiency cost of avoiding this bias by using a

distribution—free estimator is also larger the higher the

 

 

 

 

#8

dEgree of censoring
A further interesting result is that the comparison of
the efficiencies of the CLAD and SCLS estimators depends
strongly on the degree of censoring. In Table 7 the SCLS
estimator is more efficient than the CLAD estimator when the
censoring percentage is less than 502 but the CLAD estimator
is more efficient than the SCLS estimator when the censoring
percentage is 502 or larger. The SCLS estimator uses the
assumption that the error terms are distributed
symmetrically, and we would probably expect it to be more
efficient than the CLAD estimator when this assumption is
true. Our experiments use normal error terms. which are
symmetric. and this should favor the SCLS estimator.
However, the SCLS estimator does indeed censor the sample
symmetrically. and when the sample approaches 50% censoring,
a symmetric censoring procedure does not leave many
uncensored observations for the estimator to depend on. For
that reason it is perhaps not surprising that the CLAD

estimator should be preferable for heavily censored data.

III. REDUCED FORM EQUATIONS WITH TWO EXPLANATORY VARIABLES
In this section we consider models with a constant plus
two explanatory variables. The explanatory variables are
bivariate normal, truncated to the range [-1.96, 1.96]. One
reason for considering models with two explanatory variables
is Just to be sure that the results of the last section
still hold in a slightly different setting. However. a more

important reason is to be able to see whether the results

 

 

 

 

49

depend on the degree of multicollinearity, which we can vary
by varying the correlation in the bivariate normal
distribution from which the regressors are drawn.

All of the cases we consider in this section assume
vi = 1, ca : 1.2, and p = 0.5. Our basic set of results is
given in Table 8. Here we have "2 : [1, 1, 1]', a choice
that generates a censoring percentage of about 20%. We vary
the degree of multicollinearity by considering correlations
(in the bivariate normal distribution for the regressors) of
o, 0.5, 0.9. and 0.99.

While multicollinearity naturally affects the
efficiency of all of the estimators, it does not have very
strong effects on relative efficiencies. The efficiency of
Amemiya's estimator relative to the Joint MLE is more or
less independent of the degree of multicollinearity. The
efficiencies of the semiparametric estimators relative to
the Joint MLE improve as the degree of multicollinearity
rises. In Table 8 about half of the efficiency advantage of
Joint MLE over SCLS or CLAD is removed when we change the
correlation between regressors from 0 to 0.9, though there
is not much further change when the correlation is increased
to 0.99.

Tables 9-12 give similar results for some different
values of "23 namely. [1. o, 11’, [0, O, 11’. [1, i, 01’.

[0, 1, 01’, [0, 1, i]’, [1. o, 01' and [-i, 1. 11‘. In
these cases we considered only two correlations, zero and

0.9. The results are fairly similar to those Just described

 

 

 

 

50

Crable 8). The efficiency of Amemiya’s estimator relative
to the Joint MLE is not affected much by the degree of
multicollinearity, and in most cases the efficiencies of the
semiparametric estimators relative to the Joint MLE are
increased as the degree of multicollinearity increases.
There are. however, some exceptions to this last statement;
in a few cases the semiparametric estimators become slightly
less efficient relative to the Joint MLE as the correlation
is changed from zero to 0.9.

We can also investigate the effect of the degree of
censoring in Tables 8-12. The clearest comparison is among
the results for x2 = [1, 1. 1]' (Tables 8 or 12). “2 : [0,

i, 11’ (Table 11) and "2 : {—1. 1, 11’ (Table 12). These
parameter values yield censoring percentages of 20.33%, 50K,
and 79.67Z, respectively. Regardless of whether the
correlation between the regressors is O or 0.9, we find the
same pattern that we found in the case of a single
explanatory variable. The efficiencies of all estimators
relative to the Joint MLE decrease as the censoring
percentage rises, with the increase being substantial for
the semiparametric estimators. For example, for the
parameter ﬂag and the case of regressor correlation equal to
zero, increasing the censoring percentage as above changes
ABLE from 1.04 to 1.08 and 1.11; it changes SMLE from 2.28
to 4.07 and 7.46: and it changes LMLE from 2.71 to 4.76 and
6.76. (Similar results hold for the parameter ﬂag and for

regressor correlation equal to 0.9.) We also find, as in

 

 

51

the previous section, that SCLS is more efficient than CLAD
in lightly censored samples but that the ranking is reversed

in heavily censored samples.

IV. STRUCTURAL EQUATIONS

The two previous sections have considered the
efficiency of reduced form estimation, while this section
will consider the efficiency of estimation of structural
parameters. Our structural estimates are derived from
reduced form estimates (by the minimum distance method). so
it is clear that more efficient reduced form estimates will
yield more efficient structural estimates. However, the
nature of our reduced form leads to the efficiency gain of
Joint MLE being restricted to only one of the two reduced
form equations. Except in special cases the efficiency gain
of Joint MLE should extend to the parameter estimates of
both structural equations, and we wish to see the extent to
which this is true.

The basic problem to be faced in doing this is that a
given set of reduced form equations (with specified
parameter values) could have arisen from more than one set
of identified structural equations. In order to keep our
analysis as general as possible. we therefore will consider
three specific sets of reduced from equations, but we will
consider separately all possible structures that would be
identified and would lead to these three sets of reduced

form equations. All identified sets Of structural equations

 

 

 

52

are listed in the Appendix I.
The results shown in Tables 13—14 are for four
different sets of identified structural equations

corresponding to the reduced form equations with "2

(1. 0)’. The results shown in Tables 15-19 are for ten

different sets 0f identified structural equations

corresponding to the reduced form equations with we
(1. 1)’. The results shown in.Tables 20-21 are for four
different sets of identified structural equations
corresponding to the reduced form equations with "2 :
(0, 1)’. Furthermore, in Tables 13—21, for each model the
first column represents the case where the correlation
between 6t1 and eta (error terms of structural equations) is
zero while the second column represents the case where the
correlation between 6.1 and 6.2 is not zero. Changing the
correlation changes the implied value of certain parameters.
as indicated in the Tables. 1

From Tables 13-21. we find that, in some sets of
structural equations, the efficiency gain of Joint MLE for
the reduced form equations extends only to the parameter
estimates of one structural equation (the Yt3* structural
equation). There are two situations where this occurs. For
the first situation there are three sets of structural
equations with Yta' not appearing in the Yti structural
equation. no exogenous variables appearing in the Yt2*
structural equation, and zero correlation between 6.1 and

eta. (See Tables 13. 18. and 21 for models 2, 12, and 17).

 

 

53

For“the second situation there are six sets of structural
equations with Yt2* not appearing in the Yti structural
equation, but with exogenous variables in the Yta'
structural equation. (See Tables 14, 16, 18, 19. and 21 for
models 4, 8, 11, 13. 14, and 18). There are nine other sets
of structural equations in which the efficiency gain of
Joint MLE for the reduced form equations extends to the
parameter estimates of both structural equations. A more
detailed explanation of these results is as follows.

1. When Yta' does not appear in the Yti structural equation
and no exogenous variables appear in the Ytg” equation.
the efficiency gain of Joint MLE for the reduced form
equations extends only to the Yt3* structural equation,
if the correlation between €t1 and Eta is zero. However,
the efficiency gain of Joint MLE for the reduced form
equations applies to both structural equations if the
correlation between 5t1 and Eta is not zero. This result
can be found in Tables 13, 18 and 21 for models a, 12,
and 17. When Y2 : 0.6 (eti and eta are uncorrelated), we
find that AHLE(911)(B12) : SHLE(B11)(813) =
LHLE(811)(812) : 1 for all three models. When ya : i
(€t1 and eta are correlated), we find that (1)
AMLE(811)(813) : 1, but SNLE(B11)(813) ¢ 1 and
LNLE(B11)(813) ¢ 1 for models 2 and 17; (2)
AHLE(911)(913) = 1. SMLE(B11)(812) = 1» but
LMLE(811)(813) ¢ 1 for model 12.

2. When Yt2” does not appear in the Yt1 structural equation,

 

54

Iﬁlt there are exogenous variables in the Yta structural
equation, the efficiency gain of Joint MLE for the
reduced form equations applies only to the Ytg*
structural equation, whether the correlation between 6t1
and Eta is zero or not. This result can be found in
Tables 14. 16, 18, 19, and 21 for models 4, 8, 11. 13,

14, and 18. When Y2 = 0.6 (6.1 and eta are uncorrelated)
or Y2 = 1 (€t1 and €t2 are correlated), our study shows
that AMI..E(B11)(B12) = SHLE(B11)(B12) = LMLE(B11)(513)

= 1 for all six models.

When Yti does not appear in the Yta” structural equation,
but there are exogenous variables in the Yti structural
equation, AMLE(911)(B13). SHLE(811)(B13). LMLE(B11)(B12)
are the same whether the correlation between 6t: and Eta
is zero or not. This result can be found in Tables 13,
14, 15, 17 and 20 for models 1, 3, 6, 9, and 15. For
model 1, when y1 : 0.42 (€t1 and eta are uncorrelated) or
Y1 = 1 (€t1 and eta are correlated). we find that
AHLE(821) = 1, AMLE(822) : 1, smmeai) = 1.10, smmeaa)
: 1.10, LMLE(Ba1) : 1.49, and LHLE(822) : 1.51. The same
result can be found for models 3, 6. 9, and 15.

When Yt3* appears in the Yti structural equation, Yti
appears in the Yta” structural equation. and there are
exogenous variables in each structural equation,

AMLE(y1, a“. or B12), sumo/1, a“. or 812), LNLE(y1,
811, or 813) are the same whether the correlation between

eti and eta is zero or not. This result can be found in

55

Tables 15 and 17 for models 5 and 10. For model

Ya : —2. 1 (€t1 and eta are uncorrelated) or ya -

and eta are correlated), we find that ANLE(y1) :
AHLE(811) = 1. sumo“) = 1.02, Sl‘lLE(B“)
LNLE(y1) = 1.65, and mums“) : 1.64. The same

can be found for model 10.

= 1.04,

 

5, when
-1 (eti

1.

result

 

 

 

56

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 1
X : (1, X3), X3 is truncated normal.
We = (1 0)', ‘01 = 1, ‘03 : 1.2
p p : 0 p = 0.5 p : 0.9
censoring % 20.33% 20.33% 20.33%
AHLE wag 1.00 1.00 1.02
SMLE v22 1.10 1.10 1.12
LHLE wag 1.50 1.61 1.53
TABLE 2
X = (1. X2), X3 is truncated normal.
‘ﬂ'a = (1 1)’. '01 = 1, '03 = 1.2
p p = 0 p = 0.5 p : 0.9
censoring % 20.33% 20.332 20.33%
AMLE 1:22 1.00 1.01. 1.09
SNLE we; 2.00 2.02 2.18
LNLE ﬂea 2.39 2.42 2.61
TABLE 3
X : (1, X2), X8 is truncated normal.
172 = (O 1)’, ‘01 = 1, '02 = 1.2
p p : o p : 0.5 p : 0.9
censoring % 50% 50% 50%
AHLE wag 1.00 1.04 1.29
SMLE wag 5.27 5.49 6.79
LMLE W22 4.84 5.05 6.24

 

 

 

 

 

 

 

57

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 4
X = (1, X2), X3 is truncated lognormal.
"2 = (1 O)‘, o1 : 1. ca = 1.2
p p : O p : 0.5 p : 0.9
censoring % 20.33% 20.33% 20.33%
AMLE wag 1.00 1.00 1.02
SNLE "22 1.10 1.10 1.12
LNLE wag 1.50 1.51 1.54
TABLE 5
X = (1, X3), X3 is truncated lognormal.
"2 : (1 1)’, o. = 1, o2 : 1.2
p p :O p =0.59 :o.9
censoring % 1.36% 1 36% 1.36%
AHLE "22 1.00 1.00 1.00
SNLE W22 1.01 1.02 1.02
LMLE ﬂea 1.56 1.56 1.57
TABLE 6
X : (1, X2), X3 is truncated lognormal.
"2 : (0 1)’, vi : 1, ca : 1.2
p p :0 p :0.Sp :o.9
censoring x 8.38% 8.38% 8.38%
ANLE "ea 1.00 1.00 1.02
SMLE W22 1.12 1.12 1.14
LNLE v22 1.53 1.53 1.55

 

 

 

 

 

58

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 7
X : (1, X3), X3 is truncated normaL
.0111 02:1.2,p: .5
1r (1.5 1)’ (1 i)‘ (0 1)’ (—11)' (—1.5 i)’
censoring % 10.56% 20.33% 50% 79.67% 89.44%
AHLE V22 1.01 1.01 1.04 1.08 1.09
SNLE "22 1.43 2.02 5.49 52.29 1033.07
LHLE ﬂea 1.55 2.42 5.05 45.92 325.27
TABLE 8
X : (1, X3, X3), X2 and X3 are bivariate normal
with cov(xa, X3)' : corr
corr 1
'01: 1, = 1.2, p = 0.5.172 : (11 1)’,
censoring % : 20.33%
corr:0 corr:0.5 corr:0.9 corr:0.99
Wag 1.03 1.03 1.03 1.03
AMLE
wa3 1.04 1.03 1.02 1.03
ﬂag 1.96 1.87 1.44 1.48
SMLE
“‘23 2.26 1.67 1.68 1.45
V22 2.39 2.41 1.96 1.88
LHLE
11:23 2.71 2.17 2.12 1.64

 

 

 

 

 

 

 

 

X = (1,

with cov(Xa,

X2.

x3)!

X3)’ = [

X2

59

TABLE 9

 

and X3 are bivariate normal

corr
corr 1

 

 

 

 

 

 

 

 

 

 

 

 

 

'O1=1,'O‘a=1.3,9=0.5
1 1 0 0
772 =(0) 172 = 0 V2 2 0 We : 0
i 1 1
corr: 0 corr:0.9 corr: 0 corr:0.9
censoring % 20.33% 20.33% 50% 50%
722 1.01 1.01 1.03 1.03
AMLE
"23 1.02 1.01 1.05 1.03
1722 1. 47 1. 40 3. 02 2. 43
SHLE
W23 2.19 1.58 r 6.77 3.24
v23 1.76 1.81 3.25 2.13
LHLE
1723 2. 52 1. 95 5. 95 2. 62

 

 

 

 

60

TABLE 10

X : (1, X3, X3). X3 and X3 are bivariate normal

1 corr
with cov(xa, X3)' : corr 1

61 = 1, 02 = 1.2, P = 0.5

 

 

 

 

 

 

 

1 1 0 0
we : 1 we : 1 "2 : 1 Va = 1
0 0 0 0
corr: 0 corr:0.9 corr: 0 corr:0.9
censoring % 20.33% 20.33% 50% 50%
wag 1.02 1.01 1.06 1.03
AHLE
«23 1.01 1.01 1.03 1.02
ﬂea 2.30 1.38 6.22 3.47
SHLE
"23 1.55 1.30 3.50 3.02
ﬂea 2.51 1.73 5.72 2.90
LMLE
V23 1 78 1.59 3.79 3.01

 

 

 

 

 

 

 

 

 

 

x = (1, xa.

with cov(Xa,

X3):

X3)’ = [

X2

61

TABLE 11

1 corr
corr 1

and X3 are bivariate normal

 

 

 

 

 

 

 

U1-1.Ua=1.2.9=0.5
O 0 i 1
172 = 1 We :(1 172 : 0 1'72 1: 0
1 1 O 0
corr: 0 corr:0.9 corr: 0 corr:0.9

censoring % 50% 50% 20.33% 20.33%
Wag 1.07 1.05 1.00 1.00

AHLE
«23 1.08 1.05 1.00 1.00
W22 4.13 1.35 1.10 1.10

SHLE
«23 4.07 1.92 1.10 1.10
W22 4.30 ‘2.14 1.51 1.51

LHLE
«23 4.75 2.38 1.50 1.51

 

 

 

 

 

 

 

 

 

 

X = (1.

with cov(Xa,

X2.

X3),

X3)’ = [

62

TABLE 12

X2

1 corr '
corr 1

and X3 are bivariate normal

 

 

 

 

 

 

 

o. = 1. ca = 1.2, p = 0.5
1 1 1 -1
"2 = 1 "2 : 1 "2 : 1 we : 1
1 1 1 1
corr: 0 corr:0.9 corr: 0 corr:0.9
censoring % 20.33% 20.33% 79.67% 79.67%
"22 1.03 1.03 1.10 1.09
ANLE
«23 1.04 1.02 1.11 1.09
"22 1.96 1.44 11.36 4.03
SNLE
wag 2.28 1.68 7.46 3.95
ﬂea 2.39 1.96 8.44 4.01
LMLE
v33 2.71 2.12 6.76 3.62

 

 

 

 

 

 

 

 

63

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 13
Yt1l = Yth2* + 912xt2 + Et1 (912 = 0)
Yte = 921Xt1 + 522Xt2 + Et2 (922 = 0)
Yt1I = 911Xt1 + B12Xt2 + et1 (512 = 0)
Yta = YaYti + 6t2
X : (1, X2), X3 is truncated lognormal,
We = (1 O)’, 01 = 1, '02 : 1.2, p : 0.5
model 1 model 1 model 2 model 2
Y1: 0.42 ‘11 : 1 Ya : 0.5 Y2 : 1
Y1 1.00 1.00 .
911 1.00 1.00
812 1.00 1.00 1.00 1.00
AHLE
ye .. 1.00 1.00
931 1.00 1.00 - ..
922 1.00 1.00 . .
71 1.03 1.10
811 1.00 1.01
812, 1.03 1.11 1.00 1.01
SMLE
ya 1.11 1.10
821 1. 10 1.10
Bag 1. 10 1. 10 . .
y. 1.17 1.57
B11 1.00 1.03
812 1.17 1.58 1.00 1.05
LMLE
Y2 1.63 1.55
831 1.49 1.49 .. .
822 1.51 1.51 0' on

 

 

 

 

 

 

 

 

 

64

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 14
Yt1I = Y1Yt2' + 812Xt2 + €t1
Yt2 = 821Xt1 + Baaxta + et2 (922 = 0)
4- Ytix = B11Xt1 + B12Xt2 + €t1
Yte = VaYt1 + Baaxta + €t2 (Y2912 + 822 = 0)
X = (1, X3), X2 is truncated lognormal,
1T2 = (1 0)’. '01 = 1. '02 = 1.2, p : 0.5
model 3 model 3 model 4 model 4
Y1 = 0.42 Y1 : 1 Ya : 0.6 Y2 = 1
y. 1.00 1.00 . ...
811' 00 o 1:00 1.00
812 1.00 1.00 1.00 1.00
AHLE
ya . 1.00 1.00
821 1.00 1.00 ...
823 1.00 1.00 1.00 1.00
v] 1.03 1.11 ..
B11 . 1.00 1.00
912 1.03 1.11 1.00 1.00
SHLE
ya .. 1.12 1.11
821 1.09 1.09 .
833 1.10 1.10 1.11 1.10
y. 1.17 1.57 . .
B11 . 1.00 1.00
813 1.17 1.58 1.00 1.00
LMLE
Y2 1.65 1.57
931 1.49 1.49 .. ...
832 1.51 1.51 1.64 1.56

 

 

 

 

 

 

65

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 15
5' Yt1l = Y1Yt2' + B11Xt1 + 6t1
Yta = Y2Yt1 + 922Xt2 + et2
5' Yt1* = Y1Yt2” + B11Xt1 t €t1
Yta = 921Xt1 + Baaxta + €t2
X : (1, X2), X2 is truncated lognormal,
"2 = (1 1)’. '01 = 1. ‘02 = 1.2. p : 0.5
model 5 model 5 model 6 model 6
Y2 = -2.1 ya : -1 Y1 : 0.42 Y1 : 1
V1 1.00 1.00 1.00 1.00
B11 1.00 1.00 1.00 1.00
812 ... ... ...
AHLE
73 1.00 1.00 ...
921 ... ... 1.00 1.00.
822 1.00 1.00 1.00 1.00
Y1 1.02 1.02 1.01 1.02
811 1.04 1.04 1.01 1.04
812 . .
SHLE ,
V; 1.01 1.02 ... ...
831 . . 1.04 1.04
Baa 1.01 1.01 1.02 1.02
v1 1.65 1.65 1.19 1.65
811 1.64 1.64 1.19 1.64
812 .. . .
LMLE
ya 1.10 1.22
821 1.55 1.55
832 1.10 1.22 1.56 1.56

 

 

 

 

 

 

 

 

 

66

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 16
Yt1* = Y1Yt2‘ + et1
Yta = B21Xt1 + Beaxta + Et2
Yt1l = 911Xt1 + B12Xt2 + €t1 (912 = 0)
Yt2 = YaYt1 + B22Xt2 + 6t2
X : (1, X2). X2 is truncated lognormal,
"2 = (1 1):. 61 = 1, ca - 1.2, p 0.5
model 7 model 7 model 8 model 8
Y1 : 0.42 Y1 : 1 ya : 0.6 Ya : 1
Y1 1.00 1.00 ..
B11 . 1.00 1.00
912 . . 1.00 1.00
AMLE
ya 1.00 1.00
921 1000 1-00 on 01
522 1.00 1.00 1.00 1.00
Y1 1.00 1.01 . .
911 . 1.00 1.00
813 1.00 1.00
SHLE
ya . . 1.05 1.04
921 1.04 1.01 . .
822 1.01 1.00 1.02 1.02
Y1 1019 1066 079 U!
911 1.00 1.00
B12 1.00 1.00
LMLE
ya 1.73 1.64
621 1.46 1.10 ... .
Baa 1.49 1.22 1.75 1.65

 

 

 

 

 

 

 

 

 

 

67

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 17
9- Yt1* = Y1Yt2' + 912Xt2 + €t1
Yta = 921Xt1 + 922Xt2 + €t2
10. Ytl.‘ : V1Yta* + 81.2th + €t1
Yta = Y2Yt1 + Balxti t €t2
X : (1, X3), X2 is truncated lognormal,
"a = (1 1):. 61 : 1, oz : 1.2, p 0.5
model 9 model 9 model 10' model 10
Y1 = 0. 42 Y1 = 1 Y2 = -E.1 Ya : -1
y, 1.00 1.00 1.00 1.00
911 . ..
812 1.00 1.00 1.00 1.00
AMLE
ya . . 1.00 1.00
821 1.00 1.00 1.00 1.00
923 1.00 1.00 ..
v1 1.01 1.04 1.04 1.04
311 ~
513 1.01 1.04 1.04 1.04
SMLE
ya 1.00 1.01
831 1.04 1.04 1.00 1.01
923 1.02 1.02 ..
91 1.16 1.64 1.64 1.64
911 . -
812 1.19 1.64 1.64 1.64
LMLE
Ya ... . 1.10 1.82
821 1.55 1.55 1.10 1.22
923 1.56 1.56 . .

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

68

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 18
11- Yt1i = 911Xt1 + B12Xt2 + €t1 (811 = 0)
Yt2 = Y2Yt1 + 921Xt1 + €t2
13' Ytix = B11Xt1 + Biaxta + €t1
Yta = Y2Yt1 + €t2
X : (1, X3), X2 is truncated lognormal,
172 = (1 1)’, '01 = 1. '02 = 1.2, p 0.5
model 11 model 11 model 12 model 12
Ya=0.6 Yaz1 Ya:0.6 Yazi
Y1 .
811 1.00 1.00 1.00 1.00
B12 1.00 1.00 1.00 1.00
AMLE
ya 1.00 1.00 1.00 1.00
831 1.00 1.00 . .
Baa . . .
Y1 .
811 1.00 1.00 1.00 1.00
812 1.00 1.00 1.00 1.00
SHLE
ya 1.02 1.02 1.01 1.01
821 1.05 1.04
822 ... .
Y1 . .
911 1.00 1.00 1.00 1.05
912 1.00 1.00 1.00 1.04
LMLE
ya 1.75 1.65 1.75 1.66
821 1.73 1.64
822 -

 

 

 

 

 

 

 

 

 

 

 

 

69

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 19
Yt1* = B11Xt1 + B12Xt2 + Et1
Yt2 = Y2Yt1 + 321Xt1 + €t2
14- Yt1 = B11Xt1 + 912Xt2 + €t1
Yt2 = V2Yt1 + 822Xt2 + €t2
X = (1, X3), X2 is truncated lognormal,
72 = (11):, 01 = 1, ca : 1 a, o. 5
model 13 model 13 model 14
Y3 : 0.6 Y3 = 1 0.6
Y1 ... ...
311 1.00 1.00 .00
912 1.00 1.00 .00
AMLE
ya 1.00 1.00 .00
821 1.00 1 00
Bag ... ... .00
Y1 . .
811 1.00 1.00 .00
913 1.00 1.00 .00
SHLE
ya 1.02 1.02 .05
921 1.05 1.04 .
Baa . .04
Y1
811 1.00 1.00 .00
912 1.00 1.00 .00
LMLE
72 1.74 1.65 .73
821 1.73 1 64
922 1.74

 

 

 

7O

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 20
15~ Ytil = Y1Yt2* + B11Xt1 + €t1
Yta = 921Xt1 + Baaxta + €t2 (921 = 0)
16~ Yt1I = Y1Yt2* + Et1
Yta = B21Xt1 + 922Xt2 + €t2 (921 = 0)
X : (1, X2), X2 is truncated lognormal,
We = (0 1)’. 01 = 1. : 1.2. p : 0.5
model 15 model 15 model 16 model 16
Y1 : 0.42 Y1 : 1 Y1 = O. 42 Y1 : 1
v1 1.00 1.00 1.00 1.00
811 1a 00 1.01 I to.
912 ...
AMLE
Ya .. .
831 1.01 1.01 1L01 1.00
922 1.00 1.00 1.00 1.00
Y1 1.04 1.14 1.01 1.02
911 ' 1.10 1.32 .
B12 .
SMLE
Y3 .
921 1.28 1.28 1.24 1.03
933 1.12 1.12 1.10 1.01
Y1 1.18 1.61 1.19 1.65
811 1. 15 1.51
812 til I
LMLE
Ya . .
821 1.44 1.44 1.36 1.04
922 1.53 1.53 1.48 1.35

 

 

 

 

 

 

 

 

 

 

 

 

 

71

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

TABLE 21
17- Yt1* = 911Xt1 + B12Xt2 + et1 (811 = 0)
Yta = YaYt1 + €t2
15 Yt1* = B11Xt1 + B12Xt2 + €t1
Yt2 = Y2Yt1 + 921Xt1 + €t2 (Y2911 + 921 = 0)
X : (1, X3), X3 is truncated lognormal,
"2 = (0 1)’. 01 = 1. 03 = 1.2, p : 0.5
model 17 model 17 model 18 model 15
Ya:0.6 Y2=1 ya:0.5 Vazi
Y1 v: c o
811 1.00 1.00 1.00 1.00
812 1.00 1.00 1.00 1.00
AMLE
ya 1.00 1.00 1.00 1.00
821 ... ... 1.01 1.01
B22
‘11 ... ... ...
811 1.00 1.03 1.00 1.00
812 1.00 1.02 1.00 1.00
SMLE
ya 1.02 1.02 1.16 1.14
821 ... ... 1.34 1.33
922
Y1
911 1 00 1.04 1 00 1 00
812 1.00 1.02 1.00 1.00
LMLE
V; 1.74 1.65 1.69 1.61
921 1.57 1 50
Bag ... ... ... ...

 

 

 

 

 

 

 

 

 

72

Footnote
As our structural parameter estimates are derived from
reduced form estimates by the minimum distance method,
the asymptotic covariance matrix 0f the structural
parameter estimates is (1/T)§, where o : (P’Q‘iP)'1. P =
0f(do)/Cd’, d is the vector Of structural parameters. and
Q is the asymptotic covariance matrix of the reduced form
estimates. For different reduced form estimates (MLE,
Amemiya's, CLAD. SCLS). Q would be different and hence Q
would be different.

To show the process of calculation, let‘s use our
first set of structural equations as an example. These
correspond to the reduced form equations with Va =
(1, ol'.

STRUCTURAL EQUATIONS:

Ytl = B11Xt1 + 912 Xt2 + et1 (912 = 0)

Yta” = Y2Yt1 + 6t2

CORRESPONDING REDUCED FORM EQUATIONS:

Yt1 = Blixti + 912Xt2 + €t1 (912 = 0)

Yta” = Y2811Xt1 + Y2912Xt2 + V2€t1 + €t2

Let f1“) = (911- 912' Y2911- V2812)’- a = (911: B12~

Ya)’. Then P : 8f(d)/8d’ :

1 O O
O 1 0
Ya 0 B11
0 Va 912

and o = (P'Q-1P)-1.
To calculate P, we need the value of ya and 811 (812

: 0 in this model). Ya could be derived as follows.

 

 

 

vt1 = 6t1

vt2 = Y2€t1 + Et2
If the correlation between 6t1 and €12 is zero, then
var(vu) = var(€t1),

cov(vti, Vt2) = yavar(€t1).

/..
S°- Y2 = c°V(Vt1- vta)/var(vt1) = 013/613. In our :f:,ggﬁ
experiments, 61 = 1, ca : 1.2 and p = 0.5, so ya =

912/912 = 0~6. As v3 = (1 0)'. it is clear that val = Q'

V2911 : 1. 811 is therefore equal to 1/Ya = 1/0.6 : 3-

1.67. “
For all 18 identified sets of structural equations

in this chapter, we use the same calculations as above to

derive the values of the structural parameters which

appear in P. When the correlation between €t1 and eta is

zero. yi (or ya) is equal to a certain value. If Y1 (or

ya) were not equal to this value, the correlation between

€11 and eta would not be zero.

if“

 

 

CHAPTER FIVE

SUHHARY AID COHCLUDIHG REMARKS

The purpose of this dissertation was to study
distribution—free methods of estimation for the simultaneous
equations Tobit model. This is a simultaneous equation
model which contains censored dependent variables, and which
may also contain some dependent variables of the usual
continuous type. The typical treatment of this model
assumes that the error terms follow a multivariate normal
distribution and estimates the parameters by the method of
maximum likelihood. If the normality assumption is correct,
the normal MLE is consistent and asymptotically efficient.
However, if the normality assumption is incorrect, the
normal MLE is inconsistent, and it is therefore desirable to
have available estimators whose consistency does not hinge
on a specific distributional assumption. In this
dissertation we propose such estimators and consider their
efficiency.

Our method of estimation involves estimating the
reduced form equations by distribution—free methods. and
then deriving estimates of the structural parameters from
the estimates of the reduced form parameters by the minimum
distance method. We choose to estimate the reduced form
equations with continuous dependent variables by ordinary
least squares, and we estimate the reduced form equations
with censored dependent variables by Powell’s "censored
least absolute deviations" (CLAD) or "symmetrically censored

74

 

 

 

 

75

least squares" (SCLS) estimator. These estimators are

robust to non-normality and therefore so is our derived

estimator of the structural parameters. In a model with no
\

censored dependent variables, our method reduces to an

approximate version Of three stage least squares.

The disadvantage of our estimator (or other robust
estimators) is that it is inefficient relative to the normal
MLE when the error terms are indeed normal. We measure the
extent of this inefficiency for particular parameter values
and sequences of exogenous variables by comparing the
asymptotic covariance matrices of our estimator and of the
normal MLE. We present a simplified form of the covariance
matrix of our estimator for the special case that the error
terms are iid multivariate normal, and we also calculate the
information matrix, whose inverse is the covariance matrix
of the normal MLE. The information matrix contains some
terms with expectations that we cannot evaluate
analytically, so we resort to a simulation to evaluate these
terms. We find three important results as follows.

First, our robust estimators become less efficient
relative to the normal MLE when the correlation between the
error terms in the different equations is increased. This
occurs because the MLE implicitly estimates all reduced form
equations simultaneously while our estimator estimates each
reduced form equation separately. In the typical case of a
linear model with continuous dependent variables, separate

estimation of each reduced form equation is Just as

 

 

 

 

76

efficient as Joint (SUR) estimation, but this is not true
when some dependent variables are censored. Thus our
estimator's efficiency relative to the normal MLE falls as
the across-equation error terms correlation rises.

Second, our estimators become less efficient relative
to the normal NLE as the degree of censoring increases.
This is unfortunate because the size of the inconsistency of
the normal MLE caused by an incorrect assumption of
normality also increases with the degree of censoring. In
heavily censored samples the normality assumption becomes
more dangerous (because of the possibility of serious bias
if it is wrong) but also more valuable (because of the
larger efficiency gain from imposing it). It is also
reasonable to conJecture that the ability to detect non-
normality (the power of the relevant tests of normality)
decreases as the sample becomes more heavily censored. The
practical implication of all this is apparently Just that we
should not hope to learn very much from heavily censored
data.

Third. the comparison between our estimators based on
Powell’s CLAD and SCLS estimators also depends heavily on
the degree of censoring. The SCLS estimator depends on the
symmetry of the error terms distribution and therefore may
be expected to be more efficient than the CLAD estimator
when the error terms are symmetric, as they are in our
experiments. This is in fact true for lightly censored

samples, but it fails to be so for more heavily censored

 

 

 

 

77

samples. In particular, the CLAD estimator dominates the
SCLS estimator in our experiments when the degree of
censoring is 50% or more. This is intuitively reasonable
because the process of symmetric censoring implicitly
removes virtually all of the observations in such cases.

We conclude by noting some possible avenues for further
research on this topic. which are suggested by recent work
in distribution-free estimation. One such possibility is
based on the recent results of Chamberlain (1987) and
Cosslett (1987), who establish efficiency bounds for
distribution-free estimators for the censored regression
model. In our context these would correspond to upper
bounds for the efficiency of distribution-free estimates of
the reduced form parameters, and they could be converted
into upper bounds for the efficiency of minimum distance
estimates of the structural parameters. It would be
interesting to see how our estimates (based on Powell's
estimates) compare to these efficiency bounds. Indeed, such
a calculation would be of interest even in the single-
equation context.

A second obvious possibility is simply to expand the
set of distribution-free estimators considered for the
reduced form equations. In particular, there has been
considerable interest in the quasi-MLE estimators suggested
by Gourieroux et a1. (1987). In the usual linear model, the
normal MLE is least squares and it is robust to non-

normality. However, given censoring the normal MLE is not

 

 

 

 

 

 

78

TObUSt to the failure 0f its distributional assumption.

Gourieroux et al. have considered the properties of MLE’s

based on distributions other than the normal, and have

categorized the distributions whose MLE‘s are robust to the
failure of their distributional assumptions. In other

words, they find other distributions that one can assume

such that the resulting MLE is consistent even when the
error terms do not follow the assumed distribution. The
efficiency properties of these robust quasi—MLE‘S are

Clearly worth studying.

 

 

APPENDIX A

Proof of Consistency of Estimator of E13 in II.A.1.b. of

Chapter Two

The consistent estimator of E13 is
(1/T)E[1(Xtaa > 0)][1/2 - 1(Vt2 < 0))vt1xt’xt. The proof is
as below.
a = (1/T)E[1(Xtaa > 0)][1/2 — 1(vt2 < 0)]vt1xt'xt -

(1/T)E[1(Xtﬂa > 0)][1/2 - 1(vta < 0)]vt1Xt'Xt

IA

(1/T)El[1(Xtﬁa > 0)][1/2 - 1(vta < 0)]Vt1Xt’Xt -

[1(Xt"2 > 0)][1/2 - 1(vta < 0)]vt1Xt'th

IA

(1/T)E(1/2)l1(Xtﬁa > 0)Vt1Xt’Xt — 1(Xt"2 > 0)Vt1Xt'Xt| +
(1/T)El[1(Xtﬁa > 0)][1(Vta < 0)]Vt1Xt’Xt —

[1(Xtﬂa > 0)][1(Vta < 0)]vt1Xt’th

G1 + 02

IA

(3/2T)E[1(1thal 1 uxtuueg — wau)llvt1lﬂxtﬂa +
(3/2T)§[1(Ixtwel 1 uxtuuea - wgﬂ)]lvt1lﬂxtﬂa +
(1/T)E[1(Ivtal 1 uxtuuaa — ﬂau)llvt1lﬂxtﬂa +

(1/T)E[1(Ivtal 1 “Xtﬂﬂﬁg — ﬂaH)]|Vt1lHXtﬂa

(3/2)C + (3/2)C’ + D + D’

(T 1. d1 part

Xtﬁa tha a1
(1) + + ¢ 0
(ii) + - ¢ 0
(111) - + ¢ 0
- - - 0

Case (1)? Xtﬁa > 0, Xtﬂa >70.

79

 

 

 

Case (11):

 

80

d1 1 (1/T)§(1/2)lvt1Xt’Xt - Vt1Xt’th
= (i/T)E(1/2)lXt(W1 - ﬁ1)Xt’th
1 (1/T)EHW1 - ainuxtn3
L 0 (by consistency of W1 and Assumption (A2))
xtaa > o, Xtﬂa 1 0.
d1 1 (1/T)E[1(lxtwal S HXtHﬂﬁg - ﬂau)]lvt1lHXtHa

: (1/2)C'

Case (111): Xtﬁa 1 0, Xtﬂa > o.

(1)
(2)

(3)

(4)

(5)

Xtﬁe
+

+

61 1 (1/T)E[1(1thal 1 uxtuuaa - vaH)]lvt1lHXtua

: (1/2)C
xt“2 Vta Vta 0‘2
+ + + : 0
+ + - ¢ 0
+ - + ¢ 0
— + + : 0
+ + + : 0
- + + : o
+ - + = O
+ + - ¢ 0
_ — + ¢ 0
_ + — : O
+ - - ¢ 0 (the same as Case (1)
above)
- - + = 0
_ + - = o

 

 

 

(6) +

Case (1):

Case (2):

Case (3):

Case (4):

81

_ — . - ¢ 0

Vtg l O, Vta < O. Xtﬁa > O. Xtﬂa > O.

lvtal 1 lvta - Vt2' ‘

Vta = Yt2” - xt"2

Vte = Yta - Xtﬁa

Yta - Xtﬁa l o <=> maxto, Ytg*] - Xtﬁa l 0

<=> 712* > 0 (7 Xtﬁa > 0) <=> Yt2 : Yt2'

5 th2 ' Vtai = IYt2 ' XtTTa - Yt2 + XtﬁaI

= IXt(ﬁg - We)!

«a 1 (1/T)E[1(|Vtal 1 uxtuuaa - ﬁgﬂ)llvt1lﬂxtna
: D

Vta < 0, Vta l o. Xtﬁa > 0. xtwa > o.

v12 : Yt2” - thg l 0

Yt2* z Xtva > o

1 Yt2 = Yt2”

thal 5 tha - Vtal = 'Yta - Xt"2 - Yta + Xtﬁ'al

= IXt(ﬁa - Wa)|

«a s (1/T)E[1(lvtal 1 uxtuuaa - wanlllvt1luxtu3
: D’

Vta l 0, Vta < o, Xtﬁa 1 0, xtwa > 0.

«a 1 (1/T)§[1(Ixtwan 1 uxtuuaa - ﬂaﬂ)llvt1lﬂxtna
= C

Vta < o, v12 1 o, xtaa > o, XtWa 1 0.

ca 1 (i/T)E[1(Ivtal 1 uxtnuaa - we")llvt1|uxtua
= D’ or

62 1 (i/T)E[1(lxtwel 1 nxtuuea - waulllvt1luxtu3

: C'

 

 

 

82

case (5): Vtg < O, Vta < 0, Xtﬁg 1 0, tha > 0.
ca 1 (1/T)§[1(Ixtwa| 1 uxtuuaa - waﬂ)llvt1lnxtﬂa
= C
Case (6): Vte < o, Vta < 0, Xtﬁa > o, Xt"2 1 0.
Ga 1 (1/T)§[1(1xtwal 1 uxtnuaa — ﬂaﬂ)llvt11ﬂxtﬂa

:C’ )

plim C : 0. The proof is similar to that of p.323
(A27) in Powell’s [1984] paper. We may also prove plim C' :

plim C = O as below.

°t1 = Yt1 - Xtﬁ1 = xtﬂ’1 + Vt1 - xtr‘ri = vt1 + Xt("1 ' 51)

19111 = Ivt1 + xt(w1 - £1)! 1 Ivt1l + lxt(a1 - "1)1

(1/T)§[1(lxtﬂgl 1 uxtuuag - waﬂ)llvt1luxtua

— (1/T)E[1(|thgl 1 "Xtﬂﬂﬁg - nau)l|vtinuxtn3

= (1/T)E[1(lxtwal 1 nxtuuaa - ﬂaﬂ)l(lvt1l - lvt1l)HXtN2

1 (i/T)§[1(|thal 1 uxtnuaa — ﬂaﬂ)]HXtH3N&1 - «in

By Assumption (A4) and the consistency of ai,

(1/T)§[1(IX.wZI 1 uxtuuaa - we")]HXtH3Hﬁ1 - win —> 0(1)

a (1/T)§11(Ixtwgl 1 “Xtﬂﬂﬁa — waﬂlllvt1lﬂxtﬂa L
(1/T)E[1(|Xtﬂgl 1 uxtnuaa - ﬂaﬂ)llvt1luxtua = 0

SO, plim C’ = plim C : 0.

By the same way as above, D and D’ could be shown to
converge to zero in probability. So,
(1/T)E[1(Xt%2 > 0)][1/2 -1(Vt2 < 0)]vt1xt'xt L
(1/T)E[1(tha > 0)][1/2 -1(vta < 0))Vt1Xt’Xt

By Assumption (A12), (1/T)§§tvtixtlxt R (1/T)EE[§tvt1Xt’Xt].

 

 

 

APPENDIX B

Proof of Consistency of Estimator of 6 and E12 in II.B.1.b.

0f Chapter Two

.The consistent estimator Of 5 is
6 = (1/T)E[1(-Xtﬁa < v.2 < Xtﬁg)]Xt’Xt.
Where Vta = Yta ' Xtﬁa = Xtﬂa + Vta” - Xtﬁa = Vt2” ‘ tht-
(Vt2* = maXIVt2' 'Xtﬂal. 5t = 72 ‘ We)
l11/T)E[1('Xtﬁa < v13 < Xtﬁa)1XtJth -
(1/T)E[1(-th2 < Vt2 < Xtﬂa)1XtJthl
1 (1/T)E|[1(-xtwa — xtst < vt2* - xtst < Xtﬂa + xtst)] —
[11'Xt72 < vta < tha)]luxtna
: (1/T)§1[1(-Xt"2 < vt2* < Xtva + extst)l -
[11'Xtﬂg < vta < Xth)]lHXtH2
= (i/T)E|[1(—tha < vta < xtwa + extst)] —
[1(‘Xtﬂg < vtg < xtwa)lluxtu2
(v v13” : max(vt3. —th2)
: Vta' : Vt2 if vt2* > “Xt"2-)
Since Euth4+ﬂ is uniformly bounded. this term converges to
zero almost surely by the strong consistency of ﬁg-
A (i/T)§[1(-Xtﬁa < eta < Xtﬁa)]Xt'Xt L

(1/T)E[1(-tha < vta < xtw2)lxt'xt

(1/T)E[1(-Xtﬂa < v12 < Xtﬂa)]Xt’Xt L
(1/T)EE[1(-Xtﬂe < vta < xtﬂa)]Xt'Xt

by Assumption (A9)' and the law of large numbers.

85

 

 

 

84

‘froof of Consistency of E12 in.II.B.1.b. of Chapter Two

The consistent estimator of (1/T)EE[§tv11Xt‘Xt] is
(1/T)E[1(Xtﬁa > 0)]min[max(vta, -Xtﬁa)» xtaalvtixt'xt.
The proof is as follows.
1(1/T)E[1(Xtﬁa > 0)]min[max(vtg, -Xtﬁa), Xtﬁalvtlxthtk -

(1/T)E[1(Xtﬂa > 0)]m1n[max(vta, ~Xtﬂa), Xtﬂalvt1XtJthl

IA

(1/T)E|[1(Xtﬁa > 0)]min[max(vt2, *Xtﬁa). Xtﬁalvt1XtJth -

[1(Xtﬂa > 0)]min[max(vta, -Xtﬂa), xtwalvt1xtJXtK|

IA

(1/T)E1[1(tha > -Xt3t)]min[max(vta* — xtst,

'Xtﬂa ‘ Xt3t)1 Xt"2 + Xtﬁtl '

[1(Xt72 > 0)]m1n[max(vta, -Xt72)- tha11vt1lﬂxtﬂa +
(1/T)E|[1(tha > -Xtat)]min[max(vta' - xtxt,

-xtwa - xtst). tha + xtatllna1 - wiuuxtu3

: «1 + d2

Let r s l[1(xtn2 > —Xt3t)]min[max(vta* — x131,

—th3 — tht), tha + X131] -

[1(Xt"2 > 0)]min[max(vta, 'Xt32): thall

Case 1. tha > -xtst and Xtﬁa > 0.
(a) max(vta, -tha) < Xt"2
(i) -Xt72 < Vt2 < xtva
Vt2* : max(vta, ‘Xt"2) : Vt2
P S 'Vt2 — tht - Vtal = IXtBtl
(ii) v12 < 'Xt"2 < Xt"2
vt2* : max(vta, 'XtV2) : 'Xt"2
P 1 l-tha - tht - (-xtw2)| = Ithtl

(b) max(vta, -tha) > Xtﬂa

 

 

 

85

P 1 Ixtwa + Xt3t - th2l = lxtstl
Case a: xtVa s -xt3t and tha > o.
r : |-min[max(vta. -tha), XtVEJI
s lxtwal s lXtStl
Case 3: tha > —xt3t and XtVE S O.
This implies lthal < lxtstL
r :’|min[max(vta* — xtst, -tha — xtst), Xtﬂa + thtJI
S lxtval + IXtStl
s alxtstl
From cases 1.2 and 3. we Know «2 s (a/T)§nxtn3uatulvt1l.
Since Euxtu4+ﬂ is uniformly bounded, and E(Xt’vt1) = o,
(2/T)Enxtu3ﬂ8tﬂlvt1l converges to zero almost surely by the

strong consistency of ﬁg. So. plim a1 = 0.

a2 (1/T)§'[1(Xt"a > —Xt3t)]min[max(vta* - Xtat,

-XtW2 - xt3t)- Xtva + thtllﬂﬁi — wiuuxtn3

IA

(1/T)E(lxtﬂ3l + lXt8t|)ﬂﬁ1 - «1nuxtn3

= L

Case 1: Xtﬂa > ‘xt3t and Xt"2 S 0.
This implies lxtwal < 'Xt3t'-
L < (a/T)§ua1 ~ w1uustnuxtu4
Since Euxtn4+ﬂ is uniformly bounded.
(a/T)EH%1 - w1HH8tHHXtH4 converges to zero almost surely by
the strong consistency of %1 and ﬁg.
Case a: Xtva > -xt3t and Xtﬂa > o.
This implies lthgl > lthtl.

L < (a/T)§uai - «1nuwaunxtu4

 

 

 

86

Since Ellxtu4H1 is uniformly bounded, and we is finite.
(E/T)§Hﬁ1 - wiunwguuxtu4 converges to zero almost surely by

the strong consistency of ﬁi and ﬁg.

(1/T)E[1(Xt€ya > 0)]mintmaxwta. ~Xtﬁa). Xt%2]9t1Xt’Xt z.
(1/T)§[1(xtwa > O)]min[max(vt3, -xtw2). xtwawuxt'xt
By Assumption (A9)' and the law of large numbers.

(i/T)E[1(tha > 0)]min[max(vtg. 'XtVE)’ xthJvtixt'xt L

(i/T)EE[1(tha > 0)]min[max(vt2. —tha), thalvtixt'xt

z (i/T)EE(EtVt1Xt'Xt)

 

 

 

APPENDIX C

Proof of Consistency of Estimator of ﬁpq 1n III.A.i.b. of

Chapter TWO

The consistent estimator of ﬁpq is

(1/T)§[1(xtap > 0)1[1/a - 1(vtp < 0)1[1(xtaq > 0)1[1/a -

1(vtq < 0)]xt'xt. The proof is as below.

a = (i/T)E[1(Xtﬁp > 0)1[1/a — 1(vtp < 0)][1(xtaq > 0)][1/2 -
1(vtq < 0)]xt'xt - (1/T)§[1(xtwp > 0)][1/2 - i(vtp <
0)][1(xtwq > 0)][1/2 - 1(vtq < 0)]Xttxt

s (i/T)El[1(xtﬁp > o. xtaq > 0)]{1/4 - [1(vtp < 0)] -
[1(vtq < 0)] - [1(vtp < o, vtq < 0)]: — [1(xtwp > o,
thq > 0)]{1/4 — [1(vtp < 0)] — [1(vtq < 0)] - [1(vtp <
o. vtq < c)13qutu3

s (i/4T)El[i(Xtﬁp > o. xtaq > 0)] -

[1(xtwp > o, xtwq > 0)]!"th3 +

(1/T)El[i(Xtﬁp > o. xtaq > 0)][1(vtp < 0)] -

[1(xtwp > o, xtwq > 0))[1(vtp > 0)]qutu3 +
(1/T)§I[1(xtap > 0. xtaq > 0)1[1(vtq < 0)] —

[1(xtwp > 0. thq > 0)J[1(vtq > onnuxtua +
(1/T)EI[1(xtap > o. xtaq > 0)][1(vtp < o. vtq < 0)] -
[1(xtwp > o, xtwq > 0)][1(vtp > o. vtq > onthna

:a1+a3+a3+a4

«1 : (1/4T)E|[1(Xtﬁp > O, Xtaq > 0)] —
[1(xtvp > o, xtwq > onluxtua
s (1/T)§[1(Ixtwp| s uxtnuap — up". Ixtwa s
uxtunaq - «qu)1ux£n3
87

 

 

 

88

For any n > o,
Pri(1/T)§[1(Ixtwpl i uxtunap - up". lxtwa 5

uxtuuaq - qu)]nxtu3 > n!

IA

Prt(1/T)§[1(Ixtwpn s uxtnz). lthql s nxtuz)1uxtu3 > n; +

Pr(uap - up" > 2. "sq - qu > 2)

IA

(1/n)(1/T)§Pr(uxtwpu s "XtHz). lxtwa s uxtnz)nxtn2 +

Pr(ﬂﬁp - up" > 2. "ﬁg - wqu > z)(by Markov’s inequality)

IA

(1/n)K32 + Pr(ﬂﬁp - ﬂp" > 2, "ﬁg - "q" > Z)
By choosing z sufficiently small. Pr(||ﬁP - up" > 2.
"ﬁg - wqu > 2) can be made arbitrarily small for large T by

the consistency of ap and aq. So. plim a1 : 0.

a4 : (i/T)E|[1(Xtap > o, xtaq > 0)][1(vtp < o. vtq < 0)] -

[1(xtwp > o, xtwq > 0)1[1(vtp > o. vtq > o>1luxtua

IA

(1/T)§[1(Ixtwpl s uxtnuap — wpn)1uxtu2 +
(1/T)§[1(Ixtwq| s uxtuuaq - qu))uxtu3 +
(1/T)§[1(Ivtpl s uxtunap - wpn)1uxtua +
(1/T)E[1(|thl s nxtunaq - «qu)1uxtu3
:A1+A2+A3+A4

('.' Xtﬁp Xtﬁ'q th th Xt‘ﬂ’p Xtﬂq th th (X4

+ + — - - + - — A1
+ + — - + - - - A2
+ + - - + + + — A3
+ + - - + + - + A4
— + - - + + — —- A1
+ — — - + + — - A2

+ + + - + + — - A3

 

 

89

+ + - + -+ + — - A4 )
These terms can each be shown to converge to zero in
probability by that of p. 323 (A27) in Powell’s [1984] paper.
So. plim a4 : 0. By the same way. we can show that plim «a
= 0 and plim a3 = 0. Therefore.

(1/T)E[1(Xt€rp > 0)][1/2 - “th < 0)][1(Xta'q > 0)][1/2 -
1(vtq < 0)]xt'xt a (1/T)E[i(xtwp > 0)][1/2 — 1(vtp <
0)][1(xt1rq > 0)][1/2 - 1(vtq < 0)]xt'xt = (1/T)§Etp§tht’xt
By Assumption (A9).

(umgztpthxt'xt .r. (umgmztpthxt'xt)

 

 

APPENDIX D

Proof of Consistency of Estimator of ﬁpq in III.B.1.b of

Chapter TWO

The consistent estimator of (i/T)§E[§tp§tht’xtl is

(1/T)§[1(xtap > 0)1m1n[max(vtp. -xtap). xtap1[1(xtaq >

0)]mintmax(vtq. -Xtﬁq). xtanXt'xt. The proof is as below.

a = 1(1/T)§[1(xtap > 0)1m1n[max(vtp. -xtap). xtap1[1(xtaq >

O)]min[max(vtq. -Xtﬁq). XtﬁqIXtJth -
(i/T)E[1(thp > 0)]min[max(vtp. —xtwp). xtwp][1(xtwq >

0)]min[max(vtq. -Xtﬂq), Xtﬂq1XtJXtK'

IA

O)]min[max(vtq. -Xtﬁq). Xtﬁqlxtjxtki- [1(Xtﬂp >
0) )min [maX(th, —Xt1rp) . Xtﬂ'p] [1 (th'q > 0) ]min [maX(th,

-Xtﬂq), XthlxtJthl

(1/T)E'[1(xt"p > -Xt3tp)]min[max(vtp* - thtp,
-thp - Xtﬂtp). thp + thtp][1(Xtﬂq >

‘thtq)]m1n[max(vtq* ' xtstq. -Xtﬂq - thtq), Xtﬂq +

(1/T)§I[1(xtap > 0)1m1ntmax(vtp. -xtap). xtap1[1(xtaq >

> 0)]mintmax(vtq. -thq). xtwaluxtua

rnxtna

Case 1. Xt‘n’p > ‘Xtatp. Xth > ‘thtq and Xtﬂp > 0. Xtﬂq > O.

(a) max(vtp, -Xtﬂp) < thp and max(vtq. —thq) < Xt"2
(i) -thp < vtp < xtvp and —thq < vtq < xtwa
vtp* : max(vtp. ~thp) : vtp
vtq' : max(vtq. -xtwq) - vtq

r s |(th - thtp)(th - Xt$tq> - thvtq'

9O

 

IA

IA

 

91

‘thXtStq - thxt3tp + xtstpxtstql
thxtﬁtql + thqxtstpl + 'Xt3tpxt3tq'

vtpuuxtuustqn + thqIHXtuﬂotp" +

natpunstqnuxtua

F1
Since EHXtH4*“. Elvtpl. and Elvtql are bounded

uniformly in t. riuxtna converges to zero almost surely by

the strong consistency of ﬁp and ﬁq.

(11) th < 'Xtﬂp < Xtﬂp and th < 'Xtﬂq < xtﬂq

th*

th"

P

(

: max(vtp. -Xtvp) : -thp

: max(vtq. -thq) : -thq

|(xtvp + xtstp)(xt"q + Xt3tq) - (xtwp)(xtuq)|
|(Xtﬂp)(Xt3tq) + (xtltp)(Xth) +
(xtatp)(xtstq)l

unpuuatqnuxtua + qunnstpunxtna +
ustpunthuuxtua

Pa

Since Euxtu4+ﬂ is uniformly bounded and up. wq are

finite. Paﬂxtﬂa converges to zero almost surely by the

strong consistency Of ﬁp and ﬁq.

th'

F

(

(

(

it

: max(vtp, -thp) : vt1

: max(vtq. -thq) : -thq

|(th - XtStp)(-Xtﬂq - thtq) - th(-Xtﬁq)l
Ivtpxtatql + lththtﬂql + 'Xt3tpxt3tq'
Ivtpluxtuuatqn + nstpunnquuxtua +

natpnustqunxtua

 

 

92

: r3
Since Euxtn4+“. Elvtpl are uniformly bounded and wq is
finite, r3nxtu3 converges to zero almost surely by the
strong consistency of ﬁp and ﬁq.
(iv) vtp < -thp < thp and -thq < vtq < tha
The proof is the same as (111%
(b) max(vtp. -thp) > thp and max(vtq. —Xtvq) > XtVE
r s 1(xtwp + xtatp)(xtwq + xtgtq) - xtwpxtwa
i lthpXtthl + IXtSththl + IXtSthtStql
s unpuustquuxtua + unquuztpuuxtua + natpuuatqnuxtua
= r4
Since Eﬂxtﬂ4*“ is uniformly bounded and both up and ﬁg
are finite. r4uxtu3 converges to zero almost surely by the
strong consistency of ﬁp and ﬁq.
Case a: xtwp s -xt3tp or xtwq s —xt3t and
thp > 0, thq > 0. This implies that Ithpl i
Ixtstpl or lxtvql s lthtql.
P : I—{min[max(vtp. —Xtvp). Xtﬂpllimin[max(vtq.

‘xt"q)n XthJII

IA

[(thp)(xtwq)l

IA

IXtEtpllxtwa (or Ixtwpllxtstqi)

IA

qunnxtpunxtua (or uwpnustqnuxtua)
: r5
Since Ellxtll‘l““fl is uniformly bounded and wq (or WP) is
finite, rsuxtna converges to zero almost surely by the
strong consistency of ﬁp (or ﬁq).

Case 3. thp > 'Xt3tp' Xth > -Xt3tq and

 

 

93

thp i O or thq-i O. This implies that lXtﬂpl <

r‘ :

IA

IA

IA

IA

limin[max(vtp* - Xt3tp' —thp - Xt3tp)v thp +

xtstplliminEmax(vtq* — Xt3tq- —thq - xtﬁtq)-

thq + thtqlll

|(th1 + XtStp)(Xtﬂq + Xtﬁtq)l

lthpthql + lxtwpxtgtql + lxtstpxtwa +

lXt3tht3tql

IXtEtPIIXthI + lXtStpllxtﬂql + ZIthtpllXthql

(or IXtﬂpllXtthl + IXtﬂpllXtStql +
Zlthtpilxthql)

aqunustpnuxtua + austpnuatquuxtua

(or auwpuustqnuxtua + anatpuuatquuxtua)

P6

Since Enxtu4+n is uniformly bounded and vq (or up) is

finite, renxtua converges to zero almost surely by the

strong consistency of Sp and ﬁq-

; (l/T)E[1(Xtﬁp > 0))m1n[max(vtp. -xtap), xtap][1(xtaq >

0)]min[max(vtq, -Xtﬁq). Xtﬁq1Xt'xt L

(1/T)E[1(thp > 0)]min[max(vtp, —Xtﬂp). thp][i(xtwq >

O)]min[max(vtq. -Xt1Tq), Xt‘ﬂq] Xt'Xt

: (1/T)E§tpgtht'xt

By Assumption (A8)’ and law of large numbers.

(1/T)§ztp§tht'xt s (1/T)EE(§tp§tht'xt)

 

 

 

APPIHBIX E

SECOND DERIVATIVES OF THE LOG LIKELIHOOD FUNCTION

To derive the second derivatives of the log likelihood
function. we use the following relationship frequently:
5§(Wt)/6Wt = ¢(Wt)
5¢(Wt)/5Wt = 'Wt¢(wt)
0[¢(Wt)/Q(Wt)]/6Wt = -[¢2(Wt)/§a(wt) + Wt¢(Wt)/§(Wt)]
where §(o) is the standard univariate normal distribution

function, and

¢(.) is the univariate normal density function.

The second derivatives of 108 likelihood function are

as below:
61nL(e)/aw16w1’ : -Ext(xt'xt)/[oia(1 - 92)]

— 5(1 - At)(Xt’Xt)/C12

— §(1 - xt)pa<xt'xt)zt/tu13(1 - 93)]
(where zt = ¢2<wt)/§3(wt) + wt¢(wt)/§(wt) ).
61nL(e)/6w16wa' = Ektf(xt’xt)/[U1Ua(1 - 93)]

+ 5(1 - At)P(Xt'Xt)Zt/[U1Ua(1 — p3)1
o1nL(e)/awieo13 : Extxt’[-(Yt1 - xtw1)/o13

+ p<Yta — tha)/(ZU1Ua)]/[Uia(1 — 93))

' {(1 ~ Atlxt’(Yt1 - XtV1)/U14

- §(1 - At)Xt’{th/[EU13(1 - 92)“)

+ pazt(Yt1 — xtw1)/tac14(1 - 92)]
(where Mt = ¢(Wt)/§(Wt) L
alnL(e)/Ow16oaa : Extp(Yta - xtw3)xt'/[2o1oa3(1 - 92)]

- E<1 - kt)9(Xtﬂa)ZtXt'/IZU1623(1 - 93)]

94

 

 

 

OlnL(e)/a1riap :

olnL(e)/6wa6ﬂa’ =

alnL(e)/euaae12 =

oinL(e)/owaao38 :

61nL(e)/Oﬂadp :

olnL(9)/OU136C13:

01nL(e)/ociaocaaz

95

Extxt'tap(Yt1 - xtw1)/[o13(1 - 92,2]

- (1 + pe)(Yta — xtwa)/[c1va(1 - 92,2],
+ §(1 - xt)xt'{Mt/[b1(1 — pa,x

+ pZt[p(Xtﬂa)/Ua + (Yti - xtw1)/oil
/[e1(1 - p3)311

-§xt(xt’xt)/tcaa(1 — p2,]

- E‘l - xt)(xt'xt)zt/[caa(1 — 92)]
EXtP<Yt1 - Xtﬂ1)Xt’/[ZU13U2(1 — 93)]

+ {(1 - At)pzt(Yt1 — xtw1)xt'
/[2613va(1 — 92)]

Extxt't-<Yta — xtva)/Uaa

+ p<Yt1 - xtw1)/(acica)1/tvae(1 - p2)1
+ 3(1 - At)Xt’[Mt/[2U33(1 - 93)“)

+ (xtwa)zt/[2ea4(1 - 93))!
Ektxt'fap(Yta - XtWaV‘Oaa

— (pa + 1)<Yt1 - xtw1)/(viba)1/(1 — 92,2
_ E(1 — At)Xt’{th/[oa(1 - pa)%]

+ Zt[p(Xtv2)/oa + (Yti - xtw1)/o1]
/wgu -9ma];

T/(aoi4) + §At[-2(Yt1 - xtw1)2/o13

+ 39(Yt1 ~ Xt"1)(Yta - XtV2)/(BU1U2)J
/tau14<1 - p2)1

' E‘i - AtHYti — XtV1)a/016

- {(1 — xt)i3th(Yt1 - xtw1)/[4U15(1 - 93)“)
+ PaZt(Yt1 - xtw1)3/[4o15(1 - 92)];

ExtP(Yti * XtV1)(Yt2 — XtVE)

/[4U13633(1 — 93)]

 

 

alnL(e)/ao126p

a1nL(e)/ooaeae38=

alnL(9)/Ooaaop

96

— 3‘1 - xt)p(xtwa)zt(vt1 - xtwi)
/[4U13623(1 - pa):

{AtEP(Yt1 ' Xt"1)2/Uia

— (pa + 1)(Yt1 - xtw1)(Yta - xtwa)/(au1ca)1
/[613(1 — 92)?)

+ §(1 - At)(Yt1 - th1)[Mt/[ao13(i - pa)“:
+ pZt[p(Xt‘n’a)/‘o‘a + (Ytl — Xt‘n'i)/'01]
/[av13(1 — 92)311

(§*t)/(3U24) + 8%[-(Yta - xtwa)3/o33

+ 3p(Yt1 — th1)(vt2 - XtV2)/(4biba)1
/tva4(1 - 93)]

- §(1 - xt)13Mt(xtv2)/[4ca5(1 - 93)“)

+ (xtwa)azt/[4o35(1 - 93)];

§*t[P(Yta — Xtﬂg)a/baa

' (Pa + 1)(Yt1 - Xt"1)(Yt2 ' xtwa)
/(2b1vg)1/[b23(1 — 93)?)

+ 5(1 - At)1p(xtwa)Ht/taua3(1 — p3)%1

+ (Xt‘n'a)Zt[P(Xt1fa)/'Oa + (Ytg - XtTT1)/'O1]
/[ava3(1 - 92,2];

(1 + pa)(§xt)/(1 - pa)?

+

Ext[—(3pa + i)(Yt1 - xtw1)3/o13

...

29(93 + 3)(Yt1 - Xt"1)(Yt2 - Xt"2)/<U1U2’
- (3p2 + 1)(Yt2 - xtwa)3/v231/(1 - 92,3

- §(1 - xt)nt[(1 + ap3)o1(xtﬂa,

+

3963(Yt1 - xtw1)1/[o1oa(1 - 93)“)

- {(1 - At)ZtIP(Xth)/[Ua(1 - 93)“)

+

(Yt1 - xtw1)/[v1(1 - p2)¥113

 

 

 

Let

(1)
(2)
(3)
(4)
(5)
(6)

(7)
(8)

(9)

(10)

(11)

(13)

At -

Z N
e ¢
¢

II II

2
N
6"

u

Ext = §(at)

Ekt(Yt1 - Xtﬂi)

Elt(Yt1
EAt(Yta
EAt(Yt2

Elt(Yt1

E(1 - A

E(1

E(1

8(1 - At)(Yt2 -

E(1 - At)(Yt1 -

NORHAL DISTRIBUTIONS

 

APPﬁHnIX F

SOHE EXPECTATIONS OF TRUNCATED UNIVARIATE OR BIVARIATE

1(Vta > -Xtﬂa) : 1(Yt2 > O),

1(Vt2

~Xtﬁa) : 1(Ytg : 0),

ca¢(at)/§(at).

-Ua¢(at)/[1 - @(at)L

t)

AtHYti -

E(1 - Xt)(Yt1 -

AtHYta -

= PU1¢(at)
= §(at)to13 - 92012N1t(xtﬂg)/Caal
= va¢(at)

= §(at)[caa - N1t(xtwa)1

XtV1)(Yta - tha)

= §(at){PU1[U22 - N1t(XtVa)]/Ual

§(at)

thi) : —po1¢(at)

th1)3

- §(at)][b13 — 93613Nat(xtw3)/o331
tha) = -Ua¢(at)

Xt"2)2

— 6(atlllbaa — Ngt(xtﬂa)]

XtV1)(Yta - XtVa)

[1 — 9(at)]{pv1[baa - N2t(Xtﬂg)1/Ual

97

 

 

 

IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIlIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII‘

II.

III.

98

'Truncated univariate7normal distribution
(Johnson and Kotz 1970 pp.8i-83)
(1) x ~ H(O. be)

x z oci.

c¢(c1)/[1 - o(ci)1 = M1

o2 — M1(M1 - ecl)

E(X)

V(X)

E(x3) = c3 + M1(CC1)
(2) x ~ N(O. U?)
x s oca

E(X) 'O[-¢(Ca)/§(Ca)] = 1‘13

V(X) = o3 — M2(Hg - oca)
E(x3) = c3 + na(oca)
(3) x ~'n(o. be)

cc, 1 x s ace

30‘) 'O[¢(C1) - ¢(Ca)]/[§(Ca) - @(C1H = M

V(X)

o3 - n2 + v3101¢(c1> - Ca¢(Ca)l/[§(Ca) - @(01)1

E(x2) = be + vatc1¢(c1) - ca¢(cg)1/[o(cg) - o(c1)1

Truncated bivariate normal distribution

(Johnson 3 K0122 1972 p. 113)

V1 U12 V12

V2 612 U23

v3 > n (or v3 < K or n < v3 < K)
E(V1) : p(o1/oa)E(Va I Va > h)
E(V1V2) = p(o1/og)E(v33 1 v2 > n)

E(V13) = p2(o13/o33)s(v23 I v3 > n) + e13(1 - p3)

 

 

 

APPENDIX G

THE ASYNPTOTIC COVARIANCE MATRIX OF THE CLAD ESTIMATE

The asymptotic covariance matrix of the CLAD estimate
for the case of independently and identically distributed

error terms is shown in Case One of Chapter Two as follows:

1 — 1 — 1 1 --1— — —1
(-——)H' ya“ (-——)u n1a[f(0)uul

T T

1 - 1- - 1 1 2 1
(-¥—)H' D’1a[f(0)Hx]- (—;—)[2f(0)]' Eu-

How. we assume Vt1 and Vta are independently and identically
distributed as bivariate normal with mean zero and

covariance matrix

”12 U12
U12 022

Treating X as fixed. and using the assumption of normality.

we have:

ﬁ" = pllm(X'X/T)'1 = (X’X/T)'1

v = plim (1/T)E(vt13xt'xt) : (i/T)012(X’X)

ﬁ. = plim Ei(1/T)§1(Xt"a > 0)xt'xt1 = (l/T)E1(Xtﬂa > 0)xt'xt

f(0) = 1/[(aw)%c21
512 : plim (i/T)EE((Vt1)(§t)Xt’Xt} = (1/T)E[E(Vt1)(§t)lxt'xt
Ewunzt) = E(vt1)[1(xtwa > onm — mite < 0)]
: -1(Xt"a > O)E[Vt1ui(vta < 0)]
E[Vt1I1(Vta < 0)] = “2(vt1 I Vta < 0)
: %p(b1/CE)E(Vta I Vta < 0)
99

 

 

 

TOO

: -p[‘c1/(21r)%] (see appendix F)
2 E13 = (1/T)§po11(xtwg > o1xt'xt/(2w)“

= PU1ﬁx/(2W)%

(1/T)ﬁ‘1Vﬁ‘1
= (i/T)(X’X/T)'1[(1/T)613X'XJ(X’X/T)'1

= 612(X’X)'1

(1/T)ﬁ‘1a1a[£(0)ﬁ.1‘1

= (1/T)(x'X/T)'1[pv1n./(2W)%1(ﬁ./[(aw)%u311'1

: '012(X'X)—1

(1/T)[3f(0)]_aﬁx-1

= (1/T)12/1(aw)%v311'2[(1/T)§1(xtwa > 0)xt'xt]-1
: (w/a)o23[§1(xtwa > 0)xt'xt1-1

U13(XIX)-1 C12(X’X)'1

612(X’X)'1 (w/a)o23[f1(xtwa > 0)xt'xt]‘1

 

 

APPENDIX H

THE ASYHPTOTIC COVARIANCE MATRIX OF THE SCLS ESTIHATE

The asymptotic covariance matrix of the SCLS estimate
for independently and identically distributed error terms is
shown in Case One of Chapter Two as follows:

(i/T)ﬁ'1Vﬁ‘ (1/T)ﬁ'1ﬁ125‘1

(1/T)ﬁ'1ﬁ’125’1 (1/T)5'156'1

Under our new assumption about the error terms
(bivariate normal). we may calculate the asymptotic
covariance matrix as follows:

ﬁ'1 = plim(X’X/T)‘1 = (x'X/T)‘1

<I

: plim (1/T)§E(vt13xt'xt) = (i/T)o13X’X

m
u

plim (1/T)EE[1(-Xt"2 < Vta < xtﬂa)Xt'Xt]

(i/T)E1(tha > 0)[a§(at) - 11xt3x

r = plim (i/T)EE{1(tha > O)min[Vtaa. (xtwa)21xt'xt1

(i/T)E1(tha > O)E[[i(-Xtﬂ2 < Vta < Xt"2)1Vtaa

+ [1(Vta Z Xtﬂa or Vta S -tha)](xtwa)3lxt'xt

E[1(—th2 < Vta < Xt"2)]Vtaa
: [2§(at) - 1]E(Vtaa I -Xtﬂa < Vta < Xtﬂa)
= caatao(at) - 1 - aat¢(at)]

(See Appendix F for all conditional expectations)

E[1(Vta z XtVZ or Vta 1 'Xt"2)1(XtVa)2
= 2(xtwa)2[1 - 6(at)]

; n = (l/T)Ei(tha > 0)1e22[2o(at) - 1 — aat¢(at)l

101

 

102

+ 2(xtwa)at1 - §(at1]3Xt’Xt
512 = plim (1/T)§E(;tvt1xt'xt)
= (l/T)EE(§tVt1)Xt’Xt
E(§tVt1) = E{1(tha > 0)min[max(Vta. 'Xt"2)1 Xtﬁ213Vt1
= 1(tha > O)E{[i(Vtg 1 -Xtﬂa)l(-Xt"a)Vt1
+ [1(’Xt"a < Vta < Xtﬂa)]Vt1Vta
+ [1(Vt2 2 Xtﬂa)1(tha)Vt1l
E[1(Vta 1 -xtwa)J(-xtwa)vt1
= -(Xt"a)£1 - 0(at)lE(Vt1 l Vta 1 ‘Xt"a)
= (Xt"2)U1a¢(at)/Ua
E[i(-tha < Vta < Xtﬂa)]Vt1Vta
= [2§(at) - i]E(Vt1Vta I 'XtWa < Vta < xtwa)
= b1a[2§(at) - 1 - aat¢(at)l ‘
E[i(Vt2 z xtwa)](xtw2)vt1
: [1 — §(at)](xtwa)E(Vt1 I Vta zxtwa)
= (Xt"2)512¢(at)/Ua
1 E13 = oia(l/T)E1(Xtﬂa > 0)[2§(at) - 11xt'xt

= .0125

(i/T)ﬁ'1Vﬁ"1 = (x'X/T)‘1
(i/T)ﬁ‘15136‘1 = 613(X’X)‘1

o13(x'xI‘1 013(X’X)‘1

o12(x’xI-1 (i/T)5‘1DE'1

where a = (i/T)E1(tha > 0)[2§(at) - 1Ixt'xt
5 = (i/T)Ei(xtw2 > O){o22[2@(at) - 1 - aat¢(at)]

+ 2(xtwg)3[1 — §(at)]!Xt’Xt

   

 

APPENDIX I

THE IDENTIFIED STRUCTURAL HODELS CORRESPONDING TO REDUCED

FORK EQUATIONS WITH “'2 = (1, O)’, (1. 1)’. 0R (0,1)'.

1' Yti = Y1Yta* + B12Xta + €t1 (912 = 0)
Yta” = 821Xt1 + Baaxta + €t2 (922 = 0)

3~ Yti = B11Xt1 + 91aXta + €t1 (912 = 0)
Yt2” = YaYt1 + eta

3- Yti = Y1Yta* t 912Xt2 + et1

Yta = Ba1Xt1 + Baaxta + eta (922 = 0)

4' Yt1 = 911Xt1 * B12Xta + 61.1

Yta” = YaYt1 + Baaxta + eta (YaB1a + 922 = 0)
5- Yt1 = Y1Yt2” + B11Xt1 + €t1

Yta” = YaYt1 + Baaxta + eta

5' Yt1 = Y1Yt2” + B11Xt1 + €t1

Yta” = 821Xt1 + Baaxta + eta
7- Yti = Y1Yta* + €t1
Yta” = 921Xt1 + Baaxta + €t2
5- Yt1 = B11Xt1 + B12Xt2 + €t1 (812 = 0)
Yta” = YaYt1 + Baaxta + eta
9' Yt1 = Y1Yt2” + 912Xt2 + €t1
Yta” = Ba1Xt1 + BaaXt1 + €ta
10- Yt1 = Y1Yt2' + Biaxta + Et1

103

 

 

11.

12.

13.

14-.

15.

16.

Yt1

Yta

Yti

Yti

Yta

Yti

Yta

104

YaYt1 + Ba1Xt1‘+ 6ta

B11Xt1 + B1axta + €t1

YaYt1 + 821Xt1 + €ta

B11Xt1 + B12Xta + €t1

YaYt1 t €t2

B11Xt1 + B12Xta + €t1

YaYt1 + B12Xt1 + Eta

B11Xt1 + B1axt2 + €t1

YaYt1 + Baaxta + et2

Y1Yt2*
921Xt1
Y1Yta'
921Xt1

911Xt1

YeYt1

+

+

+

+

+

B11Xt1 + €t1

Baaxta + eta

et1

Baaxta + eta

B1axt2 + €t1

€t2

911Xta + B12Xta + €t1

YaYt1 + 921Xt1 + €ta

 

(911 = 0)
(521 = 0)
(Bag = 0)
(511 = 0)

(YaB11 + 921 = 0)

 

,..s..1.1

 

BIBLIOGRAPHY

Amemiya, T. (1979), "The Estimation of a Simultaneous
Equation Tobit Model." International Economic Review

20: 169—161._

Amemiya. T. (1963), "A Comparison of the Amemiya GLS and the
Lee—Haddala-Trost GZSLS in a Simultaneous-Equations

Tobit Hodel,” Journal of Econometrics 23:295-300.

Arabmazar, A and P. Schmidt (1962), "An Investigation of the
Robustness of the Tobit Estimator to Hon-Normality,”

Econometrica 50: 1055-1061

Chamberlain, G. (1983). "Panel Data." in Handbook of
Econometrics. V01.2. (edited by Griliches. Z. and M.D.

Intriligator), 1246-1318.

Chamberlain, G. (1987), "Asymptotic Efficiency in Estiamtion
with conditional Moment Restrictions," Journal of

Econometrics, 34: 305-334.

Cosslett. S.R. (1987), "Efficiency Bounds for Distribution-
Free Estimators of the Binary Choice and the Censored

Regression Models," Econometrica, 55: 559-586.

Duncan, G.M. (1986), "A Semi-Parametric Censored Regression

Estimator, " Journal of Econometrics 322 5-34.

Fernandez, L. (1986) "Nonparametric ML Estimation of

Censored Regression Models."Journal of Econometrics 32:

 

105

 

 

 

106

35-57.

Goldberg, A.S. (i980), " Abnormal Selection Bias," Workshop
Paper No. 6006 (Social Systems Research Institute,

University Of Wisconsin, Madison, WI).

Gourieroux. C., A. Monfort and E. Renault, "Consistent M-
Estimators in a Semi-Parametric Model,” Working Paper

8706, INSEE, Paris.

Horowitz, J.L. (1986), "A Distribution-Free Least Squares
Estimator for Censored Linear Regression Models,"

Journal of Econometrics 32: 59-84.

Johnson, N.L. and S. Kotz (1970), Distributions in
Statistics: Continuous Univariate Distributions 1,

New York: John Wiley a Sons, Inc.

Johnson, H.L. and S. Kotz (1970). Distributions in
Statistics: Continuous Multivariate Distributions 1,

Mew York: John Wiley a Sons, Inc.

Lee, L.F. (1981), "Simultaneous Equations Models with
Discrete and Censored Variables," in: C.F. Manski and
D. McFadden, eds., Structural Analysis of Discrete Data

with Econometric Applications (MIT Press, Cambridge,
MA).

Lee, L.F., G.S. Maddala, and R.P. Trost (1980), "Asymptotic
I
Covariance Matrices of Two-Stage Probit and Two-Stage

Tobit Methods for Simultaneous Equations Models with

 

 

107

Selectivity," Econometrica 48: 491-503.

Malinvaud, E. (1980), Statistical Methods of Econometrics,

3rd ed., New York: North-Holland Publishing Company.

Nelson, F. and L. Olsen (1978), "Specification and
Estimation of a Simultaneous Equation Model with
Limited Dependent Variables,” International Economic

Review 19: 695—705.

Hewey, W.K. (1985), "Semiparametric Estimators for Limited
Dependent Variable Models with Endogenous Explanatory

Variables," Annales de L’INSEE 59/60: 219-237.

Newey, W.K. (1987a), "Specification Tests for Distributional
Assumptions in the TObit Model, " Journal Of

Econometrics, 34: 125-146.

Newey, W.K. (1987b), "Efficient Estimation of Limited
Dependent Variable Models with Endogenous Explanatory

Variables," Journal of Econometrics 36: 231—250.

Powell, J.L. (1964), "Least Absolute Deviations Estimation
for the Censored Regression Model," Journal of

Econometrics 25: 303-325.

Powell, J.L. (1985), "Symmetrically Trimmed Least Squares
Estimation for Tobit Models," MIT Working Paper,

No.36&

Powell, J.L. (1986a), "Censored Regression Quantiles,"

 

I

 

 

108

Journal of Econometrics 32: 143-155.

Powell, J.L. (1966b), "Symmetrically Trimmed Least Squares
Estimation for Tobit Models," Econometrica 54: 1435-

1460.

Smith. R.J. (1987), "Testing the Normality Assumption in
Multivariate Simultaneous Limited Dependent Variable

Models," Journal of Econometrics, 342 105-124.

Tobin, J. (1958), "Estimation of Relationships for Limited

Dependent Variables," Econometrica, 26: 24-36.

White, M. (1980a), "A Heteroskedasticity-consistent
Covariance Matrix Estimator and a Direct Test for

Heteroskedasticity," Econometrica 50: 1055-1063.

White, M. (1980b), "Nonlinear Regression on Cross-section

Data," Econometrica 48: 721-746.

White, M. (1984), Asymptotic Theory for Econometricians,

New York: Academic Press, Inc.

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

(.23. I.
3:11.111...
.11.... ......
...... a :

I who: I 3
, . a ”Emir.
..i..we...~§..n. ,s
. . i 5:...

1. an.

...... t.
“ran .
....

...:
in
1 v
7

«Rant
.x